RE: [R] 3 little questions

2004-02-02 Thread Marc Schwartz
On Sun, 2004-02-01 at 20:22, Gabor Grothendieck wrote:
 See ?cor.test

I'll stand to be corrected here, but I do not believe that cor.test()
will work for Kendall's W, since it can only handle two variables.
Kendall's W is designed for = 3 variables as a generalization of
rho/tau and Friedman's test.

There are some suggestions I have seen that estimations of Kendall's W
can be done by using the average of multiple pairwise Kendall's tau
across the variables, with two formula variations depending upon whether
there is an even or odd number of variables AND in the case of no ties.

I have also seen a suggestion that the R^2 from a one-way ANOVA yields
an approximation of Kendall's W. In fact, the SAS macro I reference
below uses this approach. 

However, each of the aforementioned approaches can vary from W and so
carries various caveats under certain conditions.

I have not seen anything searching on r-help or CRAN for Kendall's W and
I have not coded it myself. However, I did find one reference on the
s-news list at:

http://www.biostat.wustl.edu/archives/html/s-news/2001-03/msg00197.html

and I also located a SAS Macro called %MAGREE at:

http://ewe3.sas.com/techsup/download/stat/magree.html

If you are familiar with SAS, that might be helpful as well.

Another reference which has various formulas and worked examples for W
is:

Nonparametric Measures of Association
Jean Dickson Gibbons
Sage, 1993

With respect to gamma, I have worked through a methodology to calculate
this and some other measures in R, that I have planned to add to the
CrossTable() function in the gregmisc package. Unfortunately, I have not
yet completed the coding for several of the measures and the associated
ASE's and p values due to lack of time.

I have done gamma however and the code is below, starting with the
functions to compute concordant and discordant pairs. These approaches
(using a cross-tabulation and matrix partitioning) can save a fair
amount of time, if the number of unique pairs in the data is
substantially less than the total number of pairs.

# Calculate CONcordant Pairs in a table
# cycle through x[r, c] and multiply by
# sum(x elements below and to the right of x[r, c])
# x = table
concordant - function(x)
{
  # get sum(matrix values  r AND  c)
  # for each matrix[r, c]
  mat.lr - function(r, c)
  { 
lr - x[(r.x  r)  (c.x  c)]
sum(lr)
  }

  # get row and column index for each
  # matrix element
  r.x - row(x)
  c.x - col(x)

  # return the sum of each matrix[r, c] * sums
  # using mapply to sequence thru each matrix[r, c]
  sum(x * mapply(mat.lr, r = r.x, c = c.x))
}

# Calculate DIScordant Pairs in a table
# cycle through x[r, c] and multiply by
# sum(x elements below and to the left of x[r, c])
# x = table
discordant - function(x)
{
  # get sum(matrix values  r AND  c)
  # for each matrix[r, c]
  mat.ll - function(r, c)
  { 
ll - x[(r.x  r)  (c.x  c)]
sum(ll)
  }

  # get row and column index for each
  # matrix element
  r.x - row(x)
  c.x - col(x)

  # return the sum of each matrix[r, c] * sums
  # using mapply to sequence thru each matrix[r, c]
  sum(x * mapply(mat.ll, r = r.x, c = c.x))
}


# Calculate Goodman-Kruskal gamma
# x = table
calc.gamma - function(x)
{
  c - concordant(x)
  d - discordant(x)

  gamma - (c - d) / (c + d)

  gamma
}

Here is an example of use. Keep in mind that x is the cross tabulation
of two vectors of measures, in this example, yielding a 3 x 3 table:

 x - matrix(c(70, 10, 27, 85, 134, 60, 15, 41, 100), ncol = 3)
 x
 [,1] [,2] [,3]
[1,]   70   85   15
[2,]   10  134   41
[3,]   27   60  100

 calc.gamma(x)
[1] 0.57045

If you have any questions on the above, let me know.

Hope this helps.

Marc Schwartz


On Sun, 2004-02-01 at 18:53, Siegfried.Macho wrote: 
 Dear R-helpers,
 
 3 questions:
 1. Is there a package that contains a routine for computing Kendall's W 
 (coefficient of concordance), with and without ties ?
 2. Is there a package that contains a routine for computing Goodman' s Gamma.
 3. I there a simple method for computing the number of ties as well as 
 their lengths within a vector fo ranks,
 e.g.
  r1 - rank(c(1, 3, 2, 3, 3, 2, 4))
 
 gives:
 
 [1] 1.0 5.0 2.5 5.0 5.0 2.5 7.0
 
 which contains 2 ties with length 2 and 3.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] CART: rapart vs bagging

2004-02-02 Thread Liaw, Andy
 From:  Qin Liu
 
 Hi, 
 
 Is here anyone knows the difference between rapart and 
 bagging when grow a
 CART tree? 

rpart produces one classification or regression tree.  Bagging is a general
technique for combining classifiers or regression models, usually trees,
from bootstrap samples.  The bagging() function in the `ipred' package just
loop over call to rpart().

Andy

 
 Thanks
 
 Qin
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Stepwise regression and PLS

2004-02-02 Thread Jinsong Zhao
Dear all,

I am a newcomer to R. I intend to using R to do stepwise regression and
PLS with a data set (a 55x20 matrix, with one dependent and 19
independent variable). Based on the same data set, I have done the same
work using SPSS and SAS. However, there is much difference between the
results obtained by R and SPSS or SAS.

In the case of stepwise, SPSS gave out a model with 4 independent
variable, but with step(), R gave out a model with 10 and much higher
R2. Furthermore, regsubsets() also indicate the 10 variable is one of
the best regression subset. How to explain this difference? And in the
case of my data set, how many variables that enter the model would be
reasonable?

In the case of PLS, the results of mvr function of pls.pcr package is
also different with that of SAS. Although the number of optimum latent
variables is same, the difference between R2 is much large. Why?

Any comment and suggestion is very appreciated. Thanks in advance!

Best wishes,

Jinsong Zhao

=
(Mr.) Jinsong Zhao
Ph.D. Candidate
School of the Environment
Nanjing University
No.22 Hankou Road, Najing 210093
P.R. China
E-mail: [EMAIL PROTECTED]

_

60
http://cn.rd.yahoo.com/mail_cn/tag/?http://cn.mail.yahoo.com

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ATTENZIONE: [La vostra email: MAIL TRANSACTION FAILED]

2004-02-02 Thread VaMailArmor
posta.powerhouse.it ha trovato ciò che segue nella email
da lei inviata:

Worm/MyDoom.A2 virus

La email non può essere inoltrata.

Il suo computer può essere infettato da un virus.


Mail-Info:
--8--
 Message-Id: [EMAIL PROTECTED]
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Date: Mon, 2 Feb 2004 09:16:49 +0100
 Subject: MAIL TRANSACTION FAILED
--8--

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] MATLAB to R

2004-02-02 Thread Bjørn-Helge Mevik
[EMAIL PROTECTED] writes:

 In MATLAB, I can write:

 for J=1:M
 Y(J+1)=Y(J)+ h * feval(f,T(J),Y(J));
 ...

 In R, I can write above as:

 for (J in 2:M)
 {
  y = y + h * f(t,y)
 ...
 }

Are you sure this gives the same result?  If Y and T in Matlab are
vectors, I believe

for (J in 1:M)
{
  y[J+1] - y[J] + h * f(tt[J], y[J])
  ...
}

is what you want.  (Don't use `t' as a variable; t() is the function
to transpose a matrix.)

 for J=1:M
 k1 = feval(f,T(J),Y(J));
 k2 = feval(f,T(J+1),Y(J)+ h * k1

I assume you mean k1(J) = ... and k2(J) = ...

 How do I write k2 in R?
 k1 = f(t,y)
 k2 = ?

## If f can take vector arguments:
k1 - f(tt[-M],y)
k2 - f(tt[-1], y+h*k1)
## Otherwise:
for (J in 1:M) {
  k1[J] - f(tt[J], y[J])
  k2[J] - f(tt[J+1], y[J] + h*k1[J])
}

-- 
Hth,
Bjørn-Helge Mevik

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] PLS discriminant analysis

2004-02-02 Thread xavière panhard
Hi everyone, 

we would like to perform a PLS discriminant analysis with R.
Does anyone knows if at least a PLS regression package is available?

Thanks by advance,

Xavière Panhard
University Hospital Bichat - Claude Bernard
Paris
[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] how to keep functions while remove all other commands

2004-02-02 Thread Simon Fear
As an alternative to Petr's approach of using separate
directories, one can also write a function (which I find quite
useful in its own right) to separate ls() output by mode, e.g.

lsd - function(splitby=mode, pos=1, ...) {
lsout - ls(pos=pos, ...)
tapply(lsout,
  sapply(lsout, function(x) {splitby(get(x))}),
  invisible)
}

Given that, you can use expressions such as

rm(list=lsd()$numeric)
rm(list=lsd(splitby=class, all.names=T)$integer)

or to remove functions,

rm(list=lsd()$function) # nb reserved word function needs quotes

this latter also removes `lsd` itself, so do also follow the advice
re keeping a copy of all your functions in a text file somewhere!

 -Original Message-
 On 24 Jan 2004 at 21:47, Yong Wang wrote:
 
  Dear all:
  a quick question:
  I am used to apply  rm(list=()) regularly to remove all old codes in
  preventing them creeping in current analysis.however, with that
  application, functions I wrote are also removed. please let me know
  how to keep the thing you want while remove those you don't.
  
 
Simon Fear 
Senior Statistician 
Syne qua non Ltd 
Tel: +44 (0) 1379 69 
Fax: +44 (0) 1379 65 
email: [EMAIL PROTECTED] 
web: http://www.synequanon.com 
  
Number of attachments included with this message: 0 
  
This message (and any associated files) is confidential and\...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ordering in dotplot

2004-02-02 Thread Luca De Benedictis
Dear R-friends,

the dataset I am using (data.it) is organized as follows

partnerstpbtpreg
hk0.641s
ger0.271d
tur0.271s
rom0.241s-f
por0.241s
spa0.231s
gre0.221d-f
aus0.171d
uk0.161s
be0.161d
arg0.151s
usa0.131d-f
fra0.131s
neth0.051s-f
pol0.051d
bra0.041s
ko0.0061s
un-0.0092d-f
svi-0.0222s-f
cin-0.0402   d
ru-0.0742d-f
mex-0.0772s
............
and plotting it using dotplot, I specified the script as:

library(lattice)
attach(data.ita)
dotplot(reg~stp | partner, data=data.ita, groups=btp,
   xlab=list(Data - 
it,cex=1.5),col=c(black,red),aspect=0.3,as.table=TRUE,xlim=c(-1,1))
detach(data.ita)

In the resulting plot the variable reg is ordered alphabetically. 
Instead, I would like the variable to be plotted in the following order: 
s, s-f, d, d-f.
How can I do it?

Many thanks
Luca
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ordering in dotplot

2004-02-02 Thread Chuck Cleland
Luca De Benedictis wrote:
the dataset I am using (data.it) is organized as follows
............
and plotting it using dotplot, I specified the script as:
library(lattice)
attach(data.ita)
dotplot(reg~stp | partner, data=data.ita, groups=btp,
   xlab=list(Data - 
it,cex=1.5),col=c(black,red),aspect=0.3,as.table=TRUE,xlim=c(-1,1))
detach(data.ita)

In the resulting plot the variable reg is ordered alphabetically. 
Instead, I would like the variable to be plotted in the following order: 
s, s-f, d, d-f.
Make that an ordered factor as follows:

data.it$reg - ordered(data.it$reg, levels=c(s, s-f, d, d-f))

hope this helps,

Chuck Cleland

--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 452-1424 (M, W, F)
fax: (917) 438-0894
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] array of variable length vectors

2004-02-02 Thread Giampiero Salvi
Hi,
I'd like to store N vectors of different lengths, and to be able to
access them with an index, and eventually free the memory for one
of them without modifying the indexes to the others.

In C this would be a vector of N pointers that point to memory cells
independently allocated.

For example

int *pv[3];

pv[0] = (int *) malloc(13 * sizeof(int));
pv[1] = (int *) malloc(7 * sizeof(int));
pv[2] = (int *) malloc(110 * sizeof(int));

free(pv[1])
...

What is the best data type (or class) in R to do such a thing?

Thank you!
Giampiero
_
Giampiero Salvi, M.Sc.  www.speech.kth.se/~giampi
Speech, Music and Hearing   Tel:  +46-8-790 75 62
Royal Institute of Technology   Fax:  +46-8-790 78 54
Drottning Kristinasv. 31,  SE-100 44,  Stockholm,  Sweden

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] array of variable length vectors

2004-02-02 Thread Barry Rowlingson
Giampiero Salvi wrote:
Hi,
I'd like to store N vectors of different lengths, and to be able to
access them with an index, and eventually free the memory for one
of them without modifying the indexes to the others.

int *pv[3];

pv[0] = (int *) malloc(13 * sizeof(int));
pv[1] = (int *) malloc(7 * sizeof(int));
pv[2] = (int *) malloc(110 * sizeof(int));
free(pv[1])
...
What is the best data type (or class) in R to do such a thing?
 A list, with vector elements (index starts at 1 in R):

 pv = list()
 pv[[1]] = real(13)
 pv[[2]] = real(7)
 pv[[3]] = real(110)
 then the equivalent of freeing the memory and keeping the indexing 
would be:

 pv[[2]] = real(0)

 and NOT

 pv[[2]] = NULL  (which deletes element 2)

 *BUT* I dont know if R will really free() the memory at that point. 
You may need to force the garbage collection with gc()

Baz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Stepwise Regression and PLS

2004-02-02 Thread Peter Flom
Frank Harrell wrote

 I think you missed the point.  None of the variable
 selection procedures
 will provide results that have a fair probability of
 replicating in
 another sample.
 
 FH


And Jinsong Zhao answered

Do you mean different procedures will provide
different results? Maybe I don't understand your email
correctly. Now, I just hope I could get a reasonable
linear model using stepwise method in R, but I don't
know how to deal with collinear problem.


The problem is not with R, SAS, or SPSS, but with your desire to
produce a reasonable linear model using stepwise.  Stepwise does not,
in general, produce reasonable linear models, nor does it produce 
models that are generally replicable.

This issue has been discussed here in the past, but there have been
more extensive discussions on SAS-L, or in numerous statistics books,
including Dr. Harrell's excellent one.

HTH

Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
www.peterflom.com
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] array of variable length vectors

2004-02-02 Thread Prof Brian Ripley
On Mon, 2 Feb 2004, Giampiero Salvi wrote:

 Hi,
 I'd like to store N vectors of different lengths, and to be able to
 access them with an index, and eventually free the memory for one
 of them without modifying the indexes to the others.
 
 In C this would be a vector of N pointers that point to memory cells
 independently allocated.
 
 For example
 
 int *pv[3];
 
 pv[0] = (int *) malloc(13 * sizeof(int));
 pv[1] = (int *) malloc(7 * sizeof(int));
 pv[2] = (int *) malloc(110 * sizeof(int));
 
 free(pv[1])
 ...
 
 What is the best data type (or class) in R to do such a thing?

Sounds like an R list.  However, in R you cannot free memory, but what 
you can do (carefully) is to change the list element to NULL and then 
memory will be salvaged at a future garbage collection.

z - vector(list, 3)
z[[1]] - integer(13)
z[[2]] - integer(7)
z[[3]] - integer(110)

then

z[1] - list(NULL)  # and not z[[1]] - NULL

will potentially release the memory allocated for the first element.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Stepwise Regression and PLS

2004-02-02 Thread Frank E Harrell Jr
On Sun, 1 Feb 2004 20:03:36 -0800 (PST)
Jinsong Zhao [EMAIL PROTECTED] wrote:

 
 --- Frank E Harrell Jr [EMAIL PROTECTED] wrote:
   
   For the case of stepwise regression, I have found
  that
   the subsets I got using regsubsets() are
  collinear.
   However, the variables in SPSS's result are not
   collinear. I wonder what I should do to get a same
  or
   better linear model.
  
  I think you missed the point.  None of the variable
  selection procedures
  will provide results that have a fair probability of
  replicating in
  another sample.
  
  FH
  ---
  Frank E Harrell Jr   Professor and Chair  
  School of Medicine
   Department of Biostatistics  
  Vanderbilt University
 
 Do you mean different procedures will provide
 different results? Maybe I don't understand your email
 correctly. Now, I just hope I could get a reasonable
 linear model using stepwise method in R, but I don't
 know how to deal with collinear problem.
 
 =
 (Mr.) Jinsong Zhao

No, I mean the SAME procedure will provide different results.  Use the
bootstrap, or use simulation to repeatedly sample from the same population
and the same true regression model.  You will see dramatically different
final models selected by same algorithm.  The algorithm is inherently
unstable unless perhaps you have a sample an order of magnitude larger
than the one you have.  See
http://www.pitt.edu/~wpilib/statfaq/regrfaq.html) which contains some good
references.

---
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] PLS discriminant analysis

2004-02-02 Thread Liaw, Andy
You could have at least look on CRAN, where you would have found pls.pcr,
which does PLS and PC regression.

As to using PLS for discrimination, you might want to look at the gpls
package in the BioConductor suite.

Andy

 From: xavière panhard
 
 Hi everyone, 
 
 we would like to perform a PLS discriminant analysis with R.
 Does anyone knows if at least a PLS regression package is available?
 
 Thanks by advance,
 
 Xavière Panhard
 University Hospital Bichat - Claude Bernard
 Paris


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] array of variable length vectors

2004-02-02 Thread Uwe Ligges
Giampiero Salvi wrote:

Hi,
I'd like to store N vectors of different lengths, and to be able to
access them with an index, and eventually free the memory for one
of them without modifying the indexes to the others.
In C this would be a vector of N pointers that point to memory cells
independently allocated.
For example

int *pv[3];

pv[0] = (int *) malloc(13 * sizeof(int));
pv[1] = (int *) malloc(7 * sizeof(int));
pv[2] = (int *) malloc(110 * sizeof(int));
free(pv[1])
...
What is the best data type (or class) in R to do such a thing?
See ?list

Uwe Ligges

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] for loops?

2004-02-02 Thread Catherine Stein

Hello R people!

How can one use a for loop (or something similar) in R?  As I type in each
line, I get syntax errors... I'm just confused how much to type in at each
 prompt.

Thanks for your help,
cathy



~~~
Catherine M. Stein
Research Assistant, Tuberculosis Research Unit
Doctoral Candidate in Genetic Epidemiology
Department of Epidemiology and Biostatistics
Case Western Reserve University
office: (216)368-0875 or (216)778-1378
e-mail: [EMAIL PROTECTED], or [EMAIL PROTECTED]
http://darwin.cwru.edu/~kasia

EPBI Student Resources Page:
http://hal.epbi.cwru.edu/stures/


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] for loops?

2004-02-02 Thread Pedro Rodrigues
On Mon, 2004-02-02 at 13:42, Catherine Stein wrote:
 Hello R people!
 
 How can one use a for loop (or something similar) in R?  As I type in each
 line, I get syntax errors... I'm just confused how much to type in at each
  prompt.
 
 Thanks for your help,
 cathy

Hello.

I believe you want something like:

for(i in 1:n){
...some lines here...
}

If you have only one line within the for loop you can use it without the
brackets.

If the environment is still open (as within the for loop) you will not
get a new  prompt but a + prompt to continue as if you were writing
on the same line.

Only when you close the brackets the prompt will return to  after
executing the for loop.

Hope this will help.

Pedro

-- 
---
   Pedro Pereira Rodrigues
   http://www.dcc.fc.up.pt/~prodrigues/

   Phone: +351 226078830 - Ext: 121
   Snail: Department of Computer Science
  Faculty of Sciences - University of Porto
  Rua do Campo Alegre, 823
  4150-180 Porto - Portugal

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] for loops?

2004-02-02 Thread Barry Rowlingson
Catherine Stein wrote:

How can one use a for loop (or something similar) in R?  As I type in each
line, I get syntax errors... I'm just confused how much to type in at each
 prompt.
 Have you read help(for) (you need to quote 'for' here to avoid a 
syntax error!)?

 If you'd shown us exactly what you'd typed we could probably help 
better. Suppose you want to loop from 1 to 10 and print it. You can do 
the following, where '' is the R prompt (dont type it):

  for(i in 1:10)print(i)
- ie all on one line
  for(i in 1:10){print(i)}
- with curly brackets
  for(i in 1:10)
 + print(i)
   - where '+' is the continuation prompt (dont type it) - R gives you 
this when it realises you havent written a complete expression yet.

  for(i in 1:10) {
 + print(i)
 + }
- curly brackets enclose as many expressions as you like inside the 
loop.

  for(i in 1:10)
 + { print(i)
 + }
- curly brackets anywhere. R works it out.

 You can even do, and I didn't think this would work...

  for(i in
 + 1:10)
 + {print(i)}
 So in short, I cant get it to give a syntax error :) What are you 
doing? What version of R, and what platform?

Baz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] for loops?

2004-02-02 Thread Petr Pikal
Hi

On 2 Feb 2004 at 8:42, Catherine Stein wrote:

 
 Hello R people!
 
 How can one use a for loop (or something similar) in R?  As I type in
 each line, I get syntax errors... I'm just confused how much to type

Use curly braces
for (i in ...) {

your long commands
in several lines

}

An Introduction to R in doc directory is your best friend.

Cheers
Petr

 in at each  prompt.
 
 Thanks for your help,
 cathy
 
 
 
 ~~
 ~ Catherine M. Stein Research Assistant, Tuberculosis Research
 Unit Doctoral Candidate in Genetic Epidemiology Department of
 Epidemiology and Biostatistics Case Western Reserve University office:
 (216)368-0875 or (216)778-1378 e-mail: [EMAIL PROTECTED], or
 [EMAIL PROTECTED] http://darwin.cwru.edu/~kasia
 
 EPBI Student Resources Page:
 http://hal.epbi.cwru.edu/stures/
 ~~
 ~~
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] axes in boxplots

2004-02-02 Thread David Andel
Hi

I am struggling with controlling the axes of boxplots. I need to produce 
two horizontal boxplots with the same x-axis for comparison purposes.
But when I generate a plot(c(-12,8), c(-6,1), type=n) first, then the 
following boxplot always overwrites it! I went into the code of 
boxplot.default, but even that without success.

Thanks for any hint!

David

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] filled contour + points

2004-02-02 Thread Ivailo Partchev
Hello

I have a small problem with filled contour plots. I'd like to plot point 
on top of that using points(). Trouble is, the x axis of the contour 
plot is modified to make room for the legend but points() is not aware 
of that. It could be easily tackled by using a linear transformation of 
x in points(), but does anyone know exactly  *what* transformation?

Kind regards

Ivailo Partchev

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] axes in boxplots

2004-02-02 Thread Uwe Ligges
David Andel wrote:

Hi

I am struggling with controlling the axes of boxplots. I need to produce 
two horizontal boxplots with the same x-axis for comparison purposes.
But when I generate a plot(c(-12,8), c(-6,1), type=n) first, then the 
following boxplot always overwrites it! I went into the code of 
boxplot.default, but even that without success.
	
Thanks for any hint!
 boxplot(dat1, dat2, horizontal=TRUE)
plots 2 parallel boxplots of dat1 and dat2 respectively.
Uwe Ligges

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Stepwise regression and PLS

2004-02-02 Thread Thomas Lumley
On Sun, 1 Feb 2004, [gb2312] Jinsong Zhao wrote:


 In the case of stepwise, SPSS gave out a model with 4 independent
 variable, but with step(), R gave out a model with 10 and much higher
 R2. Furthermore, regsubsets() also indicate the 10 variable is one of
 the best regression subset. How to explain this difference? And in the
 case of my data set, how many variables that enter the model would be
 reasonable?


Most likely because step() uses AIC and SPSS uses a p-value criterion, so
the models are `best' in different ways.   regsubsets() gives best models
of each size, so it doesn't address the 4 vs 10 issue.

This isn't what regsubsets() is intended for.  If you want a single model
for prediction, you need a method based on an honest estimate of
prediction error and if you want a single model to explain relationships
you need to think about relationships.

While people seem to want to use it for finding a single model,
the purpose of regsubsets() is to give you many models,  precisely as a
way around the problem of instability everyone else has pointed out.
Given a large number of models you can see what features
are common to them, or you can do a crude but reasonably effective
approximation to model averaging.


-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] filled contour + points

2004-02-02 Thread Roger D. Peng
Rather than transform the points, you can use the 
`plot.axes' argument to filled.contour() and call points() 
from there.

-roger

Ivailo Partchev wrote:
Hello

I have a small problem with filled contour plots. I'd like to plot point 
on top of that using points(). Trouble is, the x axis of the contour 
plot is modified to make room for the legend but points() is not aware 
of that. It could be easily tackled by using a linear transformation of 
x in points(), but does anyone know exactly  *what* transformation?

Kind regards

Ivailo Partchev

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] problem building R on HPUX 11.23

2004-02-02 Thread Charles F. Fisher
When building the X11 modules under HPUX 11.23 I get the following errors

ld: Unsatisfied symbol Rf_isNull in file dataentry.lo
ld: Unsatisfied symbol Rf_length in file dataentry.lo
ld: Unsatisfied symbol Rf_warningcall in file devX11.lo
ld: Unsatisfied symbol UNIMPLEMENTED in file dataentry.lo
ld: Unsatisfied symbol R_alloc in file devX11.lo
ld: Unsatisfied symbol R_GlobalEnv in file dataentry.lo
ld: Unsatisfied symbol R_setX11Routines in file devX11.lo
ld: Unsatisfied symbol Rf_devNumber in file devX11.lo
ld: Unsatisfied symbol Rf_elt in file devX11.lo
ld: Unsatisfied symbol R_NaInt in file dataentry.lo

I assume I need to link in an R library built earlier; which library would 
I need?

Please reply directly, as I am not on the list.

Thanks
Chuck Fisher
[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Robust nonlinear regression - sin(x)/x?

2004-02-02 Thread cstrato
Dear all

Since I did not receive any answer to my general question (?),
let me ask a concrete question:
How can I fit the simple function y = a*sin(x)/b*x?

This is the code that I tried, but nls gives an error:

x - seq(1,10,0.1)
y - sin(x)/x
plot(x,y)
z - jitter(y,amount=0.1)
plot(x,z)
df - as.data.frame(cbind(x,z))
nf - nls(z ~ a*sin(x)/b*x, data=df,
  start=list(a=0.8,b=0.9), trace = TRUE)
I have followed the Puromycin sample which works fine:
Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated,
  start = list(Vm = 200, K = 0.1), trace = TRUE)
Do I make some mistake or is it not possible to fit sin(x)/x?

Thank you in advance
Best regards
Christian
_._._._._._._._._._._._._._._._
C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a   A.u.s.t.r.i.a
_._._._._._._._._._._._._._._._
cstrato wrote:
Dear R experts

This is a general question:
Does R have functions for nonlinear robust regression,
analogous to e.g. LTS?
Searching google I have found
1, an abstract to generalize LTS for nonlinear regression
models, see: http://smealsearch.psu.edu/1509.html
2, an AD-model builder, see: http://otter-rsch.com/admodel/cc1.html
but no mention of R/S
Thank you in advance
Best regards
Christian
_._._._._._._._._._._._._._._._
C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a   A.u.s.t.r.i.a
_._._._._._._._._._._._._._._._
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Unknown account

2004-02-02 Thread Eckerolinjen sys admin

The account is not in use. If you don't know the correct address, please send our mail 
to [EMAIL PROTECTED]  / With best regards, Rederi Ab Eckero

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Robust nonlinear regression - sin(x)/x?

2004-02-02 Thread Spencer Graves
 You didn't tell us what the error message nls gave, but it should 
be obvious:  You can estimate either a or b in this model (or their 
ratio), but you can't estimate both.  See any good book on nonlinear 
regression, e.g., Bates and Watts (1988) Nonlinear Regression and Its 
Applications (Wiley). 

 With that simplification, the equation is linear. 

 However, I see another problem:  If x is 0, sin(x)/x is NaN.  
For a situation like this, I typically write a function to first compute 
sin(x)/x, then test for x being 0 or very small, and replace the value 
in such cases with a more accurate Taylor series or asymptotic expansion. 

 hope this helps. 
 spencer graves

cstrato wrote:

Dear all

Since I did not receive any answer to my general question (?),
let me ask a concrete question:
How can I fit the simple function y = a*sin(x)/b*x?

This is the code that I tried, but nls gives an error:

x - seq(1,10,0.1)
y - sin(x)/x
plot(x,y)
z - jitter(y,amount=0.1)
plot(x,z)
df - as.data.frame(cbind(x,z))
nf - nls(z ~ a*sin(x)/b*x, data=df,
  start=list(a=0.8,b=0.9), trace = TRUE)
I have followed the Puromycin sample which works fine:
Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated,
  start = list(Vm = 200, K = 0.1), trace = TRUE)
Do I make some mistake or is it not possible to fit sin(x)/x?

Thank you in advance
Best regards
Christian
_._._._._._._._._._._._._._._._
C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a   A.u.s.t.r.i.a
_._._._._._._._._._._._._._._._
cstrato wrote:

Dear R experts

This is a general question:
Does R have functions for nonlinear robust regression,
analogous to e.g. LTS?
Searching google I have found
1, an abstract to generalize LTS for nonlinear regression
models, see: http://smealsearch.psu.edu/1509.html
2, an AD-model builder, see: http://otter-rsch.com/admodel/cc1.html
but no mention of R/S
Thank you in advance
Best regards
Christian
_._._._._._._._._._._._._._._._
C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a   A.u.s.t.r.i.a
_._._._._._._._._._._._._._._._
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Robust nonlinear regression - sin(x)/x?

2004-02-02 Thread Ravi Varadhan
You reall have only one parameter in your model, c = a/b. You can't 
identify both a and b from your model, therefore, you should fit the 
linear model:  lm(z ~ c* sin(x)/x)

Ravi.

- Original Message -
From: cstrato [EMAIL PROTECTED]
Date: Monday, February 2, 2004 2:28 pm
Subject: [R] Robust nonlinear regression - sin(x)/x?

 Dear all
 
 Since I did not receive any answer to my general question (?),
 let me ask a concrete question:
 
 How can I fit the simple function y = a*sin(x)/b*x?
 
 This is the code that I tried, but nls gives an error:
 
 x - seq(1,10,0.1)
 y - sin(x)/x
 plot(x,y)
 z - jitter(y,amount=0.1)
 plot(x,z)
 df - as.data.frame(cbind(x,z))
 nf - nls(z ~ a*sin(x)/b*x, data=df,
   start=list(a=0.8,b=0.9), trace = TRUE)
 
 I have followed the Puromycin sample which works fine:
 Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated,
   start = list(Vm = 200, K = 0.1), trace = TRUE)
 
 Do I make some mistake or is it not possible to fit sin(x)/x?
 
 Thank you in advance
 Best regards
 Christian
 _._._._._._._._._._._._._._._._
 C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
 V.i.e.n.n.a   A.u.s.t.r.i.a
 _._._._._._._._._._._._._._._._
 
 
 cstrato wrote:
  Dear R experts
  
  This is a general question:
  Does R have functions for nonlinear robust regression,
  analogous to e.g. LTS?
  
  Searching google I have found
  1, an abstract to generalize LTS for nonlinear regression
  models, see: http://smealsearch.psu.edu/1509.html
  2, an AD-model builder, see: http://otter-rsch.com/admodel/cc1.html
  but no mention of R/S
  
  Thank you in advance
  Best regards
  Christian
  _._._._._._._._._._._._._._._._
  C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
  V.i.e.n.n.a   A.u.s.t.r.i.a
  _._._._._._._._._._._._._._._._
  
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
 
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-
 guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Robust nonlinear regression - sin(x)/x?

2004-02-02 Thread Peter Dalgaard
cstrato [EMAIL PROTECTED] writes:

 Dear all
 
 Since I did not receive any answer to my general question (?),
 let me ask a concrete question:
 
 How can I fit the simple function y = a*sin(x)/b*x?
 
 This is the code that I tried, but nls gives an error:
 
 x - seq(1,10,0.1)
 y - sin(x)/x
 plot(x,y)
 z - jitter(y,amount=0.1)
 plot(x,z)
 df - as.data.frame(cbind(x,z))
 nf - nls(z ~ a*sin(x)/b*x, data=df,
start=list(a=0.8,b=0.9), trace = TRUE)
 
 I have followed the Puromycin sample which works fine:
 Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated,
start = list(Vm = 200, K = 0.1), trace = TRUE)
 
 Do I make some mistake or is it not possible to fit sin(x)/x?

The expression only depends on a/b so you cannot estimate both.

Besides, you need to check up on operator precedence and
associativity: What you wrote is equivalent to a*sin(x)*x/b.

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] memory problem for R --Summary

2004-02-02 Thread Yun-Fang Juan
Thank you very much for the replies you have sent me regarding the memory
problem.
The following is the summary
(I tried to read all the messages through. I apologized if I overlooked your
message)

Cheers,

Yun-Fang

Backgrounds:
a. Data: 1million rows with 73 numeric attributes
b. Environment: R 1.7.1 on FreeBSD 4.3 with  2GB memory and double CPU
   Pentium III/Pentium III Xeon/Celeron
with  data seg size (kbytes) =1572864  limit

Suggested Solutions:
z. use SAS since SAS is not trying to read all the data into RAM.
a. random sampling from the large data set i.e. 10% of 1 million rows
(the option singular.ok=TRUE can be used in lm for singular matrice.)
b. use kalman filter with migration variance =0. ( see the dse package for
details)
c. add the following configuration: options(object.size=1e8)
   Results:  still OOM
d. if data is all numeric, add colClasses=numeric in read.table()
   Results: read.table read in the data successfully but I failed to access
the dataset after the loading
(even dataset[1:10,] didn't work)

- Original Message -
From: Liaw, Andy [EMAIL PROTECTED]
To: 'Yun-Fang Juan' [EMAIL PROTECTED]; Prof Brian Ripley
[EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, January 30, 2004 11:44 AM
Subject: RE: [R] memory problem for R


 You still have not read the posting guide, have you?

 See more below.

  From: Yun-Fang Juan

 [...]

  I tried 10% sample and it turned out the matrix became
  singular after I did that.
  Ther reason is some of the attributes only have zero values
  most of the time.
  The data i am using is web log data and after some
  transformation, they are all numeric.
  Can we specify some parameters in read.table so that the
  program will treat all the vars as numeric
  (with this context, hopefully that will reduce the memory
  consumption)  ?

 and you clearly have not read my (private) reply, either, in which I told
 you *exactly* how to do that, via the colClasses argument to read.table().

 Please take the help given to you seriously.  If you want attention, you
 have to pay attention.

 Andy

  thanks a lot,
 
  Yun-Fang


 --

 Notice:  This e-mail message, together with any attachments, contains
 information of Merck  Co., Inc. (One Merck Drive, Whitehouse Station, New
 Jersey, USA 08889), and/or its affiliates (which may be known outside the
 United States as Merck Frosst, Merck Sharp  Dohme or MSD and in Japan, as
 Banyu) that may be confidential, proprietary copyrighted and/or legally
 privileged. It is intended solely for the use of the individual or entity
 named on this message.  If you are not the intended recipient, and have
 received this message in error, please notify us immediately by reply
e-mail
 and then delete it from your system.
 --




__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] mvrnorm problem

2004-02-02 Thread Stuart V Jordan
I am trying to simulate draws from a multivariate normal using mvrnorm, and
am getting the following error message:
 
Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : 
non-conformable arrays

I do not understand why I am getting this message, since the vector of means
I am giving to the function is 13 by 1 and the variance matrix I am giving
to the function is 13 by 13...these matrices are therefore conformable,
right?
 
Is it possible that this is actually a problem with the
positive-definiteness of the variance matrix?  The matrix I am attempting to
use has eigenvalues that are close to zero (but positive).
 
Here's a transcript of the commands I've tried:
 
 mvrnorm(n = 1000,B,V)
Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : 
non-conformable arrays
 mvrnorm(n = 1000,t(B),V)
Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : 
non-conformable arrays

 V
   [,1]  [,2]  [,3] [,4]  [,5]
 [1,]  2.632324e+05 -1.286920e-01 -9.461830e-02  6.27173e-02 -2.625664e+00
 [2,] -1.286920e-01  1.164268e+00 -7.496500e-03  4.95040e-03 -1.557751e-01
 [3,] -9.461830e-02 -7.496500e-03  5.761000e-03 -6.66370e-03  6.366440e-02
 [4,]  6.271730e-02  4.950400e-03 -6.663700e-03  8.23150e-03 -6.252150e-02
 [5,] -2.625664e+00 -1.557751e-01  6.366440e-02 -6.25215e-02  1.753953e+02
 [6,]  1.017806e+01  8.797900e-02 -1.028916e+01  9.75542e+00 -8.614497e+03
 [7,] -1.286920e-01  1.164268e+00 -7.496500e-03  4.95040e-03 -1.557751e-01
 [8,]  3.93e-07  2.06e-09  1.15e-10 -1.39000e-10  5.55e-09
 [9,] -1.95e-07  3.68e-09 -2.05e-11  8.86000e-12 -6.11e-09
[10,]  4.53e-07  4.61e-10  9.25e-12 -1.5e-11  1.31e-08
[11,]  1.17e-06 -6.71e-09  6.74e-11 -1.68000e-10 -3.40e-09
[12,] -9.60e-07  3.97e-09 -9.52e-11  2.12000e-10 -3.43e-10
[13,] -4.97e-07 -2.07e-10 -3.10e-11  5.69000e-11 -2.27e-09
   [,6]  [,7] [,8] [,9][,10]
 [1,]  1.017806e+01 -1.286920e-01  3.93000e-07 -1.95000e-07  4.53000e-07
 [2,]  8.797900e-02  1.164268e+00  2.06000e-09  3.68000e-09  4.61000e-10
 [3,] -1.028916e+01 -7.496500e-03  1.15000e-10 -2.05000e-11  9.25000e-12
 [4,]  9.755420e+00  4.950400e-03 -1.39000e-10  8.86000e-12 -1.5e-11
 [5,] -8.614497e+03 -1.557751e-01  5.55000e-09 -6.11000e-09  1.31000e-08
 [6,]  4.980406e+05  8.797900e-02 -4.67000e-07  3.5e-07 -7.68000e-07
 [7,]  8.797900e-02  1.164268e+00  2.06000e-09  3.68000e-09  4.61000e-10
 [8,] -4.67e-07  2.06e-09  2.15049e-02  9.00950e-03  9.58560e-03
 [9,]  3.50e-07  3.68e-09  9.00950e-03  1.58849e-02  1.65230e-03
[10,] -7.68e-07  4.61e-10  9.58560e-03  1.65230e-03  1.53024e-02
[11,]  5.61e-07 -6.71e-09  4.26370e-03 -9.08500e-04  6.03300e-03
[12,] -2.72e-07  3.97e-09 -3.98220e-03  4.55100e-04 -5.55290e-03
[13,]  9.53e-08 -2.07e-10 -1.38413e-02 -9.42850e-03 -1.12002e-02
 [,11][,12][,13]
 [1,]  1.17000e-06 -9.6e-07 -4.97000e-07
 [2,] -6.71000e-09  3.97000e-09 -2.07000e-10
 [3,]  6.74000e-11 -9.52000e-11 -3.1e-11
 [4,] -1.68000e-10  2.12000e-10  5.69000e-11
 [5,] -3.4e-09 -3.43000e-10 -2.27000e-09
 [6,]  5.61000e-07 -2.72000e-07  9.53000e-08
 [7,] -6.71000e-09  3.97000e-09 -2.07000e-10
 [8,]  4.26370e-03 -3.98220e-03 -1.38413e-02
 [9,] -9.08500e-04  4.55100e-04 -9.42850e-03
[10,]  6.03300e-03 -5.55290e-03 -1.12002e-02
[11,]  9.29045e-02 -8.00488e-02 -2.21646e-02
[12,] -8.00488e-02  7.19209e-02  1.82813e-02
[13,] -2.21646e-02  1.82813e-02  1.76818e-02

 B
  [,1]  [,2] [,3]  [,4] [,5]  [,6]  [,7]
[1,] -22.02731 0.4982429 2.651566 -2.356899 31.65277 -3302.719 0.4982429
  [,8]   [,9] [,10][,11] [,12]  [,13]
[1,] 0.3675237 -0.2562356 0.4549983 4.901309 -4.464062 -0.0836918

 t(B)
   [,1]
 [1,]   -22.0273100
 [2,] 0.4982429
 [3,] 2.6515660
 [4,]-2.3568990
 [5,]31.6527700
 [6,] -3302.719
 [7,] 0.4982429
 [8,] 0.3675237
 [9,]-0.2562356
[10,] 0.4549983
[11,] 4.9013090
[12,]-4.4640620
[13,]-0.0836918

 
Thanks very much,
 
Stu Jordan
Princeton University

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Robust nonlinear regression - sin(x)/x?

2004-02-02 Thread Ravi Varadhan
A small correction to my previous email:
You actually specify the following call to lm:

y - sin(x)/x
lm(z ~ y - 1)

to make sure that the intercept is not estimated.

Ravi.

- Original Message -
From: Ravi Varadhan [EMAIL PROTECTED]
Date: Monday, February 2, 2004 2:46 pm
Subject: Re: [R] Robust nonlinear regression - sin(x)/x?

 You reall have only one parameter in your model, c = a/b. You 
 can't 
 identify both a and b from your model, therefore, you should fit 
 the 
 linear model:  lm(z ~ c* sin(x)/x)
 
 Ravi.
 
 - Original Message -
 From: cstrato [EMAIL PROTECTED]
 Date: Monday, February 2, 2004 2:28 pm
 Subject: [R] Robust nonlinear regression - sin(x)/x?
 
  Dear all
  
  Since I did not receive any answer to my general question (?),
  let me ask a concrete question:
  
  How can I fit the simple function y = a*sin(x)/b*x?
  
  This is the code that I tried, but nls gives an error:
  
  x - seq(1,10,0.1)
  y - sin(x)/x
  plot(x,y)
  z - jitter(y,amount=0.1)
  plot(x,z)
  df - as.data.frame(cbind(x,z))
  nf - nls(z ~ a*sin(x)/b*x, data=df,
start=list(a=0.8,b=0.9), trace = TRUE)
  
  I have followed the Puromycin sample which works fine:
  Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated,
start = list(Vm = 200, K = 0.1), trace = TRUE)
  
  Do I make some mistake or is it not possible to fit sin(x)/x?
  
  Thank you in advance
  Best regards
  Christian
  _._._._._._._._._._._._._._._._
  C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
  V.i.e.n.n.a   A.u.s.t.r.i.a
  _._._._._._._._._._._._._._._._
  
  
  cstrato wrote:
   Dear R experts
   
   This is a general question:
   Does R have functions for nonlinear robust regression,
   analogous to e.g. LTS?
   
   Searching google I have found
   1, an abstract to generalize LTS for nonlinear regression
   models, see: http://smealsearch.psu.edu/1509.html
   2, an AD-model builder, see: http://otter-
 rsch.com/admodel/cc1.html  but no mention of R/S
   
   Thank you in advance
   Best regards
   Christian
   _._._._._._._._._._._._._._._._
   C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
   V.i.e.n.n.a   A.u.s.t.r.i.a
   _._._._._._._._._._._._._._._._
   
   __
   [EMAIL PROTECTED] mailing list
   https://www.stat.math.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide! 
   http://www.R-project.org/posting-guide.html
   
  
  
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-
 project.org/posting-
  guide.html
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-
 guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ordering and plotting question

2004-02-02 Thread Simon Melov
Hi,
I am trying to plot several rows out of a list of thousands. I have 40 
columns, and about 16,000 rows with the following Df structure.

ID X01 X02 X03..X40
AI456 45 64 23...
AI943 14 3 45 ..
AI278 78 12 68..
BW768 -2 -7 34..
...
My question is, I have a list of 100 IDs generated elsewhere 
(Df-Ofinterest), I would like to plot the 100 IDs from that data 
frame over the 40 columns (40 points per ID, with each column being 1 x 
value). But I cant figure out how to retrieve the values from the main 
Df and plot them. Ive tried looking at order, rank etc, but these seem 
to apply to numbers, and not text strings.

thanks

Simon.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Robust nonlinear regression - sin(x)/x?

2004-02-02 Thread cstrato
Dear all

Thank you very much this time for the fast response and
your many comments, and sorry for the stupid mistake.
1, The following gives the following error:
nf - nls(z ~ a*sin(x)/(b*x), data=df,
  start=list(a=0.8,b=0.9), trace = TRUE)
Error in nlsModel(formula, mf, start) : singular gradient matrix at 
initial parameter estimates

2, However, as Peter Dalgaard mentioned, the following
gives the correct result:
nf - nls(z ~ c*sin(x)/x, data=df,
  start=list(c=0.5), trace = TRUE)
2.113783 :  0.5
0.3187204 :  1.022591
3, Now to the question of robust nonlinear fitting:
Let me introduce some outliers:
z1 - z
z1[c(6,12,13,34,36,42,67,69,72,76)] - 
c(0.8,0.9,0.8,-0.5,-0.4,-0.6,0.5,0.6,0.8,0.7)
plot(x,z1)
df1 - as.data.frame(cbind(x,z1))

Now, the fit gives:
nf1 - nls(z1 ~ c*sin(x)/x, data=df1,
  start=list(c=0.5), trace = TRUE)
4.814774 :  0.5
2.072135 :  1.145962
The true result should be   c=1.0
fitting w/o outliers gives  c=1.023
fitting with outliers gives c=1.146
Can this fit considered to be robust?
Best regards
Christian
Peter Dalgaard wrote:

cstrato [EMAIL PROTECTED] writes:


Dear all

Since I did not receive any answer to my general question (?),
let me ask a concrete question:
How can I fit the simple function y = a*sin(x)/b*x?

This is the code that I tried, but nls gives an error:

x - seq(1,10,0.1)
y - sin(x)/x
plot(x,y)
z - jitter(y,amount=0.1)
plot(x,z)
df - as.data.frame(cbind(x,z))
nf - nls(z ~ a*sin(x)/b*x, data=df,
  start=list(a=0.8,b=0.9), trace = TRUE)
I have followed the Puromycin sample which works fine:
Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated,
  start = list(Vm = 200, K = 0.1), trace = TRUE)
Do I make some mistake or is it not possible to fit sin(x)/x?


The expression only depends on a/b so you cannot estimate both.

Besides, you need to check up on operator precedence and
associativity: What you wrote is equivalent to a*sin(x)*x/b.
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Nearest Neighbor Algorithm in R -- again.

2004-02-02 Thread Hans W Borchers
Several of the methods I use for analyzing large data sets, such as

 WinGamma: determining the level of noise in data
 Relief-F: estimating the influence of variables
depend on finding the k nearest neighbors of a point in a data frame or
matrix efficiently. (For large data sets it is not feasible to compute
the 'dist' matrix anyway.)
Seeing the proposed solution to [R] distance between two matrices
last month for finding _one_ nearest neighbor I came up with a solution
'nearest(A, n, k)' as appended.
Still, this is very clumsy and slow if you have to find the 3 nearest
neighbors for 1000 points in a data frame with 10 entries at least
-- about 2 secs per data point on my computer or half an hour for an
application from real life.
Are there better/faster ways to perform this task using R functions?
Even better, is there a free implementation of kd-trees that I could
utilize (the one I found did not conform to the ANSI C standard)?
Someome pointed to 'spdep' of the R-Sig-Geo project, but 'knearneigh'
only accepts 2D data points.
Hans Werner Borchers
ABB Corporate Research, Germany

require (class)
nearest - function (X, n, k=3)
##  Find k nearest neighbors of X[n, ] in the data frame
##  or matrix X, utilizing function knn1 k-times.
{
   N - nrow(X)
   # inds contains the indices of nearest neighbors
   inds - c(n); i - 0
   while (i  k) {
   # use knn1 for one index...
   j  - as.integer(knn1(X [-inds, ], X[n, ], 1:(N-length(inds
   # ...and change to true index of neighbor
   inds - c(inds, setdiff(1:N, inds)[j])
   i - i+1
   }
   # return nearest neighbor indices (without n, of course)
   return(inds[-1])
}
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] problems when compiling C code

2004-02-02 Thread Stephane DRAY
Hello,

I would like to call C code from R. My C code is divided in two files. In 
the file testpermut.c, I have the following lines:

#include adesub.h

In my working folder, I have the files:
- adesub.c which contains general functions
- adesub.h with the header of functions contained in adesub.c
- testpermut.c which call some functions defined in adesub.c
When I try to  create my dll (Work on Windows XP, R-1.8.1), I obtain error 
message:

$ Rcmd shlib testpermut.c
making adesub.d from adesub.c
making testpermut.d from testpermut.c
gcc   -Ic:/Rdev/R-1.8.1/src/include -Wall -O2   -c testpermut.c -o testpermut.o
ar cr testpermut.a *.o
ranlib testpermut.a
gcc  --shared -s  -o testpermut.dll testpermut.def 
testpermut.a  -Lc:/Rdev/R-1.8
.1/src/gnuwin32  -lg2c -lR
testpermut.a(testpermut.o.b)(.text+0x35):testpermut.c: undefined reference 
to `taballoc'
testpermut.a(testpermut.o.b)(.text+0x49):testpermut.c: undefined reference 
to `taballoc'
testpermut.a(testpermut.o.b)(.text+0x62):testpermut.c: undefined reference 
to `taballoc'
testpermut.a(testpermut.o.b)(.text+0x14c):testpermut.c: undefined reference 
to `freetab'
testpermut.a(testpermut.o.b)(.text+0x156):testpermut.c: undefined reference 
to `freetab'
testpermut.a(testpermut.o.b)(.text+0x160):testpermut.c: undefined reference 
to `freetab'
testpermut.a(testpermut.o.b)(.text+0x192):testpermut.c: undefined reference 
to `taballoc'
testpermut.a(testpermut.o.b)(.text+0x22b):testpermut.c: undefined reference 
to `taballoc'
testpermut.a(testpermut.o.b)(.text+0x23f):testpermut.c: undefined reference 
to `vecalloc'
testpermut.a(testpermut.o.b)(.text+0x24b):testpermut.c: undefined reference 
to `vecalloc'
testpermut.a(testpermut.o.b)(.text+0x339):testpermut.c: undefined reference 
to `freevec'
testpermut.a(testpermut.o.b)(.text+0x343):testpermut.c: undefined reference 
to `freevec'
testpermut.a(testpermut.o.b)(.text+0x34d):testpermut.c: undefined reference 
to `freetab'
testpermut.a(testpermut.o.b)(.text+0x412):testpermut.c: undefined reference 
to `taballoc'
make: *** [testpermut.dll] Error 1
$

The functions taballoc, freetab, vecalloc and freevec are defined in adesub 
files. So it seems that gcc does not make the links between my files. If I 
include the problematic functions in testpermut.c, gcc works perfectly and 
my dll is created.

Perhaps someone could explain me what is my problem although it is not an R 
problem but probably a misuse of gcc ?.

Thanks in advance.
Stéphane DRAY
-- 

Département des Sciences Biologiques
Université de Montréal, C.P. 6128, succursale centre-ville
Montréal, Québec H3C 3J7, Canada
Tel : 514 343 6111 poste 1233
E-mail : [EMAIL PROTECTED]
-- 

Web  http://www.steph280.freesurf.fr/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Robust nonlinear regression - sin(x)/x?

2004-02-02 Thread cstrato
Dear Ravi

Sorry, I forgot to mention that you have also  indicated
that I have only one parameter. Fitting using lm gives:
c=1.023 w/o and c=1.146 with outliers.
Maybe sin(x)/x was a bad example, how about trying to
fit a polynomial of degree n?
Best regards
Christian
Ravi Varadhan wrote:

A small correction to my previous email:
You actually specify the following call to lm:
y - sin(x)/x
lm(z ~ y - 1)
to make sure that the intercept is not estimated.

Ravi.

- Original Message -
From: Ravi Varadhan [EMAIL PROTECTED]
Date: Monday, February 2, 2004 2:46 pm
Subject: Re: [R] Robust nonlinear regression - sin(x)/x?

You reall have only one parameter in your model, c = a/b. You 
can't 
identify both a and b from your model, therefore, you should fit 
the 
linear model:  lm(z ~ c* sin(x)/x)

Ravi.

- Original Message -
From: cstrato [EMAIL PROTECTED]
Date: Monday, February 2, 2004 2:28 pm
Subject: [R] Robust nonlinear regression - sin(x)/x?

Dear all

Since I did not receive any answer to my general question (?),
let me ask a concrete question:
How can I fit the simple function y = a*sin(x)/b*x?

This is the code that I tried, but nls gives an error:

x - seq(1,10,0.1)
y - sin(x)/x
plot(x,y)
z - jitter(y,amount=0.1)
plot(x,z)
df - as.data.frame(cbind(x,z))
nf - nls(z ~ a*sin(x)/b*x, data=df,
 start=list(a=0.8,b=0.9), trace = TRUE)
I have followed the Puromycin sample which works fine:
Pur.wt - nls(rate ~ (Vm * conc)/(K + conc), data = Treated,
 start = list(Vm = 200, K = 0.1), trace = TRUE)
Do I make some mistake or is it not possible to fit sin(x)/x?

Thank you in advance
Best regards
Christian
_._._._._._._._._._._._._._._._
C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a   A.u.s.t.r.i.a
_._._._._._._._._._._._._._._._
cstrato wrote:

Dear R experts

This is a general question:
Does R have functions for nonlinear robust regression,
analogous to e.g. LTS?
Searching google I have found
1, an abstract to generalize LTS for nonlinear regression
models, see: http://smealsearch.psu.edu/1509.html
2, an AD-model builder, see: http://otter-
rsch.com/admodel/cc1.html  but no mention of R/S

Thank you in advance
Best regards
Christian
_._._._._._._._._._._._._._._._
C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a   A.u.s.t.r.i.a
_._._._._._._._._._._._._._._._
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-
project.org/posting-

guide.html
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-
guide.html




__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] mvrnorm problem

2004-02-02 Thread Peter Dalgaard
Stuart V Jordan [EMAIL PROTECTED] writes:

  mvrnorm(n = 1000,B,V)
 Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : 
 non-conformable arrays
  mvrnorm(n = 1000,t(B),V)
 Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) : 
 non-conformable arrays

You might, for at least two good reasons, have said that this is from
library(MASS). The point is that

 mvrnorm(n=10,matrix(c(1,1),1,2),diag(2))
Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) :
non-conformable arrays
 mvrnorm(n=10,matrix(c(1,1),2,1),diag(2))
Error in mu + eS$vectors %*% diag(sqrt(pmax(ev, 0)), p) %*% t(X) :
non-conformable arrays
 mvrnorm(n=10,c(1,1),diag(2))
[,1]   [,2]
 [1,]  0.5005327  1.1919216
 [2,]  2.8273925  2.7004788
 [3,]  2.6493970  1.1304274


and the docs quite clearly say that mu wants to be a vector, not a
matrix. 

Curiously enough, this works with rmvnorm from the mvtnorm package by
Genz, Bretz, and Hothorn, the difference being that this version adds
in the means with a sweep() operation, whereas mvrnorm just adds mu
(to the transpose of the ultimate result) and relies on recycling
rules. I.e. the point is that

 x - matrix(1:2,1,2)
 M - matrix(1:4,2)
 x+M
Error in x + M : non-conformable arrays
 t(x)+M
Error in t(x) + M : non-conformable arrays
 c(x)+M
 [,1] [,2]
[1,]24
[2,]46
 sweep(M,1,x,+)
 [,1] [,2]
[1,]24
[2,]46


-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] sorting by date

2004-02-02 Thread Jeff Jorgensen
Hello,

I have set up a data.frame and one of the columns contains a date of the 
form (with slashes as separators):

mm/dd/

I would like to use formulas on other columns in the data.frame organized 
by date, for example:

tapply(var1, sort(date), mean)

However, when I try sort(date) it sorts based on the first two entries in 
the date field:

9/1/20019/1/20029/1/20039/2/2001 ...
5.6 7.5 6.4 7.0 ...
Instead of:

9/1/20019/2/20019/3/20019/4/2001 ...
5.6 6.1 7.2 6.8 ...
I would greatly appreciate any help in sorting chronologically.  Do I need 
to create separate columns for month, day, and year, and then use order() 
and then stipulate the hierarchy for which to sort the output?  Or, is 
there some other more efficient way?

Thanks,

Jeff

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Robust nonlinear regression - better example

2004-02-02 Thread cstrato
Dear all

Here is a hopefully better example with regards to
nonlinear robust fitting:
# fitting a polynomial:
x - seq(-10,10,0.2)
y - 10*x + 4*x*x - 2*x*x*x
plot(x,y)
z - jitter(y,amount=300)
plot(x,z)
df - as.data.frame(cbind(x,z))
nf - nls(z ~ a*x + b*x*x + c*x*x*x, data=df,
+   start=list(a=4,b=2,c=1), trace = TRUE)
127697531 :  4 2 1
2974480 :  10.972123  3.793426 -1.942278
# introducing outliers before fitting the  polynomial:
z1 - z
z1[c(16,22,23,34,36,42,67,69,72,76)] -
+ c(2000,1900,2000,1900,1600,1600,500,-2000,-1700,-1800)
plot(x,z1)
df1 - as.data.frame(cbind(x,z1))
nf1 - nls(z1 ~ a*x + b*x*x + c*x*x*x, data=df1,
+   start=list(a=4,b=2,c=1), trace = TRUE)
159359174 :  4 2 1
24098548 :  -59.053288   4.169518  -1.072027
# plotting the results:
y1 - 10.97*x + 3.79*x*x - 1.94*x*x*x
y2 - -59.05*x + 4.17*x*x - 1.07*x*x*x
oldpar - par(pty=s,mfrow=c(2,2),mar=c(5,5,4,1))
plot(x,y)
plot(x,z1)
plot(x,y1)
plot(x,y2)
par(oldpar)
In my opinion this fit could hardly be considered
to be robust.
Are there functions in R which can do robust fitting?
(Sorrowly, at the moment I could not test the package
nlrq mentioned by Roger Koenker)
Best regards
Christian
_._._._._._._._._._._._._._._._
C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a   A.u.s.t.r.i.a
_._._._._._._._._._._._._._._._
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Nearest Neighbor Algorithm in R -- again.

2004-02-02 Thread Prof Brian Ripley
Why not modify the C code of knn? (Not knn1)

On Mon, 2 Feb 2004, Hans W Borchers wrote:

 Several of the methods I use for analyzing large data sets, such as
 
   WinGamma: determining the level of noise in data
   Relief-F: estimating the influence of variables
 
 depend on finding the k nearest neighbors of a point in a data frame or
 matrix efficiently. (For large data sets it is not feasible to compute
 the 'dist' matrix anyway.)
 
 Seeing the proposed solution to [R] distance between two matrices
 last month for finding _one_ nearest neighbor I came up with a solution
 'nearest(A, n, k)' as appended.
 
 Still, this is very clumsy and slow if you have to find the 3 nearest
 neighbors for 1000 points in a data frame with 10 entries at least
 -- about 2 secs per data point on my computer or half an hour for an
 application from real life.
 
 Are there better/faster ways to perform this task using R functions?
 Even better, is there a free implementation of kd-trees that I could
 utilize (the one I found did not conform to the ANSI C standard)?
 
 Someome pointed to 'spdep' of the R-Sig-Geo project, but 'knearneigh'
 only accepts 2D data points.
 
 Hans Werner Borchers
 ABB Corporate Research, Germany
 
 
 require (class)
 nearest - function (X, n, k=3)
 ##  Find k nearest neighbors of X[n, ] in the data frame
 ##  or matrix X, utilizing function knn1 k-times.
 {
 N - nrow(X)
 # inds contains the indices of nearest neighbors
 inds - c(n); i - 0
 while (i  k) {
 # use knn1 for one index...
 j  - as.integer(knn1(X [-inds, ], X[n, ], 1:(N-length(inds
 # ...and change to true index of neighbor
 inds - c(inds, setdiff(1:N, inds)[j])
 i - i+1
 }
 # return nearest neighbor indices (without n, of course)
 return(inds[-1])
 }
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ordering and plotting question

2004-02-02 Thread Marc Schwartz
On Mon, 2004-02-02 at 14:13, Simon Melov wrote:
 Hi,
 I am trying to plot several rows out of a list of thousands. I have 40 
 columns, and about 16,000 rows with the following Df structure.
 
 ID X01 X02 X03..X40
 AI456 45 64 23...
 AI943 14 3 45 ..
 AI278 78 12 68..
 BW768 -2 -7 34..
 ...
 
 My question is, I have a list of 100 IDs generated elsewhere 
 (Df-Ofinterest), I would like to plot the 100 IDs from that data 
 frame over the 40 columns (40 points per ID, with each column being 1 x 
 value). But I cant figure out how to retrieve the values from the main 
 Df and plot them. Ive tried looking at order, rank etc, but these seem 
 to apply to numbers, and not text strings.
 
 thanks
 
 Simon.


I am not entirely sure what type of plot you want, however the following
will enable you to subset the main dataframe, based upon matching ID's:

# Create an example vector of the ID's you want
df.index - c(AI456, AI278, BW768)

 df.index
[1] AI456 AI278 BW768
 
# Create a df with the subset of data you have above
df - data.frame(ID = c(AI456, AI943, AI278, BW768),
 X01 = c(45, 14, 78, -2),
 X02 = c(64, 3, 12, -7),
 X03 = c(23, 45, 68, 34))

 df
 ID X01 X02 X03
1 AI456  45  64  23
2 AI943  14   3  45
3 AI278  78  12  68
4 BW768  -2  -7  34

# Use which() and %in% to get a vector containing the
# row indices in df that match the three entries in df.index
found - which(df$ID %in% df.index)

 found
[1] 1 3 4

# Now subset df to include only those rows that match
MySubset - df[found, ]

 MySubset
 ID X01 X02 X03
1 AI456  45  64  23
3 AI278  78  12  68
4 BW768  -2  -7  34


MySubset now contains only those rows that match based upon your IDs in
the index vector.

If you know how to generate the plot you want from here, have at it. If
not, let me know what you wish to do and I can help further.

See ?which and ?%in% for more information. Also, see ?subset for
additional ways to subset dataframes using more complex logicals.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sorting by date

2004-02-02 Thread Prof Brian Ripley
Convert to POSIXct and sort.

Note that tapply will coerce to a factor, so you need to create a factor 
with the levels sorted as you want them: just sorting date will not help.
Something like

udate - unique(date)
lev - udate[sort.list(as.POSIXct(strptime(udate, %m/%d/%Y)))]
date - factor(date, levels=lev)


On Mon, 2 Feb 2004, Jeff Jorgensen wrote:

 I have set up a data.frame and one of the columns contains a date of the 
 form (with slashes as separators):
 
 mm/dd/
 
 I would like to use formulas on other columns in the data.frame organized 
 by date, for example:
 
 tapply(var1, sort(date), mean)
 
 However, when I try sort(date) it sorts based on the first two entries in 
 the date field:
 
 9/1/2001  9/1/20029/1/20039/2/2001 ...
 5.6   7.5 6.4 7.0 ...
 
 Instead of:
 
 9/1/2001  9/2/20019/3/20019/4/2001 ...
 5.6   6.1 7.2 6.8 ...
 
 I would greatly appreciate any help in sorting chronologically.  Do I need 
 to create separate columns for month, day, and year, and then use order() 
 and then stipulate the hierarchy for which to sort the output?  Or, is 
 there some other more efficient way?

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sorting by date

2004-02-02 Thread Peter Dalgaard
Jeff Jorgensen [EMAIL PROTECTED] writes:

 Hello,
 
 I have set up a data.frame and one of the columns contains a date of
 the form (with slashes as separators):
 
 mm/dd/
 
 I would like to use formulas on other columns in the data.frame
 organized by date, for example:
 
 tapply(var1, sort(date), mean)

I don't think that does what I think you think it does!


 
 However, when I try sort(date) it sorts based on the first two entries
 in the date field:
 
 9/1/2001  9/1/20029/1/20039/2/2001 ...
 5.6   7.5 6.4 7.0 ...
 
 Instead of:
 
 9/1/2001  9/2/20019/3/20019/4/2001 ...
 5.6   6.1 7.2 6.8 ...
 
 I would greatly appreciate any help in sorting chronologically.  Do I
 need to create separate columns for month, day, and year, and then use
 order() and then stipulate the hierarchy for which to sort the output?
 Or, is there some other more efficient way?

You now know why the ISO standard has -mm-dd ... 

It's a bit awkward, but I think you need something like

pdate - as.POSIXct(strptime(date,%m/%d/%Y))
tapply(var1, pdate, mean)

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Robust nonlinear regression - better example

2004-02-02 Thread Spencer Graves
 1.  The question of linear vs. nonlinear means linear in the 
parameters to be estimated.  All the examples you have given so far are 
linear in the parameters to be estimated.  The fact that they are 
nonlinear in x is immaterial. 

 2.  With this hint and the posting guide 
http://www.R-project.org/posting-guide.html;, you may find more 
information.  A search there exposed much discussion of robust 
regression and even robust nonlinear regression, if you actually 
still need that.  In addition, I found useful information on robust 
regression in Venables and Ripley (2002) Modern Applied Statistics with 
S, 4th ed. (Springer). 

 hope this helps. 
 spencer graves

cstrato wrote:

Dear all

Here is a hopefully better example with regards to
nonlinear robust fitting:
# fitting a polynomial:
x - seq(-10,10,0.2)
y - 10*x + 4*x*x - 2*x*x*x
plot(x,y)
z - jitter(y,amount=300)
plot(x,z)
df - as.data.frame(cbind(x,z))
nf - nls(z ~ a*x + b*x*x + c*x*x*x, data=df,
+   start=list(a=4,b=2,c=1), trace = TRUE)
127697531 :  4 2 1
2974480 :  10.972123  3.793426 -1.942278
# introducing outliers before fitting the  polynomial:
z1 - z
z1[c(16,22,23,34,36,42,67,69,72,76)] -
+ c(2000,1900,2000,1900,1600,1600,500,-2000,-1700,-1800)
plot(x,z1)
df1 - as.data.frame(cbind(x,z1))
nf1 - nls(z1 ~ a*x + b*x*x + c*x*x*x, data=df1,
+   start=list(a=4,b=2,c=1), trace = TRUE)
159359174 :  4 2 1
24098548 :  -59.053288   4.169518  -1.072027
# plotting the results:
y1 - 10.97*x + 3.79*x*x - 1.94*x*x*x
y2 - -59.05*x + 4.17*x*x - 1.07*x*x*x
oldpar - par(pty=s,mfrow=c(2,2),mar=c(5,5,4,1))
plot(x,y)
plot(x,z1)
plot(x,y1)
plot(x,y2)
par(oldpar)
In my opinion this fit could hardly be considered
to be robust.
Are there functions in R which can do robust fitting?
(Sorrowly, at the moment I could not test the package
nlrq mentioned by Roger Koenker)
Best regards
Christian
_._._._._._._._._._._._._._._._
C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a   A.u.s.t.r.i.a
_._._._._._._._._._._._._._._._
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problems when compiling C code

2004-02-02 Thread Prof Brian Ripley
On Mon, 2 Feb 2004, Stephane DRAY wrote:

 Hello,
 
 I would like to call C code from R. My C code is divided in two files. In 
 the file testpermut.c, I have the following lines:
 
 #include adesub.h
 
 In my working folder, I have the files:
 - adesub.c which contains general functions
 - adesub.h with the header of functions contained in adesub.c
 - testpermut.c which call some functions defined in adesub.c
 
 When I try to  create my dll (Work on Windows XP, R-1.8.1), I obtain error 
 message:
 
 $ Rcmd shlib testpermut.c

You only included one of the files. You need

Rcmd SHLIB testpermut.c adesub.c

I believe.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problems when compiling C code

2004-02-02 Thread DivineSAAM
Hello,

Do you have S Programming by Venables and Ripley, Springer, 2000?

There is an excellent discussion and examples there on compiling multifile codes.

I think your problem is in the order of the compilation of the multiple files.

Best,
/oal

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Re: packages

2004-02-02 Thread Barbara Bailey
Hi,

I am trying to make my own package of R functions, datasets and help 
files, which were originally in S and have been converted. As a Unix 
user, I am trying to make a package that installs on Windows and I am 
having some trouble. 

I have a zip file that seems to unzip fine, but I have to use the source 
command to make functions accessible.  

e.g. source(C:/Program Files/R/rw1081/library/slab/R/slab.q)

Is there a command that make happen? or is the .q extention causing trouble?

It turns out that the data() command works fine, but help files are not 
accessible either. 

Thanks for your help,
Barb Bailey

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] glm.poisson.disp versus glm.nb

2004-02-02 Thread Hanke, Alex
Dear list,
This is a question about overdispersion and the ML estimates of the
parameters returned by the glm.poisson.disp (L. Scrucca) and glm.nb
(Venables and Ripley) functions. Both appear to assume a negative binomial
distribution for the response variable.

Paul and Banerjee (1998) developed C(alpha) tests for interaction and main
effects, in an unbalanced two-way layout of counts involving two fixed
factors, when data are Poisson distributed, and when data are extra
dispersed. In R I coded their C(alpha) test statistic (called TNBI) for
interaction for the case where the counts are extra-dispersed, as well as
their test for extra-dispersion (called T_a). Using the Quine data set
(Quine, 1975) the authors collapse the orginal 4x2x2x2 study design into a
2x4 table and obtained estimates of TNBI=10.36 and T_a=90.81.
Using the dispersion estimate from glm.poisson.disp and the estimates for mu
I got exactly the same results for TNBI and T_a. This made me happy. Now I
thought to try the ML estimates from glm.nb to see if the results would be
the same but I am having difficulty relating the dispersion phi from
glm.poisson.disp to theta estimated by glm.nb.
According to the R help for glm.poisson.disp  Var(y_i) =  mu_i(1+mu_i*phi)
. The help for glm.nb lead me to a book by VR (1994) which indicates that
Var(y)=mu+mu^2/theta. From this I gathered that phi=1/theta but the
estimates do not relate to each other in this way unless one is in error. In
a document by L.P. Ammann he says a negative binomial model can be
specified with mean mu and dispersion phi by taking theta=mu/(phi-1). I had
a problem implementing this because in my mind mu is a vector whereas phi
and theta are scalars.

Consequently, I would like to know  how to get phi from theta so that I can
compare the glm.poisson.disp and glm.nb methods for estimating dispersion. 

Regards, Alex




Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to label plots?

2004-02-02 Thread David Andel
Hi

With main=Title I can write centered above the plot.
Is there also a way to write into the left (or right) upper corner?
I'd like to label my plots by (a), (b), ...
Thanks,
David
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Re: packages

2004-02-02 Thread Dirk Eddelbuettel
On Mon, Feb 02, 2004 at 05:35:12PM -0600, Barbara Bailey wrote:
 I am trying to make my own package of R functions, datasets and help 
 files, which were originally in S and have been converted. As a Unix 
 user, I am trying to make a package that installs on Windows and I am 
 having some trouble. 

AFAIK that is not guaranteed to work. Generally speaking, for any given
system, binary R packages (as opposed to source packages) are built on that
architecture.

There are exceptions, most notably the cross-builds for Windows i386 that
can be generated, given a suitable environment, on a Linux system. This is
documented in a few places; google should find it.

For the normal case, your best bet is to read some more of the fine 'Writing
R Extensions' manual, and to possibly study some of the examples provided by
the over 300 source packages available on CRAN.

Hth, Dirk

-- 
The relationship between the computed price and reality is as yet unknown.  
 -- From the pac(8) manual page

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] output from multcomp and lm

2004-02-02 Thread Hiroto Miyoshi
Dear R-users
(B
(BI analysed the same data set by two different ways;
(Banalysis of covariance by using lm and anova functions
(Band multiple comparison by using simtest function in
(Bthe multcomp library.
(B
(BThe output from the analysis of covariance is;
(B
(B   y-lm(D~Cond+Q1,data=x)
(B anova(y)
(BAnalysis of Variance Table
(B
(BResponse: D
(B Df Sum Sq Mean Sq F valuePr(F)
(BCond2 1017.8   508.9   4.7548  0.0135041 *
(BQ1   1 1652.7  1652.7 15.4417  0.0002969 ***
(BResiduals 44 4709.2   107.0
(B---
(BSignif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
(B
(Bwhere Cond is a factor with three levels (A,B,C)
(Band Q1 is a covariate.
(B
(B
(B
(BNow, simtest showed the following output
(B
(B   o5-summary(simtest(D~Cond+Q1,conf.level=0.95,data=x,type="Tukey"))
(B   o5
(B
(B  Simultaneous tests: Tukey contrasts
(B
(BCall:
(Bsimtest.formula(formula = D ~ Cond + Q1, data = x, conf.level = 0.95,
(Btype = "Tukey")
(B
(B  Tukey contrasts for factor Cond, covariable:  Q1
(B
(BContrast matrix:
(B  CondA CondB CondC
(BCondB-CondA 0-1 1 0 0
(BCondC-CondA 0-1 0 1 0
(BCondC-CondB 0 0-1 1 0
(B
(B
(BAbsolute Error Tolerance:  0.001
(B
(BCoefficients:
(BEstimate t value Std.Err. p raw  p Bonf  p adj
(BCondB-CondA5.555  -1.4613.802  0.151   0.453   0.319
(BCondC-CondB   -5.248  -1.3653.661 0.179   0.453   0.319
(BCondC-CondA0.306  -0.0843.844  0.934   0.934   0.934
(B
(BThe results from two analyses seem so different that I am
(Bwondering why.  I do understand that multiple comparison may
(Bnot show any significant difference even when the overall analysis
(Bof (co)variance shows the statistical significance of a factor.
(B
(BHowever, in my analysis, overall analysis showed statistical significance of
(B1.4% level and mutiple comparison showed significance of 32% level
(BCould this happen? and why?  Please enlighten me.
(B
(BSincerely
(B
(BHiroto Miyoshi
$B;09%90?M(B
(B[EMAIL PROTECTED]
(B
(B__
(B[EMAIL PROTECTED] mailing list
(Bhttps://www.stat.math.ethz.ch/mailman/listinfo/r-help
(BPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] how to label plots?

2004-02-02 Thread Marc Schwartz
On Mon, 2004-02-02 at 19:28, Marc Schwartz wrote:
 You can play with the outer margin settings for more space if you wish
 and the 'adj' argument in mtext() manipulates the positioning of the
 text at the x,y coordinate of line, at.


Oops...That last line should read:

text at the x,y coordinate of at, line.

Sorry.

Marc

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to label plots?

2004-02-02 Thread Marc Schwartz
On Mon, 2004-02-02 at 17:25, David Andel wrote:
 Hi
 
 With main=Title I can write centered above the plot.
 Is there also a way to write into the left (or right) upper corner?
 I'd like to label my plots by (a), (b), ...
 
 Thanks,
 David


# Create a basic plot
plot(1:5)

# Now set the outer plot margin to 1 line at the top
# and 0 for bottom, left and right
# See ?par for more information
par(oma = c(0, 0, 1, 0))

# Now use mtext() to place text in the outer margin
# The outer margin coordinates go from 0 to 1
# See ?mtext for more information

#For the upper left hand corner, use the following:
mtext((a), side = 3, line = 0, at = 0, outer = TRUE, adj = 0)

#For the upper right hand corner, use the following:
mtext((b), side = 3, line = 0, at = 1, outer = TRUE, adj = 1)


You can play with the outer margin settings for more space if you wish
and the 'adj' argument in mtext() manipulates the positioning of the
text at the x,y coordinate of line, at.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] running R from PHP

2004-02-02 Thread brook
I would like to construct a PHP script that runs R to generate a
graphics file.  Running R itself is no problem.  However, it seems
impossible to instantiate one of the graphics devices to create
output.  For example, the normal bitmap devices (e.g., jpeg, png,
etc.) are derived from X11, which requires a display.  This seems
true, even if no output is ever directed to a real display.  For some
reason, the postscript device seems to suffer from similar problems.

Is there a trick to creating a graphics device in the absence of an
actual display in order to create an image in a file?

Thanks for your help.

Cheers,
Brook

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] running R from PHP

2004-02-02 Thread Philippe Glaziou
@biology.nmsu.edu [EMAIL PROTECTED] wrote:
 I would like to construct a PHP script that runs R to generate a
 graphics file.  Running R itself is no problem.  However, it seems
 impossible to instantiate one of the graphics devices to create
 output.  For example, the normal bitmap devices (e.g., jpeg, png,
 etc.) are derived from X11, which requires a display.  This seems
 true, even if no output is ever directed to a real display.  For some
 reason, the postscript device seems to suffer from similar problems.
 
 Is there a trick to creating a graphics device in the absence of an
 actual display in order to create an image in a file?


If you need a bitmap graphic file, I would suggest the use of
ImageMagick:


cunegonde:~/tmp ls
foo

cunegonde:~/tmp cat foo
pdf(file=g.pdf)
plot(1:5)
dev.off()

cunegonde:~/tmp R --no-save foo/dev/null  convert g.pdf g.png

cunegonde:~/tmp ls -g
-rw---1 glaziou  3374 2004-02-03 11:58 g.pdf
-rw---1 glaziou  4115 2004-02-03 11:58 g.png
-rw---1 glaziou80 2004-02-03 11:58 foo


This works from a unix console without X running (the postcript
device works similarly on my machine). R can easily be fed this
way with a file and parameters passed from a php script.

-- 
Philippe Glaziou, MD
Epidemiologist
Institut Pasteur du Cambodge

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Nearest Neighbor Algorithm in R -- again.

2004-02-02 Thread Hans W Borchers
/Why not modify the C code of knn? (Not knn1)/
Because I thought it would not be a good idea to apply 'knn' a 1000
times or more.
Instead I was looking for a procedure that returns a structure making
it easy to find nearest neighbors for any point one wants. This way I
used 'dist' for smaller data sets.
The next step could be to extent nearest neighbors to data frames with
factors generalizing, for example, the 'daisy' function.
But you might be right that looking at the 'knn' implementation will
be a faster road for the moment.
Hans Werner Borchers
ABB Corporate Research, Germany
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html