from:"Berton Gunter"

Re: [R] data storage/cubes and pointers in R

2006-11-16 Thread Berton Gunter

Perhaps I do not understand, but the array (?array) and manipulation of
array objects and their components are a fundamental paradigm of R. How is
this not **exactly** what you want to do? Perhaps a specific reproducible
example might be informative...

Bert Gunter
Nonclinical Statistics
7-7374

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jens Scheidtmann
Sent: Thursday, November 16, 2006 12:35 PM
To: r-help@stat.math.ethz.ch
Subject: Re: [R] data storage/cubes and pointers in R

Piet van Remortel <[EMAIL PROTECTED]> writes:

> Hi all,
>
[...]
> Intuitively, I would like to be able to slice the data in a 'data- 
> cube' kind of way to query, analyze, cluster, fit etc., which  
> resembles the database data-cube way of thinking common in de db  
> world these days. ( http://en.wikipedia.org/wiki/Data_cube )
>
> I have no knowledge of a package that supports such things in an  
> elegant way within R.  If this exists, please point me to it.
[...]

If non-R systtems are an option for you, please have a look at PALO
http://www.palo.net/ or Mondrian http://sourceforge.net/projects/mondrian

Maybe writing an interface to these systems may be easier than
implementing it...

HTH,

Jens

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to get empty sequence for certain bounds

2006-11-15 Thread Berton Gunter


... seq(a, b, length = ifelse(a <= b, b - a + 1, 0))

seq(a,b,length = max(0,b-a+1))

seems a bit simpler and more transparent.

-- Bert

Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the secret (?) language of lists

2006-11-15 Thread Berton Gunter

>From c()'s Help docs:

"The default method combines its arguments to form a vector. 
The output type is determined from the highest type of the components in the
hierarchy NULL < raw < logical < integer < real < complex < character < list
< expression. "

Perhaps this could be clearer, but I read this as saying that if M is a
matrix, c(M) should give a vector of type that of the type of M. Peter D's
idiom is perhaps too clever or arcane, but nevertheless documented.

As usual, Brian Ripley's comment is pertinent: as.vector() makes the silent
coercion explicit.

Bert Gunter
Nonclinical Statistics
7-7374

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 14, 2006 4:18 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Subject: Re: [R] the secret (?) language of lists

Berton Gunter wrote:

> - c() removes attributes
> 
> ** Documented in c()'s Help file

***REALLY*** ???  Where?  The only use of the word ``attribute'' in
the help file is in ``See Also: 'unlist' and 'as.vector' to produce
attribute-free vectors.''

Unless you already knew the fact that when m is a matrix c(m)
gives a vector strung out in column order, you'd be very unlikely to
discern that fact from the help file for c().  The only hint is that
the output is described as being a vector.  This is not exactly
spoon-feeding the user.  (Especially in view of the fact that a
matrix ***is*** a vector --- with a dim attribute.  Or so you
keep telling us.)

The help file talks about catenation (which is the essential role of
c()) and expounds at length about the type of the result.  The user
is going to think in terms of c(v1,v2,v3), usw, where v1, v2 and v3
are ``genuine'' vectors''; it would never occur to him or her to
apply c() to a matrix.

There is no way on God's green earth that even the most diligent of
neophytes is going to work out that c() has the effect on matrices
that it does.  EVEN IF the poor neophyte is so sophisticated as to
have absorbed the idea that a matrix is a vector with a dim
attribute.  Which is perhaps a neat trick in designing data
structures but is not, to put it mildly, intuitive.

All too often it is useful to RTFM only if you already know the
answer to your question.

cheers,

Rolf Turner
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the secret (?) language of lists

2006-11-14 Thread Berton Gunter

Peter Dalgaard wrote:

There are a few basic principles in play here, once you know them, the
rest follows; I'm not sure exactly where they are documented, but I'd
guess at Venables & Ripley's books (MASS, S Programming) at least, the
"Blue Book" on S, and possibly others.

The first suggestion requires that you know or now about the following

- matrices are vectors with dim attibutes, stored column-major

** This is clearly documented in AN Introduction to R

 binding rows into matrices with rbind()

** Documented in rbind's Help file.

- c() removes attributes

** Documented in c()'s Help file

The second one requires

- rep function, and its each=

** Documented in rep's Help file

- vector recycling in arithmetic
** Documented in Introduction to R and many Help files

So, in fact, one does not need to go so far as MASS or S Programming -- or
heaven forfend! -- the Blue Book. Indeed I learned about all of this by just
reading the basic stuff before I even knew about these other (excellent and
valuable, I grant) resources. Of course, cleverness in using R's
well-documented capabilities is never guaranteed, but it is important to
recognize that one need not hunt for the docs: they're where they should be.

Which, I regret to say, leads me to echo Brian Ripley's pungent plea:

install.packages("fortunes"); library(fortunes); fortune("WTFM")

Cheers,
Bert Gunter 

> I am reminded of quote by Byron Ellis: "Contrary to popular belief  
> the speed of R's interpreter is rarely the limiting factor to R's  
> speed. People treating R like C is typically the limiting factor. You  
> have vector operations, USE THEM."  Not exactly the point, but close.
> 
> Thanks!
> 
> Jeff Spies
> http://www.nd.edu/~jspies/
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Missing data analysis in R

2006-10-31 Thread Berton Gunter

Have a look at J.L Schafer: ANALYSIS OF INCOMPLETE MULTIVARIATE DATA (1997).


R already has several packages that deal with missing data imputation, I
believe. Search around (please excuse my laziness). 

Don't know if you can get exactly what you want (an R/S -centric text),
though.


Bert Gunter
Nonclinical Statistics
7-7374

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Inman, Brant A. M.D.
Sent: Tuesday, October 31, 2006 9:18 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Missing data analysis in R


I am looking for a book that discusses the theory of multiple imputation
(and other methods of dealing with missing data) and, just as
importantly, how to implement these methods in R or S-Plus.  Ideally,
the book would have a structure similar to Faraway (Regression),
Pinheiro&Bates (Mixed Effects) and Wood (GAMs) and would be very modern
(i.e. published within the last couple of years).  

Any ideas?  If such a book does not exist, one of the experts on this
help list should write it! (I will gladly buy the first copy.)

Brant Inman
Mayo Clinic

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bug in lowess

2006-10-13 Thread Berton Gunter

Folks:

This interesting dicussion brings up an issue of what I have referred to for
some time as "safe statistics," by which I mean:

Usually, but not necessarily automated)Statistical procedures that are
guranteed to give either 

(a) a "reasonable" answer; or
(b) Do not give an answer and when possible emit "useful" error messages.

All standard least squares procedures taught in basic statistics courses are
examples (from many different perspectives) of unsafe statistics.
Robustness/resistance clearly takes us some way along this path, but as is
clear from the discussion, not the whole way. The reason I think that this
is important is 

a) Based on my own profound ignorance/limitations, I think it's impossible
to expect those who need to use many different kinds of sophisticated
statistical analyses to understand enough of the technical details to be
able to actively and effectively guide their appropriate when this requires
such guidance (e.g., least aquares with extensive diagnostics; overfitting
in nonlinear regression);  

b) The explosion of large complex data in all disciplines that **require**
some sort of automated analyses to be used (e.g. microarray data?).

Having said this, it is unclear to me even **if** "safe statistics" is a
meaningful concept: can it ever be -- at all? But I believe one thing is
clear: A lot of people devote a lot of labor to "optimal" procedures that
are far too sensitive to the manifold peculiarities of real data to give
reliable, trustworthy results in practice considerable expert coaxing. We at
least need a greater variety of less optimal but safer data analysis
procedures. R -- or rather it's many contributors-- seems to me to be the
exception in recognizing and doing something about this.

And as a humble example of what I mean: I like simple running medians of
generally small span for "smoothing" sequential data (please don't waste
time giving me counterexamples of why this is bad or how it can go wrong --
I know there are many).

I would appreciate anyone else's thoughts on this, pro or con, perhaps
privately rather than on the list if you view this as too far off-topic.

(NOTE: TO be clear: My personal views, not those of my company or
colleagues)

My best regards to all,

Bert

Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr
Sent: Friday, October 13, 2006 5:51 AM
To: Prof Brian Ripley
Cc: [EMAIL PROTECTED]
Subject: Re: [R] Bug in lowess

Prof Brian Ripley wrote:
> Frank Harrell wrote:
> 
> [...]
> 
>> Thank you Brian.  It seems that no matter what is the right answer, the
>> answer currently returned on my system is clearly wrong.  lowess()$y
>> should be constrained to be within range(y).
> 
> Really?  That assertion is offered without proof and I am pretty sure is 
> incorrect.  Consider
> 
>> x <- c(1:10, 20)
>> y <- c(1:10, 5) + 0.01*rnorm(11)
>> lowess(x,y)
> $x
>  [1]  1  2  3  4  5  6  7  8  9 10 20
> 
> $y
>  [1]  0.9983192  1.9969599  2.9960805  3.9948224  4.9944158  5.9959855
>  [7]  6.9964400  7.9981434  8.9990607 10.0002567 19.9946117
> 
> Remember that lowess is a local *linear* fitting method, and may give 
> zero weight to some data points, so (as here) it can extrapolate.

Brian - thanks - that's a good example though not typical of the kind I 
see from patients.

> 
> After reading what src/appl/lowess.doc says should happen with zero 
> weights, I think the answer given on Frank's system probably is the 
> correct one.  Rounding error is determining which of the last two points 
> is given zero robustness weight: on a i686 system both of the last two 
> are, and on mine only the last is. As far as I can tell in 
> infinite-precision arithmetic both would be zero, and hence the value at 
> x=120 would be found by extrapolation from those (far) to the left of it.
> 
> I am inclined to think that the best course of action is to quit with a 
> warning when the MAD of the residuals is effectively zero.  However, we 
> need to be careful not to call things 'bugs' that we do not understand 
> well enough.  This might be a design error in lowess, but it is not 
> AFAICS a bug in the implementation.

Yes it appears to be a weakness in the underlying algorithm.

Thanks
Frank

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Row comparisons to a new matrix?

2006-10-06 Thread Berton Gunter

?dist 


Bert Gunter
Nonclinical Statistics
7-7374

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Atte Tenkanen
Sent: Friday, October 06, 2006 1:54 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Row comparisons to a new matrix?

Hi,
Can somebody tell me, which is the fastest way to make comparisons between
all rows in a matrix (here A) and put the results to the new symmetric
matrix? I have here used cosine distance as an example, but the comparison
function can be any other, euclidean dist etc.

A=rbind(c(2,3),c(4,5),c(-1,2),c(5,6))

M=matrix(nrow=length(A[,1]),ncol=length(A[,1]))

for(i in 1:length(A[,1]))
{
for(j in 1:length(A[,1]))
{
M[i,j]=cosine(A[i,],A[j,])
}
}

Atte Tenkanen
University of Turku, Finland

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm with nesting

2006-10-05 Thread Berton Gunter

Jeffrey:

Please... May I repeat what Peter Dalgaard already said: consult a local
statistician. The structure of your study is sufficiently complicated that
your stat 101 training is inadequate. Get professional help, which this list
is not set up to provide (though it often does, through the good offices and
patience of many wise contributors).

Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404

 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Jeffrey Stratford
> Sent: Thursday, October 05, 2006 7:46 AM
> To: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
> Subject: Re: [R] glm with nesting
> 
> Harold and list,
> 
> I've changed a few things since the last time so I'm really starting
> from scratch.  
> 
> I start with
> 
> bbmale <- read.csv("c:\\eabl\\2004\\feathers\\male_feathers2.csv",
> header=TRUE)
> box <-factor(box)
> chick <- factor(chick)
> 
> Here's a sample of the data
> 
> box,chick,julian,cltchsz,mrtot,cuv,cblue,purbank,purban2,purba
> n1,pgrassk,pgrass2,pgrass1,grassdist,grasspatchk
> 1,2,141,2,21.72290152,0.305723811,0.327178868,0.003813435,0.02
> 684564,0.06896552,0.3282487,0.6845638,0.7586207,0,3.73
> 4,1,164,4,18.87699007,0.281863299,0.310935559,0.06072162,0.208
> 0537,0.06896552,0.01936052,0,0,323.1099,0.2284615
> 4,2,164,4,19.64359348,0.294117388,0.316049817,0.06072162,0.208
> 0537,0.06896552,0.01936052,0,0,323.1099,0.2284615
> 7,1,118,4,13.48699876,0.303649408,0.31765218,0.3807568,0.43624
> 16,0.6896552,0.06864183,0.03355705,0,94.86833,0.468
> 12,1,180,4,21.42196378,0.289731361,0.317562294,0.09238011,0.13
> 42282,0,0.2430127,0.8322148,1,0,1.199032
> 12,2,180,4,18.79487905,0.286052077,0.316367349,0.09238011,0.13
> 42282,0,0.2430127,0.8322148,1,0,1.199032
> 12,3,180,4,12.83127682,0.260197475,0.292636914,0.09238011,0.13
> 42282,0,0.2430127,0.8322148,1,0,1.199032
> 15,1,138,4,20.07161467,0.287632782,0.318671887,0.07046477,0.03
> 355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818
> 15,2,138,4,17.61146256,0.305581768,0.315848051,0.07046477,0.03
> 355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818
> 15,3,138,4,20.36397134,0.271795667,0.30539683,0.07046477,0.033
> 55705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818
> 15,4,138,4,20.81940158,0.269468041,0.304160648,0.07046477,0.03
> 355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818
> 
> As you can see I have multiple boxes (> 70).  Sometimes I 
> have multiple
> chicks per box each having their own response  to mrtot, cuv, 
> and cblue
> but the same landscape variables for that box.  Chick number 
> is randomly
> assigned and is not an effect I'm interested in.  I'm really not
> interested in the box effect either.  I would like to know if 
> landscape
> affects the color of chicks (which may be tied into chick
> health/physiology).  We also know that chicks get bluer as the season
> progresses and that clutch size (cltchsz) has an effect so 
> I'm including
> that as covariates.  
> 
> Hopefully, this clears things up a bit. 
> 
> I do have the MASS and MEMS (Pineiro's) texts in hand. 
> 
> Many thanks,
> 
> Jeff
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fractional part

2006-09-28 Thread Berton Gunter


Note:

I should have said for negative values, this may have to be adjusted,
depending on how you define "fractional part," as -1.25 %% 1 = .75.

-- Bert Gunter 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Hans-Peter
> Sent: Thursday, September 28, 2006 8:40 AM
> To: R Help
> Subject: [R] fractional part
> 
> Hi all,
> 
> is there a function to get the fractional part of a number?
> 
> -- 
> Regards,
> Hans-Peter
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fractional part

2006-09-28 Thread Berton Gunter

?"%%"

X%%1

Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404



> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Hans-Peter
> Sent: Thursday, September 28, 2006 8:40 AM
> To: R Help
> Subject: [R] fractional part
> 
> Hi all,
> 
> is there a function to get the fractional part of a number?
> 
> -- 
> Regards,
> Hans-Peter
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "summarry.lm" and NA values

2006-09-21 Thread Berton Gunter

?vcov

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: r user [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, September 21, 2006 2:11 PM
> To: Berton Gunter; 'rhelp'
> Subject: "summarry.lm" and NA values
> 
> Gentlemen,
> 
> (I am using R 2.2.1 in a Windows environment.)
> 
> I apologize but I did not fully comprehend all of your
> answer.  I have a dataframe called "data1".  I run
> several liner regression using the lm function similar
> to:
> 
> reg <- ( lm(lm(data1[,2] ~., data1[,2:4])) )
> 
> 
> I see from generous answers below how I can use 
> "coef(reg)" to extract the coefficient estimates.  (If
> the coefficient for a variable is for some reason NA,
> "coef(reg)"  returns  NA for that coefficient, which
> is what I want.)
> 
> My question: 
> What is the best way to get the standard errors,
> including NA values that "go with" each of these
> coefficient estimates?  (i.e. If the coefficient
> estimate is NA, I similarly want the standard error to
> come back as NA, so that the length of coef(reg) is
> the same as the length of the vector that contains the
> standard errors. )
> 
> Thanks very much for all your help, and I apologize
> for my need of additional assistance.
> 
> 
> 
> 
> 
> 
> --- Berton Gunter <[EMAIL PROTECTED]> wrote:
> 
> > "Is there a way to..." always has the answer "yes"
> > in R (or C or any
> > language for that matter). The question is: "Is
> > there a GOOD way...?" where
> > "good" depends on the specifics of the situation. So
> > after that polemic,
> > below is an effort to answer, (adding to what Petr
> > Pikal already said):
> > 
> > -- Bert Gunter
> > Genentech Non-Clinical Statistics
> > South San Francisco, CA
> >  
> > "The business of the statistician is to catalyze the
> > scientific learning
> > process."  - George E. P. Box
> >  
> >  
> > 
> > > -Original Message-
> > > From: [EMAIL PROTECTED] 
> > > [mailto:[EMAIL PROTECTED] On
> > Behalf Of r user
> > > Sent: Tuesday, August 15, 2006 7:01 AM
> > > To: rhelp
> > > Subject: [R] question re: "summarry.lm" and NA
> > values
> > > 
> > > Is there a way to get the following code to
> > include
> > > NA values where the coefficients are "NA"?
> > > 
> > > ((summary(reg))$coefficients)
> > BAAAD! Don't so this. Use the extractor on the
> > object: coef(reg) 
> > This suggests that you haven't read the
> > documentation carefully, which tends
> > to arouse the ire of would-be helpers.
> > 
> > > 
> > > explanation:
> > > 
> > > Using a loop, I am running regressions on several
> > > "subsets" of "data1".
> > > 
> > > "reg <- ( lm(lm(data1[,1] ~., data1[,2:l])) )"
> > ??? There's an error here I think. Do you mean
> > update()? Do you have your
> > subscripting correct?
> > 
> > > 
> > > My regression has 10 independent variables, and I
> > > therefore expect 11 coefficients.
> > > After each regression, I wish to save the
> > coefficients
> > > and standard errors of the coefficients in a table
> > > with 22 columns.
> > > 
> > > I successfully extract the coefficients using the
> > > following code:
> > > "reg$coefficients"
> > Use the extractor, coef()
> > 
> > > 
> > > I attempt to extract the standard errors using :
> > > 
> > > aperm((summary(reg))$coefficients)[2,]
> > 
> > BAAAD! Use the extractor vcov():
> > sqrt(diag(vcov(reg)))
> > > 
> > > ((summary(reg))$coefficients)
> > > 
> > > My problem:
> > > For some of my subsets, I am missing data for one
> > or
> > > more of the independent variables.  This of course
> > > causes the coefficients and standard erros for
> > this
> > > variable to be "NA".
> > Not it doesn't, as Petr said.
> > 
> > One possible approach: Assuming that a variable is
> > actually missing (all
> > NA's), note that coef(reg) is a named vector, so
> > that the character string
> > names of the regressors actually used are availab

Re: [R] how to ignore "NA" or replace it by another value

2006-09-21 Thread Berton Gunter

Most R functions have arguments of the form "na.action" or "na.rm" that
allow you to specify how you treat NA's. In general, it's not a good idea to
replace NA's with numbers.

See also ?na.omit, ?na.action.



-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Preuth
> Sent: Thursday, September 21, 2006 10:59 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] how to ignore "NA" or replace it by another value
> 
> Hello,
> 
> I`m a newbie to R so maybe this question is boring, but I 
> have a large 
> table with several empty missing values, which come out as 
> "NA". How can 
> i ignore them or replace them by another number?
> 
> Greetings, Thomas
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Statitics Textbook - any recommendation?

2006-09-20 Thread Berton Gunter

Not withstanding Prof. Heiberger's admirable enthusiasm, I think the
canonical answer is probably MASS (Modern Applied Statistics with S) by
Venables and Ripley. It is very comprehensive, but depending on your
background, you may find it too telegraphic.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Iuri Gavronski
> Sent: Wednesday, September 20, 2006 1:22 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Statitics Textbook - any recommendation?
> 
> I would like to buy a basic statistics book (experimental design,  
> sampling, ANOVA, regression, etc.) with examples in R. Or 
> download it  
> in PDF or html format.
> I went to the CRAN contributed documentation, but there were only R  
> textbooks, that is, textbooks where R is the focus, not the  
> statistics. And I would like to find the opposite.
> Other text I am trying to find is multivariate data analysis (EFA,  
> cluster, mult regression, MANOVA, etc.) with examples with R.
> Any recommendation?
> 
> Thank you in advance,
> 
> Iuri.
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] prediction interval for new value

2006-09-18 Thread Berton Gunter

Peter et. al.:
> 
> With those definitions (which are hardly universal), tolerance
> intervals are the same as prediction intervals with k == m == 1, which
> is what R provides.
> 
>  
  
I don't believe this is the case. See also:

http://www.itl.nist.gov/div898/handbook/prc/section2/prc263.htm

This **is** fairly standard, I believe. For example, see the venerable
classic text (INTRO TO MATH STAT) by Hogg and Craig.

To be clear, since I may also be misinterpreting, what I understand/mean is:

Peter's definition of a "tolerance/prediction interval" is a random interval
that with a prespecified confidence contain a future predicted value;

The definition I understand to be a random interval that with a prespecified
confidence will contain a prespecfied proportion of the distribution of
future values. ..e.g. a "95%/90%" tolerance interval will with 95%
confidence contain 90% of future values (and one may well ask, "which
90%"?).

Whether this is a useful idea is another issue: the parametric version is
extremely sensitive (as one might imagine) to the assumption of exact
normality; the nonparametric version relies on order statistics and is more
robust. I believe it is nontrivial and perhaps ambiguous to extend the
concept from the usual fixed distribution to the linear regression case. I
seem to recall some papers on this, perhaps in JASA, in the past few years.

As always, I welcome correction of any errors or misunderstandings herein.

Cheers to all,

Bert Gunter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] FW: R Reference Card and other help (especially useful for Newbies)

2006-09-15 Thread Berton Gunter

 
Hi all: 

  
Newbies (and others!) may find useful the R Reference Card made available by

Tom Short of EPRI Solutions at http://www.rpad.org/Rpad/Rpad-refcard.pdf  or
through 
the "Contributed" link on CRAN (where some other reference cards are also 
linked). It categorizes and organizes a bunch of R's basic, most used 
functions so that they can be easily found. For example, paste() is under 
the "Strings" heading and expand.grid() is under "Data Creation." For 
newbies struggling to find the right R function as well as veterans who 
can't quite remember the function name, it's very handy. 

Also don't forget R's other Help facilties: 

help.search("keyword or phrase") to search the **installed** man pages 

RSiteSearch("keyword or phrase") to search the CRAN website via Jonathan
Baron's search engine. This can also be done directly from CRAN by following
the "search" link there.

And, occasionally, find()/apropos() to search the ** attached** packages for
functions using regexp's. 

Though R certainly can be intimidating, please **do** try these measures
first before posting questions to the list. And please **do** read the other
basic R reference materials. Better and faster answers can often be found
this way.

  
-- Bert Gunter 
Genentech Non-Clinical Statistics 
South San Francisco, CA 
  
"The business of the statistician is to catalyze the scientific learning 
process."  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] inserting columns in the middle of a dataframe

2006-09-13 Thread Berton Gunter

Please folks -- use indexing.

myframe<-myframe[,c(1,5,2,3,4)]

Which begs the question: why bother rearranging the columns anyway, since
one can get them used, printed, etc. in any order you wish anytime you want
just by specifying the indices in the order you want them. I suspect the
question was motivated by too much Sas- or Excel -ism.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Timothy Bates
> Sent: Wednesday, September 13, 2006 3:05 PM
> To: Jon Minton; r-help@stat.math.ethz.ch
> Subject: Re: [R] inserting columns in the middle of a dataframe
> 
> 
> > Is there a built-in and simple way to insert new columns in 
> a dataframe?
> 
> You do this by collecting the columns in the new order you desire, and
> making a new frame.
> 
> oldframe   <- data.frame(matrix(0:14,ncol=3))
> newcol  <- data.frame(20:24)
> names(newcol) <- "newcol"
> newframe <- data.frame(c(oldframe[1],newcol, oldframe[2:3]))
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Access Rows in a Data Frame by Row Name

2006-09-13 Thread Berton Gunter

The answer is yes, you can access rows of a data.frame by rowname in the
same way as columns, which you could have found by merely trying it. Don't
overlook the value of a little experimentation as the fastest way to an
answer.

-- Bert Gunter
Genentech
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Tony Plate
> Sent: Wednesday, September 13, 2006 11:02 AM
> To: Michael Gormley
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] Access Rows in a Data Frame by Row Name
> 
> Matrix-style indexing works for both columns and rows of data frames.
> 
> E.g.:
>  > x <- data.frame(a=1:5, b=6:10, d=11:15)
>  > x
>a  b  d
> 1 1  6 11
> 2 2  7 12
> 3 3  8 13
> 4 4  9 14
> 5 5 10 15
>  > x[2:4,c(1,3)]
>a  d
> 2 2 12
> 3 3 13
> 4 4 14
>  >
> 
> Time spend reading the help document "An Introduction to R" will 
> probably be well worth it.  The relevant sections are "5 Arrays and 
> matrices", and "6.3 Data frames".
> 
> -- Tony Plate
> 
> Michael Gormley wrote:
> > I have created a data frame using the read.table command.  
> I want to be able to access the rows by the row name, or a 
> vector of row names. I know that you can access columns by 
> using the data.frame.name$col.name.  Is there a way to access 
> row names in a similar manner?
> > 
> > [[alternative HTML version deleted]]
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About truncated distribution

2006-09-12 Thread Berton Gunter

> 
> But my question is a bit different. What I know is the mean 
> and sd after truncation. If I assume the distribution is 
> normal, how I am gonna develope the original distribution 
> using this two parameters?

You can't, as they are plainly not sufficient (you need to know the amount
of truncation also). If you have only the mean and sd and neither the actual
data nor the truncation point you're through.

-- Bert Gunter
Genentech


 Could anybody give me some advice? 
> Thanks in advance!
> 
> Jen
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Successive Graphs

2006-09-12 Thread Berton Gunter

... or use lattice and splom() instead. If the successive graphs bear some
relationship to each other, this might produce a more useful display, too.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Michael Prager
> Sent: Monday, September 11, 2006 2:38 PM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] Successive Graphs
> 
> [EMAIL PROTECTED] wrote:
> 
> > Hello! I have written an R script on a Windows platform where I
> > calculate eight result matrices I plot using matplot. I 
> would like to
> > display the resulting plots successively, rather than 
> simultaneously,
> > and I was wondering if anyone could point me in the right 
> direction as
> > to how to do this. The graphs pop up in this manner by 
> default when I
> > run my script in S-PLUS, with tabs separating them so I can 
> view each
> > graph at my leisure. However when I run my script in R, 
> each graph pops
> > up only for a moment before it is replaced by the next 
> until I am left
> > with only the plot of the eighth matrix at the end of the 
> script. Thanks
> > in advance for your help!
> 
> Others have pointed out the R plot history mechanism, which is
> very nice.  A few additions.
> 
> If you are re-running the script often and want to get rid of
> old windows, you can put near the top of your script
> 
> graphics.off()
> 
> You can open a graphics window -- with history enabled -- with
> 
> windows(record=TRUE)
> 
> Plot history is saved.  If desired, you can clear the old
> history before making new plots with
> 
> .SavedPlots <- NULL
> 
> 
> Mike Prager
> Southeast Fisheries Science Center, NOAA
> Beaufort, North Carolina  USA
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help: advice on the structuring of ReML models foranalysing growth curves

2006-09-05 Thread Berton Gunter


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Robinson
> Sent: Tuesday, September 05, 2006 7:25 AM
> To: Simon Pickett
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] help: advice on the structuring of ReML 
> models foranalysing growth curves
> 
> Hi Simon,
> 
> overall I think that lmer is a good tool for this problem.  It's
> impossible to reply definitively without the full details on the
> experimental design.
> 
> Caveat in place, I have questions and some suggestions.  Are
> treatment1 and treatment2 distinct factors, or two levels of a
> treatment, the dietary compound?  Also, what is broodsize?
> 
> If you want to nest chick id within brood, I think that you should
> include the interaction as a random factor.  If you'd like the age
> effects to differ between chicks then age should be on the left of id.
> 
> Thus, start with something like ...
> 
> model1 <- lmer(weight ~ treatment +  broodsize + sex + age
>+ (1|brood) + (age|id:brood), data=H) 


FWIW, this model can also be easily fit with the lme() function (in the nlme
package) as the random effects are strictly nested. The only advantage in
doing so is that the lme tools for examining the model are somewhat more
developed and extensive (or am I just more familiar with them?)

Cheers,
Bert

- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] terms.inner

2006-09-05 Thread Berton Gunter

Terry:

errr...

> > at it, although I don't remember the form.  This Nixonesque 
> passion with
> > hiding things is one of the reasons I still prefer Splus.
> 

Two comments:

1) The use of namespaces is a well-established appoach in computer science
to avoid naming conflicts (and probably other stuff I don't understand).

2) To be fair, S-Plus is a proprietary closed system, and so can and
presumably does control its naming of new functions so that naming conflicts
are avoided. R, which is an open system with literally hundreds of
contributed packages cannot do this, and so must use some methodology like
namespaces to do so. 


Cheers,
Bert

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] New quote ?

2006-09-01 Thread Berton Gunter


Is this a candidate for R's package of wise quotes (whose name I've
forgotten and can't find at the moment)?

"The point here is that all but the most 
uselss variables will measurably improve the fit in large problems with 
few variables."

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems with plot.data.frame

2006-08-31 Thread Berton Gunter

I perhaps should have added:

?plot.data.frame, ?plot.factor and ?UseMethod for S3 generic dispatch
(plot.factor is the plotting method actually being called and you are
actually getting a boxplot for a single value.)

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Monica Pisica
> Sent: Thursday, August 31, 2006 9:30 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] problems with plot.data.frame
> Importance: High
> 
> Hi list,
> 
> I have a question about 'plot'. I am trying to plot values 
> registered every 
> month - or every other month. If i build a data.frame called 
> mydata like 
> this (as an example)
> 
> jan   3   1   7
> mar  2   4   2
> may 1   3   2
> jul3   7   4
> sep  5   2   3
> nov  3   1   5
> 
> and use the command line:
> 
> plot(mydata[c(1,3)])
> 
> I get a graph that has on the x axis my months in 
> alphabetical order - which 
> i don't want, and instead of points i have thick horizontal 
> lines. I've 
> tried everything i could and understood from the R help files 
> to give me 
> points and on x axis the month in my order instead of alpha order. No 
> success. What is the trick?
> 
> I fixed the month order by using numerals in front of them 
> like 01, 03, ... 
> etc, but this is not an elegant solution.
> 
> Any help will be much appreciated.
> 
> Monica
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems with plot.data.frame

2006-08-31 Thread Berton Gunter

You need to read R docs. Month is a factor. ?factor
As a result you are getting a bar plot. 
You can make month numeric and control axis labelling via ?axis
See also ?plot.default and ?par

There are also undoubtedly functions available either in base R or through
packages that would do this directly. You might check the hmisc and zoo
packages for starters.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Monica Pisica
> Sent: Thursday, August 31, 2006 9:30 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] problems with plot.data.frame
> Importance: High
> 
> Hi list,
> 
> I have a question about 'plot'. I am trying to plot values 
> registered every 
> month - or every other month. If i build a data.frame called 
> mydata like 
> this (as an example)
> 
> jan   3   1   7
> mar  2   4   2
> may 1   3   2
> jul3   7   4
> sep  5   2   3
> nov  3   1   5
> 
> and use the command line:
> 
> plot(mydata[c(1,3)])
> 
> I get a graph that has on the x axis my months in 
> alphabetical order - which 
> i don't want, and instead of points i have thick horizontal 
> lines. I've 
> tried everything i could and understood from the R help files 
> to give me 
> points and on x axis the month in my order instead of alpha order. No 
> success. What is the trick?
> 
> I fixed the month order by using numerals in front of them 
> like 01, 03, ... 
> etc, but this is not an elegant solution.
> 
> Any help will be much appreciated.
> 
> Monica
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Check values in colums matrix

2006-08-24 Thread Berton Gunter

Absolutely. But do note that if the values in obj are the product of
numerical computations then columns of equal values may turn out to be only
**nearly** equal and so the sd may turn out to be **nearly** 0 and not
exactly 0. This is a standard issue in numerical computation, of course, and
has been commented on in this list at least dozens of times, but it's still
a gotcha for the unwary (so now dozens +1).

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Gabor 
> Grothendieck
> Sent: Thursday, August 24, 2006 4:28 PM
> To: Muhammad Subianto
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] Check values in colums matrix
> 
> Try sd(obj.tr) which will give a vector of standard 
> deviations, one per column.
> A column's entry will be zero if and only if all values in the column
> are the same.
> 
> On 8/24/06, Muhammad Subianto <[EMAIL PROTECTED]> wrote:
> > Dear all,
> > I apologize if my question is quite simple.
> > I have a dataset (20 columns & 1000 rows) which
> > some of columns have the same value and the others
> > have different values.
> > Here are some piece of my dataset:
> > obj <- cbind(c(1,1,1,4,0,0,1,4,-1),
> > c(0,1,1,4,1,0,1,4,-1),
> > c(1,1,1,4,2,0,1,4,-1),
> > c(1,1,1,4,3,0,1,4,-1),
> > c(1,1,1,4,6,0,1,5,-1),
> > c(1,1,1,4,6,0,1,6,-1),
> > c(1,1,1,4,6,0,1,7,-1),
> > c(1,1,1,4,6,0,1,8,-1))
> > obj.tr <- t(obj)
> > obj.tr
> > > obj.tr
> > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
> > [1,]11140014   -1
> > [2,]01141014   -1
> > [3,]11142014   -1
> > [4,]11143014   -1
> > [5,]11146015   -1
> > [6,]11146016   -1
> > [7,]11146017   -1
> > [8,]11146018   -1
> > >
> >
> > How can I do to check columns 2,3,4,6,7 and 9 have
> > the same value, and columns 1,5 and 8 have different values.
> >
> > Best, Muhammad Subianto
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rgl: exporting to pdf or png does not work

2006-08-23 Thread Berton Gunter

Please contact the package maintainer.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Gaspard Lequeux
> Sent: Wednesday, August 23, 2006 2:15 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] rgl: exporting to pdf or png does not work
> 
> 
> Hej,
> 
> When exporting a image from rgl, the following error is encountered:
> 
> > rgl.postscript('testing.pdf', fmt="pdf")
> RGL: ERROR: can't bind glx context to window
> RGL: ERROR: can't bind glx context to window
> Warning messages:
> 1: X11 protocol error: GLXBadContextState
> 2: X11 protocol error: GLXBadContextState
> 
> The pdf file is created and is readable, but all the labels are gone.
> 
> Taking a snapshot (to png) gives 'failed' and no file is created.
> 
> Version of rgl used: 0.67-2 (2006-07-11)
> Version of R used: R 2.3.1; i486-pc-linux-gnu; 2006-07-13 01:31:16;
> Running Debian GNU/Linux testing (Etch).
> 
> /Gaspard
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multivariate analysis by using lme

2006-08-21 Thread Berton Gunter

FWIW, a small story.

Many a moon ago, I had the great good fortune and honor of driving John
Tukey to periodic consulting sessions at Merck (talk about precious cargo!).
So I got to chat with him about stuff. Basically for the reasons already
elaborated, he also had no use for multivariate methods; but of course when
**HE** elaborated, one paid attention.

Cheers,

-- Bert Gunter

No I'm not expressing an opinion. The discussion just reminded me...
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of hadley wickham
> Sent: Monday, August 21, 2006 12:14 PM
> To: Spencer Graves
> Cc: r-help@stat.math.ethz.ch; Hui-Ju Tsai
> Subject: Re: [R] multivariate analysis by using lme
> 
> >Only after doing the best I could with univariate 
> modeling would
> > I then consider multivariate modeling.  And then I'd want 
> to think very
> > carefully about whether the multivariate model(s) under 
> consideration
> > seemed consistent with the univariate results -- and what else they
> > might tell me that I hadn't already gotten from the 
> univariate model.
> >  If you've already done all this, I'm impressed.  In the 
> almost 30 years
> > since I realized I should try univariate models first and work up to
> > multivariate whenever appropriate, I've not found one 
> application where
> > the extra effort seemed justified.  R has made this much 
> easier, but I'm
> > still looking for that special application that would 
> actually require
> > the multivariate tools.
> 
> To add to Spencer's comments, I'd strongly recommend you look at your
> data before trying to model it.  The attached graph, a scatterplot of
> res1 vs res2 values conditional on c1 and c2, with point shape given
> by inter, reveals many interesting features of your data:
> 
>  * res1 and res2 values are highly correlated
>  * inter is constant for a given c1 and c2
>  * there are between 1 and 3 points for each level of inter - not very
> many and I don't think enough to investigate what the effect of inter
> is
> 
> The plot was created using the following code:
> 
> library(ggplot)
> s <- read.table("~/Desktop/sample.txt", header=T)
> s <- rename(s, c(two="value"))
> s$res2 <- NULL
> s <- as.data.frame(cast(s, ... ~ res1))
> 
> 
> qplot(X0, X1, c1 ~ c2, data=s, shape=factor(inter))
> 
> (note that you will need the latest version of ggplot available from
> http://had.co.nz/ggplot)
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about 'coef' method and fitted_value calculation

2006-08-21 Thread Berton Gunter

I'm ignorant about this, so no answer. But as you seem interested in
coefficient shrinkage, have you tried the lars or lasso2 package?

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
> Sent: Monday, August 21, 2006 2:40 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] question about 'coef' method and fitted_value calculation
> 
> Dear all,
> 
> I am trying to calculate the fitted values using a ridge model
> (lm.ridge(), MASS library). Since the predict() does not work 
> for lm.ridge
> object, I want to get the fitted_value from the coefficients 
> information.
> The following are the codes I use:
> 
>   fit = lm.ridge(myY~myX,lambda=lamb,scales=F,coef=T)
>   coeff = fit$coef
> 
> However, it seems that "coeff" (or "fit$coef") is not really the
> coefficients matrix. From the manual, "Note that these are not on the
> original scale and are for use by the 'coef' method...".
> 
> Could anyone please point out what is the 'coef' method the manual
> mentioned, and how should I get the fitted value? I have tried simple
> multiplication of the coeff and my X matrix ("coeff%*%X"). 
> But the results
> seems to be in the wrong scale.
> 
> Thanks so much!
> 
> Sincerely,
> Jeny
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Floating point imprecision in sum() under R-2.3.1?

2006-08-18 Thread Berton Gunter

> 
> But after Roger Peng's <[EMAIL PROTECTED]> **insightful** comment that the

... but as we are not in that <> S language dialect, maybe it should
be his **peRceptive** comment. ;-)

(Sorry -- it's Friday)

-- Bert Gunter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dataframe of unequal rows

2006-08-18 Thread Berton Gunter

 
test.txt:

"V1""V2""V3""V4"
1   2   3   4
5   6   7   
8   9
10  11
12  13  14  15

The fields are delimited by tab characters ("\t")


In R:

> read.table(choose.files(),sep='\t',head=TRUE)

  V1 V2 V3 V4
1  1  2  3  4
2  5  6  7 NA
3 NA NA  8  9
4 10 NA NA 11
5 12 13 14 15

(I use choose.files() on Windows to select the file via the standard file
browser widget)

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: Sachin J [mailto:[EMAIL PROTECTED] 
> Sent: Friday, August 18, 2006 10:45 AM
> To: Berton Gunter; R-help@stat.math.ethz.ch
> Subject: RE: [R] dataframe of unequal rows
> 
> Bert,
>  
> I tried readLines. It reads the data as is, but cant access 
> individual columns. Still cant figure out how to accomplish 
> this. An example would be of great help.
>  
> PS: How do you indicate which fields are present in a record 
> with less than the
> full number? - Via known delimiters for all fields. 
> 
> TIA
> Sachin
>  
> 
> Berton Gunter <[EMAIL PROTECTED]> wrote:
> 
>   How do you indicate which fields are present in a 
> record with less than the
>   full number? Via known delimiters for all fields? Via 
> the order of values
>   (fields are filled in order and only the last fields in 
> a record can
>   therefore be missing)?
>   
>   If the former, see the "sep" parameter in read.table() 
> and friends.
>   If the latter, one way is to open the file as a 
> connection and use
>   readLines()(you would check how many values were 
> present and fill in the
>   NA's as needed).There may be better ways, though. 
> ?connections will get you
>   started.
>   
>   -- Bert Gunter
>   Genentech Non-Clinical Statistics
>   South San Francisco, CA
>   
>   "The business of the statistician is to catalyze the 
> scientific learning
>   process." - George E. P. Box
>   
>   
>   
>   > -Original Message-
>   > From: [EMAIL PROTECTED] 
>   > [mailto:[EMAIL PROTECTED] On Behalf 
> Of Sachin J
>   > Sent: Friday, August 18, 2006 9:14 AM
>   > To: R-help@stat.math.ethz.ch
>   > Subject: [R] dataframe of unequal rows
>   > 
>   > Hi,
>   > 
>   > How can I read data of unequal number of observations 
>   > (rows) as is (i.e. without introducing NA for columns of less 
>   > observations than the maximum. Example:
>   > 
>   > A B C D
>   > 1 10 1 12
>   > 2 10 3 12
>   > 3 10 4 12
>   > 4 10 
>   > 5 10 
>   > 
>   > Thanks in advance.
>   > 
>   > Sachin
>   > 
>   > 
>   > 
>   > 
>   > -
>   > 
>   > [[alternative HTML version deleted]]
>   > 
>   > __
>   > R-help@stat.math.ethz.ch mailing list
>   > https://stat.ethz.ch/mailman/listinfo/r-help
>   > PLEASE do read the posting guide 
>   > http://www.R-project.org/posting-guide.html
>   > and provide commented, minimal, self-contained, 
> reproducible code.
>   > 
>   
>   
> 
> 
> 
> 
> How low will we go? Check out Yahoo! Messenger's low 
> PC-to-Phone call rates. 
> <http://us.rd.yahoo.com/mail_us/taglines/postman8/*http://us.r
> d.yahoo.com/evt=39663/*http://voice.yahoo.com> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dataframe of unequal rows

2006-08-18 Thread Berton Gunter

How do you indicate which fields are present in a record with less than the
full number? Via known delimiters for all fields? Via the order of values
(fields are filled in order and only the last fields in a record can
therefore be missing)?

If the former, see the "sep" parameter in read.table() and friends.
If the latter, one way is to open the file as a connection and use
readLines()(you would check how many values were present and fill in the
NA's as needed).There may be better ways, though. ?connections will  get you
started.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Sachin J
> Sent: Friday, August 18, 2006 9:14 AM
> To: R-help@stat.math.ethz.ch
> Subject: [R] dataframe of unequal rows
> 
> Hi,
>
>   How can I read data of unequal number of observations 
> (rows) as is (i.e. without introducing NA for columns of less 
> observations than the maximum. Example:
>
>   AB   C   D
>   110  1   12
>   210  3   12
>   310  4   12
>   410  
>   510  
>
>   Thanks in advance.
>
>   Sachin
>
>
> 
>   
> -
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nls convergence problem

2006-08-15 Thread Berton Gunter

>   Or, maybe there's something I don't understand about the 
> algorithm being used.

Indeed! So before making such comments, why don't you try to learn about it?
Doug Bates is a pretty smart guy,  and I think you do him a disservice when
you assume that he somehow overlooked something that he explicitly warned
you about. I am fairly confident that if he could have made the problem go
away, he would have. So I think your vent was a bit inconsiderate and
perhaps even intemperate. The R Core folks have produced a minor miracle
IMO, and we should all be careful before assuming that they have overlooked
easily fixable problems. They're certainly not infallible -- but they're a
lot less fallible than most of the rest of us when it comes to R.

> 
> Just my $0.02 and minority opinion,
> efg
> 

... and mine.

-- Bert Gunter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question re: "summarry.lm" and NA values

2006-08-15 Thread Berton Gunter

"Is there a way to..." always has the answer "yes" in R (or C or any
language for that matter). The question is: "Is there a GOOD way...?" where
"good" depends on the specifics of the situation. So after that polemic,
below is an effort to answer, (adding to what Petr Pikal already said):

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of r user
> Sent: Tuesday, August 15, 2006 7:01 AM
> To: rhelp
> Subject: [R] question re: "summarry.lm" and NA values
> 
> Is there a way to get the following code to include
> NA values where the coefficients are "NA"?
> 
> ((summary(reg))$coefficients)
BAAAD! Don't so this. Use the extractor on the object: coef(reg) 
This suggests that you haven't read the documentation carefully, which tends
to arouse the ire of would-be helpers.

> 
> explanation:
> 
> Using a loop, I am running regressions on several
> "subsets" of "data1".
> 
> "reg <- ( lm(lm(data1[,1] ~., data1[,2:l])) )"
??? There's an error here I think. Do you mean update()? Do you have your
subscripting correct?

> 
> My regression has 10 independent variables, and I
> therefore expect 11 coefficients.
> After each regression, I wish to save the coefficients
> and standard errors of the coefficients in a table
> with 22 columns.
> 
> I successfully extract the coefficients using the
> following code:
> "reg$coefficients"
Use the extractor, coef()

> 
> I attempt to extract the standard errors using :
> 
> aperm((summary(reg))$coefficients)[2,]

BAAAD! Use the extractor vcov(): sqrt(diag(vcov(reg)))
> 
> ((summary(reg))$coefficients)
> 
> My problem:
> For some of my subsets, I am missing data for one or
> more of the independent variables.  This of course
> causes the coefficients and standard erros for this
> variable to be "NA".
Not it doesn't, as Petr said.

One possible approach: Assuming that a variable is actually missing (all
NA's), note that coef(reg) is a named vector, so that the character string
names of the regressors actually used are available. You can thus check for
what's missing and add them as NA's at each return. Though I confess that I
see no reason to put things ina matrix rather than just using a list. But
that's a matter of personal taste I suppose.

> 
> Is there a way to include the NA standard errors, so
> that I have the same number of standard erros and
> coefficients for each regression, and can then store
> the coefficients and standard erros in my table of 22
> columns?
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with workaround for: Function '`[`' is not in thederivatives table

2006-08-14 Thread Berton Gunter

I think this is the sort of problem which is most elegantly handled by
computing on the language. Here is an INelegant solution: 

>  A <- c(1, 2, 3)

> for(i in 1:3)assign(paste('A',i,sep=''),A[i])

>  E <- expression(A1 * exp(A2*X) + A3) ## could also use substitute() here,
I think
## instead of explicitly assigning the coefficients

> X <- c(0.5, 1.0, 2.0)

>  eval(E)
[1]  5.718282 10.389056 57.598150

>  D(E, "A2")
A1 * (exp(A2 * X) * X)


Bert Gunter
Genentech
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Earl F. Glynn
> Sent: Monday, August 14, 2006 3:44 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Help with workaround for: Function '`[`' is not 
> in thederivatives table
> 
> # This works fine:
> > a <- 1
> 
> > b <- 2
> 
> > c <- 3
> 
> 
> 
> > E <- expression(a * exp(b*X) + c)
> 
> 
> 
> > X <- c(0.5, 1.0, 2.0)
> 
> 
> 
> > eval(E)
> 
> [1]  5.718282 10.389056 57.598150
> 
> 
> 
> > D(E, "b")
> 
> a * (exp(b * X) * X)
> 
> > eval(D(E, "b"))
> 
> [1]   1.359141   7.389056 109.196300
> 
> 
> 
> # But if (a,b,c) are replaced with (A[1], A[2], A[3]), how 
> can I get a 
> derivative using "D"?
> 
> 
> 
> > A <- c(1, 2, 3)
> 
> > E <- expression(A[1] * exp(A[2]*X) + A[3])
> 
> > X <- c(0.5, 1.0, 2.0)
> 
> > eval(E)
> 
> [1]  5.718282 10.389056 57.598150
> 
> 
> 
> # Why doesn't this work?  Any workarounds?
> 
> > D(E, "A[2]")
> 
> Error in D(E, "A[2]") : Function '`[`' is not in the derivatives table
> 
> 
> 
> If I want to have a long vector of coefficients, A, (perhaps 
> dozens) how can 
> I use "D" to compute partial derivatives?
> 
> 
> 
> Thanks for any help with this.
> 
> 
> 
> efg
> 
> 
> 
> Earl F. Glynn
> 
> Scientific Programmer
> 
> Stowers Institute for Medical Research
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fast way to load multiple files

2006-08-14 Thread Berton Gunter

A reproducible example here would help (please see posting guide). A guess:
is your filelist a list of (quoted) character strings? Correct pathnames to
the files with correct separators for your OS?

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Peter Eiger
> Sent: Monday, August 14, 2006 1:00 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Fast way to load multiple files
> 
> Hi,
> 
> Instead of having to program a loop to load several 
> workspaces in a directory, it would be nice to store the 
> filenames in a list "filelist" and then to apply "load" to this list
> "lapply( filelist, load)"
> Unfortunately, although it seems that R is loading the files, 
> the contained objects are not available in the workspace afterwards.
> Any hints what I'm doing wrong or how to circumvent the problem?
> Peter
> --
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to export data to Excel Spreadsheet?

2006-08-07 Thread Berton Gunter

You can also usually copy and paste to/from the Windows clipboard by using
file='clipboard' in file i/o or via description = 'clipboard' using
connections. I haven't checked all details of this, so there may be some
glitches.  

-- Bert Gunter

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
Sent: Monday, August 07, 2006 5:23 AM
To: R-Help
Subject: Re: [R] How to export data to Excel Spreadsheet?

On 8/7/06, Xin <[EMAIL PROTECTED]> wrote:
>I try to export my output's data to Excel spreadsheet. My outputs are:
>
>  >comb3
>[,1] [,2] [,3]
>   [1,] "a"  "b"  "c"
>   [2,] "a"  "b"  "d"
>   [3,] "a"  "b"  "e"
>   [4,] "a"  "b"  "f"
>   [5,] "a"  "b"  "g"

See

? write.table
? write.csv

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for transformation to overcome heterogeneity ofvariances

2006-08-03 Thread Berton Gunter

I know I'm coming late to this, but ...

> > Is someone able to suggest to me a transformation to overcome the
> > problem of heterocedasticity?

It is not usually useful to worry about this. In my experience, the gain in
efficiency from using an essentially ideal weighted analysis vs. an
approximate unweighted one is usually small and unimportant (transformation
to simplify a model is another issue ...). Of far greater importance usually
is the loss in efficiency due to the presence of a few "unusual" extreme
values; have you carefully checked to make sure that none of the large
sample variances you have are due merely to the presence of a small number
of highly discrepant values?


-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] listing of permutations

2006-08-02 Thread Berton Gunter

I seem to be on a roll of being dumb today. Sorry for posting my previous
silly solution to Erin's permutation problem. Please **do** ignore it.

-- Bert
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Berton Gunter
> Sent: Wednesday, August 02, 2006 10:32 AM
> To: 'Erin Hodgess'; r-help@stat.math.ethz.ch
> Subject: Re: [R] listing of permutations
> 
> Erin:
> You got 2 (so far) pre-packaged functions .Here's an 
> obscenely inefficient
> but short un-prepackaged way to do it:
> 
> k<-4
> z<- do.call('expand.grid',as.data.frame(matrix(rep(1:k,k),nc=k)))
> results<- z[apply(z,1,function(x)length(unique(x))==k),]
> 
> It is too inefficient to make public, though.
> 
> -- Bert Gunter
> Genentech Non-Clinical Statistics
> South San Francisco, CA
>  
>  
> 
> > -Original Message-
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED] On Behalf Of Erin Hodgess
> > Sent: Wednesday, August 02, 2006 9:57 AM
> > To: r-help@stat.math.ethz.ch
> > Subject: [R] listing of permutations
> > 
> > Dear R People:
> > 
> > Suppose I have the 4 numbers: 1,2,3,4.
> > 
> > I would like to create a listing of the permutations
> > of 4 items taken 4 at a time.
> > 
> > Is there a built in function for that, please?
> > 
> > Thanks in advance!
> > R 2-3-1 for Windows or Linux
> > Sincerely,
> > Erin Hodgess
> > Associate Professor
> > Department of Computer and Mathematical Sciences
> > University of Houston - Downtown
> > mailto: [EMAIL PROTECTED]
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] listing of permutations

2006-08-02 Thread Berton Gunter

Erin:
You got 2 (so far) pre-packaged functions .Here's an obscenely inefficient
but short un-prepackaged way to do it:

k<-4
z<- do.call('expand.grid',as.data.frame(matrix(rep(1:k,k),nc=k)))
results<- z[apply(z,1,function(x)length(unique(x))==k),]

It is too inefficient to make public, though.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Erin Hodgess
> Sent: Wednesday, August 02, 2006 9:57 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] listing of permutations
> 
> Dear R People:
> 
> Suppose I have the 4 numbers: 1,2,3,4.
> 
> I would like to create a listing of the permutations
> of 4 items taken 4 at a time.
> 
> Is there a built in function for that, please?
> 
> Thanks in advance!
> R 2-3-1 for Windows or Linux
> Sincerely,
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: [EMAIL PROTECTED]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rpad, leaps, regsubsets

2006-08-02 Thread Berton Gunter

Boris:

Thankyou for this. All the RPAD links now appear to be dead. However, the
Reference Card is still available in the CONTRIBUTED link on CRAN, as I
said.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA


> 
> Thanks for the resources, Berton. but unfortunately, that 
> rpad link fails, 
> and I still do not know where to get leaps or regsubsets functions. 
> Sincerely, Boris.
> -- 
> Hello, dear r team. Please help the newbie. My r is not 
> finding leaps or 
> regsubsets finctions. What should I do? Any name changes or 
> library loading 
> issues?
> -
> Boris Garbuzov
> E-mail: [EMAIL PROTECTED]
> ICQ:  146995300
> MSN: [EMAIL PROTECTED]
> Residence: 3007 Hamilton Hall,  University Drive, Burnaby 
> BC, V5A 1S6, 
> Canada

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Reference Card and other help (especially useful for Newbies)

2006-08-01 Thread Berton Gunter


Hi all: 

  
Newbies (and others!) may find useful the R Reference Card made available by

Tom Short and Rpad at http://www.rpad.org/Rpad/Rpad-refcard.pdf  or through 
the "Contributed" link on CRAN (where some other reference cards are also 
linked). It categorizes and organizes a bunch of R's basic, most used 
functions so that they can be easily found. For example, paste() is under 
the "Strings" heading and expand.grid() is under "Data Creation." For 
newbies struggling to find the right R function as well as veterans who 
can't quite remember the function name, it's very handy. 

Also don't forget R's other Help facilties: 

help.search("keyword or phrase") to search the **installed** man pages 

RSiteSearch("keyword or phrase") to search the CRAN website via Jonathan
Baron's search engine. This can also be done directly from CRAN by following
the "search" link there.

And, occasionally, find()/apropos() to search the ** attached** packages for
functions using regexp's. 

Though R certainly can be intimidating, please **do** try these measures
first before posting questions to the list. And please **do** read the other
basic R reference materials. Better and faster answers can often be found
this way.

  
-- Bert Gunter 
Genentech Non-Clinical Statistics 
South San Francisco, CA 
  
"The business of the statistician is to catalyze the scientific learning 
process."  - George E. P. Box 

__ 
R-help@stat.math.ethz.ch mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PCA with not non-negative definite covariance

2006-07-26 Thread Berton Gunter

Not sure what "completely analagous" means; mds is nonlinear, PCA is linear.

In any case, the bottom line is that if you have high dimensional data with
"many" missing values, you cannot know what the multivariate distribution
looks like -- and you need a **lot** of data with many variables to usefully
characterize it anyway. So you must either make some assumptions about what
the distribution could be (including imputation methodology) or use any of
the many exploratory techniques available to learn what you can.
Thermodynamics holds -- you can't get something for nothing (you can't fool
Mother Nature).

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Quin Wills
> Sent: Wednesday, July 26, 2006 8:44 AM
> To: [EMAIL PROTECTED]
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] PCA with not non-negative definite covariance
> 
> Thanks.
> 
> I suppose that another option could be just to use classical
> multi-dimensional scaling. By my understanding this is (if based on
> Euclidian measure) completely analogous to PCA, and because it's based
> explicitly on distances, I could easily exclude the variables 
> with NA's on a
> pairwise basis when calculating the distances.
> 
> Quin
> 
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
> Sent: 25 July 2006 09:24 AM
> To: Quin Wills
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] PCA with not non-negative definite covariance
> 
> Hi , hi all,
> 
> > Am I correct to understand from the previous discussions on 
> this topic (a
> > few years back) that if I have a matrix with missing values 
> my PCA options
> > seem dismal if:
> > (1) I dont want to impute the missing values.
> > (2) I dont want to completely remove cases with missing values.
> > (3) I do cov() with use=pairwise.complete.obs, as 
> this produces
> > negative eigenvalues (which it has in my case!).
> 
> (4) Maybe you can use the Non-linear Iterative Partial Least Squares
> (NIPALS)
> algorithm (intensively used in chemometry). S. Dray proposes 
> a version of
> this
> procedure at http://pbil.univ-lyon1.fr/R/additifs.html.
> 
> 
> Hope this help :)
> 
> 
> Pierre
> 
> 
> 
> --
> 
> Ce message a été envoyé depuis le webmail IMP (Internet 
> Messaging Program)
> 
> -- 
> No virus found in this incoming message.
> 
> 
>  
> 
> --
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grouping by consecutive integers: Correction

2006-07-24 Thread Berton Gunter

Sorry, all. My previous post was mixed up. Here's the corrected version:

sequences <- function(x,incr = 1)
{
ix <- which(abs(diff(c(FALSE,diff(x) == incr))) ==1)
if(length(ix)%%2)c(ix,length(x))
else ix
}



-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Kevin J Emerson
> Sent: Monday, July 24, 2006 9:20 AM
> To: Niels Vestergaard Jensen
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] grouping by consecutive integers
> 
> Let me clarify one thing that I dont think I made clear in my posting.
> I am looking for the max, min and median of the indicies, not of the
> time series frequency counts.  I am looking to find the max, min, and
> median time of peaks in a time series, so i am looking for the
> information concerning that. 
> 
> so mostly my question is how to extract the information of 
> max, min, and
> median of sequential numbers in a vector.  I will reword my original
> posting below.
> 
> > > Hello R-helpers!
> > >
> > > I have a question concerning extracting sequence 
> information from a
> > > vector.  I have a vector (representing the bins of a time 
> series where
> > > the frequency of occurrences is greater than some 
> threshold) where I
> > > would like to extract the min, median and max of each group of
> > > consecutive numbers in the index vector..
> > >
> > > For Example:
> > >
> > > tmp <- 
> c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71)
> > >
> > > I would like to have the max,min,median of the following groups:
> > >
> > > 24,25 - max = 25, min = 24 median = 24.5
> > > 29 max=min=median = 29
> > > 35,36,37,38,39,40,41,42,43,44,45,46,47, max = 45 min = 35 etc...
> > > 68,69,70,71
> > >
> > > I would like to be able to perform this for many time series so an
> > > automated process would be nice.  I am hoping to use this 
> as a peak
> > > detection protocol.
> > >
> > > Any advice would be greatly appreciated,
> > > Kevin
> > >
> > > -
> > > -
> > > Kevin J Emerson
> > > Center for Ecology and Evolutionary Biology
> > > 1210 University of Oregon
> > > Eugene, OR 97403
> > > USA
> > > [EMAIL PROTECTED]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] unique, but keep LAST occurence

2006-07-24 Thread Berton Gunter

Try:

 largestDF <- DF[nrow(DF)- which(!duplicated(rev(DF$t)))+1,]

You can then sort this however you like in the usual way. Row names will be
preserved.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> [EMAIL PROTECTED]
> Sent: Monday, July 24, 2006 10:00 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] unique, but keep LAST occurence
> 
> ?unique says
> 
> Value:
> 
>  An object of the same type of 'x'. but if an element is equal to
>  one with a smaller index, it is removed.
> 
> However, I need to keep the one with the LARGEST index.
> Can someone please show me the light? 
> I thought about reversing the row order twice, but I couldn't 
> get it to work right
> 
> (My data frame has 125000 rows and 7 columns, 
> and I'm 'uniqueing' on column #1 (chron) only, although the 
> class of the column may not matter.)
> 
> Say, e.g., 
> > DF <- data.frame(t = c(1,2,3,1,4,5,1,2,3), x = c(0,1,2,3,4,5,6,7,8))
> 
> I would like the result to be (sorted as well)
>  t x
>  1 6
>  2 7
>  3 8
>  4 4
>  5 5
> 
> If I got the original rownames, that would be a bonus (for debugging.)
> 
> > R.version
>_ 
> platform   i386-pc-mingw32   
> arch   i386  
> os mingw32   
> system i386, mingw32 
> status   
> major  2 
> minor  3.1   
> year   2006  
> month  06
> day01
> svn rev38247 
> language   R 
> version.string Version 2.3.1 (2006-06-01)
> 
> Thanks for any hints!
> David
> 
> David L. Reiner
> Rho Trading Securities, LLC
> Chicago  IL  60605
> 312-362-4963
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grouping by consecutive integers

2006-07-24 Thread Berton Gunter

As you do not seem to have received what you consider to be  satisfactory
reply, here is a function that I **think** does what you want:

sequences <- function(x,incr = 1)
{
ix <- which(abs(diff(c(FALSE,diff(x) == 1))) ==incr)
if(length(ix)%%2)c(ix,length(x))
else ix
}

This function gives successive pairs of first and last values of sequences
of increasing values within x that differ by incr. You can then process
these pairs however you like either to summarize 
statistics on the indices and/or the values of the sequences.

Examples:
> sequences(c(1:5,50,3:7))
[1]  1  5  7 11
> sequences(c(10,1:5,50,3:7))
[1]  2  6  8 12
> sequences(c(1:5,50,3:7,10))
[1]  1  5  7 11
> sequences(c(10,1:5,50,3:7,10))
[1]  2  6  8 12

Cheers,

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Kevin J Emerson
> Sent: Monday, July 24, 2006 9:20 AM
> To: Niels Vestergaard Jensen
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] grouping by consecutive integers
> 
> Let me clarify one thing that I dont think I made clear in my posting.
> I am looking for the max, min and median of the indicies, not of the
> time series frequency counts.  I am looking to find the max, min, and
> median time of peaks in a time series, so i am looking for the
> information concerning that. 
> 
> so mostly my question is how to extract the information of 
> max, min, and
> median of sequential numbers in a vector.  I will reword my original
> posting below.
> 
> > > Hello R-helpers!
> > >
> > > I have a question concerning extracting sequence 
> information from a
> > > vector.  I have a vector (representing the bins of a time 
> series where
> > > the frequency of occurrences is greater than some 
> threshold) where I
> > > would like to extract the min, median and max of each group of
> > > consecutive numbers in the index vector..
> > >
> > > For Example:
> > >
> > > tmp <- 
> c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71)
> > >
> > > I would like to have the max,min,median of the following groups:
> > >
> > > 24,25 - max = 25, min = 24 median = 24.5
> > > 29 max=min=median = 29
> > > 35,36,37,38,39,40,41,42,43,44,45,46,47, max = 45 min = 35 etc...
> > > 68,69,70,71
> > >
> > > I would like to be able to perform this for many time series so an
> > > automated process would be nice.  I am hoping to use this 
> as a peak
> > > detection protocol.
> > >
> > > Any advice would be greatly appreciated,
> > > Kevin
> > >
> > > -
> > > -
> > > Kevin J Emerson
> > > Center for Ecology and Evolutionary Biology
> > > 1210 University of Oregon
> > > Eugene, OR 97403
> > > USA
> > > [EMAIL PROTECTED]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] intersect of list elements

2006-07-21 Thread Berton Gunter

FAQ 7.21.

But there are perhaps slicker ways.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Georg Otto
> Sent: Friday, July 21, 2006 10:39 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] intersect of list elements
> 
> 
> Hi,
> 
> i have a list of several vectors, for example:
> 
> > vectorlist
> $vector.a.1
> [1] "a" "b" "c"
> 
> $vector.a.2
> [1] "a" "b" "d"
> 
> $vector.b.1
> [1] "e" "f" "g"
> 
> 
> I can use intersect to find elements that appear in $vector.a.1 and
> $vector.a.2:
> 
> > intersect(vectorlist[[1]], vectorlist[[2]])
> [1] "a" "b"
> 
> 
> I would like to use grep to get the vectors by their names matching an
> expression and to find the intersects between those vectors. For the
> first step:
> 
> > vectorlist[grep ("vector.a", names(vectorlist))]
> $vector.a.1
> [1] "a" "b" "c"
> 
> $vector.a.2
> [1] "a" "b" "d"
> 
> 
> Unfortunately, I can not pass the two vectors as argument to 
> intersect:
> 
> > intersect(vectorlist[grep ("vector.a", names(vectorlist))])
> Error in unique(y[match(x, y, 0)]) : argument "y" is missing, 
> with no default
> 
> I am running R Version 2.3.1 (2006-06-01) 
> 
> 
> Could somone help me to solve this?
> 
> Cheers,
> 
> Georg
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] seeking robust test for equality of variances w/ observationweights

2006-07-21 Thread Berton Gunter

You can always bootstrap any robust spread measure (e.g. mad or higher
efficiency versions from robustbase or other packages).

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Alexis Diamond
> Sent: Friday, July 21, 2006 9:56 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] seeking robust test for equality of variances w/ 
> observationweights
> 
> Hello R community,
> 
> I am looking for a robust test for equality of variances that can take
> observation weights.
> I realize I can do the F-test with weighted variances, but 
> I've read that
> this test is not very robust.
> 
> So I thought about maybe adding a "weights" argument to John 
> Fox's code for
> the Levene Test (in the "car" library, "levene.test"),
> substituting his "median" function for a " weighted.mean" and 
> also including
> the observation weights in his "lm" run--
> after all, Levene's original test used the mean, not the median.
> 
> I asked John about it and he doesn't know what the properties of this
> weighted Levene test would be.
> Does anyone have any thoughts or suggestions, or know of a 
> robust weighted
> hypothesis test for equality of variances?
> 
> Thank you in advance for any advice you can provide,
> 
> Alexis Diamond
> [EMAIL PROTECTED]
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Job Openings: Nonclinical statistics Positions at Genentech, South San Francisco, CA

2006-07-20 Thread Berton Gunter


To all:

Genentech has immediate openings for two MS/PhD statisticians in its
preclinical/nonclinical statistics group in South San Francisco, CA. As the
name indicates, this group provides statistical services to all of
Genentech's R&D, manufacturing, and marketing activities **except** clinical
trials. In order not to abuse the good offices of this newslist, if you are
interested in more details about these positions, please send inquiries
directly to me. Please do **not** cc this list. 

Genentech is an equal opportunity employer.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting a distribution to peaks in histogram

2006-07-19 Thread Berton Gunter

With this much data, I think it makes more sense to fit a nonparametric
density estimate. ?density does this via a kernel density procedure, but
RSiteSearch('nonparametric density') will find many alternatives. The ash
and mclust packages are two that come to mind, but there are certainly
others.

Of course, if you must have a parametric fit, then you'll have to fit a
mixture of some sort.  But when both the number of components and individual
distributions are to be estimated, this is a nontrivial problem, as one runs
into identifiability issues and corresponding convergence problems. V&R's
discussion of density estimation in MASS has some useful things to say about
these issues, and Ripley's book, "PATTERN RECOGNITION AND NEURAL NETWORKS"
has even more. As both sources indicate, there's a large literature on this
issue and much software.

Cheers,
Bert Gunter
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of hadley wickham
> Sent: Wednesday, July 19, 2006 9:21 AM
> To: Ulrik Stervbo
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] Fitting a distribution to peaks in histogram
> 
> > I would like to fit a distribution to each of the peaks in 
> a histogram, such
> > as this: 
> http://photos1.blogger.com/blogger/7029/2724/1600/DU145-Bax3-B
> cl-xL.png
> 
> As a first shot, I'd try fitting a mixture of gamma distributions (say
> 3), plus a constant term for the highest bin.  You could do this using
> ML.  If the number of peaks is truly unknown, this will be a little
> trickier but still possible and you could use the LRT to chose between
> them.
> 
> > Integrate the area between each two peaks, using the means 
> and widths of the
> > distributions fitted to the two peaks. I will be using the integrate
> > function
> 
> Why do you want to do this?
> 
> >
> > The histogram is based on approximately 15000 events, which 
> makes Mclust and
> > pam (which both delivers the information I need) less useful.
> 
> If you have unbinned data, it would be better (more precise/powerful)
> to use that.
> 
> Regards,
> 
> Hadley
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: Large datasets in R

2006-07-18 Thread Berton Gunter

Or, more succinctly, "Pinard's Law":

The demands of ever more data always exceed the capabilities of ever better
hardware.

;-D

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
  

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of François Pinard
> Sent: Tuesday, July 18, 2006 3:56 PM
> To: Thomas Lumley
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] FW: Large datasets in R
> 
> [Thomas Lumley]
> 
> >People have used R in this way, storing data in a database 
> and reading it 
> >as required. There are also some efforts to provide 
> facilities to support 
> >this sort of programming (such as the current project funded 
> by Google 
> >Summer of Code:  
> http://tolstoy.newcastle.edu.au/R/devel/06/05/5525.html). 
> 
> Interesting project indeed!  However, if R requires uses more 
> swapping 
> because arrays do not all fit in physical memory, crudely replacing 
> swapping with database accesses is not necessarily going to buy
> a drastic speed improvement: the paging gets done in user 
> space instead 
> of being done in the kernel.
> 
> Long ago, while working on CDC mainframes, astonishing at the 
> time but 
> tiny by nowadays standards, there was a program able to invert or do 
> simplexes on very big matrices.  I do not remember the name of the 
> program, and never studied it but superficially (I was in computer 
> support for researchers, but not a researcher myself).  The 
> program was 
> documented as being extremely careful at organising accesses 
> to rows and 
> columns (or parts thereof) in such a way that real memory was 
> best used.
> In other words, at the core of this program was a paging system very 
> specialised and cooperative with the problems meant to be solved.
> 
> However, the source of this program was just plain huge 
> (let's say from 
> memory, about three or four times the size of the optimizing FORTRAN 
> compiler, which I already knew better as an impressive algorithmic 
> undertaking).  So, good or wrong, the prejudice stuck solidly 
> in me at 
> the time, if nothing else, that handling big arrays the right way, 
> speed-wise, ought to be very difficult.
> 
> >One reason there isn't more of this is that relying on 
> Moore's Law has 
> >worked very well over the years.
> 
> On the other hand, the computational needs for scientific 
> problems grow 
> fairly quickly to the size of our ability to solve them.  Let me take
> weather forecasting for example.  3-D geographical grids are 
> never fine 
> enough for the resolution meteorologists would like to get, 
> and the time 
> required for each prediction step grows very rapidly, to increase 
> precision by not so much.  By merely tuning a few parameters, these 
> people may easily pump nearly all the available cycles out the 
> supercomputers given to them, and they do so without hesitation.  
> Moore's Law will never succeed at calming their starving hunger! :-).
> 
> -- 
> François Pinard   http://pinard.progiciels-bpi.ca
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 10^x instead 10EX on plot axes. How?

2006-07-10 Thread Berton Gunter

You can always draw the axes by hand.

?par with axes =FALSE
?axis
?plotmath for mathematical notation in R (for exponents)

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
> Sent: Monday, July 10, 2006 9:43 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] 10^x instead 10EX on plot axes. How?
> 
> Hi,
> 
> I'm drawing a very simple plot with both axes logarithmic 
> (default base 10).
> Example:
> vec=c(1,10,100,1000,1,10,100,1000)
> plot(vec,vec,log="xy")
> 
> The axes on the plot now show the technical notation like 
> 1E+3 but I would prefer to have it the notation 10 ^3 i.e. 
> with the exponent here 3 superscript (raised).
> Any help very much appreciated!
> 
> Best Regards 
>  Tom
> -- 
> 
> 
> "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Robustness of linear mixed models

2006-06-27 Thread Berton Gunter

Below...

> > Hello,
> >
> > with 4 different linear mixed models (continuous dependent) 
> I find that my
> > residuals do not follow the normality assumption 
> (significant Shapiro-Wilk
> > with values equal/higher than 0.976; sample sizes 750 or 
> 1200). I find,
> > instead, that my residuals are really well fitted by a t 
> distribution with
> > dofs' ranging, in the different datasets, from 5 to 12.
> >
> > Should this be considered such a severe violation of the normality
> > assumption as to make model-based inferences invalid?
> 
> For some aspects, yes.  Given that R provides you with the 
> means to fit 
> robust linear models, why not use them and find out if they make a 
> difference to the aspects you are interested in?
> 
> -- 
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
> 

Or do your inferences in a way that does not depend on normality, perhaps
via (careful to honor the multilevel sampling assumptions) bootstrapping?

Cautions apply. 

First, linear mixed models is actually a nonlinear modeling technique, as is
robust linear fitting. So the process may be sensitive to initial values  I
believe this was pointed out to me by Professior Ripley, though in a
different context. I would appreciate any more informed comments and
qualifications about this.

Second, both the normal theory inference and bootstrapping are asymptotic
and therefore approximate.  I believe this was the point Prof. Ripley was
making when he said "For **some** aspects..." Comparing results under
various assumptions is always a good idea to check sensitivity to those sets
of assumptions, though it may emphasize the fact that choice of the "right"
analysis may be a complex and application and data specific issue. 

Cheers,

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] lmer and mixed effects logistic regression

2006-06-26 Thread Berton Gunter

> Rick Bilonick wrote:

> > I guess the moral is before you do any computations you have to make
> > sure the procedure makes sense for the data.
> > 

Is this a candidate for the fortunes package? (an oxymoronic profound, but
obvious comment).

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] numeric variables converted to character when recoding missingvalues

2006-06-23 Thread Berton Gunter

Please read section 2.5 of "An Introduction to R". Numerical missing values
are assigned as NA:

x[x==999]<-NA

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Juan 
> Pablo Lewinger
> Sent: Friday, June 23, 2006 3:00 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] numeric variables converted to character when 
> recoding missingvalues
> 
> Dear R helpers,
> 
> I have a data frame where missing values for numeric 
> variables are coded as
> 999. I want to recode those as NAs. The following only 
> partially succeeds
> because numeric variables are converted to character in the process:
> 
> df <- data.frame(a=c(999,1,999,2), b=LETTERS[1:4])
> is.na(df[2,1]) <- TRUE
> df
> 
> a b
> 1 999 A
> 2  NA B
> 3 999 C
> 4   2 D
> 
> is.numeric(df$a)
> [1] TRUE
> 
> 
> is.na(df[!is.na(df) & df==999]) <- TRUE
> df
>  a b
> 1  A
> 21 B
> 3  C
> 42 D
> 
> is.character(df$a)
> [1] TRUE
> 
> My question is how to do the recoding while avoiding this 
> undesirable side
> effect. I'm using R 2.2.1 (yes, I know 2.3.1 is available but 
> don't want to
> switch mid project). I'd appreciate any help.
> 
> Further details:
> 
> platform i386-pc-mingw32
> arch i386   
> os   mingw32
> system   i386, mingw32  
> status  
> major2  
> minor2.1
> year 2005   
> month12 
> day  20 
> svn rev  36812  
> language R  
> 
> 
> 
> Juan Pablo Lewinger
> Department of Preventive Medicine 
> Keck School of Medicine 
> University of Southern California
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] PowerPoint

2006-06-23 Thread Berton Gunter


I've always assumed that this was a rendering problem in the MS application,
as the reappearance of the missing lines on re-sizing shows that that the
necessary information **is** in the imported .wmf file, right?

-- Bert 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Sundar 
> Dorai-Raj
> Sent: Friday, June 23, 2006 7:55 AM
> To: Johannes Ranke
> Cc: r-help@stat.math.ethz.ch; Marc Bernard
> Subject: Re: [R] PowerPoint
> 
> Hi, all,
> 
> (Sorry to highjack the thread, but I think the OP should also 
> know this)
> 
> One of the plots Marc mentions is xyplot. Has anybody else on 
> this list 
> had a problem with lattice and win.metafile (or Ctrl-W in the 
> R graphics 
> device)? I will sometimes import wmf files (or Ctrl-V) with lattice 
> graphics into powerpoint and notice some of the border lines are 
> missing. I can re-size the plot to make the lines reappear 
> but have to 
> find just the right size to make it look right. This seems to be a 
> problem with PPT, XLS, and Word. I never have this problem with 
> traditional graphics (e.g. plot.default, etc.).
> 
> I'm using Windows XP Pro with R-2.3.1 and lattice-0.13.8, though I've 
> also experienced the problem on earlier versions of R and earlier 
> versions of lattice.
> 
> Thanks,
> 
> --sundar
> 
> Johannes Ranke wrote:
> > Dear Bernard,
> > 
> > if you use MS Powerpoint, it seems likely to me that you 
> are using the
> > Windows version of R. Are you aware of the fact, that you can just
> > right-click on any graph and copy it to the clipboard (copy 
> as metafile
> > or similar).
> > 
> > That way you get a vectorized version of the graph, which 
> you can nicely
> > paste into Powerpoint and edit.
> > 
> > Johannes
> > 
> > * Marc Bernard <[EMAIL PROTECTED]> [060623 13:40]:
> > 
> >>Dear All,
> >>   
> >>  I am looking for the best way to use graphs from R (like 
> xyplot, curve ...)   for a presentation with powerpoint. I 
> used to save my plot as pdf and after to copy them as image 
> in powerpoint but the quality is not optimal by so doing.
> >>   
> >>  Another completely independent question is the following: 
> when I use "main"  in the  xyplot, the main title is very 
> close to my plot, i.e. no ligne separate the main and the 
> plot. I would like my title to be well distinguished from the plots.
> >>   
> >>  I would be grateful for any improvements...
> >>   
> >>  Many thanks,
> >>   
> >>  Bernard,
> >>   
> >>
> >>
> >>
> >>-
> >>
> >>[[alternative HTML version deleted]]
> >>
> >>__
> >>R-help@stat.math.ethz.ch mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> > 
> >
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] High breakdown/efficiency statistics -- was RE: Rosner's test

2006-06-22 Thread Berton Gunter

Many thanks for this Martin. There now are several packages with what appear
to be overlapping functions (or at least algorithms). Besides those you
mentioned, "robust" and "roblm" are at least two others. Any recommendations
about how or whether to choose among these for us enthusiastic but
non-expert users?

Cheers,
Bert

> -Original Message-
> From: Martin Maechler [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, June 22, 2006 2:04 AM
> To: Berton Gunter
> Cc: 'Robert Powell'; r-help@stat.math.ethz.ch
> Subject: Re: [R] Rosner's test
> 
> >>>>> "BertG" == Berton Gunter <[EMAIL PROTECTED]>
> >>>>> on Tue, 13 Jun 2006 14:34:48 -0700 writes:
> 
> BertG> RSiteSearch('Rosner') ?RSiteSearch or search directly
> BertG> from CRAN.
> 
> BertG> Incidentally, I'll repeat what I've said
> BertG> before. Don't do outlier tests.  They're
> BertG> dangerous. Use robust methods instead.
> 
> Yes, yes, yes!!!
> 
> Note that  rlm() or cov.rob()  from recommended package MASS
> will most probably be sufficient for your needs.
> 
> For slightly newer methodology, look at package 'robustbase', or
> also 'rrcov'.
> 
> Martin Maechler, ETH Zurich
> 
> BertG> -- Bert Gunter Genentech Non-Clinical Statistics
> BertG> South San Francisco, CA
>  
> BertG> "The business of the statistician is to catalyze the
> BertG> scientific learning process."  - George E. P. Box
>  
>  
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R Reference Card and other help (especially useful for Newbies)

2006-06-21 Thread Berton Gunter

 
Hi all:

Happy summer solstice to all northern hemispherics (and winter solstice to
the southerners).
 
Newbies (and others!) may find useful the R Reference Card made available by
Tom Short and Rpad at http://www.rpad.org/Rpad/Rpad-refcard.pdf  or through
the "Contributed" link on CRAN (where some other reference cards are also
linked). It categorizes and organizes a bunch of R's basic, most used
functions so that they can be easily found. For example, paste() is under
the "Strings" heading and expand.grid() is under "Data Creation." For
newbies struggling to find the right R function as well as veterans who
can't quite remember the function name, it's very handy.

Also don't forget R's other Help facilties:

help.search("keyword or phrase") to search the **installed man pages **

RSiteSearch("keyword or phrase") to search the CRAN website via Jonathan
Baron's search engine. This can also be done directly from CRAN by following
the "search" link there.

And, occasionally, find()/apropos() to search **the attached session
packages** for functions using rexexp's.

Though R certainly can be intimidating, please **do** try these measures
first before posting questions to the list. And please **do** read the other
basic R reference materials. Better and faster answers can usually be found
this way.
 
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] any function for monotone nonparametric regression?

2006-06-16 Thread Berton Gunter

Why wonder? Why not use R's search facilities to find out for yourself?

RSiteSearch('monotonic regression')

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Cunningham Kerry
> Sent: Friday, June 16, 2006 12:16 PM
> To: R-help@stat.math.ethz.ch
> Subject: [R] any function for monotone nonparametric regression?
> 
> I am wondering if there is any package in R that can fit a 
> nonparametric regression model with monotone constraints on 
> the fitted results.
> 
>   
> -
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] number of iteration s exceeded maximum of 50

2006-06-16 Thread Berton Gunter

My goodness! This is **NOT** a reproducible example. You need to give us the
exact data you fitted to reproduce your results/diagnose your problem.


-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Leaf Sun
> Sent: Friday, June 16, 2006 9:41 AM
> To: Uwe Ligges
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] number of iteration s exceeded maximum of 50
> 
> Sorry, I thought it was a straightforward question inside 
> which I was stuck .
> 
> I used nls( ) to estimate a and b in this function. 
> 
> nls(y~ a*x^b,start=list(a=a1,b=b1) 
> 
> seems the start list I gave was not able to reach convergence 
> and it gave notes: number of iteration s exceeded maximum of 
> 50. Then I put  nls.control(maxiter = 50, tol = 1e-05, 
> minFactor = 1/1024) in nls(.. ), and modified the argument of 
> maxiter = 500. But it worked out as the same way and noted : 
> number of iteration s exceeded maximum of 50. I have totally 
> no idea how to set this parameter MAXITER.
> 
> Thanks for any information!
> 
> Leaf
> 
> 
> >  Hi  all,
> >  
> >  I  found  r-site-research  not  work  for  me  these  days.
> >  
> >  When  I  was  doing  nls(  )  ,  there  was  an  error  
> "number  of  iterations  exceeded  maximum  of  50".  I  set  
> number  in  nls.control  which  is  supposed  to  control  
> the  number  of  iterations  but  it  didn't  work  well.  
> Could  anybody  with  this  experience  tell  me  how  to  
> fix  it?  Thanks  in  advance!
> 
> We  cannot  make  suggestions  unless  you  tell  us  what  
> you  tried  yourself.
> Id  possible,  please  gib4ve  a  reproducible  examle.
> 
> Uwe  Ligges
> 
> >  Leaf
> >  
> >   [[alternative  HTML  version  deleted]]
> >  
> >  __
> >  R-help@stat.math.ethz.ch  mailing  list
> >  https://stat.ethz.ch/mailman/listinfo/r-help
> >  PLEASE  do  read  the  posting  guide!  
> http://www.R-project.org/posting-guide.html
> 
>   [[alternative HTML version deleted]]
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] flipping a plot vertically?

2006-06-15 Thread Berton Gunter

> 
> Any idea how to get the x axis numbers to go 
> along the top instead of the bottom?
> 

Use xaxt = 'n' in your plot call (?par for details) to suppress plotting of
the axis and then add the axis via a call to axis().

If you do a lot of plotting, you may wish to purchase a copy of Murrell's R
GRAPHICS or V&R's MASS. At the very least, do read the relevant sections of
an Introduction to R and the Reference Manual, as I believe this sort of
thing is covered there.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Tim Brown
> Sent: Thursday, June 15, 2006 11:18 AM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] flipping a plot vertically?
> 
> That works great.  Thanks.
> 
> Any idea how to get the x axis numbers to go 
> along the top instead of the bottom?
> 
> tim
> 
> At 11:46 AM 6/15/2006, you wrote:
> >[Tim Brown]
> >
> >>This seems like an obvious question but I can't 
> >>find the answer in the "par" help document --- 
> >>I'd like to make a plot where the 0,0 point is 
> >>in the top left of the screen rather than 
> >>bottom left... .  [...] Any suggestions?
> >
> >You might retry your plot, adding an ylim=c(HIGHEST, LOWEST) 
> argument,
> >that is, listing the maximum before the minimum.  For example:
> >
> >   plot(1:10, ylim=c(10, 1))
> >
> >--
> >François Pinard   http://pinard.progiciels-bpi.ca
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Rosner's test

2006-06-13 Thread Berton Gunter

RSiteSearch('Rosner')

?RSiteSearch  
 or search directly from CRAN.

Incidentally, I'll repeat what I've said before. Don't do outlier tests.
They're dangerous. Use robust methods instead.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Robert Powell
> Sent: Tuesday, June 13, 2006 1:47 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Rosner's test
> 
> My second request of the day, sorry to be such a bother.
> 
> Can you tell me whether Rosner's test for outliers is available in  
> any of the R packages and, if so, which one? I've tried but I can't  
> find it.
> 
> Thank you very much,
> 
> Bob Powell
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Re-binning histogram data

2006-06-09 Thread Berton Gunter

Charles: 

To be fair ... both histograms and densityplots are nonparametric density
estimators whose appearance and effectiveness are dependent on various
parameters. Neither are immune from misleading due to a poor choice of the
parameters. For histograms they are the bin boundaries; for kde's and
friends it is some version of bandwidth.

-- Bert
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Charles Annis, P.E.
> Sent: Thursday, June 08, 2006 7:17 PM
> To: 'Justin Ashmall'; r-help@stat.math.ethz.ch
> Subject: Re: [R] Re-binning histogram data
> 
> Concerning the several comments on your note relating to 
> histograms, an
> informative and entertaining illustration, using Java, of how your
> subjective assessment of the data can change with different histograms
> constructed from the same data, is provided by R. Webster 
> West, recently
> with the Department of Statistics at the University of South 
> Carolina, but
> as of May 2006 with the Department of Statistics at Texas A & 
> M University,
> http://www.stat.sc.edu/~west/javahtml/Histogram.html  and
> http://www.stat.tamu.edu/~west/ 
> 
> 
> Charles Annis, P.E.
> 
> [EMAIL PROTECTED]
> phone: 561-352-9699
> eFax:  614-455-3265
> http://www.StatisticalEngineering.com
>  
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Justin Ashmall
> Sent: Thursday, June 08, 2006 5:46 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Re-binning histogram data
> 
> Hi,
> 
> Short Version:
> Is there a function to re-bin a histogram to new, broader bins?
> 
> Long version: I'm trying to create a histogram, however my 
> input-data is 
> itself in the form of a fine-grained histogram, i.e. numbers 
> of counts 
> in regular one-second bins. I want to produce a histogram of, say, 
> 10-minute bins (though possibly irregular bins also).
> 
> I suppose I could re-create a data set as expected by the 
> hist() function 
> (i.e. if time t=3600 has 6 counts, add six entries of 3600 to a list) 
> however this seems neither elegant nor efficient (though I'd 
> be pleased to 
> be mistaken!). I could then re-create a histogram as normal.
> 
> I guessing there's a better solution however! Apologies if 
> this is a basic 
> question - I'm rather new to R and trying to get up to speed.
> 
> Regards,
> 
> Justin
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] apologies if you aready received this ?

2006-06-08 Thread Berton Gunter

?readline

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> [EMAIL PROTECTED]
> Sent: Thursday, June 08, 2006 9:47 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] apologies if you aready received this ?
> 
> I am accessing my email account remotely so it
> seems to be acting strangely so I am not sure
> if this R question was received. I apologize if it was
> and thanks for any help you can provide.
> 
> -
> 
> 
> Hi Everyone : As I mentioned earlier, I am taking a lot
> of Splus code and turning into R and I've run into
> another stumbling block that I have not been
> able to figure out.
> 
> I did plotting in a loop when I was using Splus on unix
> and the way I made the plots stop so I could
> lookat them as they got plotted ( there are hundreds
> if not thousands getting plotted sequentially ) 
> on the screen was by using the unix() command.
> 
> Basically, I wrote a function called wait()
> 
> 
> wait<-function()
> {
> cat("press return to continue")
> unix("read stuff")
> }
> 
> and this worked nicely because I then
> did source("program name") at the Splus prompt and
> a plot was created on the screen  and then
> the wait() function was right under the plotting code
> in the program so that you had to hit the return key to go to 
> the next plot.
> 
> I am trying to do the equivalent on R 2.20/windows XP
> I did a ?unix in R and it came back with system() and
> said unix was deprecated so I replaced unix("read stuff") 
> with system("read stuff") but all i get is a warning "read 
> not found" and
> it flies through the successive plots and i can't see them.
> 
> Thanks for any help on this. It's much appreciated.
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Re-binning histogram data

2006-06-08 Thread Berton Gunter

I would argue that histograms are outdated relics and that density plots
(whatever your favorite flavor is) should **always** be used instead these
days.

In this vein, I would appreciate critical rejoinders (public or private) to
the following proposition: Given modern computer power and software like R
on multi ghz machines, statistical and graphical relics of the pre-computer
era (like histograms, low resolution printer-type plots, and perhaps even
method of moments EMS calculations) should be abandoned in favor of superior
but perhaps computation-intensive alternatives (like density plots, high
resolution plots, and likelihood or resampling or Bayes based methods). 

NB: Please -- no pleadings that new methods would be mystifying to the
non-cogniscenti. Following that to its logical conclusion would mean that
we'd all have to give up our TV remotes and cell phones, and what kind of
world would that be?! :-)

-- Bert Gunter

  

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Petr Pikal
> Sent: Thursday, June 08, 2006 6:17 AM
> To: Justin Ashmall; r-help@stat.math.ethz.ch
> Subject: Re: [R] Re-binning histogram data
> 
> 
> 
> On 8 Jun 2006 at 11:35, Justin Ashmall wrote:
> 
> Date sent:Thu, 8 Jun 2006 11:35:46 +0100 (BST)
> From: Justin Ashmall <[EMAIL PROTECTED]>
> To:   Petr Pikal <[EMAIL PROTECTED]>
> Copies to:r-help@stat.math.ethz.ch
> Subject:  Re: [R] Re-binning histogram data
> 
> > 
> > Thanks for the reply Petr,
> > 
> > It looks to me that truehist() needs a vector of data just like
> > hist()? Whereas I have histogram-style input data? Am I missing
> > something?
> 
> Well, maybe you could use barplot. Or as you suggested recreate the 
> original vector and call hist or truehist with other bins.
> 
> > hhh<-hist(rnorm(1000))
> > barplot(tapply(hhh$counts, c(rep(1:7,each=2),7), sum))
> > tapply(hhh$mids, c(rep(1:7,each=2),7), mean)
> 1 2 3 4 5 6 7 
> -3.00 -2.00 -1.00  0.00  1.00  2.00  3.25 
> > hhh1<-rep(hhh$mids,hhh$counts)
> > plot(hhh, freq=F)
> > lines(density(hhh1))
> >
> 
> HTH
> Petr
> 
> 
> 
> 
> 
> 
> > 
> > Cheers,
> > 
> > Justin
> > 
> > 
> > 
> > On Thu, 8 Jun 2006, Petr Pikal wrote:
> > 
> > > Hi
> > >
> > > try truehist from MASS package and look for argument breaks or h.
> > >
> > > HTH
> > > Petr
> > >
> > >
> > >
> > >
> > > On 8 Jun 2006 at 10:46, Justin Ashmall wrote:
> > >
> > > Date sent:Thu, 8 Jun 2006 10:46:19 +0100 (BST)
> > > From: Justin Ashmall <[EMAIL PROTECTED]>
> > > To:   r-help@stat.math.ethz.ch
> > > Subject:  [R] Re-binning histogram data
> > >
> > >> Hi,
> > >>
> > >> Short Version:
> > >> Is there a function to re-bin a histogram to new, broader bins?
> > >>
> > >> Long version: I'm trying to create a histogram, however my
> > >> input-data is itself in the form of a fine-grained 
> histogram, i.e.
> > >> numbers of counts in regular one-second bins. I want to produce a
> > >> histogram of, say, 10-minute bins (though possibly irregular bins
> > >> also).
> > >>
> > >> I suppose I could re-create a data set as expected by the hist()
> > >> function (i.e. if time t=3600 has 6 counts, add six 
> entries of 3600
> > >> to a list) however this seems neither elegant nor 
> efficient (though
> > >> I'd be pleased to be mistaken!). I could then re-create 
> a histogram
> > >> as normal.
> > >>
> > >> I guessing there's a better solution however! Apologies 
> if this is
> > >> a basic question - I'm rather new to R and trying to get up to
> > >> speed.
> > >>
> > >> Regards,
> > >>
> > >> Justin
> > >>
> > >> __
> > >> R-help@stat.math.ethz.ch mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide!
> > >> http://www.R-project.org/posting-guide.html
> > >
> > > Petr Pikal
> > > [EMAIL PROTECTED]
> > >
> > >
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> 
> Petr Pikal
> [EMAIL PROTECTED]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Accessing lme source code

2006-06-06 Thread Berton Gunter

1. You need to learn about S3 methods. ?UseMethod will tell you what you
need to know.

2. methods("lme") will tell you the available methods.

3. nlme:::lme.formula will give you the code. 

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Yang, Richard
> Sent: Tuesday, June 06, 2006 11:03 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Accessing lme source code
> 
> Dear all;
> 
>   This an FAQ. I tried to access lme source script so I can step
> into it to debug the problems resulting from a lme() call. I used
> getAnywhere("lme") or nlme:::lme, both produced only the function
> definition and "UseMethod("lme"). 
> 
>   Any idea how to list the source code?
> 
>   TIA,
> 
> Richard Yang
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Constrained regression

2006-06-05 Thread Berton Gunter

If you haven't already done so, please make use of R's search capabilities
before posting.

help.search('constrained regression')
RSiteSearch('constrained regression') ## also available through CRAN's
search functionality.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Daniel Bogod
> Sent: Monday, June 05, 2006 1:01 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Constrained regression
> 
> Hi,
> I would like to run a constrained OLS, with the following constraints:
> 1. sum of coefficients must be 1
> 2. all the coefficients have to be positive.
> 
> Is there an eas way to specify that in R
> 
> Thank you,
>  Daniel Bogod
> [EMAIL PROTECTED]
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] lm() variance covariance matrix of coefficients.

2006-06-02 Thread Berton Gunter

or more simply and better,

vcov(lm.object)

?vcov

Note R's philosophy:use available extractors to get the key features of the
objects, rather then indexing. This is safer, as it does not depend on the
particular structure/implementation, which can change. This is the
difference between "private" and "public" views of an object in oo lingo.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Rolf Turner
> Sent: Friday, June 02, 2006 2:25 PM
> To: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
> Subject: Re: [R] lm() variance covariance matrix of coefficients.
> 
> 
> summary(object)$cov.unscaled
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] FW: How to create a new package?

2006-06-01 Thread Berton Gunter

Quicker and dirtier is to simply save a workspace  with your desired
functions and attach() it automatically (e.g. via .First or otherwise) at
startup (?Startup for details and options). Of course none of the tools and
benefits of package management are available, but then I did say quicker and
dirtier.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Ramon 
> Diaz-Uriarte
> Sent: Thursday, June 01, 2006 5:43 AM
> To: r-help@stat.math.ethz.ch
> Cc: Rita Sousa
> Subject: Re: [R] FW: How to create a new package?
> 
> Dear Rita,
> 
> Do you want a package just for yourself, or something useful 
> for others, with 
> docs, etc? I think the rest of the answers in this thread 
> will help you 
> create a "full fledged" package. See also the detailed explanation 
> in "Writing R extensions".
> 
> If you just want something quick and dirty that allows you to 
> use a bunch of 
> functions without using "source" (and thus cluttering your 
> global workspace), 
> is easy to move around, etc, you just need a directory 
> structure such as:
> 
> SignS2/
> SignS2/R/
> SignS2/R/SignS2.R
> SignS2/DESCRIPTION
> SignS2/Changes
> 
> (Change SignS2 for the name of your package).
> 
> This has no documentation whatsoever. You can get rid of the 
> "changes" file, 
> but I put it there to keep track of changes.
> 
> Run R CMD check against the directory (of coruse, you'll get 
> warnings about 
> missing documentation), and then R CMD build.
> 
> Best,
> 
> R.
> 
> 
> On Thursday 01 June 2006 13:23, michael watson (IAH-C) wrote:
> > ?package.skeleton
> >
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of 
> Gabor Grothendieck
> > Sent: 01 June 2006 12:20
> > To: Rita Sousa
> > Cc: r-help@stat.math.ethz.ch
> > Subject: Re: [R] FW: How to create a new package?
> >
> > The minimum is to create a DESCRIPTION file, plus R and man 
> directories
> > containing R code and .Rd files respectively. It might help 
> to run  Rcmd
> > CHECK mypkg  before installation and fix any problems it finds.
> >
> > Googling for   creating R package   will locate some tutorials.
> >
> > On 6/1/06, Rita Sousa <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > >
> > >
> > > I'm a group of functions and I would like to create a 
> package for load in
> > > R. I have created a directory named INE and a directory 
> below that named
> > > R, for the files of R functions. A have created the files 
> DESCRIPTION and
> > > INDEX in the INE directory. The installation from local 
> zip files, in the
> > > R 2.3.0, results but to load the package I get an error like:
> > >
> > >
> > >
> > > 'INE' is not a valid package -- installed < 2.0.0?
> > >
> > >
> > >
> > > I think that is necessary create a Meta directory with package.rds
> > > file, but I don't know make it! I have read the manual 'Writing R
> > > Extensions - 1. Creating R packages' but I don't understand the
> > > procedure...
> > >
> > > Can I create it automatically?
> > >
> > >
> > >
> > > Could you help me with this?
> > >
> > >
> > >
> > > Thanks,
> > >
> > > ---
> > > Rita Sousa
> > > DME - ME: Departamento de Metodologia Estatística - Métodos
> > > Estatísticos INE - DRP: Instituto Nacional de Estatística 
> - Delegação
> > > Regional do Porto
> > > Tel.: 22 6072016 (Extensão: 4116)
> > > ---
> > >
> > >
> > >
> > >
> > >[[alternative HTML version deleted]]
> > >
> > >
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> 
> -- 
> Ramón Díaz-Uriarte
> Bioinformatics 
> Centro Nacional de Investigaciones Oncológicas (CNIO)
> (Spanish National Cancer Center)
> Melchor Fernández Almagro, 3
> 28029 Madrid (Spain)
> Fax: +-34-91-224-6972
> Phone: +-34-91-224-6900
> 
> http://ligarto.org/rdiaz
> PGP KeyID: 0xE89B3462
> (http://ligarto.org/rdiaz/0xE89B3462.asc)
> 
> 
> 
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en 
> s...{{dropped}}
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

___

Re: [R] Joining variables

2006-05-24 Thread Berton Gunter

What does "combination of both" mean exactly. I can think of two
interpretations that have two different answers. If you give a small example
(as the posting guide suggests) it would certainly help me provide an
answer.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 

 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Guenther, Cameron
> Sent: Wednesday, May 24, 2006 11:46 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Joining variables
> 
> Hello,
> 
> If I have two variables that are factors or characters and I want to
> create a new variable that is the combination of both what 
> function can
> I use to accomplish this?
> 
> Ex.
> 
> Var1  Var2
> SA100055113   19851113
> 
> And I want
> 
> NewVar
> SA10005511319851113
> 
> Thanks in advance.
> 
> Cameron Guenther, Ph.D. 
> Associate Research Scientist
> FWC/FWRI, Marine Fisheries Research
> 100 8th Avenue S.E.
> St. Petersburg, FL 33701
> (727)896-8626 Ext. 4305
> [EMAIL PROTECTED]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R Reference Card (especially useful for Newbies)

2006-05-24 Thread Berton Gunter

 
Newbies (and others!) may find useful the R Reference Card made available by
Tom Short and Rpad at http://www.rpad.org/Rpad/Rpad-refcard.pdf  or through
the "Contributed" link on CRAN (where some other reference cards are also
linked). It categorizes and organizes a bunch of R's basic, most used
functions so that they can be easily found. For example, paste() is under
the "Strings" heading and expand.grid() is under "Data Creation." For
newbies struggling to find the right R function as well as veterans who
can't quite remember the function name, it's very handy.
 
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] everytime I download a new version of R, need I reinstall all packages?

2006-05-16 Thread Berton Gunter

> 
> I think you did not answer my question... I now upgraded my 
> main R program
> from 2.2.1 to 2.3.0 and I removed the 2.2.1 installation, but all the

Wait until after you use update.packages() to remove your previous
installation. You can keep multiple versions of R simultaneously, so this is
no problem. That is:

1) Install new R version
2) Run update.packages() on old library version
3) Copy updated old library to new library location (or point new library
location to old)
4) Remove old R version (and its libraries if you copied them)

There are probably better ways to do this, which this message may stimulate.


-- Bert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Engel curve

2006-05-16 Thread Berton Gunter

RSiteSearch('Engel')

This search capability can also be used directly via a browser in CRAN.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Werner 
> Wernersen
> Sent: Tuesday, May 16, 2006 8:24 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Engel curve
> 
> Hi,
> 
> has anybody an example of an Engel curve analysis in R
> or does there exist a package to estimate and plot
> Engel curves from expenditure / income data in R?
> 
> Thanks a million for your hints,
>   Werner
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] anova statistics in lmer

2006-05-15 Thread Berton Gunter

Note the intrusion of modeling "philosophy" here: is it better to give an
exact but likely quite wrong answer, or to give an incomplete or no answer?
The latter seems to me to be an honest statement about the limits of one's
knowledge that the former obscures by a fog of exactitude. 

Obviously, others may disagree, but I applaud Doug Bates's forthright
approach.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Doran, Harold
> Sent: Monday, May 15, 2006 9:15 AM
> To: Diego Vázquez; r-help@stat.math.ethz.ch
> Subject: Re: [R] anova statistics in lmer
> 
> The issue is not unresolved within lmer, but with the 
> statistical model itself. SAS gives you alternatives for the 
> ddf such as Kenward-Roger. But, as I have noted on the list 
> before, this makes the assumption that the ratio of the 
> variances follow an F distribution and that the only 
> remaining challenge is to then estimate the ddf. Then, one 
> can get all the p-values you want.
> 
> If you believe that is true, then the SAS options will give 
> you some statistics to use--not to say that they are correct, though. 
> 
> > -Original Message-
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED] On Behalf Of Diego Vázquez
> > Sent: Monday, May 15, 2006 11:53 AM
> > To: r-help@stat.math.ethz.ch
> > Subject: [R] anova statistics in lmer
> > 
> > Dear list members,
> > 
> > I am new to R and to the R-help list. I am trying to perform 
> > a mixed-model analysis using the lmer() function. I have a 
> > problem with the output anova table when using the anova() 
> > function on the lmer output object: I only get the numerator 
> > d.f., the sum of squares and the mean squares, but not the 
> > denominator d.f., F statistics and P values.
> > Below is a sample output, following D. Bates' SASmixed 
> > example in his paper "Fitting linear mixed models in R" 
> > (R-News 5: 27-30).
> > 
> > By reading the R-help archive, I see that this problem has 
> > come up before (e.g., 
> > http://tolstoy.newcastle.edu.au/R/help/06/04/25013.html).
> > What I understand from the replies to this message is that 
> > this incomplete output results from some unresolved issues 
> > with lmer, and that it is currently not possible to use it to 
> > obtain full anova statistics. Is this correct? And is this 
> > still unresolved? If so, what is the best current alternative 
> > to conduct a mixed model analysis, other than going back to SAS?
> > 
> > I would greatly appreciate some help.
> > 
> > Diego
> > 
> > 
> > 
> > Example using SASmixed "HR" data (see D. Bates, "Fitting 
> > linear mixed models in R", R-News 5: 27-30)
> > 
> > > data("HR",package="SASmixed")
> > > library(lme4)
> > Loading required package: Matrix
> > Loading required package: lattice
> > 
> > Attaching package: 'lattice'
> > 
> > 
> > The following object(s) are masked from package:Matrix :
> > 
> >  qqmath
> > 
> > > (fm1<-lmer(HR~baseHR+Time*Drug+(1|Patient),HR))
> > Linear mixed-effects model fit by REML
> > Formula: HR ~ baseHR + Time * Drug + (1 | Patient)
> >   Data: HR
> >   AIC  BIClogLik MLdeviance REMLdeviance
> >  788.6769 810.9768 -386.3384   791.8952 772.6769
> > Random effects:
> >  Groups   NameVariance Std.Dev.
> >  Patient  (Intercept) 44.541   6.6739
> >  Residual 29.780   5.4571
> > number of obs: 120, groups: Patient, 24
> > 
> > Fixed effects:
> >  Estimate Std. Error t value
> > (Intercept)  33.962099.93059  3.4199
> > baseHR0.588190.11846  4.9653
> > Time-10.698352.42079 -4.4194
> > Drugb 3.380133.78372  0.8933
> > Drugp-3.778243.80176 -0.9938
> > Time:Drugb3.511893.42352  1.0258
> > Time:Drugp7.501313.42352  2.1911
> > 
> > Correlation of Fixed Effects:
> >(Intr) baseHR Time   Drugb  Drugp  Tm:Drgb
> > baseHR -0.963
> > Time   -0.090  0.000
> > Drugb  -0.114 -0.078  0.237
> > Drugp  -0.068 -0.125  0.236  0.504
> > Time:Drugb  0.064  0.000 -0.707 -0.335 -0.167 Time:Drugp  
> > 0.064  0.000 -0.707 -0.167 -0.333  0.500
> > > anova(fm1)
> > Analysis of Variance Table
> >   Df Sum Sq Mean Sq
> > baseHR 1 745.99  745.99
> > Time   1 752.86  752.86
> > Drug   2  86.80   43.40
> > Time:Drug  2 143.17   71.58
> > 
> > 
> > --
> > Diego Vázquez
> > Instituto Argentino de Investigaciones de las Zonas Áridas 
> > Centro Regional de Investigaciones Científicas y Tecnológicas 
> > CC 507, (5500) Mendoza, Argentina 
> > http://www.cricyt.edu.ar/interactio/dvazquez/
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http:/

Re: [R] bitwise addition

2006-05-12 Thread Berton Gunter

Not clear from your message what you want, but maybe try:

expand.grid(c(0,1),c(0,1),c(0,1))

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
  

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Nameeta Lobo
> Sent: Friday, May 12, 2006 9:42 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] bitwise addition
> 
> 
> 
> Hello all again,
> 
> I want to do bitwise addition in R. I am trying to generate a matrix
> 
> 0001
> 0010
> 
> 
> 
> 
> I know the other ways of generating this matrix but I need to 
> look at bitwise
> addition. 
> 
> Any suggestions???
> 
> thanks a lot
> 
> Nameeta
> 
> 
> 
> -
> This email is intended only for the use of the individual\...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Can't there be a cd command?

2006-05-10 Thread Berton Gunter


...another fortunes package candidate? I especially liked the sections
beginning "R is a 4 wheel drive SUV...", but a lot of it is great IMHO. Well
said! Bestimmt!

-- Bert

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Greg Snow
> Sent: Wednesday, May 10, 2006 1:37 PM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] Can't there be a cd command?
> 
> When talking about user friendlyness of computer software I 
> like the analogy of cars vs. busses:
> 
> Busses are very easy to use, you just need to know which bus 
> to get on, where to get on, and where to get off (and you 
> need to pay your fare).  Cars on the other hand require much 
> more work, you need to have some type of map or directions 
> (even if the map is in your head), you need to put gas in 
> every now and then, you need to know the rules of the road 
> (have some type of drivers licence).  The big advantage of 
> the car is that it can take you a bunch of places that the 
> bus does not go and it is quicker for some trips that would 
> require transfering between busses.
> 
> Using this analogy programs like SPSS are busses, easy to use 
> for the standard things, but very frustrating if you want to 
> do something that is not already preprogrammed.
> 
> R is a 4-wheel drive SUV (though environmentally friendly) 
> with a bike on the back, a kayak on top, good walking and 
> running shoes in the pasenger seat, and mountain climbing and 
> spelunking gear in the back.
> 
> R can take you anywhere you want to go if you take time to 
> leard how to use the equipment, but that is going to take 
> longer than learning where the bus stops are in SPSS.
> 
> Now there are tools like Rcmdr that help get you started 
> (maybe a gps unit in the R suv above), but if we make R too 
> user friendly then we limit what can be done with it.
> 
> I think the volume of mail in R-help is partly due to lack of 
> friendliness, but a lot of it is due to the flexibility as 
> well, if it did less, there would be less to learn and ask 
> questions about.  I for one prefer to do a little more work 
> learning the program in exchange for being able to do a lot 
> more with it.
> 
> 
> To mangle a famous Einstien quote:  "Statistical packages 
> should be made as user friendly as possible, but no friendlier."
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> [EMAIL PROTECTED]
> (801) 408-8111
>  
> 
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Manuel 
> López-Ibáñez
> Sent: Wednesday, May 10, 2006 2:19 PM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] Can't there be a cd command?
> 
> Jan T. Kim wrote:
> > 
> > That's an idea I like very much too -- much better than the 
> currently 
> > popular idea of "protecting" users from the "unfriendliness" of 
> > programming, anyway...
> > 
> 
> It is just my opinion that the amount of mail in R-help 
> speaks volumes about the current "friendliness" [1], or lack 
> thereof, of R. Perhaps I am just the only one who thinks this way...
> 
> [1] http://en.wikipedia.org/wiki/Usability
> 
>   
> __
> LLama Gratis a cualquier PC del Mundo. 
> Llamadas a fijos y móviles desde 1 céntimo por minuto. 
> http://es.voice.yahoo.com
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] command completion?

2006-05-10 Thread Berton Gunter

Is the following a fortunes package candidate?


> >Others need to run under ESS.
> 
> While this is a good things for Emacs lovers, the requirement 
> is rather 
> unwelcome for pagans!  :-)
> 
> -- 
> François Pinard   http://pinard.progiciels-bpi.ca
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 


:-)

-- Bert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] importing a list

2006-05-10 Thread Berton Gunter

?dump  ?source

But do you really need to save the fitted object as a txt file?
Saving/loading it in native format (?save  ?load) and then using ?update
would seem to be more straightforward.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> [EMAIL PROTECTED]
> Sent: Wednesday, May 10, 2006 8:24 AM
> To: R-help@stat.math.ethz.ch
> Subject: [R] importing a list
> 
> Hi, all.
> I'm trying to automate some regression operations in R leaving the 
> possibility to modify the predictors in the regression.
> 
> For example, I have saved in a list the results and then 
> exported as a txt 
> file, in which we can modify the predictors, putting for example 
> lm(y~x^2) instead of having lm(y~x) as in the original model.
> 
> Now, I need to import in R the txt file as a list to evaluate 
> the model. 
> Is that possible?
> 
> I played around
> with source() and file() but can't figure it out.
> 
> Thanks.
> Grazia
> 
> 
> M. Grazia Pittau, Ph.D.
> Post-Doctoral Research Fellow
> Department of Statistics
> Columbia University
> 1255 Amsterdam Avenue
> New York, NY  10027
> 
> [EMAIL PROTECTED]
> Phone: 212.851.2160
> Fax: 212.851.2164
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Can't there be a cd command?

2006-05-09 Thread Berton Gunter

You do not say, but if you are on Windows see the R for Windows FAQ 2.14
where getwd() is explicitly mentioned. setwd() is on the same man page.

Also the R for Windows File menu has a "Change dir ..." entry. So I think R
core has already taken care of this, at least on Windows.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Issac Trotts
> Sent: Tuesday, May 09, 2006 2:51 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Can't there be a cd command?
> 
> R is quite a powerful environment.  Here's a small way it 
> could be even better.
> 
> I wanted to change the working directory, so I tried the obvious thing
> 
> > cd("foo")
> Error: couldn't find function "cd"
> 
> Then I looked for `directory' in the FAQ but found nothing.  A search
> for directory in the introduction also turned up nothing.
> 
> A Google search for "gnu R change directory" brought up a link to the
> Windows FAQ, and there was the answer: setwd.  Oddly enough, setwd is
> not mentioned in the general FAQ, the introduction, or the language
> definition.
> 
> Hopefully someone can add a mention of setwd to the general FAQ or the
> intro.  Even better would be to have a cd() command, since that's what
> almost every beginner will try first.
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Non repetitive permutations/combinations of elements

2006-05-08 Thread Berton Gunter



> 
> That will result in a data frame, rather than a matrix:
> 

Ah, indeed. I suspect that Nameeta doesn't care about the distinction, but
you're certainly right. I think the reason for the extra step was worth
explicitly mentioning, as many who are new to R are hazy about the
distinction (they're all just "tables")

Cheers,
Bert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Non repetitive permutations/combinations of elements

2006-05-08 Thread Berton Gunter

 expand.grid(rep(list(c(-1, 1)), 4))  suffices I believe.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
  

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Marc 
> Schwartz (via MN)
> Sent: Monday, May 08, 2006 2:50 PM
> To: Nameeta Lobo
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] Non repetitive permutations/combinations of elements
> 
> On Mon, 2006-05-08 at 16:32 -0500, Nameeta Lobo wrote:
> > Hello all,
> > 
> > I am trying to create a matrix of 1s and -1s without any 
> repetitions for a
> > specified number of columns.
> > e.g. 1s and -1s for 3 columns can be done uniquely in 2^3 ways.
> > -1 -1 -1
> > -1 -1  1
> > -1  1 -1
> > -1  1  1
> >  1 -1 -1
> >  1 -1  1
> >  1  1 -1
> >  1  1  1
> > and for 4 columns in 2^4 ways and so on.
> > 
> > I finally used the function combn([0 1],3) that I found at 
> the following link
> > 
> http://www.mathworks.com/matlabcentral/fileexchange/loadFile.d
> o?objectId=7147&objectType=FILE
> > written by Jos van der Geest in Matlab which generated the above.
> > 
> > 
> > How can I do this is R? I have looked at permn and combn in 
> the combinat library
> > and permutations and combinations in the gtools library and 
> I am still confused
> > as to how to get it to work.
> > 
> > Any suggestions will be truly appreciated.
> > 
> > Thank you
> > 
> > Nameeta
> > 
> 
> With just two elements in the source vector, it may be easiest to just
> use expand.grid() and coerce the result to a matrix:
> 
> > as.matrix(expand.grid(rep(list(c(-1, 1)), 3)))
>   Var1 Var2 Var3
> 1   -1   -1   -1
> 21   -1   -1
> 3   -11   -1
> 411   -1
> 5   -1   -11
> 61   -11
> 7   -111
> 8111
> 
> Just adjust the final value of '3' to the number of columns that you
> wish to have:
> 
> > as.matrix(expand.grid(rep(list(c(-1, 1)), 4)))
>Var1 Var2 Var3 Var4
> 1-1   -1   -1   -1
> 2 1   -1   -1   -1
> 3-11   -1   -1
> 4 11   -1   -1
> 5-1   -11   -1
> 6 1   -11   -1
> 7-111   -1
> 8 111   -1
> 9-1   -1   -11
> 101   -1   -11
> 11   -11   -11
> 1211   -11
> 13   -1   -111
> 141   -111
> 15   -1111
> 161111
> 
> 
> HTH,
> 
> Marc Schwartz
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] OT: DOE - experiments for teaching

2006-05-05 Thread Berton Gunter

I've had fun and luck with the apparatus described in my little paper:

THROUGH A FUNNEL SLOWLY WITH BALL BEARING AND INSIGHT TO TEACH EXPERIMENTAL
DESIGN 
The American Statistician, 47, 4 p. 265-269 (1993)

We continue to use this in our industrial training.

I also would strongly second Spencer's remarks re the difficulty of helping
students see the big picture. For some reason, viewing experimentation as
part of an overall learning process/strategy does not seem to be part of
most scientist's or engineer's formal education. I suppose if you look at
typical science or engineering labs where the goal is to come to a
predetermined conclusion, it's not hard to see why. But we don't need to get
into that imbroglio here.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Spencer Graves
> Sent: Friday, May 05, 2006 3:59 PM
> To: Thomas Kaliwe
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] OT: DOE - experiments for teaching
> 
> I fully endorse Richard Heiberger's recommendation of 
> the Bill Hunter 
> articles on teaching experimental design.  For a 
> college-level semester 
> D0E class, I had students do experiments in groups.  I found 
> it wise to 
> have them do a preliminary presentation with a discussion of the 
> experimental design plus their protocol for managing all the 
> details of 
> test materials, data collection, etc., then a final report with the 
> results.  Many students did fine, but some were clearly 
> clueless about 
> the whole process, which indicated a need for some adjustment 
> in what I 
> taught or in some individual assistance.
> 
> If this is just a few hours or a 1-day thing, you 
> might consider 
> "http://www.prodsyse.com/exped2b.pdf";.
> 
> hope this helps.
> Spencer Graves
> 
> Thomas Kaliwe wrote:
> > Hi,
> >  
> > I'm sorry for this not being related to R but I think this is a good
> > place to ask. I'm looking for DOE examples(experiments) 
> that can be done
> > at home or in class, such as Paper  Helicopter, Paper Towel 
> etc.. I'm
> > thankful for any comment.
> >  
> > Thomas
> > 
> > [[alternative HTML version deleted]]
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> >
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to a handle an error in a loop

2006-05-05 Thread Berton Gunter

?try

as in

result<- try (some R expression...)
if (inherits(result,'try-error')) ...do something
else ...do something else


Hope this allows you to get to heaven

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Farrel 
> Buchinsky
> Sent: Friday, May 05, 2006 3:54 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] How to a handle an error in a loop
> 
> I am about one step away from heaven on earth. I think only one step!
> I am using dgc.genetics to run a TDT test on thousands of 
> genetic loci. I 
> have learnt (through the help of others on this mailing list) 
> to send the 
> complex output to useful data frames which in turn allow me 
> to look at the 
> big picture and screen the thousands of loci.
> 
> Resultdt<-lapply(PGWide[,240:290], tdt)
> the above would do 51 loci at a time. I want to do 6000 all 
> in one shot.
> the only problem is that about once every 100 variables in 
> PGWide there is a 
> locus whose data trips up the tdt function. And when it does, 
> it short 
> circuits the entire loop. Nothing gets written  to Resultdt.
> 
> > Resultdt<-lapply(PGWide[,240:389], tdt)
> Error in rep.default(1, nrow(U)) : rep() incorrect type for 
> second argument
> In addition: Warning messages:
> 1: 1 misinheritances in: phase.resolve(g.cs, g.mr, g.fr, 
> as.allele.pair = 
> TRUE, allow.ambiguous = (parent ==
> 2: 1 misinheritances in: phase.resolve(g.cs, g.mr, g.fr, 
> as.allele.pair = 
> TRUE, allow.ambiguous = (parent ==
> 3: 4 misinheritances in: phase.resolve(g.cs, g.mr, g.fr, 
> as.allele.pair = 
> TRUE, allow.ambiguous = (parent ==
> 
> 
> It is very laborious working out which column it is that 
> caused the error.
> But by narrowing down the range of the looping, one can do 
> it. So eventually 
> I got to:
> 
> > tdt(PGWide[,243])
> Error in rep.default(1, nrow(U)) : rep() incorrect type for 
> second argument
> In addition: Warning message:
> 4 misinheritances in: phase.resolve(g.cs, g.mr, g.fr, 
> as.allele.pair = TRUE, 
> allow.ambiguous = (parent ==
> 
> Do I have to grapple with the code inside the tdt function in 
> order to 
> handle the error (disappointing because I only half 
> understand the R code in 
> it) or is there an easy way to instruct R to simply move onto 
> the next one. 
> I saw something about try() but it got me nowhere. I tried
> Resultdt<-lapply(PGWide[,385:389],try(tdt))
> 
> 
> -- 
> Farrel Buchinsky, MD
> Pediatric Otolaryngologist
> Allegheny General Hospital
> Pittsburgh, PA
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Adding elements in an array where I have missing data.

2006-05-02 Thread Berton Gunter

That is curious, as I thought when I checked it that evalq did what I said
by default. Apparently not. However, to continue in the vein of complex
solutions for simple problems, either

explicitly using local() or specifying a local environment as an argument

 evalq({a[is.na(a)]<-0;a},env=new.env())+b 

does it. Either must be used to protect against argument evaluation, which
is the point of interest.

The solutions you suggested do this implicitly by creating local function
environments in which the assignment occurs of course.

This also means that my arcane explanation of evaluation is wrong: the
evaluator does not by default do the assignment in a local environment as I
stated, but follows the pointers (enclosures) -- of course. Local evaluation
must be explicitly forced. 

-- Bert

> -Original Message-
> From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, May 02, 2006 11:14 AM
> To: Berton Gunter
> Cc: John Kane; R R-help
> Subject: Re: [R] Adding elements in an array where I have 
> missing data.
> 
> But the evalq solution does change a.
> 
> > a <- c(2, NA, 3)
> > b <- c(3,4, 5)
> > evalq({a[is.na(a)]<-0;a})+b
> [1] 5 4 8
> > a
> [1] 2 0 3
> 
> If evalq were changed to local then it would not change a:
> 
> > a <- c(2, NA, 3)
> > b <- c(3,4, 5)
> > local({a[is.na(a)]<-0;a})+b
> [1] 5 4 8
> > a
> [1]  2 NA  3
> 
> Also the replace, ifelse and mapply solutions do not change a.
> 
> 
> On 5/2/06, Berton Gunter <[EMAIL PROTECTED]> wrote:
> > Below.
> >
> > > -Original Message-
> > > From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
> > > Sent: Tuesday, May 02, 2006 10:42 AM
> > > To: Berton Gunter
> > > Cc: John Kane; R R-help
> > > Subject: Re: [R] Adding elements in an array where I have
> > > missing data.
> > >
> > > On 5/2/06, Berton Gunter <[EMAIL PROTECTED]> wrote:
> > > > >
> > > > > Here are a few alternatives:
> > > > >
> > > > > replace(a, is.na(a), 0) + b
> > > > >
> > > > > ifelse(is.na(a), 0, a) + b
> > > > >
> > > > > mapply(sum, a, b, MoreArgs = list(na.rm = TRUE))
> > > > >
> > > >
> > > > Well, Gabor, if you want to get fancy...
> > > >
> > > > evalq({a[is.na(a)]<-0;a})+b
> > > >
> > >
> > > Note that the evalq can be omitted:
> > >
> > >{ a[is.na] <- 0; a } + b
> > >
> >
> > No it can't. The idea is **not** to change the original a.
> >
> > -- Bert
> >
> >
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Adding elements in an array where I have missing data.

2006-05-02 Thread Berton Gunter

Below.

> -Original Message-
> From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, May 02, 2006 10:42 AM
> To: Berton Gunter
> Cc: John Kane; R R-help
> Subject: Re: [R] Adding elements in an array where I have 
> missing data.
> 
> On 5/2/06, Berton Gunter <[EMAIL PROTECTED]> wrote:
> > >
> > > Here are a few alternatives:
> > >
> > > replace(a, is.na(a), 0) + b
> > >
> > > ifelse(is.na(a), 0, a) + b
> > >
> > > mapply(sum, a, b, MoreArgs = list(na.rm = TRUE))
> > >
> >
> > Well, Gabor, if you want to get fancy...
> >
> > evalq({a[is.na(a)]<-0;a})+b
> >
> 
> Note that the evalq can be omitted:
> 
>{ a[is.na] <- 0; a } + b
> 

No it can't. The idea is **not** to change the original a.

-- Bert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Adding elements in an array where I have missing data.

2006-05-02 Thread Berton Gunter

> 
> 
> --- Berton Gunter <[EMAIL PROTECTED]> wrote:
> 
> > > 
> > > Here are a few alternatives:
> > > 
> > > replace(a, is.na(a), 0) + b
> > > 
> > > ifelse(is.na(a), 0, a) + b
> > > 
> > > mapply(sum, a, b, MoreArgs = list(na.rm = TRUE))
> > > 
> > 
> > Well, Gabor, if you want to get fancy...
> > 
> > evalq({a[is.na(a)]<-0;a})+b
> 
> It's going into my tips file but what does it mean??
> Thanks
> 
> 
Note 1: The following is probably more arcane than most R users would care
to be bothered with. You have been forewarned.

Note 2: I am far from an expert on this so I would appreciate public
correction of any errors in the following.

Well, "what it means" is "explained" in the man page for evalq, but to
understand it you have to understand expression evaluation in R (or, really,
in any computer language). Basically, my understanding is as follows:when R
sees a series of characters like

a + b

it goes through roughly the following steps to figure out what to do (the
situation is actually more complicated because of method dispatch, but I'll
ignore this):

1) R creates a parse tree -- equivalently, a list -- with root "+" and 2
leaves, a and b. 

2) R now by default needs to evaluate the symbols "a" and "b" (as names, not
character strings). It uses it's lexical scoping procedures to do this. That
is, it uses lexical scoping to decide where to look up the name value pairs
whose names are a and b. See the R FAQ 3.3.1, ?environment, or the R
Language Definition Manual for more on this (also V & R's S PROGRAMMING has
a nice discussion of this).  

3) It now substitutes the values for a and b into the parse tree (or issues
an error message if none can be found, etc.). This is what is meant by "the
arguments are evaluated before being passed to the evaluator." 

4) This parse tree is now passed to the evaluator which adds the values (or,
in general, calls the appropriate method, I think -- I'm fuzzy on exactly
how method dispatch occurs here) and returns the result to R.

So how does this apply to the above? Well, the stuff in the curly braces is
an "expression" that ordinarily would be parsed and evaluated and its value
substituted into the left node of the "+" parse tree (the overall expression
[] + b ). As part of this evaluation, 0 would be substituted for the
missings in a and the changed a would be saved in the Global environment
(or, more generally, whatever the enclosing environment is).  However, evalq
protects it's argument from that evaluation, so that the whole expresseion
is passed **as an expression** -- e.g. an unevaluated parse tree -- to the
left node of the "+" parse tree. The right node symbol, b, would be
evaluated, since it's not so protected. This entire parse tree with "+" at
its root node is then passed to the evaluator for evaluation. It is
evaluated there locally -- that is, the changes in x are made only on a
local copy of x in the evaluator, not on the x in the global environment --
and the resulting value returned **without having changed x.**

HTH.

Cheers,
Bert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Adding elements in an array where I have missing data.

2006-05-02 Thread Berton Gunter

> 
> Here are a few alternatives:
> 
> replace(a, is.na(a), 0) + b
> 
> ifelse(is.na(a), 0, a) + b
> 
> mapply(sum, a, b, MoreArgs = list(na.rm = TRUE))
> 

Well, Gabor, if you want to get fancy...

evalq({a[is.na(a)]<-0;a})+b

(and variants...)

Cheers,
Bert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Looking for an unequal variances equivalent of the KruskalWallis nonparametric one way ANOVA

2006-04-27 Thread Berton Gunter

Why not bootstrap or simulate (e.g. permutation test)? Sounds like exactly
the sort of situation for which it's designed.

-- Bert
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard
> Sent: Thursday, April 27, 2006 8:39 AM
> To: Mike Waters
> Cc: R-help@stat.math.ethz.ch
> Subject: Re: [R] Looking for an unequal variances equivalent 
> of the KruskalWallis nonparametric one way ANOVA
> 
> "Mike Waters" <[EMAIL PROTECTED]> writes:
> 
> > Well fellow R users, I throw myself on your mercy. Help me, 
> the unworthy,
> > satisfy my employer, the ungrateful. My feeble ramblings follow...
> > 
> > I've searched R-Help, the R Website and done a GOOGLE 
> without success for a
> > one way ANOVA procedure to analyse data that are both 
> non-normal in nature
> > and which exhibit unequal variances and unequal sample 
> sizes across the 4
> > treatment levels. My particular concern is to be able to 
> discrimintate
> > between the 4 different treatments (as per the Tukey HSD in 
> happier times).
> > 
> > To be precise, the data exhibit negative skew and 
> platykurtosis and I was
> > unable to obtain a sensible transformation to normalise 
> them (obviously
> > trying subtracting the value from range maximum plus one in 
> this process).
> > Hence, the usual Welch variance-weighted one way ANOVA 
> needs to be replaced
> > by a nonparametric alternative, Kruskal-Wallis being ruled 
> out for obvious
> > reasons. I have read that, if the treatment with the fewest 
> sample numbers
> > has the smallest variance (true here) the parametric tests 
> are conservative
> > and safe to use, but I would like to do this 'by the book'.
> 
> What are the sample sizes like? Which assumptions are you willing to
> make _under the null hypothesis_?  
> 
> If it makes sense to compare means (even if nonnormal), then a
> Welch-type procedure might suffice if the DF are large.
> 
> pairwise.wilcox.test() might also be a viable alternative, with a
> suitably p-adjustment. This would make sense if you believe that the
> relevant null for comparison between any two treatments is that they
> have identical distributions. (With only four groups, I'd be inclined
> to use the Bonferroni adjustment, since it is known to be
> conservative, but not badly so.)
> 
> -- 
>O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark  Ph:  
> (+45) 35327918
> ~~ - ([EMAIL PROTECTED])  FAX: 
> (+45) 35327907
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Were to find appropriate functions for a given task in R

2006-04-26 Thread Berton Gunter

1. Check out the R reference card at
http://www.rpad.org/Rpad/Rpad-refcard.pdf . There are also several others
available from the CRAN website.

2. Check out TINN-R, a Windows text/R code editor that integrates the above
and provides function "tips" inline to give the syntax of many R functions.

3. ?help.search

4. ?RSiteSearch (or search CRAN directly using Jon Baron's search engine).

These do not eliminate the problem, but hopefully mitigate it. Given that
there are several thousand R functions spread among hundreds of packages at
at least three separate repositories (CRAN, BioConductor, and Omegahat),
it's clearly a nontrivial issue. But that's why Google and other search
services are multibillion dollar companies.

HTH

Cheers,
Bert
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Albert Sorribas
> Sent: Wednesday, April 26, 2006 8:32 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Were to find appropriate functions for a given task in R
> 
> This is a generic request concerning were to look for finding
> appropriate information on a precise procedure in R.
> Im using R for teaching introductory statistics and my students are
> learning how to deal with it. However, I find it difficult to locate
> some of the procedures. For instance, for basic crosstabulation, it is
> obvious that basic functions as table, ftable, and prop.table can be
> used. But there is a CrossTable function that is very useful. This is
> hidden in gmodels and gregmisc, as far as Ive been able to 
> explore the
> packages. However, there is no way (unless I sit down to r-help for
> hours) to be sure if there is some other place in which a very useful
> function is hidden for table manipulation (for instance 
> controlling for
> other variables). This is only an example. But there are many 
> more. Were
> to look for CI for proportions? I can find it but it is not easy.
> 
> I understand R is more appropriate for difficult statistical 
> procedures
> (glm and similar), BUT students need to start somewhere.
> 
> My specific claim is about the need for a sort of guide in which the
> different procedures could be classified (and some 
> redundancies could be
> deleted..by the way). Is there something similar around? Any project
> working on this? Any clue for?
> 
> If not, I would suggest starting some kind of easy reference based on
> the problem to solve. This could indicate were to look for. Last day I
> find in package vcd that a function exist for testing the
> goodness-of-fit of a sample to binomial and other distributions.but
> this was VERY difficult to locate. 
> 
> Any way, as usual, any indication will be very useful 
> (spaecially for my
> students!!!)
> 
> 
> 
> Albert Sorribas
> Professor of Statistics and Operational Research
> Departament de Cihncies Mhdiques B`siques
> Universitat de Lleida
> Montserrat Roig 2
> 25008-Lleida (Espanya)
> web.udl.es/Biomath/Group
>  
> 
> 
>   [[alternative HTML version deleted]]
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] regression modeling

2006-04-25 Thread Berton Gunter

May I offer a perhaps contrary perspective on this.

Statistical **theory** tells us that the precision of estimates improves as
sample size increases. However, in practice, this is not always the case.
The reason is that it can take time to collect that extra data, and things
change over time. So the very definition of what one is measuring, the
measurement technology by which it is measured (think about estimating tumor
size or disease incidence or underemployment, for example), the presence or
absence of known or unknown large systematic effects, and so forth may
change in unknown ways. This defeats, or at least complicates, the
fundamental assumption that one is sampling from a (fixed) population or
stable (e.g. homogeneous, stationary) process, so it's no wonder that all
statistical bets are off. Of course, sometimes the necessary information to
account for these issues is present, and appropriate (but often complex)
statistical analyses can be performed. But not always.

Thus, I am suspicious, cynical even, about those who advocate collecting
"all the data" and subjecting the whole vast heterogeneous mess to arcane
and ever more computer intensive (and adjustable parameter ridden) "data
mining" algorithms to "detect trends" or "discover knowledge." To me, it
sounds like a prescription for "turning on all the equipment and waiting to
see what happens" in the science lab instead of performing careful,
well-designed experiments.

I realize, of course, that there are many perfectly legitimate areas of
scientific research, from geophysics to evolutionary biology to sociology,
where one cannot (easily) perform planned experiments. But my point is that
good science demands that in all circumstances, and especially when one
accumulates and attempts to aggregata data taken over spans of time and
space, one needs to beware of oversimplification, including statistical
oversimplification. So interrogate the measurement, be skeptical of
stability, expect inconsistency. While "all models are wrong but some are
useful" (George Box), the second law tells us that entropy still rules.

(Needless to say, public or private contrary views are welcome).

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi
> Sent: Tuesday, April 25, 2006 12:10 PM
> To: bogdan romocea
> Cc: r-help
> Subject: Re: [R] regression modeling
> 
> i believe it is not a question only related to regression 
> modeling. The
> correlation between the sample size and confidence of 
> prediction in data
> mining is not as clear as traditional stat approach.  My 
> concern is not in
> that theoretical discussion but more practical, looking for a 
> good algorithm
> when response variable is continuous when large dataset is concerned.
> 
> On 4/25/06, bogdan romocea <[EMAIL PROTECTED]> wrote:
> >
> > There is an aspect, worthy of careful consideration, you 
> don't seem to
> > be aware of. I'll ask the question for you: How does the
> > explanatory/predictive potential of a dataset vary as the 
> dataset gets
> > larger and larger?
> >
> >
> > > -Original Message-
> > > From: [EMAIL PROTECTED]
> > > [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi
> > > Sent: Monday, April 24, 2006 12:45 PM
> > > To: r-help
> > > Subject: [R] regression modeling
> > >
> > > Hi, there:
> > > I am looking for a regression modeling (like regression
> > > trees) approach for
> > > a large-scale industry dataset. Any suggestion on a package
> > > from R or from
> > > other sources which has a decent accuracy and scalability? Any
> > > recommendation from experience is highly appreciated.
> > >
> > > Thanks,
> > >
> > > Weiwei
> > >
> > > --
> > > Weiwei Shi, Ph.D
> > >
> > > "Did you always know?"
> > > "No, I did not. But I believed..."
> > > ---Matrix III
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> >
> 
> 
> 
> --
> Weiwei Shi, Ph.D
> 
> "Did you always know?"
> "No, I did not. But I believed..."
> ---Matrix III
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] how to control the data type

2006-04-21 Thread Berton Gunter

print(rnorm(1),digits=3)

The key is understanding the difference between the value and what is
printed automatically with the default number of digits given by the
options() value currently in effect.

?options  for "digits"

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of John Kane
> Sent: Friday, April 21, 2006 3:59 PM
> To: zhongmiao wang; r-help@stat.math.ethz.ch
> Subject: Re: [R] how to control the data type
> 
> Have a look at  ?round for one way. 
> 
> - Original Message 
> From: zhongmiao wang <[EMAIL PROTECTED]>
> To: r-help@stat.math.ethz.ch
> Sent: Thursday, April 20, 2006 11:00:25 PM
> Subject: [R] how to control the data type
> 
> Hello:
> I am generating a random number with rnorm(1). The generated number
> has 8 decimals. I don't want so many decimals. How to control the
> number of decimals in R?
> Thanks!
> 
> Zhongmiao Wang
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] smooth the ecdf plots

2006-04-20 Thread Berton Gunter

Better idea: Compare directly. ?qqplot

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Linda Lei
> Sent: Thursday, April 20, 2006 10:08 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] smooth the ecdf plots
> 
> Hi All,
> 
>  
> 
> I have codes as follows to get the ecdf plots:
> 
>  
> 
>  >
> day.hos2<-c(6,4,6,6,4,6,5,4,7,5,6,6,8,6,17,9,8,4,6,3,5,8,7,12,
> 5,10,6,4,6
> ,13,7,6,6,25,4,9,96,6,6,6,6,9,4,5,5,4,10,5,7,6)
> 
>  
> 
>  >
> day.hos3<-c(5,6,7,6,4,5,6,6,6,6,19,7,5,9,8,8,7,5,6,20,40,5,8,7
> ,7,5,6,13,
> 11,9,4,6,9,16,6,7,6)
> 
>  
> 
>  > f<-ecdf(day.hos2)
> 
>  
> 
>  > plot(f,col.p='red',col.h='red')
> 
>  
> 
>  > g<-ecdf(day.hos3)
> 
>  
> 
> > lines(g,lty=2)
> 
>  
> 
> But in order to compare the two ecdf plots. I want to smooth the ecdf
> plot, make it like a continuous distribution curve. Could you please
> 
> help me with it? I try to find some arguments in "plot" but not
> successful.
> 
>  
> 
> Thank you!
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] 3D pie

2006-04-19 Thread Berton Gunter

For more comments on this sort of thing, google on "chartjunk."

-- Bert

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Martin Maechler
> Sent: Tuesday, April 18, 2006 11:55 PM
> To: COMTE Guillaume
> Cc: r-help@stat.math.ethz.ch; Patrick Burns
> Subject: Re: [R] 3D pie
> 
> > "PatBurns" == Patrick Burns <[EMAIL PROTECTED]>
> > on Tue, 18 Apr 2006 19:09:25 +0100 writes:
> 
> PatBurns> You can see my opinion of 3D piecharts at
> PatBurns> 
> http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html
> 
> PatBurns> Patrick Burns [EMAIL PROTECTED] +44 (0)20
> PatBurns> 8525 0696 http://www.burns-stat.com (home of S
> PatBurns> Poetry and "A Guide for the Unwilling S User")
> 
> Indeed!
> Or:
>If you real want to commit the crime of producing 3D pies,
>then please do not abuse a beatiful software like R,
>but stay with poor man's Excel!
> 
> Martin Maechler, ETH Zurich
> 
> 
> PatBurns> COMTE Guillaume wrote:
> 
> >> Hi all,
> >> 
> >> 
> >> 
> >> Is there a way to draw 3D pie with R (like excel does)?
> >> 
> >> 
> >> 
> >> I know how to do it in 2D, just by using
> >> pie(something)...
> >> 
> >> 
> >> 
> >> I know it isn't the best way to represent data, but
> >> people are sometimes more interested by the look and feel
> >> than by the accuracy of the results...
> >> 
> >> 
> >> 
> >> If there is no way, have you another suggestion ? (i
> >> already use dotchart instead of pie)
> >> 
> >> 
> >> 
> >> Thks to all of you.
> >> 
> >> COMTE Guillaume
> >> 
> >> 
> >> [[alternative HTML version deleted]]
> >> 
> >> __
> >> R-help@stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
> >> read the posting guide!
> >> http://www.R-project.org/posting-guide.html
> >> 
> >> 
> >> 
> >> 
> 
> PatBurns> __
> PatBurns> R-help@stat.math.ethz.ch mailing list
> PatBurns> https://stat.ethz.ch/mailman/listinfo/r-help
> PatBurns> PLEASE do read the posting guide!
> PatBurns> http://www.R-project.org/posting-guide.html
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R Reference Card (especially useful for Newbies)

2006-04-18 Thread Berton Gunter

 

Newbies (and others!) may find useful the R Reference Card made available by
Tom Short and Rpad at http://www.rpad.org/Rpad/Rpad-refcard.pdf  or through
the "Contributed" link on CRAN (where some other reference cards are also
linked). It categorizes and organizes a bunch of R's basic, most used
functions so that they can be easily found. For example, paste() is under
the "Strings" heading and expand.grid() is under "Data Creation." For
newbies struggling to find the right R function as well as veterans who
can't quite remember the function name, it's very handy.
 
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Questions on formula in princomp

2006-04-17 Thread Berton Gunter

As always, please read the Help file, in particular the details on encoding
of factors. It's dense, but it explains everything.

-- Bert
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Sasha Pustota
> Sent: Friday, April 14, 2006 9:08 PM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] Questions on formula in princomp
> 
> jim holtman <[EMAIL PROTECTED]> wrote:
> > does this explain it?
> >
> > > groups <- factor(c(rep("Z",5),rep("X",5),rep("Y",5)))
> > >
> > > groups
> >  [1] Z Z Z Z Z X X X X X Y Y Y Y Y
> > Levels: X Y Z
> > > as.integer(groups)
> >
> >  [1] 3 3 3 3 3 1 1 1 1 1 2 2 2 2 2
> >  > c(1,2,3)[groups]
> >  [1] 3 3 3 3 3 1 1 1 1 1 2 2 2 2 2
> 
> I did notice the lexicographical ordering of Z,X,Y. I don't understand
> the meaning of c(1,2,3) "subscription" by a factor. I understand
> subscription by an integer, or by a single item as in associative
> arrays. Or does "[]" have a different meaning here?
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Questions on formula in princomp

2006-04-14 Thread Berton Gunter

One of the components of the returned princomp() objects can be $scores, the
matrix of scores. You can plot these as usual using any characters you like
via the 'pch' parameter of plot:

e.g. 
## groups is a factor giving the groups for each data value.Assuming three
groups
myscores<-[princomp(...,scores=TRUE)$scores
plot(myscores[,1:2],pch=c('s','c','v')[groups])

Of course, this is not quite a biplot, but it's close.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Sasha Pustota
> Sent: Friday, April 14, 2006 3:35 PM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] Questions on formula in princomp
> 
> Ok, that was just my wishful thinking.
> 
> Is there a way to plot repeated labels that identify groups, e.g.
> factor(c(rep("s",50),rep("c",50),rep("v",50)))
> 
> instead of 1--150 row indices, using something like
> biplot(princomp(lir)) ?
> 
> 
> Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
> > Just use model.frame to examine what is passed:
> >
> > > ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3])
> > > lir <- data.frame(log(ir))
> > > names(lir) <- c("a","b","c","d")
> > > lir[1,1] <- NA
> > > mf <- model.frame(~., lir,na.action=na.omit)
> > > head(mf)
> >  ab c  d
> > 2 1.589235 1.098612 0.3364722 -1.6094379
> > > head(lir)
> >  ab c  d
> > 1   NA 1.252763 0.3364722 -1.6094379
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] how to count the columns of a data.frame

2006-04-14 Thread Berton Gunter

> 
> Hi,
>   I would like to count the columns of a data.frame. I know 
> how to count the rows, but not the columns.

...

If you knew how to count the rows, you would have known about nrow which has
the same man page as ncol. Also help.search('number of rows') would have
immediately given you your answers. So please do your homework before
posting in future by using R's extensive built-in documentation.

-- Bert Gunter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Plotting positions in qqnorm?

2006-04-13 Thread Berton Gunter

Spencer:

I seem to remember that Jim Filliben did some work on this. Try checking the
references in this:

J. J. Filliben (1975) 
The Probability Plot Correlation Coefficient Test for Normality
Technometrics, Vol. 17, No. 1, pp. 111-117  

My experience agrees with yours: if sample sizes are small enough for it to
make a difference, then sample sizes are too small to say much useful about
the distribution anyway. Heresy: I gave up using normal and half normal
plots for screening designs years ago, as they never told me more (nor less)
than dot plots.

Cheers,
Bert

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Spencer Graves
Sent: Thursday, April 13, 2006 1:21 PM
To: R help list
Subject: [R] Plotting positions in qqnorm?

  Do you know of a reference that discusses alternative choices for 
plotting positions for a normal probability plot?  The documentation for 
qqnorm says it calls ppoints, which returns qnorm((1:m-a)/(m+1-2*a)) 
with "a" = ifelse(n<=10, 3/8, 1/2)?  The help pages for qqnorm and 
ppoints just refer to Becker, Chambers and Wilks (1988) The New S 
Language (Wadsworth & Brooks/Cole), and I couldn't find any discussion 
of this.

  I seem to recall that this was discussed in 1960 or earlier in a 
paper by Anscombe, but I can't find a reference and I wonder if someone 
might suggest something else.  I've been asked to comment on specialized 
software that allows the user to select "a" = +/-0.5, 0, 0.3, and 0.3175 
(but not 0.375 = 3/8, curiously).

  I'd also be interested in any examples of real data sets where the

choice of "a" actually made a difference.  When I've had so few data 
points that the choice for "a" might make a difference, a normal 
probability plot was not very informative, anyway, and I get more 
information from a simple dot plot.  If your experience is different, 
I'd like to know.

  Thanks,
  Spencer Graves

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] factor analysis backwards

2006-04-12 Thread Berton Gunter

RSiteSearch("simulate specified covariance") will bring you to mvrnorm() in
MASS. Please try to use R's built-in search capabilities first before
posting. I realize that keywords can be hard to guess, but you may find that
when the hits are **not** what you want, you need to refine your question
more, as described in the posting guide (have you read it?)

-- Bert Gunter   

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Stefan Premke
Sent: Wednesday, April 12, 2006 6:52 AM
To: r-help@stat.math.ethz.ch
Subject: [R] factor analysis backwards

Hello!
How can I do a factor analysis backwards to get an arbitrary covarianz 
matrix out of an arbitrary number of generated random variables that 
have a correlation near zero. Or the same question shorter: How to 
generate random variables that have a spezial correlation pattern.
I would like to be able to do this to generate arbitrary data structures 
for simulation purpose

sincerely
stefan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] finding common elements in a list

2006-04-08 Thread Berton Gunter

The original post is ambiguous: do you want to find the intersection or do
you want to find whether a prespecified set is in the intersection? Patrick
provided you an answer to the latter while you provided an answer to the
former. Actually, I thought using table as you did (mod the need for no
replicates) was clever. A more direct but I think considerably slower
approach would be to use intersect() in a loop: 

inall<-intersect(foo[[1]],foo[[,2]])
for(i in seq(3, to=length(foo))inall<-intersect(inall,foo[[i]])

I suspect you already thought of this and rejected it. Other than
transparency, I think the only advantage it has is that it will work for
something other than lists of numerics, e.g. it will work for lists of
factors, which the table() solution would not.

-- Bert

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Patrick Burns
Sent: Friday, April 07, 2006 12:04 PM
To: Andy Bunn
Cc: R-Help
Subject: Re: [R] finding common elements in a list

Here is one solution:

 > all(unlist(lapply(foo, function(x) c(2,3) %in% x)))
[1] TRUE

This doesn't have the restriction of assuming that the components
of the list have unique elements, as the original solution does.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Andy Bunn wrote:

>Suppose I have a list where I want to extract only the elements that occur
>in every component. For instance in the list foo I want to know that the
>numbers 2 and 3 occur in every component. The solution I have seems
>unnecessarily clunky. TIA, Andy
>
>foo <- list(x = 1:10, y=2:11, z=1:3)
>bar <-unlist(foo)
>bartab <- table(bar)
>as.numeric(names(bartab)[bartab==length(foo)])
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
>
>  
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] pros and cons of "robust regression"? (i.e. rlm vs lm)

2006-04-06 Thread Berton Gunter

Spencer:

Your comment reinforces Andy's point, which is that purported outliers must
not be ignored but need to be clearly identified and examined. For reasons
that you well understand, robust regression methods are better for this in
the linear models context than standard least aquares. However, as I
understand it, the problem in the case that you describe is **not** that the
outliers weren't identified and examined, but that they were and were
dismissed as metrological anomalies. I would say this was an error of
scientific (mis)judgment, not data analystical methodology.

Anyway, this has taken us too far afield, so no more on-list comments from
me.

-- Bert 
 

> -Original Message-
> From: Spencer Graves [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, April 06, 2006 10:30 AM
> To: Berton Gunter
> Cc: 'Liaw, Andy'; 'r user'; 'rhelp'
> Subject: Re: [R] pros and cons of "robust regression"? (i.e. 
> rlm vs lm)
> 
> A great example of the hazards of automatic outlier 
> rejection is the 
> story of how the hole in the ozone layer in the southern 
> hemisphere was 
> discovered.  Outliers were dutifully entered into the data base but 
> discounted as probable metrology problems, which also plagued the 
> investigation.  As the percentage of outliers became excessive, 
> investigators untimately became convinced that many of the "outliers" 
> were not metrology problems but real physical problems.
> 
> For a recent discussion of this, see Maureen Christie 
> (2004) "11. 
> Data Collection and the Ozone Hole:  Too much of a good thing?" 
> Proceedigns of the International Commission on History of Meteorology 
> 1.1, pp. 99-105 
> (www.meteohistory.org/2004proceedings1.1/pdfs/11christie.pdf).
> 
> spencer graves
> p.s.  I understand that Australia now has one of the world's highest 
> rates of skin cancer, which has contributed to a major change 
> in outdoor 
> styles of dress there.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

1 2 3 4 5 >

1 - 100 of 478 matches

Mail list logo