Re: [R] Too many warnings when updating R

2007-09-10 Thread Peter Dalgaard
A Lenzo wrote:
 Hello friends,

 I loaded R 2.4.1 onto a Fedora Core 6 Linux box (taking all defaults).  Then
 I ran these commands from within R:

 options(CRAN=http://cran.stat.ucla.edu;)
 install.packages(CRAN.packages()[,1])

 As a new user of R, I was shocked when I finished loading R and discovered
 the following message:

 There were 50 or more warnings (use warnings() to see the first 50)

   
Let me get this straight: You install last year's R on last year's 
Fedora, then install over 1000 unspecified packages and you are shocked 
that you get warnings?

 In addition to this, I saw errors such as this one:

 ERROR: lazy loading failed for package 'PerformanceAnalytics'

 What is this lazy loading?  More importantly, do I have to worry about all
 these warnings?  I am intimidated by the idea that I have to go back and fix
 each and every one in order to have a clean R update.  Shouldn't the update
 with CRAN just work?  Or is there something really important that I am
 missing?
   
Well, you need to know what you're doing. At the very least, notice what 
the warnings say and decide whether they point to real trouble or are 
just what they say they are: warnings. If you are worried about  
investigating all the packages, maybe install what you really need first.

And no, you can't expect a repository like CRAN to keep track of all 
versions of R on all versions of all OS's. In each individual case, a 
human maintainer is responsible for fixing problems and he/she may or 
may not be around to fix issues.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MLE Function

2007-09-10 Thread Peter Dalgaard
Terence Broderick wrote:
 I am just trying to teach myself how to use the mle function in R because it 
 is much better than what is provided in MATLAB. I am following tutorial 
 material from the internet, however, it gives the following errors, does 
 anybody know what is happening to cause such errors, or does anybody know any 
 better tutorial material on this particular subject.
   
   
 x.gam-rgamma(200,rate=0.5,shape=3.5)
 x-x.gam
 library(stats4)
 ll-function(lambda,alfa){n-200;x-x.gam 
 -n*alfa*log(lambda)+n*log(gamma(alfa))-9alfa-1)*sum(log(x))+lambda*sum(x)}
 
 Error: syntax error, unexpected SYMBOL, expecting '\n' or ';' or '}' in 
 ll-function(lambda,alfa){n-200;x-x.gam 
 -n*alfa*log(lambda)+n*log(gamma(alfa))-9alfa
   
 ll-function(lambda,alfa){n-200;x-x.gam 
 -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa-1)*sum(log(x))+lambda*sum(x)}
 est-mle(minuslog=ll,start=list(lambda=2,alfa=1))
 
 Error in optim(start, f, method = method, hessian = TRUE, ...) : 
 objective function in optim evaluates to length 200 not 1


   
Er, not what I get. Did your version have that linefeed after x - x.gam 
? If not, then you'll get your negative log-likelihood added to x.gam 
and the resulting likelihood becomes a vector of length 200 instead of 
a scalar.

In general, the first piece of advice for mle() is to check that the 
likelihood function really is what it should be. Otherwise there is no 
telling what the result might mean...

Secondly, watch out for parameter constraints. With your function, it 
very easily happens that alfa tries to go negative in which case the 
gamma function in the likelihood will do crazy things.
A common trick in such cases is to reparametrize by log-parameters, i.e.

ll - function(lambda,alfa){n-200; x-x.gam
-n*alfa*log(lambda)+n*lgamma(alfa)-(alfa-1)*sum(log(x))+lambda*sum(x)}

ll2 - function(llam, lalf) ll(exp(llam),exp(lalf))
est - mle(minuslog=ll2,start=list(llam=log(2),lalf=log(1)))

par(mfrow=c(2,1))
plot(profile(est))

Notice, incidentally, the use of lgamma rather than log(gamma(.)), which 
is prone to overflow.

In fact, you could also write this likelihood directly  as

-sum(dgamma(x, rate=lambda, shape=alfa, log=T))





 audaces fortuna iuvat

 -

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lisp-like primitives in R

2007-09-08 Thread Peter Dalgaard
François Pinard wrote:
 [Roland Rau]
   
 [François Pinard]
 

   
 I wonder what happened, for R to hide the underlying Scheme so fully, 
 at least at the level of the surface language (despite there are 
 hints).  
   

   
 To further foster portability, we chose to write R in ANSI C
 

 Yes, of course.  Scheme is also (often) implemented in C.  I meant that 
 R might have implemented a Scheme engine (or part of a Scheme engine, 
 extended with appropriate data types) with a surface language (nearly 
 the S language) which is purposely not Scheme, but could have been.

 If the gap is not extreme, one could dare dreaming that the Scheme 
 engine in R be completed, and Scheme offered as an alternate extension 
 language.  If you allow me to continue dreaming awake -- they told me 
 they will let me free as long as I do not get dangerous! :-) -- part 
 of the interest lies in the fact there are excellent Scheme compilers.  
 If we could only find or devise some kind of marriage between a mature 
 Scheme and R, so to speed up the non-vectorisable parts of R scripts...

   
Well, depending on what you want, this is either trivial or 
impossible... The internal storage of R is still pretty much equivalent 
to scheme. E.g. try this:

  r2scheme - function(e) if (!is.recursive(e))
  deparse(e) else c((, unlist(lapply(as.list(e), r2scheme)), ))
  paste(r2scheme(quote(for(i in 1:4)print(i))), collapse= )
[1] ( for i ( : 1 4 ) ( print i ) )

and a parser that parses a similar language to R internal format is  not 
a very hard exercise (some care needed in places). However, replacing 
the front-end is not going to make anything faster, and the evaluation 
engine in R does a couple of tricks which are not done in Scheme, 
notably lazy evaluation, and other forms of non-local evaluation, which 
drives optimizers crazy. Look up the writings of Luke Tierney on the 
matter to learn more.

 If we are lucky and one of the original authors reads this thread they 
 might explain the situation further and better [...].
 

 In r-devel, maybe!  We would be lucky if the authors really had time to 
 read r-help. :-)

   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ploting missing data

2007-09-07 Thread Peter Dalgaard
Markus Schmidberger wrote:
 Hello,

 I have this kind of dataframe and have to plot it.

 data - data.frame(sw= c(1,2,3,4,5,6,7,8,9,10,11,12,15),
 zehn = 
 c(33.44,20.67,18.20,18.19,17.89,19.65,20.05,19.87,20.55,22.53,NA,NA,NA),
  zwanzig = 
 c(61.42,NA,26.60,23.28,NA,24.90,24.47,24.53,26.41,28.26,NA,29.80,35.49),
  fuenfzig =
 c(162.51,66.08,49.55,43.40,NA,37.77,35.53,36.46,37.25,37.66,NA,42.29,47.80) 
 )

 The plot should have lines:
 lines(fuenfzig~sw, data=data)
 lines(zwanzig~sw, data=data)

 But now I have holes in my lines for the missing values (NA). How to 
 plot the lines without the holes?
 The missing values should be interpolated or the left and right point 
 directly connected. The function approx interpolates the whole dataset. 
 Thats not my goal!
 Is there no plotting function to do this directly?

   
Just get rid of the NAs:

lines(fuenfzig~sw, data=data, subset=!is.na(fuenfzig))

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kendall test

2007-09-06 Thread Peter Dalgaard
Stefan Grosse wrote:
 On Thursday 06 September 2007 09:48:22 elyakhlifi mustapha wrote:

 em  I thougth that there is a function which does the kendall test in R,
 em  I writed on the console apropos(kendall) and I didn't found anything
 em   can you tell me how could I do to use the kendall test? 

 ?cor.test

 btw.: rseek.org is a very good help for such questions
Interesting site! However, I don't see that leading to cor.test. Rather it 
points to the Kendall package which would seem to be a bit of an overkill. 
However, help.search(kendall) gets you there immediately.



-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] order intervals in a data.frame

2007-09-06 Thread Peter Dalgaard
Paul Smith wrote:
 On 9/6/07, João Fadista [EMAIL PROTECTED] wrote:
   
 I would like to know how can I order a data.frame with increasing the 
 dat$Interval (dat$Interval is a factor). There is an example below.

 Original data.frame:

 
 dat
   
  Interval   Number_reads
   0-100 685
200-300 744
100-2001082
300-4004213


 
 Desired_dat:
   
  Interval   Number_reads
   0-100  685
 100-200   1082
 200-300 744
 300-400   4213
 

 What about

 Desired_dat - dat[match(dat$Interval,sort(dat$Interval)),]
   
dat[order(dat$Interval),]

would be more to the point, but it is a bit fortuitous that it works at
all (split the first group at 50 and you'll see).

This (or at least something like it) should sort according to left
endpoints:

o - order(as.numeric(sub(-.*, , dat$Interval)))
dat[o,]

 ?

 Paul

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table

2007-09-06 Thread Peter Dalgaard
Ingo Holz wrote:
 Hi,

  I want to read a ascii-file using the function read.table.
  With 'skip' and 'nrows' I can select the rows to read from this file.

  Is there a way to select columns (in the selected rows)?
   
Yes, use the colClasses argument.
(I won't rewrite the help page here; I expect that you can read it once
you know where to look.)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems in read.table

2007-09-06 Thread Peter Dalgaard
[EMAIL PROTECTED] wrote:
 Dear R-users,

 I have encountered the following problem every now and then. But I was 
 dealing with a very small dataset before, so it wasn't a problem (I 
 just edited the dataset in Openoffice speadsheet). This time I have to 
 deal with many large datasets containing commuting flow data. I 
 appreciate if anyone could give me a hint or clue to get out of this 
 problem.

 I have a .dat file called 1081.dat: 1001 means Birmingham, AL.

 I imported this .dat file using read.table like
 tmp-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T)

 Then I got this error message:
 Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
 line 9499 did not have 209 elements

 Since I got an error message saying other rows did not have 209 
 elements, I added skip=c(205,9499,9294)) in hoping that R would take 
 care of this problem. But I got a similar error message:
 tmp-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T,skip=c(205,9499,9294))
 Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
 line 9294 did not have 209 elements
 In addition: Warning message:
 the condition has length  1 and only the first element will be used 
 in: if (skip  0) readLines(file, skip)

 Is there any way to let a R code to automatically skip problematic 
 rows? Thank you very much!

   
Skip is the NUMBER of rows to skip before reading. It has to be a single 
number.

You can use fill and flush to read lines with too few or too many 
elements, but it might be better to investigate the cause of the 
problem. What are in those lines? Quote and comment characters are 
common culprits.

 Taka

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to extract part of an array?

2007-09-05 Thread Peter Dalgaard
Lauri Nikkinen wrote:
 Hi,

 How can I extract part of an array? I would like to extract table
 Supported from this array. If this is not possible, how do I convert
 array to list? I'm sorry this is not an reproducible example.

   
 spl - tapply(temp$var1, list(temp$var2, temp$var3, temp$var3), mean)
 spl
 
 , , Supported

07   08
 A68.38710 71.48387
 B21.67742 20.83871
 C55.74194 61.12903
 AL L 26.19816 27.39631

 , , Not_supported

  07   08
 ANA 82.38710
 BNA 24.0
 CNA 68.77419
 ALL  NA 29.97984

   
How about spl[,,Supported]?

-p

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Table and ftable

2007-09-04 Thread Peter Dalgaard
David Barron wrote:
 There might be simpler ways, but you can certainly do this with the
 reshape package, like this:

 library(reshape)
 dta - read.table(clipboard,header=TRUE)

   sic level area
 1   a   211  2.4
 2   b   311  2.3
 3   b   322  0.2
 4   b   322  0.5
 5   c   100  3.0
 6   c   100  1.5
 7   c   242  1.5
 8   d   222  0.2
   

 mlt.dta - melt(dta)
 cst.dta - cast(mlt.dta,sic~level,sum)

   sic 100 211 222 242 311 322
 1   a  NA 2.4  NA  NA  NA  NA
 2   b  NA  NA  NA  NA 2.3 0.7
 3   c 4.5  NA  NA 1.5  NA  NA
 4   d  NA  NA 0.2  NA  NA  NA

 Then just replace the NAs with 0s.

   
tapply() will do this too:

 with(d,tapply(area,list(sic,level), sum))
  100 211 222 242 311 322
a  NA 2.4  NA  NA  NA  NA
b  NA  NA  NA  NA 2.3 0.7
c 4.5  NA  NA 1.5  NA  NA
d  NA  NA 0.2  NA  NA  NA

This has the same awkwardness of giving NA for empty cells, and there is
no easy way to circumvent it since the FUN of tapply is simply not
called for such  cells.
Replacing NA by zero is a bit dangerous (albeit not in the present case)
since you can get an NA cell for more than one reason. A more careful
approach is like this:

 with(d,{t1 - tapply(area,list(sic,level), sum)
  t2 - table(sic,level)
  t1[t2==0] - 0
  t1} )
  100 211 222 242 311 322
a 0.0 2.4 0.0 0.0 0.0 0.0
b 0.0 0.0 0.0 0.0 2.3 0.7
c 4.5 0.0 0.0 1.5 0.0 0.0
d 0.0 0.0 0.2 0.0 0.0 0.0
   

 HTH.

 David Barron
 On 9/4/07, Giulia Bennati [EMAIL PROTECTED] wrote:
   
 Dear listmembers,
 I have a little question: I have my data organized as follow

 sic  level  area
 a2112.4
 b3112.3
 b3220.2
 b3220.5
 c1003.0
 c1001.5
 c2421.5
 d2220.2

 where levels and sics are factors. I'm trying to obtain a matrix like this:

 level
  211311322   100242 222
 sic
 a2.4  0   0   0   00
 b 0   2.30.7 0   00
 c 00  0   4.5 1.5 0
 d 00  00   0   0.2

 I tryed with table function as
 table(sic,level) but i obteined only a contingency table.
 Have you any suggestions?
 Thank you very much,
 Giulia

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 


   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confusion using functions to access the function call stack example section

2007-09-04 Thread Peter Dalgaard
Leeds, Mark (IED) wrote:
 I was going through the example below which is taken from the example
 section in the R documentation for accessing the function call stack.
 I am confused and I have 3 questions that I was hoping someone could
 answer.

 1) why is y equal to zero even though the call was done with gg(3)
   
There are multiple nested calls to gg, and y is counted down. You're not 
calling ggg when y  0, and that what does the printing.
 2) what does parents are 0,1,2,0,4,5,6,7 mean ? I understand what a
 parent frame is but how do the #'s relate to this
 particular example ? Why is the current frame # 8 ?
   
How did you get that?? Did you miss the part where it said that the 
example gives different results when run by example()? I get

  gg(3)
current frame is 5
parents are 0 1 2 3 4
function() {
cat(current frame is, sys.nframe(), \n)
cat(parents are, sys.parents(), \n)
print(sys.function(0)) # ggg
print(sys.function(2)) # gg
}
environment: 0x8bb8e10
function(y) {
ggg - function() {
cat(current frame is, sys.nframe(), \n)
cat(parents are, sys.parents(), \n)
print(sys.function(0)) # ggg
print(sys.function(2)) # gg
}
if(y  0) gg(y-1) else ggg()
}

which should make somewhat better sense. (My versions, 2.5.1 and 
pre-2.6.0 don't seem to print y either?) As a general matter, frames 
make a tree: two of them can have the same parent - e.g., this happens 
whenever an argument expression is being evaluated as part of evaluating 
a function call. Try, e.g.

 f - function(x) {x;print(sys.status())} ; f(f(1))


 3) it says that sys.function(2) should be gg but I would think that
 sys.function(1) would be gg since it's one up from where
 the call is being made.

   
There are multiple calls to gg()  so both could be true.
 Thanks a lot. If the answers are too complicated and someone knows of a
 good reference that goes into more details about
 the sys functions, that's appreciated also.
   
The best way is to just poke around with some simple examples until you 
get the hang of it. Possibly modify the examples you have already seen 
but print the entire sys.status().

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Q: selecting a name when it is known as a string

2007-09-04 Thread Peter Dalgaard
D. R. Evans wrote:
 I am 100% certain that there is an easy way to do this, but after
 experimenting off and on for a couple of days, and searching everywhere I
 could think of, I haven't been able to find the trick.

 I have this piece of code:

 ...
   attach(d)

   if (ORDINATE == 'ds')
   { lo - loess(percent ~ ncms * ds, d, control=loess.control(trace.hat =
 'approximate'))
 grid - data.frame(expand.grid(ds=MINVAL:MAXVAL, ncms=MINCMS:MAXCMS))
 ...

 then there several almost-identical if statements for different values of
 ORDINATE. For example, the next if statement starts with:

 ...
   if (ORDINATE == 'dsl')
   { lo - loess(percent ~ ncms * dsl, d, control=loess.control(trace.hat =
 'approximate'))
 grid - data.frame(expand.grid(dsl=MINVAL:MAXVAL, ncms=MINCMS:MAXCMS))
 ...

 This is obviously pretty silly code (although of course it does work).

 I imagine that my question is obvious: given that I have a variable,
 ORDINATE, whose value is a string, how do I re-write statements such as the
 lo - and grid - statements above so that they use ORDINATE instead of
 the hard-coded names ds and dsl.

 I am almost sure (almost) that it has something to do with deparse(), but
 I couldn't find the right incantation, and the ?deparse() help left my head
 swimming.
   

myvar - 12345
vname - myvar
eval(substitute(X+54321, list(X=as.name(vname

However, this does not work for argument names  as in 
expand.grid(ds=.), so for that part you may need to patch up names 
afterwards.

It is (paraphrasing Thomas Lumley) often a good idea to reconsider the 
question if the answer involves this sort of trickery. Perhaps it is 
better handled by a loop or lapply over a list of variables?


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sin(pi)?

2007-09-03 Thread Peter Dalgaard
Nguyen Dinh Nguyen wrote:
 Dear all,
 I found something strange when calculating sin of pi value
 sin(pi)
 [1] 1.224606e-16

  pi
 [1] 3.141593

  sin(3.141593)
 [1] -3.464102e-07

 Any help and comment should be appreciated. 
 Regards
 Nguyen
   
Well, sin(pi) is theoretically zero, so you are just seeing zero at two 
different levels of precision.

The built-in pi has more digits than it displays:

  pi
[1] 3.141593
  pi - 3.141593
[1] -3.464102e-07
  print(pi, digits=20)
[1] 3.141592653589793

 
 Nguyen Dinh Nguyen
 Garvan Institute of Medical Research
 Sydney, Australia

   
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] size limitations in R

2007-09-01 Thread Peter Dalgaard
Daniel Lakeland wrote:
 On Fri, Aug 31, 2007 at 01:31:12PM +0100, Fabiano Vergari wrote:

   
 I am a SAS user currently evaluating R as a possible addition or
 even replacement for SAS. The difficulty I have come across straight
 away is R's apparent difficulty in handling relatively large data
 files. Whilst I would not expect it to handle datasets with millions
 of records, I still really need to be able to work with dataset with
 100,000+ records and 100+ variables. Yet, when reading a .csv file
 with 180,000 records and about 200 variables, the software virtually
 ground to a halt (I stopped it after 1 hour). Are there guidelines
 or maybe a limitations document anywhere that helps me assess the
 size
 

 180k records with 200 variables = 36 million entries, if they're
 numeric then they're doubles taking up 8 bytes, so 288 MB of RAM. This
 should be perfectly fine for R, as long as you have that much free
 RAM.

 However, the routines that read CSV and tabular delimited files are
 relatively inefficient for such large files.

 In order to handle large data files, it is better to use one of the
 database interfaces. My preference would be sqlite unless I already
 had the data on a mysql or other database server.

   
Yes. However, for an intermediate solution, notice that much of the 
inefficiency comes from storing data as character vectors before 
deciding what to do with them. Character vectors have an overhead of one 
SEXP per string stored i.e. 20-28 bytes in addition to the actual 
string. There are options for telling the read routines explicitly that 
data are numeric/integer/logical: 'colClasses' for read.table(), 'what' 
for scan(). This will bypass the intermediate storage.
 the documentation for the packages RSQLite and SQLiteDF should be
 helpful, as well as the documentation for SQLite itself, which has a
 facility for efficiently importing CSV and similar files directly to a
 SQLite database.

 eg: http://netadmintools.com/art572.html



   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] in cor.test, difference between exact=FALSE and exact=NULL

2007-09-01 Thread Peter Dalgaard
Andrew Yee wrote:
 Thanks for the clarification.  I should have recognized the difference
 between warning and error.
 But if I may take this a step further, shouldn't it then be exact=TRUE
 instead of exact=NULL?
 Thanks,
 Andrew
   

Nope. The two are equivalent for the Spearman test, but not for 
Kendall's tau.  The login in that case is that NULL implies exact 
testing if n  50 and asymptotic otherwise. TRUE and FALSE enforces one 
or the other (if possible).
 On 8/31/07, Peter Dalgaard [EMAIL PROTECTED] wrote:
   
 Andrew Yee wrote:
 
 Pardon my ignorance, but is there a difference in cor.test between
 exact=FALSE and exact=NULL when method=spearman?

 Take for example:

 x-c(1,2,2,3,4,5)
 y-c(1,2,2,10,11,12)
 cor.test(x,y, method=spearman, exact=NULL)

 This gives an error message,
 Warning message:  Cannot compute exact p-values with ties in:
 cor.test.default(x, y, method = spearman, exact = NULL)

 However, when exact is changed to FALSE, this seems to run okay.

 cor.test(x,y, method=spearman, exact=FALSE)

 Question:  should this be exact = FALSE in the documentation and/or the
   
 code?
 
   
 No. The default is indeed NULL.

 This implies that calculation of exact p-values will be attempted, and
 when there are ties you get a warning (NB: not error) message.  Setting
 exact=FALSE, no attempt is made and no warning is given.
 
 Thanks,
 Andrew
 MGH Cancer Center

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
   
 guide.html
 
 and provide commented, minimal, self-contained, reproducible code.

   

 --
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45)
 35327907



 

   [[alternative HTML version deleted]]

   
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing transform to with

2007-09-01 Thread Peter Dalgaard
Muenchen, Robert A (Bob) wrote:
 Hi All,

 I've been successfully using the with function for analyses and the
 transform function for multiple transformations. Then I thought, why not
 use with for both? I ran into problems  couldn't figure them out from
 help files or books. So I created a simplified version of what I'm
 doing:

 rm( list=ls() )
 x1-c(1,3,3)
 x2-c(3,2,1)
 x3-c(2,5,2)
 x4-c(5,6,9)
 myDF-data.frame(x1,x2,x3,x4)
 rm(x1,x2,x3,x4)
 ls()
 myDF

 This creates two new variables just fine

 transform(myDF,
   sum1=x1+x2,
   sum2=x3+x4
 )

 This next code does not see sum1, so it appears that transform cannot
 see the variables that it creates. Would I need to transform new
 variables in a second pass?

 transform(myDF,
   sum1=x1+x2,
   sum2=x3+x4,
   total=sum1+sum2
 )

 Next I'm trying the same thing using with. It doesn't not work but
 also does not generate error messages, giving me the impression that I'm
 doing something truly idiotic:

 with(myDF, {
   sum1-x1+x2
   sum2-x3+x4
   total - sum1+sum2
 } )
 myDF
 ls()

 Then I thought, perhaps one of the advantages of transform is that it
 works on the left side of the equation without using a longer name like
 myDF$sum1. with probably doesn't do that, so I use the longer form
 below. It also does not work and generates no error messages. 

 # Try it again, writing vars to myDF explicitly.
 # It generates no errors, and no results.
 with(myDF, {
   myDF$sum1-x1+x2
   myDF$sum2-x3+x4
   myDF$total - myDF$sum1+myDF$sum2
 } )
 myDF
 ls()

 I would appreciate some advice about the relative roles of these two
 functions  why my attempts with with have failed.
   
Yes, transform() calculates all its new values, then assigns to the 
given names. This is expedient, but it has the drawback that new 
variables are not usable inside the expressions. A possible alternative 
implementation would be equivalent to a series of nested calls to 
transform, which of course you could also do manually:

transform(
  transform(myDF,
 sum1=x1+x2,
 sum2=x3+x4
  ),
  total=sum1+sum2
)

The problem with with() on data frames and lists is that, like the 
eval family of functions, _converts_ the object to an environment, and 
then evaluates the expression in the converted environment. The 
environment is temporary, so assignments to it get lost. The current 
development sources has a new (experimental) function within() which is 
like with(), but stores any modified variables back. (This is very 
recent and may or may not make it to 2.6.0).

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] in cor.test, difference between exact=FALSE and exact=NULL

2007-08-31 Thread Peter Dalgaard
Andrew Yee wrote:
 Pardon my ignorance, but is there a difference in cor.test between
 exact=FALSE and exact=NULL when method=spearman?

 Take for example:

 x-c(1,2,2,3,4,5)
 y-c(1,2,2,10,11,12)
 cor.test(x,y, method=spearman, exact=NULL)

 This gives an error message,
 Warning message:  Cannot compute exact p-values with ties in:
 cor.test.default(x, y, method = spearman, exact = NULL)

 However, when exact is changed to FALSE, this seems to run okay.

 cor.test(x,y, method=spearman, exact=FALSE)

 Question:  should this be exact = FALSE in the documentation and/or the code?

   
No. The default is indeed NULL.

This implies that calculation of exact p-values will be attempted, and
when there are ties you get a warning (NB: not error) message.  Setting
exact=FALSE, no attempt is made and no warning is given.
 Thanks,
 Andrew
 MGH Cancer Center

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time conversion problems

2007-08-22 Thread Peter Dalgaard
[EMAIL PROTECTED] wrote:
 Hi there

 I have precipitation data from 2004 to 2006 in varying resolutions (10 to 
 20min intervals) with time in seconds from beginnig of the year (summation) 
 and a second variable as year.

 I applied follwing code to convert the time into a date:

 times-strptime(2004-01-01, %Y-%m-%d, tz=GMT) + precipitation$time1

 everytihng went well, except that every year, the seconds-counter starts by 
 zero, therefore I have now three 2004 series instead of going further from 04 
 to 05 etc.

 I tried to sum the last seconds-values of 2004 to the first of 2005 with an 
 if command like:

 if (year=2005) time2=time1+632489 ;(seconds)

 but it doesn't work.

 thanks for a solution
   
Can't you just do strptime(paste(year, 01-01, sep=-), ? (or use 
ISOdatetime(year,1,1,0,0,0,tz=GMT))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sphericity test in R for repeated measures ANOVA

2007-08-20 Thread Peter Dalgaard
Orou Gaoue wrote:
 Hi,
 Is there a way to do a sphericity test in R for repeated measures ANOVA
 (using aov or lme)? I can't find anything about it in the help.
 Thanks

 Orou
   
There is for lm() with multivariate response (mauchly.test).

For lme(), you can compare models with a corSymm correlation structure 
to ones with corCompSymm. Thus is a similar test, but not quite the same.

For aov() it doesn't really make sense, partly because the 
repeatedness is  ambiguous in some models, partly because aov's 
internal algorithms rely strongly on orthogonality with respect to a 
particular covariance structure. If you relax assumptions, orthogonality 
no longer holds.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about unicode characters in tcltk

2007-08-18 Thread Peter Dalgaard
R Help wrote:
 hello list,

 Can someone help me figure out why the following code doesn't work?
 I'm trying to but both Greek letters and subscripts into a tcltk menu.
  The code creates all the mu's, and the 1 and 2 subscripts, but it
 won't create the 0.  Is there a certain set of characters that R won't
 recognize the unicode for?  Or am I input the \u2080 incorrectly?

 library(tcltk)
 m -tktoplevel()
 frame1 - tkframe(m)
 frame2 - tkframe(m)
 frame3 - tkframe(m)
 entry1 - tkentry(frame1,width=5,bg='white')
 entry2 - tkentry(frame2,width=5,bg='white')
 entry3 - tkentry(frame3,width=5,bg='white')

 tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left')
 tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left')
 tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left')

 tkpack(frame1,entry1,side='top')
 tkpack(frame2,entry2,side='top')
 tkpack(frame3,entry3,side='top')

 thanks
 -- Sam

   
Which OS was this? I can reproduce the issue on SuSE, but NOT Fedora 7.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mann-Whitney U

2007-08-15 Thread Peter Dalgaard
Lucke, Joseph F wrote:
 R and SPSS are using different but equivalent statistics.  R is using
 the rank sum of group1 adjusted for the mean rank. SPSS is using the
 rank sum of group2 adjusted for the mean rank. 

   
Close: It is the _minimum_ possible rank sum that is getting subtracted. 
If everyone in group1 is less than everyone in group2, R's W statistic  
will be zero. Other way around in SPSS.

 Example.
   
 G1=group1
 G2=group2[-length(group2)] #get rid of the NA
 n1=length(G1) #n1=28
 n2=length(G2) #n2=27
 
 # convert to ranks
   
 W=rank(c(G1,G2))
 R1=W[1:n1] #put the ranks back into the groups
 R2=W[n1+1:n2]
 
 #Get the sum of the ranks for each group
   
 W1=sum(R1)
 W2=sum(R2)
 
 #Adjust for mean rank for group 1
   
 W1-n1*(n1+1)/2
 
 [1] 405.5
 #Adjust for mean rank for group 2
   
 W2-n2*(n2+1)/2
 
 [1] 350.5

 W1-n1*(n1+1)/2 gives R's result; W2-n2*(n2+1)/2 gives SPSS's result.

 Ties throw a wrench in the works.  R uses a continuity correction by
 default, SPSS does not.
 Taking out the continuity correction,
   
 wilcox.test(G1,G2,correct=FALSE)
 

 Wilcoxon rank sum test

 data:  G1 and G2 
 W = 405.5, p-value = 0.6433
 alternative hypothesis: true location shift is not equal to 0 

 Warning message:
 cannot compute exact p-value with ties in: wilcox.test.default(G1, G2,
 correct = FALSE) 

 This p-value is the same as SPSS's.


 Consult a serious non-parametrics text.  I used
 Lehmann, E. L., Nonparametrics: Statistical methods based on ranks.
 1975. Holden-Day. San Francisco, CA.


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Natalie O'Toole
 Sent: Wednesday, August 15, 2007 1:07 PM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] Mann-Whitney U

 Hi,

 I do want to use the Mann-Whitney test which ranks my data and then uses
 those ranks rather than the actual data.

 Here is the R code i am using:

  group1-
 c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2,
 2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
   
 group2-
 
 c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9
 7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
   
 result -  wilcox.test(group1, group2, paired=FALSE, conf.level = 
 0.95,
 
 na.action)

 paired = FALSE so that the Wilcoxon rank sum test which is equivalent to
 the Mann-Whitney test is used (my samples are NOT paired).
 conf.level = 0.95 to specify the confidence level na.action is used
 because i have a NA value (i suspect i am not using na.action in the
 correct manner)

 When i use this code i get the following error message:

 Error in arg == choices : comparison (1) is possible only for atomic and
 list types

 When i use this code:

  group1-
 c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2,
 2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
   
 group2-
 
 c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9
 7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
   
 result -  wilcox.test(group1, group2, paired=FALSE, conf.level = 
 0.95)
 

 I get the following result:

   Wilcoxon rank sum test with continuity correction

 data:  group1 and group2
 W = 405.5, p-value = 0.6494
 alternative hypothesis: true location shift is not equal to 0 

 Warning message:
 cannot compute exact p-value with ties in: wilcox.test.default(group1,
 group2, paired = FALSE, conf.level = 0.95) 

 The W value here is 405.5 with a p-value of 0.6494


 in SPSS, i am ranking my data and then performing a Mann-Whitney U by
 selecting analyze - non-parametric tests - 2 independent samples  and
 then checking off the Mann-Whitney U test.

 For the Mann-Whitney test in SPSS i am gettting the following results:

 Mann-Whitney U = 350.5
  2- tailed p value = 0.643

 I think maybe the descrepancy has to do with the specification of the NA
 values in R, but i'm not sure.


 If anyone has any suggestions, please let me know!

 I hope i have provided enough information to convey my problem.

 Thank-you, 

 Nat
 __


 Natalie,

 It's best to provide at least a sample of your data.  Your field names 
 suggest 
 that your data might be collected in units of mm^2 or some similar 
 measurement of area.  Why do you want to use Mann-Whitney, which will
 rank 

 your data and then use those ranks rather than your actual data?  Unless

 your 
 sample is quite small, why not use a two sample t-test?  Also,are your 
 samples paired?  If they aren't, did you use the paired = FALSE
 option?

 JWDougherty

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 
 

Re: [R] R 2.5.1 configure problem

2007-08-14 Thread Peter Dalgaard
Andreas Hey wrote:
 Hi,

  

 I have follwoing problem:

  

 I will install R-2.5.1 on a Linux Maschine (64Bit) 

Which? (CPU and OS, please. There are about four likely possibilities,
half a dozen less likely ones...)

 and I will use the R-GUI
 JGR (Jaguar)

  

 SO I make following steps:

  

 ./configure --with-gnu-ld --enable-R-shlib VAR=fPIC VAR=TCLTK_LIBS

   
What are those options supposed to be good for??? You appear to be
setting VAR twice, and I don't recall VAR as anything used by configure.
And it is unlikely that a Linux system would use anything but GNU ld by
default. Did configure terminate succesfully??? What did  the output
summary say?

 make 

  

 I become following error messages:

   
(that's not how to translate Ich bekomme...)
  

 .
   
Something must have gone before this! A linker error perhaps?
 Entering directory `/root/R-2.5.1/tests'

 make[2]: Entering directory `/root/R-2.5.1/tests'

 make[3]: Entering directory `/root/R-2.5.1/tests/Examples'

 make[4]: Entering directory `/root/R-2.5.1/tests/Examples'

 make[4]: `Makedeps' is up to date.

 make[4]: Leaving directory `/root/R-2.5.1/tests/Examples'

 make[4]: Entering directory `/root/R-2.5.1/tests/Examples'

 make[4]: *** No rule to make target `../../lib/libR.so', needed by
 `base-Ex.Rout'.  Stop.
   
Did you really run make. This looks like make check output.

 make[4]: Leaving directory `/root/R-2.5.1/tests/Examples'

 make[3]: *** [test-Examples-Base] Error 2

 make[3]: Leaving directory `/root/R-2.5.1/tests/Examples'

 make[2]: *** [test-Examples] Error 2

 make[2]: Leaving directory `/root/R-2.5.1/tests'

 make[1]: *** [test-all-basics] Error 1

 make[1]: Leaving directory `/root/R-2.5.1/tests'

 make: *** [check] Error 2

  

 Make check all - I become following messages:

  

 Entering directory `/root/R-2.5.1/tests'

 make[2]: Entering directory `/root/R-2.5.1/tests'

 make[3]: Entering directory `/root/R-2.5.1/tests/Examples'

 make[4]: Entering directory `/root/R-2.5.1/tests/Examples'

 make[4]: `Makedeps' is up to date.

 make[4]: Leaving directory `/root/R-2.5.1/tests/Examples'

 make[4]: Entering directory `/root/R-2.5.1/tests/Examples'

 make[4]: *** No rule to make target `../../lib/libR.so', needed by
 `base-Ex.Rout'.  Stop.

 make[4]: Leaving directory `/root/R-2.5.1/tests/Examples'

 make[3]: *** [test-Examples-Base] Error 2

 make[3]: Leaving directory `/root/R-2.5.1/tests/Examples'

 make[2]: *** [test-Examples] Error 2

 make[2]: Leaving directory `/root/R-2.5.1/tests'

 make[1]: *** [test-all-basics] Error 1

 make[1]: Leaving directory `/root/R-2.5.1/tests'

 make: *** [check] Error 2

  

  

 Can you help me?

  

 With best regards 

  

 Andreas Hey

  

 _

 Tel:030/2093-1463

 Email:  mailto:[EMAIL PROTECTED] [EMAIL PROTECTED]

  


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about unicode characters in tcltk

2007-08-14 Thread Peter Dalgaard
R Help wrote:
 hello list,

 Can someone help me figure out why the following code doesn't work?
 I'm trying to but both Greek letters and subscripts into a tcltk menu.
  The code creates all the mu's, and the 1 and 2 subscripts, but it
 won't create the 0.  Is there a certain set of characters that R won't
 recognize the unicode for?  Or am I input the \u2080 incorrectly?

 library(tcltk)
 m -tktoplevel()
 frame1 - tkframe(m)
 frame2 - tkframe(m)
 frame3 - tkframe(m)
 entry1 - tkentry(frame1,width=5,bg='white')
 entry2 - tkentry(frame2,width=5,bg='white')
 entry3 - tkentry(frame3,width=5,bg='white')

 tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left')
 tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left')
 tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left')

 tkpack(frame1,entry1,side='top')
 tkpack(frame2,entry2,side='top')
 tkpack(frame3,entry3,side='top')

   
Odd, but I think not an R issue. I get weirdness in wish too. Try this

% toplevel .a
.a
% label .a.b -text \u03bc\u2080 -font {Roman -10}
.a.b
% pack .a.b
% .a.b configure
{-activebackground
[]
{-text text Text {} μ₀} {-textvariable textVariable Variable {} {}}
{-underline underline Underline -1 -1} {-width width Width 0 0}
{-wraplength wrapLength WrapLength 0 0}
% .a.b configure -font {Helvetica -12 bold} # the default, now shows \u2080
% .a.b configure -font {Roman -10} # back to Roman, *still* shows \u2080

???!!!

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory allocation glitches

2007-08-14 Thread Peter Dalgaard
Ben Bolker wrote:
 (not sure whether this is better for R-devel or R-help ...)

   
Hardcore debugging is usually better off in R-devel. I'm leaving it in 
R-help though.

  I am currently trying to debug someone else's package (they're
 not available at the moment, and I would like it to work *now*),
 which among other things allocates memory for a persistent
 buffer that gets used by various functions.

  The first symptoms of a problem were that some things just
 didn't work under Windows but were (apparently) fine on Linux.
 I don't have all the development tools installed for Windows, so
 I started messing around under Linux, adding Rprintf() statements
 to the main code.

  Once I did that, strange pointer-error-like inconsistencies started
 appearing -- e.g., the properties of some of the persistent variables
 would change if I did debug(function).  I'm wondering if anyone
 has any tips on how to tackle this -- figure out how to use valgrind?
 Do straight source-level debugging (R -d gdb etc.) and look for
 obvious problems?  The package uses malloc/realloc rather than
 Calloc/Realloc -- does it make sense to go through the code
 replacing these all and see if that fixes the problem?
   
Valgrind is a good idea to try and as I recall it, the basic 
incantations are not too hard to work out (now exactly where is it that 
we wrote them down?). It only catches certain error types though, mostly 
use of uninitialized data and read/write off the ends of allocated 
blocks of memory.

If that doesn't catch it, you get to play with R -d  gdb. However, my 
experience is that line-by-line tracing is usually a dead end, unless 
you have the trouble spot pretty well narrowed down.

Apart from that, my usual procedure would be

1) find a minimal script reproducing the issue and hang onto it. Or at 
least as small as you can get it without losing the bug. Notice that any 
change to either the script or R itself may allow the bug to run away 
and hide somewhere else.

2) if memory corruption is involved, run under gdb, set a hardware 
watchpoint on the relevant location (this gets a little tricky sometimes 
because it might be outside the initial address space, in which case you 
need to somehow run the code for a while, break to gdb, and then set the 
watchpoint).

3) It is not unlikely that the watchpoint triggers several thousand 
times before the relevant one. You can conditionalize it; a nice trick 
is to use the gc_count.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mann-Whitney U

2007-08-14 Thread Peter Dalgaard
Prof Brian Ripley wrote:
 On Tue, 14 Aug 2007, Natalie O'Toole wrote:

   
 Hi,

 Could someone please tell me how to perform a Mann-Whitney U test on a
 dataset with 2 groups where one group has more data values than another?

 I have split up my 2 groups into 2 columns in my .txt file i'm using with
 R. Here is the code i have so far...

 group1 - c(LeafArea2)
 group2 - c(LeafArea1)
 wilcox.test(group1, group2)

 This code works for datasets with the same number of data values in each
 column, but not when there is a different number of data values in one
 column than another column of data.
 

 There is an example of that scenario on the help page for wilcox.test, so 
 it does 'work'.  What exactly went wrong for you?

   
 Is the solution that i have to have a null value in the data column with
 the fewer data values?

 I'm testing for significant diferences between the 2 groups, and the
 result i'm getting in R with the uneven values is different from what i'm
 getting in SPSS.
 

 We need a worked example.  As the help page says, definitions do differ. 
 If you can provide a reproducible example in R and the output from SPSS we 
 may be able to tell you how to relate that to what you see in R.

 [...]

   
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

 As it says, we really need such code (and the output you get) to be able 
 to help you.

   
Also, two variables of different length in two columns is not a good 
idea. If you read in things in parallel columns, it would usually imply 
paired data. If one column is shorter, you may be reading different data 
than you think. Check e.g. the sleep data for a better format.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining two ANOVA outputs of different lengths

2007-08-10 Thread Peter Dalgaard
Christoph Scherber wrote:
 Dear R users,

 I have been trying to combine two anova outputs into one single table 
 (for later publication). The outputs are of different length, and share 
 only some common explanatory variables.

 Using merge() or melt() (from the reshape package) did not work out.

 Here are the model outputs and what I would like to have:

 anova(model1)
  numDF denDF  F-value p-value
 (Intercept) 174 0.063446  0.8018
 days  174 6.613997  0.0121
 logdiv  174 1.587983  0.2116
 leg 174 4.425843  0.0388

 anova(model2)
   numDF denDF   F-value p-value
 (Intercept)  173 165.94569  .0001
 funcgr   173   7.91999  0.0063
 grass173  42.16909  .0001
 leg  173   4.72108  0.0330
 funcgr:grass 173   8.49068  0.0047

 #merge(anova(model1),anova(model2),...)

   F-value 1   p-val1  F-value 2   p-value 2
 (Intercept)   0.0634460.8018  165.94569   .0001
 days  6.6139970.0121  NA  NA
 logdiv1.5879830.2116  NA  NA
 leg   4.4258430.0388  4.72108 0.033
 funcgrNA  NA  7.91999 0.0063
 grass NA  NA  42.16909.0001
 funcgr:grass  NA  NA  8.49068 0.0047


 I would be glad if someone would have an idea of how to do this in 
 principle.
   
The main problems are that the merge key is the rownames and that you 
want to keep entries that are missing in one of the analysis. There are 
ways to deal with that:

  example(anova.lm)
.
  merge(anova(fit2), anova(fit4), by=0, all=T)
  Row.names Df.x  Sum Sq.x Mean Sq.x F value.xPr(F).x Df.y  Sum Sq.y
1  ddpi   NANANANA  NA1  63.05403
2   dpi   NANANANA  NA1  12.40095
3 pop151 204.11757 204.11757 13.211166 0.0006878681 204.11757
4 pop751  53.34271  53.34271  3.452517 0.0694253851  53.34271
5 Residuals   47 726.16797  15.45038NA  NA   45 650.71300
  Mean Sq.y  F value.y Pr(F).y
1  63.05403  4.3604959 0.0424711387
2  12.40095  0.8575863 0.3593550848
3 204.11757 14.1157322 0.0004921955
4  53.34271  3.6889104 0.0611254598
5  14.46029 NA   NA



Presumably, you can take it from here.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rcmdr window border lost

2007-08-09 Thread Peter Dalgaard
Andy Weller wrote:
 OK, I tried completely removing and reinstalling R, but this has not 
 worked - I am still missing window borders for Rcmdr. I am certain that 
 everything is installed correctly and that all dependencies are met - 
 there must be something trivial I am missing?!

 Thanks in advance, Andy

 Andy Weller wrote:
   
 Dear all,

 I have recently lost my Rcmdr window borders (all my other programs have 
 borders)! I am unsure of what I have done, although I have recently 
 update.packages() in R... How can I reclaim them?

 I am using:
 Ubuntu Linux (Feisty)
 R version 2.5.1
 R Commander Version 1.3-5

 
This sort of behaviour is usually the fault of the window manager, not 
R/Rcmdr/tcltk. It's the WM's job to supply the various window 
decorations on a new window, so either it never got told that there was 
a window, or it somehow got into a confused state. Did you try 
restarting the WM (i.e., log out/in or reboot)? And which WM are we 
talking about?

Same combination works fine on Fedora 7, except for a load of messages 
saying

Warning: X11 protocol error: BadWindow (invalid Window parameter)


 I have deleted the folder: /usr/local/lib/R/site-library/Rcmdr and 
 reinstalled Rcmdr with: install.packages(Rcmdr, dep=TRUE)

 This has not solved my problem though.

 Maybe I need to reinstall something else as well?

 Thanks in advance, Andy
 



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need Help: Installing/Using xtable package

2007-08-09 Thread Peter Dalgaard
M. Jankowski wrote:
 Hi all,

 Let me know if I need to ask this question of the bioconductor group.
 I used the bioconductor utility to install this package and also the
 CRAN package.install function.

 My computer crashed a week ago. Today I reinstalled all my
 bioconductor/R packages. One of my scripts is giving me the following
 error:

 in my script I set:
 library(xtable)
 print.xtable(

 and receive this error:
 Error : could not find function print.xtable

 This is a new error and I cannot find the source.
   
Looks like the current xtable is no longer exporting its print methods. 
Why were you calling print.xtable explicitly in the first place?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Invert Likert-Scale Values

2007-08-05 Thread Peter Dalgaard
(Ted Harding) wrote:
 On 04-Aug-07 22:02:33, William Revelle wrote:
   
 Alexis and John,

 To reverse a Likert like item, subtract the item from the maximum 
 acceptable value + the minimum acceptable value,
 That is, if
 x - 1:8
 xreverse - 9-x

 Bill
 

 A few of us have suggested this, but Alexis's welcome for the
 recode() suggestion indicates that by the time he gets round to
 this his Likert scale values have already become levels of a factor.

 Levels 1, 2, ... of a factor may look like integers, but they're
 not; and R will not let you do arithmetic on them:

   
 x-factor(c(1,1,1,2,2,2))
 x
 
 [1] 1 1 1 2 2 2
 Levels: 1 2
   
 y-(3-x)
 
 Warning message: 
 - not meaningful for factors in: Ops.factor(3, x) 
   
 y
 
 [1] NA NA NA NA NA NA

 However, you can turn them back into integers, reverse, and then
 turn the results back into a factor:

   
 y - factor(3 - as.integer(x))
 y
 
 [1] 2 2 2 1 1 1
 Levels: 1 2

 So, even for factors, the insight undelying our suggestion of -
 is still valid! :)
   
Er, wouldn't   y - factor(x, levels=2:1, labels=1:2)  be more to the point?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme and aov

2007-08-03 Thread Peter Dalgaard
Gang Chen wrote:
 I have a mixed balanced ANOVA design with a between-subject factor  
 (Grp) and a within-subject factor (Rsp). When I tried the following  
 two commands which I thought are equivalent,

   fit.lme - lme(Beta ~ Grp*Rsp, random = ~1|Subj, Model);
   fit.aov - aov(Beta ~ Rsp*Grp+Error(Subj/Rsp)+Grp, Model);
   
 I got totally different results. What did I do wrong?

   
Except for not telling us what your data are and what you mean by 
totally different?

One model has a random interaction between Subj and Rsp, the other does 
not. This may make a difference, unless the interaction term is aliased 
with the residual error.

If your data are unbalanced, aov is not guaranteed to give meaningful 
results.

-pd

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme and aov

2007-08-03 Thread Peter Dalgaard
Gang Chen wrote:
 Thanks a lot for clarification! I just started to learn programming in 
 R for a week, and wanted to try a simple mixed design of balanced 
 ANOVA with a between-subject factor
 (Grp) and a within-subject factor (Rsp), but I'm not sure whether I'm 
 modeling the data correctly with either of the command lines.

 Here is the result. Any help would be highly appreciated.

  fit.lme - lme(Beta ~ Grp*Rsp, random = ~1|Subj, Model);
  summary(fit.lme)
 Linear mixed-effects model fit by REML
 Data: Model
   AIC  BIClogLik
   233.732 251.9454 -108.8660

 Random effects:
 Formula: ~1 | Subj
 (Intercept)  Residual
 StdDev:1.800246 0.3779612

 Fixed effects: Beta ~ Grp * Rsp
  Value Std.Error DFt-value p-value
 (Intercept)  1.1551502 0.5101839 36  2.2641837  0.0297
 GrpB-1.1561248 0.7215090 36 -1.6023706  0.1178
 GrpC-1.2345321 0.7215090 36 -1.7110417  0.0957
 RspB-0.0563077 0.1482486 36 -0.3798196  0.7063
 GrpB:RspB   -0.3739339 0.2096551 36 -1.7835665  0.0829
 GrpC:RspB0.3452539 0.2096551 36  1.6467705  0.1083
 Correlation:
   (Intr) GrpB   GrpC   RspB   GrB:RB
 GrpB  -0.707
 GrpC  -0.707  0.500
 RspB  -0.145  0.103  0.103
 GrpB:RspB  0.103 -0.145 -0.073 -0.707
 GrpC:RspB  0.103 -0.073 -0.145 -0.707  0.500

 Standardized Within-Group Residuals:
 Min  Q1 Med  Q3 Max
 -1.72266114 -0.41242552  0.02994094  0.41348767  1.72323563

 Number of Observations: 78
 Number of Groups: 39

  fit.aov - aov(Beta ~ Rsp*Grp+Error(Subj/Rsp)+Grp, Model);
  fit.aov

 Call:
 aov(formula = Beta ~ Rsp * Grp + Error(Subj/Rsp) + Grp, data = Model)

 Grand Mean: 0.3253307

 Stratum 1: Subj

 Terms:
  Grp
 Sum of Squares  5.191404
 Deg. of Freedom1

 1 out of 2 effects not estimable
 Estimated effects are balanced

 Stratum 2: Subj:Rsp

 Terms:
  Rsp
 Sum of Squares  7.060585e-05
 Deg. of Freedom1

 2 out of 3 effects not estimable
 Estimated effects are balanced

 Stratum 3: Within

 Terms:
   Rsp   Grp   Rsp:Grp Residuals
 Sum of Squares0.33428  36.96518   1.50105 227.49594
 Deg. of Freedom 1 2 270

 Residual standard error: 1.802760
 Estimated effects may be unbalanced

This looks odd.  It is a standard split-plot layout, right? 3 groups of 
13 subjects, each measured with two kinds of Rsp = 3x13x2 = 78 
observations.

In that case you shouldn't see the same effect allocated to multiple 
error strata. I suspect you forgot to declare Subj as factor.

Also: summary(fit.aov) is nicer, and anova(fit.lme) should be informative.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] - round() strange behaviour

2007-08-02 Thread Peter Dalgaard
Monica Pisica wrote:
 Hi,
  
 I am getting some strange results using round - it seems that it depends if 
 the number before the decimal point is odd or even 
  
 For example:
  
   
 round(1.5)[1] 2 round(2.5)[1] 2
 
 While i would expect that round(2.5) be 3 and not 2.
  
 Do you have any explanation for that?
  
   
http://www.google.com/search?q=round+to+even;

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the large dataset problem

2007-07-31 Thread Peter Dalgaard
(Ted Harding) wrote:
 On 30-Jul-07 11:40:47, Eric Doviak wrote:
   
 [...]
 

 Sympathies for the constraints you are operating in!

   
 The Introduction to R manual suggests modifying input files with
 Perl. Any tips on how to get started? Would Perl Data Language (PDL) be
 a good choice?  http://pdl.perl.org/index_en.html
 

 I've not used SIPP files, but itseems that they are available in
 delimited format, including CSV.

 For extracting a subset of fields (especially when large datasets may
 stretch RAM resources) I would use awk rather than perl, since it
 is a much lighter program, transparent to code for, efficient, and
 it will do that job.

 On a Linux/Unix system (see below), say I wanted to extract fields
 1, 1000, 1275,  , 5678 from a CSV file. Then the 'awk' line
 that would do it would look like

 awk '
  BEGIN{FS=,}{print $(1) , $(1000) , $(1275) , ... $(5678)
 '  sippfile.csv  newdata.csv

 Awk reads one line at a tine, and does with it what you tell it to do.
   


Yes, but notice that there are also options within R. If you use a 
carefully constructed colClasses= argument to 
read.table()/read.csv()/etc or what= argument to scan(), you don't get 
more columns than you ask for. The basic trick is to use NULL for each 
of the columns that you do NOT want, and preferably numeric or 
character or whatever for those that you want (NA lets read.table do 
it's usual trickery of guessing type from contents). However...
   
 I wrote a script which loads large datasets a few lines at a time,
 writes the dozen or so variables of interest to a CSV file, removes
 the loaded data and then (via a for loop) loads the next few lines
  I managed to get it to work with one of the SIPP core files,
 but it's SLW.
 

 See above ...

   
Looking at the actual data files and data dictionaries (we're talking 
about http://www.bls.census.gov/sipp_ftp.html, right?), it looks like 
SIPP files are in a fixed-width format, which suggests that you might 
want  to employ read.fwf().  If you want to get really smart about it, 
extract the 'D' fields from the dictionary files

Try this

 dict - readLines(ftp://www.sipp.census.gov/pub/sipp/2004/l04puw1d.txt;)
 D.lines - grep(^D , dict)
 vdict - read.table(con - textConnection(dict[D.lines])); close(con)
 head(vdict)

a little bit of further fiddling and you have the list of field widths 
and variable names to feed to read.fwf(). Just subset the name list and 
set the field width negative for those variables that you wish to skip. 
Extracting value labels from the V fields looks like it could be done, 
but requires more thinking, especially where they straddle multiple 
lines (but hey, it's your job, not mine...)

-Peter D.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] manipulating arrays

2007-07-30 Thread Peter Dalgaard
Henrique Dallazuanna wrote:
 Hi, I don't know if is the more elegant way, but:

 X-c(1,2,3,4,5)
 X - c(X[1], 0, X[2:5])
   
append(X, 0, 1)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About infinite value

2007-07-23 Thread Peter Dalgaard
arigado wrote:
 Hi everyone

 I have a problem about infinite.
 If I type 10^308, R shows 1e+308
 When I type 10^309, R shows Inf
 So, we know if a value is large than 1.XXXe+308, R will show Inf
 How can i do let the value, like 10^400 ,typed in R to show the word
 1e+400 not Inf

   
1. You can't, due to the computer representation of floating point numbers.

2. Package brobdingnag lets you do it anyway.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] summary of linear fixed effects model is different than the HSAUR book

2007-07-22 Thread Peter Dalgaard
Christopher W. Ryan wrote:
 But on page 169, summary() is shown to produce additional columns in the
 fixed effects section, namely degrees of freedom and the P-value (with
 significance stars).

 How can I produce that output?  Am I doing something wrong?  Has lme4
 changed?

   
The latter.  To make a long story short, the author got so fed up with 
the reliability of the DF heuristics that he decided to remove them 
altogether.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RAM, swap, Error: cannot allocate vector of size, Linux:

2007-07-19 Thread Peter Dalgaard
Feldman, Maximilian Jeffrey wrote:
 Dear Community,

 I am very new to the world of Linux and R and I have stumbled upon a problem 
 that I cannot seem to resolve on my own. Here is the relevant background:

 I am working on a 64-bit Linux Fedora Core 6 OS. I using R version 2.5.1. I 
 have 3.8 Gb of RAM and 1.9 Gb of swap. As I see it, there are no restraints 
 on the amount of memory that R can use imposed by this particular OS build. 
 When I type in the 'ulimit' command at the command line the response is 
 'unlimited'.

 Here is the problem:

 I have uploaded and normalized 48 ATH1 microarray slides using the justRMA 
 function.

   
 library(affy)
 

   
 setwd(/Data/cel)
 

   
 Data-justRMA()
 

 The next step in my analysis is to calculate a distance matrix for my dataset 
 using bioDist package. This is where I get my error.

   
 library(bioDist)
 

   
 x-cor.dist(exprs(Data))
 

 Error: cannot allocate vector of size 3.9 Gb

 I used the following function to examine my memory limitations:

   
 mem.limits()
 

 nsize vsize 

 NA NA 

 I believe this means there isn't any specified limit to the amount of memory 
 R can allocate to my task. I realize I only have 3.8 Gb of RAM but I would 
 expect that R would use my 1.9 Gb of swap. 
   
It does, if swap works at all on your machine. However, the error 
message is relates to the object that R fails to create, not the total 
memory usage. I.e. this might very well be the _second_ object of size 
3.9Gb that you are trying to fit into 5.7Gb of memory.  You could try 
increasing the swap space (the expedient, although perhaps not 
efficient, way is to find a file system with a few tens of Gb to spare 
and create a large swapfile on it.)
 Does R not use my swap space? Can I explicitly tell R to use my swap space 
 for large tasks such as this? 

 I was not able to find any information regarding this particular issue in the 
 R Linux manual, Linux FAQ, or on previous listserv threads. Many of the users 
 who had similar questions resolved their problems in a different manner.

 Thanks to anyone who thinks they can provide assistance!

 Max 

 Graduate Student

 Molecular Plant Sciences

 Washington State University


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread Peter Dalgaard
ONKELINX, Thierry wrote:
 The problem also exists in a clean workspace. But I've found the
 troublemaker. I had set options(OutDec = ,). Resetting this to
 options(OutDec = .) solved the problem.

 Thanks,

 Thierry
   
Oups. That sounds like there's a bug somewhere. Can you cook up a
minimal example which shows the behaviour?

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread Peter Dalgaard
Prof Brian Ripley wrote:
 On Thu, 19 Jul 2007, Peter Dalgaard wrote:

   
 ONKELINX, Thierry wrote:
 
 The problem also exists in a clean workspace. But I've found the
 troublemaker. I had set options(OutDec = ,). Resetting this to
 options(OutDec = .) solved the problem.

 Thanks,

 Thierry

   
 Oups. That sounds like there's a bug somewhere. Can you cook up a
 minimal example which shows the behaviour?
 

 Any use of summary.lm will do it (e.g. example(lm)).  The problem is in 
 printCoefmat, at

 x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0)

 and yes, it looks like an infelicity to me.

   
Ick. Any better ideas than

printsAs0 - scan(con - textConnection(Cf[okP), dec=options(outDec)) ; 
close(con)
x0 - (xm[okP] == 0) != printsAs0 

?

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tapply

2007-07-19 Thread Peter Dalgaard
sigalit mangut-leiba wrote:
 I'm sorry for the unfocused questions, i'm new here...
 the output should be:
 classaps_mean
 1  na
 2 11.5
 3   8

 the mean aps of every class, when every id count *once*,  for example: class
 2, mean= (11+12)/2=11.5
 hope it's clearer.
   
Much... Get the first record for each individual from (e.g.)

icul.redux - subset(icul, !duplicated(id))

then use tapply as before using variables from icul.redux. Or in one go

with(
  subset(icul, !duplicated(id)),
  tapply(aps, class, mean, na.rm=TRUE)
)


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R equivalent to Matlab's Bayes net toolbox

2007-07-18 Thread Peter Dalgaard
On Wed, 2007-07-18 at 03:52 +, Jose wrote:

 The thing that I don't understand in the gR page is why there are so many 
 different packages and why they are not very integrated:

You have to understand the gR project for that. It started from a number
of completely separate pieces of software within the general field of
graphical models, and tried to bring people together and make the
existing pieces of software accessible from R. Given that the active
core of the group was really just a handful of people with limited R
programming experience (much of the original code was written in
dialects of Pascal/Delphi), the project must be said to have had some
success. However, the most pronounced effect has been to bring those old
codes out in the open, but seamless integration would be quite far into
the horizon.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting data frame by a string variable

2007-07-17 Thread Peter Dalgaard
Dimitri Liakhovitski wrote:
 I have a data frame MyData with 2 variables.
 One of the variables (String) contains a string of letters.
 How can I resort MyData by MyData$String (alphabetically) and then
 save the output as a sorted data file?

 I tried:

 o-order(MyData$String)
 SortedData-rbind(MyData$String[o], MyData$Value[o])
 write.table(SortedData,file=Sorted.txt,sep=\t,quote=F, row.names=F)


 However, all strings get replaced with digits (1 for the first string,
 2 for the second string etc.). How can I keep the strings instead of
 digits?
   

Why on earth are you trying to rbind() things together?

Anything wrong with

SortedData - MyData[o,]
write.table(SortedData,...whatever...)

?

 Thank you!
 Dimitri

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hmisc variable labels as vector?

2007-07-16 Thread Peter Dalgaard
Steve Powell wrote:
 Dear members
 I have imported an SPSS data file using Hmisc. 
 So  label(mydata[[1]]) gives me the first variable label
 Just wondering how I can access all the variable labels as a vector?
 Something like label(mydata[[1:3]]) but that doesn't work.
   
Something like sapply(mydata, label) should work.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] table function

2007-07-16 Thread Peter Dalgaard
sigalit mangut-leiba wrote:
 Hello all, 

 I want to use the table function, but for every id I have a different no.
 of rows (obs.).

 If I write:

  

 table(x$class, x$infec)

  

 I don't get the right frequencies because I should count every id once, if
 id 1 has 20 observations It should count as one.

 can I use unique func. here?

 Hope it's clear.
   
Almost. I assume that class and infect are constant over id?  (If people
change infection status during the trial, you have a more complex problem).

You could then use unique() like this

with(unique(x[c(id, class, infec)] , table(class, infec)),

but I'd prefer using duplicated() as in

with(subset(x, !duplicated(id)), table(class, infec))

(notice that the latter tabulates the first record for each id, whereas
the former will count ids multiple times if the change class or infec).

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The $ operator and vectors

2007-07-13 Thread Peter Dalgaard
Gustaf Rydevik wrote:
 Hi all,

 I've run into a slightly illogical (to me) behaviour with the $
 subsetting function.

 consider:
   
 Test
 
   A B
 1 1 Q
 2 2 R

   
 Test$A
 
 [1] 1 2

   
 vector-A
 Test$vector
 
 NULL

   
 Test$A
 
 [1] 1 2

   
 Test[,vector]
 
 [1] 1 2


 Is there a reason for the $ operator not evaluating the vector before 
 executing?
   
Yes, the evaluation rule for $ is like that

Notice that it also didn't go looking for an object called A when you
said test$A.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling a list faster

2007-07-13 Thread Peter Dalgaard
Balazs Torma wrote:
 Thank you all for your answers!

 The problem is that I don't know the length of the list in advance!  
 And hoped for a convinience structure which reallocates once the  
 preallocated list (or matrix) becomes full.
   
That's not massively hard to do yourself, is it?
As in

if (i  N) {l - c(l, vector(list,N); N - N*2}

i.e.
 N-1; l - vector(list, N)
 system.time(for(i in (1:1e5)) { if (i  N) {l - c(l,
vector(list,N)); N - N*2} ; l[[i]] - c(i,i+1,i)})
   user  system elapsed
  1.508   0.012   1.520
 l[i+1:N]-NULL

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] charset in graphics

2007-07-13 Thread Peter Dalgaard
Donatas G. wrote:
 How do I make Lithuanian characters display correctly in R graphics? 

 Instead of the special characters for Lithuanian language I get question 
 marks...

 I use Ubuntu Feisty, the locale is utf-8 ...

 Do I need to specify somewhere the locale for R, or - default font for the 
 graphics?
   
You mean as in

 plot(0,main=\u104\u116\u0118\u012e\u0172\u016a\u010c\u0160\u017d)

plot(0,main=tolower(\u104\u116\u0118\u012e\u0172\u016a\u010c\u0160\u017d))

?

This works fine for me on OpenSUSE 10.2, so I don't think the issue is
in R. More likely, this has to do with X11 fonts (Unicode is handled via
a rather complicated mechanism involving virtual fonts). Postscript/PDF
is a bit more difficult. See ?postscript and the reference to
Murrell+Ripley's R News article inside.

The correct incantation seems to be

postscript(font=URWHelvetica, encoding=ISOLatin7)
plot(0,main=tolower(\u104\u116\u0118\u012e\u0172\u016a\u010c\u0160\u017d))
dev.off()

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] charset in graphics

2007-07-13 Thread Peter Dalgaard
Prof Brian Ripley wrote:
 On Fri, 13 Jul 2007, Peter Dalgaard wrote:
   
 The correct incantation seems to be

 postscript(font=URWHelvetica, encoding=ISOLatin7)
 plot(0,main=tolower(\u104\u116\u0118\u012e\u0172\u016a\u010c\u0160\u017d))
 dev.off()
 

 The encoding should happen automagically in a Lithuanian UTF-8 locale, and 
 does for me.  But suitable fonts (e.g. URW ones) are needed.
   
OK, I sort of suspected that, although it wasn't entirely clear to me
whether autoconversion would cover cases like en_LT.utf8, if that even
exists. Still, the explicit (portable?) way of doing it is probably
worth knowing too (there could be a few pitfalls with scripts getting
run outside their usual domain).


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Correlation matrix

2007-07-13 Thread Peter Dalgaard
Caskenette, Amanda wrote:
 I have a model with 5 parameters that I am optimising where the (best)
 value of the objective function is negative. I would like to use the
 Hessian matrix (from genoud and/or optim functions)  to construct  the
 covariance and correlation matrices.

   This is the code that I am using:

   est - out$par  # Parameter estimates 
   H - out$hessian # Hessian 
   V - solve(H)   # Covariance matrix
   s - sqrt(abs(diag(V)))# Vector of standard deviations 
   cor - V/(s%o%s)# Correlation coefficient matrix 
   ci - est+qnorm(0.975)*s%o%c(-1,1) # 95% CI's

 However I am getting values that are greater than 1 (1.05, 2.34, etc)
 for the correlation matrix. Might this be due to the fact that the
 out$val is negative?

   

Not by itself (just add a large enough constant to the objective
function and the value becomes positive without changing the Hessian).

More likely, you have not actually found the minimum (Hessian not
positive definite), or there is a code error.

Print out and review the following items:

H, eigen(H), V, s, s%o%s

and see if that makes you any wiser (why are you taking abs(diag(V))?
Negative elements?)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compute rank within factor groups

2007-07-12 Thread Peter Dalgaard
Ken Williams wrote:
 Hi,

 I have a data.frame which is ordered by score, and has a factor column:

   Browse[1] wc[c(report,score)]
   report score
   9 ADEA  0.96
   8 ADEA  0.90
   11 Asylum_FED9  0.86
   3 ADEA  0.75
   14 Asylum_FED9  0.60
   5 ADEA  0.56
   13 Asylum_FED9  0.51
   16 Asylum_FED9  0.51
   2 ADEA  0.42
   7 ADEA  0.31
   17 Asylum_FED9  0.27
   1 ADEA  0.17
   4 ADEA  0.17
   6 ADEA  0.12
   10ADEA  0.11
   12 Asylum_FED9  0.10
   15 Asylum_FED9  0.09
   18 Asylum_FED9  0.07
   Browse[1] 

 I need to add a column indicating rank within each factor group, which I
 currently accomplish like so:

   wc$rank - 0
   for(report in as.character(unique(wc$report))) {
 wc[wc$report==report,]$rank - 1:sum(wc$report==report)
   }

 I have to wonder whether there's a better way, something that gets rid of
 the for() loop using tapply() or by() or similar.  But I haven't come up
 with anything.

 I've tried these:

   by(wc, wc$report, FUN=function(pr){pr$rank - 1:nrow(pr)})

   by(wc, wc$report, FUN=function(pr){wc[wc$report %in% pr$report,]$rank -
 1:nrow(pr)})

 But in both cases the effect of the assignment is lost, there's no $rank
 column generated for wc.

 Any suggestions?
   
There's a little known and somewhat unfortunately named function called 
ave() which does just that sort of thing.

  ave(wc$score, wc$report, FUN=rank)
 [1] 10.0  9.0  8.0  8.0  7.0  7.0  5.5  5.5  6.0  5.0  4.0  3.5  3.5  
2.0  1.0
[16]  3.0  2.0  1.0

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] inquiry about anova and ancova

2007-07-11 Thread Peter Dalgaard
Anderson, Mary-Jane wrote:
 Dear R users,
I have a rather knotty analysis problem and I was hoping that
 someone on this list would be able to help. I was advised to try this list
 by a colleague who uses R but it is a statistical inquiry not about how to
 use R.
  In brief I have a 3x2 anova, 2 tasks under 3 conditions, within subjects. I
 also took a variety of personality measures that might influence the results
 under the different conditions. I had thought that an ancova would be the
 best test, but it might be the case that this would not work with a within
 subjects design. I have not found anything that explicitly states whether or
 not it would, but all the examples I have read are between subjects design.
  I also thought of investigating a manova, but it is not really the case
 that I have more than one DV, it is the same DV in 6 different combinations
 of task and condition. 
  There were 4 personality measures and I wanted to look at the degree to
 which they affected the task/ condition interaction. 
  I have explained this briefly here, but I can of course provied more
 details to anyone who can advise me further with this.
   
This sounds like a job for a Multivariate Linear Model (assuming that
you have complete data for each subject or are prepared to throw away
subjects with missing values).

This lets you decompose the response into mean, effects of task and
condition, and the interaction effect. Each component can then be
separately tested for effect of predictors, using multivariate tests, or
F tests under sphericity assumptions.

Have a look at example(anova.mlm); this mostly looks at cases where
effects are tested against zero, but the last example involves a (bogus)
between subject factor f.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] type III ANOVA for a nested linear model

2007-07-11 Thread Peter Dalgaard
Carsten Jaeger wrote:
 Hello Peter,

 thanks for your help. I'm quite sure that I specified the right model.
 Factor C is indeed nested within factor A. I think you were confused by
 the numbering of C (1..11), and it is easier to understand when I code
 it as you suggested (1,2,3 within each level of A, as in mydata1 [see
 below]). However, it does not matter which numbering I choose for
 carrying the analysis, as

 anova(lm(resp ~ A * B + (C %in% A), mydata))
 anova(lm(resp ~ A * B + (C %in% A), mydata1))

 both give the same results (as at least I had expected because of the
 nesting).

 However, I found that Anova() from the car package only accepts the
 second version. So,

 Anova(lm(resp ~ A * B + (C %in% A), mydata)) does not work (giving an
 error) but
 Anova(lm(resp ~ A * B + (C %in% A), mydata1)) does.

 This behaviour is rather confusing, or is there anything I'm missing?

   
You're not listening to what I told you!

A term C %in% A  (or A/C) is not a _specification_ that C is nested in
A, it is a _directive_ to include the terms A and C:A. Now, C:A involves
a term for each combination of A and C, of which many are empty if C is
strictly coarser than A. This may well be what is confusing Anova().

In fact, with this (c(1:3,6:11)) coding of C, A:C is completely
equivalent to C, but if you look at summary(lm()) you will see a lot
of NA coefficients in the A:C case. If you use resp ~ A*B+C, then you
still get a couple of missing coefficients in the C terms because of
collinearity with the A terms. (Notice that this is one case where the
order inside the model formula will matter; C+A*B is not the same.)

Whether you'd want C as a random factor is a different matter. It is
often the natural model if C is subject and A is group. Let's assume
that this is the case: In an ordinary linear model, you can test whether
you can remove C (or A:C) , which implies that all subjects in the same
group have the same level of the response. In your case, the hypothesis
is accepted, but the F statistic is around 3 (on (6, 6) DF) , which
suggests that there might be some variation of subjects within groups.
In a mixed-effects model, you assume that this variation exists and
therefore you use the SSD for C as the denominator when testing A, which
is arguably safer than pooling it with the somewhat smaller residual SSD.

 Thanks for your help again, 

 Carsten


 R mydata
   A B  C resp
 1 1 1  1 34.12
 2 1 1  2 32.45
 3 1 1  3 44.55
 4 1 2  1 20.88
 5 1 2  2 22.32
 6 1 2  3 27.71
 7 2 1  6 38.20
 8 2 1  7 31.62
 9 2 1  8 38.71
 102 2  6 18.93
 112 2  7 20.57
 122 2  8 31.55
 133 1  9 40.81
 143 1 10 42.23
 153 1 11 41.26
 163 2  9 28.41
 173 2 10 24.07
 183 2 11 21.16

 R mydata1
   A B  C resp
 1 1 1  1 34.12
 2 1 1  2 32.45
 3 1 1  3 44.55
 4 1 2  1 20.88
 5 1 2  2 22.32
 6 1 2  3 27.71
 7 2 1  1 38.20
 8 2 1  2 31.62
 9 2 1  3 38.71
 102 2  1 18.93
 112 2  2 20.57
 122 2  3 31.55
 133 1  1 40.81
 143 1  2 42.23
 153 1  3 41.26
 163 2  1 28.41
 173 2  2 24.07
 183 2  3 21.16


 On Tue, 2007-07-10 at 13:54 +0200, Peter Dalgaard wrote:
   
 Carsten Jaeger wrote:
 
 Hello,

 is it possible to obtain type III sums of squares for a nested model as
 in the following:

 lmod - lm(resp ~ A * B + (C %in% A), mydata))

 I have tried

 library(car)
 Anova(lmod, type=III)

 but this gives me an error (and I also understand from the documentation
 of Anova as well as from a previous request
 (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/64477.html) that it is
 not possible to specify nested models with car's Anova).

 anova(lmod) works, of course.

 My data (given below) is balanced so I expect the results to be similar
 for both type I and type III sums of squares. But are they *exactly* the
 same? The editor of the journal which I'm sending my manuscript to
 requests what he calls conventional type III tests and I'm not sure if
   
 can convince him to accept my type I analysis.
   
 In balanced designs, type I-IV SSD's are all identical. However, I don't 
 think the model does what I think you think it does. 

 Notice that nesting is used with two diferent meanings, in R it would be 
 that the codings of C only makes sense within levels of A - e.g. if they 
 were numbered 1:3 within each group, but with C==1 when A==1 having nothing 
 to do with C==1 when A==2.  SAS does something. er. else...

 What I think you want is a model where C is a random terms so that main 
 effects of A can be tested, like in

 
 summary(aov(resp ~ A * B + Error(C), dd

Re: [R] make error R-5.1 on sun solaris

2007-07-11 Thread Peter Dalgaard
Dan Powers wrote:
 I hope this is enough information to determine the problem. Thanks in
 advance for any help.

 Configure goes ok (I think)

 ./configure --prefix=$HOME --without-iconv


 R is now configured for sparc-sun-solaris2.9

   Source directory:  .
   Installation directory:/home/dpowers

   C compiler:gcc  -g -O2
   Fortran 77 compiler:   f95  -g

   C++ compiler:  g++  -g -O2
   Fortran 90/95 compiler:f95 -g
   Obj-C compiler: -g -O2

   Interfaces supported:  X11
   External libraries:readline
   Additional capabilities:   NLS
   Options enabled:   shared BLAS, R profiling, Java

   Recommended packages:  yes

 Make ends after the gcc..

 make
 .
 .
 .

 gcc -I. -I../../src/include -I../../src/include -I/usr/openwin/include
 -I/usr/local/include -DHAVE_CONFIG_H   -g -O2 -c system.c -o system.o
 system.c: In function `Rf_initialize_R':
 system.c:144: parse error before `char'
 system.c:216: `localedir' undeclared (first use in this function)
 system.c:216: (Each undeclared identifier is reported only once
 system.c:216: for each function it appears in.)
 *** Error code 1
 make: Fatal error: Command failed for target `system.o'
 Current working directory /home/dpowers/R-2.5.1/src/unix
 *** Error code 1
 make: Fatal error: Command failed for target `R'
 Current working directory /home/dpowers/R-2.5.1/src/unix
 *** Error code 1
 make: Fatal error: Command failed for target `R'
 Current working directory /home/dpowers/R-2.5.1/src
 *** Error code 1
 make: Fatal error: Command failed for target `R'


 I have tried setting localedir directly in configure options, but get the
 same error.

 Any ideas?

   
Hmm, which version of gcc is this? The problem seems to be around line 
144 which reads

140 Rstart Rp = rstart;
141 cmdlines[0] = '\0';
142 
143 #ifdef ENABLE_NLS
144 char localedir[PATH_MAX+20];
145 #endif
146 
147 #if defined(HAVE_SYS_RESOURCE_H)  defined(HAVE_GETRLIMIT)
148 {
149 struct rlimit rlim;


I seem to remember that it used to be non-kosher to mix declarations 
and ordinary code like that, but the current compiler doesn't seem to 
care (I do have #define ENABLE_NLS 1 in Rconfig.h, as I assume you do 
too). Could you perhaps try moving line 141 down below #endif?



 Thanks,
 Dan
 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Daniel A. Powers, Ph.D.
 Department of Sociology
 University of Texas at Austin
 1 University Station A1700
 Austin, TX  78712-0118
 phone: 512-232-6335
 fax:   512-471-1748
 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Repeated Measure different results to spss

2007-07-10 Thread Peter Dalgaard
mb2 wrote:
 Hi, 

 I have some problems with my repeated measures analysis. When I compute it
 with SPSS I get different results than with R. Probably I am doing something
 wrong in R. 
 I have two groups (1,2) both having to solve a task under two conditions
 (1,2). That is one between subject factor (group) and one within subject
 factor (task). I tried the following:
  
  aov(Score ~factor(Group)*factor(Task)+Error(Id)))
  aov(Score ~factor(Group)*factor(Task))
 but it leads to different results than my spss. I definitely miss some point
 here .

   
Did you mean Error(factor(Id)) ?

With that modification, things look sane. Can't vouch for SPSS...

(As a general matter, I prefer to do the factor conversions up front,
rather than inside model formulas.)


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] type III ANOVA for a nested linear model

2007-07-10 Thread Peter Dalgaard
Carsten Jaeger wrote:
 Hello,

 is it possible to obtain type III sums of squares for a nested model as
 in the following:

 lmod - lm(resp ~ A * B + (C %in% A), mydata))

 I have tried

 library(car)
 Anova(lmod, type=III)

 but this gives me an error (and I also understand from the documentation
 of Anova as well as from a previous request
 (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/64477.html) that it is
 not possible to specify nested models with car's Anova).

 anova(lmod) works, of course.

 My data (given below) is balanced so I expect the results to be similar
 for both type I and type III sums of squares. But are they *exactly* the
 same? The editor of the journal which I'm sending my manuscript to
 requests what he calls conventional type III tests and I'm not sure if
   
 can convince him to accept my type I analysis.
In balanced designs, type I-IV SSD's are all identical. However, I don't think 
the model does what I think you think it does. 

Notice that nesting is used with two diferent meanings, in R it would be that 
the codings of C only makes sense within levels of A - e.g. if they were 
numbered 1:3 within each group, but with C==1 when A==1 having nothing to do 
with C==1 when A==2.  SAS does something. er. else...

What I think you want is a model where C is a random terms so that main effects 
of A can be tested, like in

 summary(aov(resp ~ A * B + Error(C), dd))

Error: C
  Df  Sum Sq Mean Sq F value Pr(F)
A  2  33.123  16.562  0.4981 0.6308
Residuals  6 199.501  33.250

Error: Within
  Df Sum Sq Mean Sq F value   Pr(F)
B  1 915.21  915.21 83.7846 9.57e-05 ***
A:B2  16.138.07  0.7384   0.5168
Residuals  6  65.54   10.92
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


(This is essentially the same structure as Martin Bleichner had earlier today, 
also @web.de. What is this? an epidemic? ;-))


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building R on Interix 6.0

2007-07-10 Thread Peter Dalgaard
 length of command line arguments... 262144
 checking command to parse /bin/nm -B output from gcc object... ok
 checking for objdir... .libs
 checking for ranlib... (cached) ranlib
 checking for strip... strip
 checking if gcc static flag  works... yes
 checking if gcc supports -fno-rtti -fno-exceptions... no
 checking for gcc option to produce PIC... -fPIC
 checking if gcc PIC flag -fPIC works... yes
 checking if gcc supports -c -o file.o... yes
 checking whether the gcc linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld)
 supports shared libraries... no
 checking dynamic linker characteristics... no
 checking how to hardcode library paths into programs... immediate
 checking whether stripping libraries is possible... yes
 checking if libtool supports shared libraries... no
 checking whether to build shared libraries... no
 checking whether to build static libraries... yes
 configure: creating libtool
 appending configuration tag CXX to libtool
 checking for ld used by g++... /opt/gcc.3.3/i586-pc-interix3/bin/ld
 checking if the linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld) is GNU
 ld... yes
 checking whether the g++ linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld)
 supports shared libraries... no
 sed: 1: s/\*/\\\*/g: invalid command code 
 checking for g++ option to produce PIC... -fPIC
 checking if g++ PIC flag -fPIC works... yes
 checking if g++ supports -c -o file.o... yes
 checking whether the g++ linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld)
 supports shared libraries... no
 checking dynamic linker characteristics... no
 checking how to hardcode library paths into programs... immediate
 checking whether stripping libraries is possible... yes
 appending configuration tag F77 to libtool
 checking if libtool supports shared libraries... no
 checking whether to build shared libraries... no
 checking whether to build static libraries... yes
 checking for g77 option to produce PIC... -fPIC
 checking if g77 PIC flag -fPIC works... yes
 checking if g77 supports -c -o file.o... yes
 checking whether the g77 linker (/opt/gcc.3.3/i586-pc-interix3/bin/ld)
 supports shared libraries... no
 checking dynamic linker characteristics... no
 checking how to hardcode library paths into programs... immediate
 checking whether stripping libraries is possible... yes
 ./configure: : bad substitution

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using the function unique(), but asking it to ignore a column of a data.frame

2007-07-09 Thread Peter Dalgaard
Andrew Yee wrote:
 Thanks.  But in this specific case, I would like the output to include
 all three columns, including the ignored column (in this case, I'd
 like it to ignore column a).
   
df[!duplicated(df[,c(a,c)]),]

or perhaps

df[!duplicated(df[-2]),]
 Thanks,
 Andrew

 On 7/9/07, hadley wickham [EMAIL PROTECTED] wrote:
   
 On 7/9/07, Andrew Yee [EMAIL PROTECTED] wrote:
 
 Take for example the following data.frame:

 a-c(1,1,5)
 b-c(3,2,3)
 c-c(5,1,5)
 sample.data.frame-data.frame(a=a,b=b,c=c)

 I'd like to be able to use unique(sample.data.frame), but have
 unique() ignore column a when determining the unique elements.

 However, I figured that this would be setting for incomparables=, but
 it appears that this funcationality hasn't been incorporated.  Is
 there a work around for this, i.e. to be able to get unique to only
 look at selected columns of a data frame?
   
 unique(df[,c(a,c)]) ?

 Hadley

 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA: Does a Between-Subjects Factor belong in the Error Term?

2007-07-09 Thread Peter Dalgaard
Alex Baugh wrote:
 I am executing a Repeated Measures Analysis of Variance with 1 DV (LOCOMOTOR
 RESPONSE),  2 Within-Subjects Factors (AGE, ACOUSTIC CONDITION), and 1
 Between-Subjects Factor (SEX).

 Does anyone know whether the between-subjects factor (SEX) belongs in the
 Error Term of the aov or not? And if it does belong, where in the Error Term
 does it go? The 3 possible scenarios are listed below:



 e.g.,

 1. Omit Sex from the Error Term:

   
 My.aov = aov(Locomotor.Response~(Age*AcousticCond*Sex) + Error
 
 (Subject/(Timepoint*Acx.Cond)), data=locomotor.tab)

   note: Placing SEX outside the double paretheses of the Error Term has the
 same statistical outcome effect as omitting it all together from the Error
 Term (as shown above in #1).



 2.  Include SEX inside the Error Term (inside Double parentheses):

   
 My.aov = aov(Locomotor.Response~(Age*AcousticCond*Sex) + Error
 
 (Subject/(Timepoint*Acx.Cond+Sex)), data=locomotor.tab)



 3.  Include SEX inside the Error Term (inside Single parentheses):


   
 My.aov = aov(Locomotor.Response~(Age*AcousticCond*Sex) + Error
 
 (Subject/(Timepoint*Acx.Cond)+Sex), data=locomotor.tab)

 note: Placing SEX inside the single parentheses (as shown above in #3)
 generates no main effect of Sex. Thus, I'm fairly confident that option #3
 is incorrect.



 Scenarios 1,2, and 3 yield different results in the aov summary.

   
You don't generally want terms with systematic effects to appear as 
error terms also, so 3 is wrong.

In 2 you basically have a random effect of sex within subject, which is 
nonsensical since the subjects presumably have only one sex each. This 
presumably generates an error stratum with 0 DF, which may well be harmless.

That leaves 1 as the likely solution.

You'll probably do yourself a favour if you learn to expand error terms, 
a/b == a + a:b, etc.; that's considerably more constructive than trying 
to think in terms of whether things are inside or outside parentheses.


 Thanks for your help!

 Alex







__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does the r-distribution function work

2007-07-06 Thread Peter Dalgaard
pieter claassen wrote:
 I am trying to understand what rbinom function does.

 Here is some sample code. Are both the invocations of bfunc effectively
 doing the same or I am missing the point?

   
There are some newbie issues with your code (you are extending a on 
every iteration, and your bfunc is just rbinom with the parameters in a 
different order), but basically, yes: They are conceptually the same. 
Both give 1 independent binomial samples.

In fact, if you reset the random number generator in between, they also 
give the same results (this is an implementation issue and not obviously 
guaranteed for any distribution) . Here's an example with smaller values 
than 1 and 30.

  set.seed(123)
  rbinom(10,1,.5)
 [1] 0 1 0 1 1 0 1 1 1 0

  set.seed(123)
  for (i in 1:10) print(rbinom(1,1,.5))
[1] 0
[1] 1
[1] 0
[1] 1
[1] 1
[1] 0
[1] 1
[1] 1
[1] 1
[1] 0

  set.seed(123)
  replicate(10, rbinom(1,1,.5))
 [1] 0 1 0 1 1 0 1 1 1 0


 Thanks,
 Pieter

 bfunc - function(n1,p1,sims) {
 c-rbinom(sims,n1,p1)
 c
 }

 a=c()
 b=c()
 p1=.5
 for (i in 1:1){
 a[i]=bfunc(30,p1,1)
 }
 b=bfunc(30,p1,1)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t.test

2007-07-06 Thread Peter Dalgaard
matthew wrote:
 Hi, how can I solve a problem without the function t.test???

 for example:
 x-(1,3,5,7)
 y-(2,4,6)
 t.test(x,y,alternative=less,paired=FALSE,var.equal=TRUE,conf.level=0.95)


   
Homework?

Hints: Take out your statistics textbook and look up the formulas for
the two-sample t.
You'll probably (there can be some variation depending on the book) find
that you need to compute

- difference of means
- sd for each group
- pooled sd
- s.e. of differences of means

all of which you can do easily in R, once you have the formulas. Then
calculate the t statistic and the corresponding p value, either using a
table or R's function for the t distibution.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Warning message: cannot create HTML package index

2007-07-06 Thread Peter Dalgaard
Leo wrote:
 On 06/07/2007, Prof Brian Ripley wrote:
   
 On Fri, 6 Jul 2007, Leo wrote:

 
 I have set R_LIBS=~/R_lib as I don't have root access.

 The following message shown up every time after installing a package:

  ..
  The downloaded packages are in
 /tmp/RtmpBoIPoz/downloaded_packages
  Warning message:
  cannot create HTML package index in: tools:::unix.packages.html(.Library)

 Any ideas?
   
 It is a correct warning.  What is the problem with being warned?

 R tries to maintain an HTML page of installed packages, but you don't have 
 permission to update it.
 

 Where is that HTML page located on a GNU/Linux system?

 Is it possible to maintain a user HTML page of installed packages?

 Thanks,
   
This confuses me a bit too. I had gotten used to the warning without
thinking about it. It tries to update $RHOME/doc/html/packages.html,
which starts like this:

.
ph3Packages in the standard library/h3
.

However, if I run help.start, I get

 help.start()
Making links in per-session dir ...
If 'firefox' is already running, it is *not* restarted, and you must
switch to its window.
Otherwise, be patient ...

and then it opens (say)
  file:///tmp/RtmpXyp5Cg/.R/doc/html/index.html
which has a link to
  file:///tmp/RtmpXyp5Cg/.R/doc/html/packages.html

which looks like this


ph3Packages in /home/bs/pd/Rlibrary/h3

ph3Packages in /usr/lib64/R/library/h3

I.e. it is autogenerated by help.start and doesn't even look at the file
in $RHOME. So what puzzles me is

(a) why we maintain $RHOME/doc/html/packages.html at all

One argument could be that this is browseable for everyone on a system,
even without starting R. But then

(b) why do we even try updating it when packages are installed in a
private location?

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Me again, about the horrible documentation of tcltk

2007-07-05 Thread Peter Dalgaard
Alberto Monteiro wrote:
 How on Earth can I know what are the arguments of any of the functions of 
 the tcl/tk package? I tried hard to find, using all search engines 
 available, looking deep into keywords of R, python's tkinter and tcl/tk, but 
 nowhere I found anything remotely similar to a help.

 For example, what are the possible arguments to tkgetOpenFile?

 I know that this works:

 library(tcltk)
 filename - tclvalue(tkgetOpenFile(
   filetypes={{Porn Files} {.jpg}} {{All files} {*}}))
 if (filename != ) cat(Selected file:, filename, \n)

 but, besides filetypes, what are the other arguments to
 tkgetOpenFile? I would like to force the files to be sorted by
 time, with most recent files coming first (and no, the purpose is
 not to use for porn files).

   
man n tk_getOpenFile

or if you are not on Unix/Linux, find it online with Google
 Alberto Monteiro

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lookups in R

2007-07-04 Thread Peter Dalgaard
mfrumin wrote:
 Hey all; I'm a beginner++ user of R, trying to use it to do some processing
 of data sets of over 1M rows, and running into a snafu.  imagine that my
 input is a huge table of transactions, each linked to a specif user id.  as
 I run through the transactions, I need to update a separate table for the
 users, but I am finding that the traditional ways of doing a table lookup
 are way too slow to support this kind of operation.

 i.e:

 for(i in 1:100) {
userid = transactions$userid[i];
amt = transactions$amounts[i];
users[users$id == userid,'amt'] += amt;
 }

 I assume this is a linear lookup through the users table (in which there are
 10's of thousands of rows), when really what I need is O(constant time), or
 at worst O(log(# users)).

 is there any way to manage a list of ID's (be they numeric, string, etc) and
 have them efficiently mapped to some other table index?

 I see the CRAN package for SQLite hashes, but that seems to be going a bit
 too far.
   
Sometimes you need a bit of lateral thinking. I suspect that you could 
do it like this:

tbl - with(transactions, tapply(amount, userid, sum))
users$amt - users$amt + tbl[users$id]

one catch is that there could be users with no transactions, in which 
case you may need to replace userid by factor(userid, levels=users$id). 
None of this is tested, of course.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lookups in R

2007-07-04 Thread Peter Dalgaard
Michael Frumin wrote:
 i wish it were that simple.  unfortunately the logic i have to do on 
 each transaction is substantially more complicated, and involves 
 referencing the existing values of the user table through a number of 
 conditions.

 any other thoughts on how to get better-than-linear performance time?  
 is there a recommended binary searching/sorting (i.e. BTree) module that 
 I could use to maintain my own index?
   
The point remains: To do anything efficient in R, you need to get rid of 
that for loop and use something vectorized. Notice that you can expand 
values from the user table into the transaction table by indexing with 
transactions$userid, or you can use a merge operation.

 thanks,
 mike

 Peter Dalgaard wrote:
   
 mfrumin wrote:
 
 Hey all; I'm a beginner++ user of R, trying to use it to do some 
 processing
 of data sets of over 1M rows, and running into a snafu.  imagine that my
 input is a huge table of transactions, each linked to a specif user 
 id.  as
 I run through the transactions, I need to update a separate table for 
 the
 users, but I am finding that the traditional ways of doing a table 
 lookup
 are way too slow to support this kind of operation.

 i.e:

 for(i in 1:100) {
userid = transactions$userid[i];
amt = transactions$amounts[i];
users[users$id == userid,'amt'] += amt;
 }

 I assume this is a linear lookup through the users table (in which 
 there are
 10's of thousands of rows), when really what I need is O(constant 
 time), or
 at worst O(log(# users)).

 is there any way to manage a list of ID's (be they numeric, string, 
 etc) and
 have them efficiently mapped to some other table index?

 I see the CRAN package for SQLite hashes, but that seems to be going 
 a bit
 too far.
   
   
 Sometimes you need a bit of lateral thinking. I suspect that you could 
 do it like this:

 tbl - with(transactions, tapply(amount, userid, sum))
 users$amt - users$amt + tbl[users$id]

 one catch is that there could be users with no transactions, in which 
 case you may need to replace userid by factor(userid, 
 levels=users$id). None of this is tested, of course.
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] focus to tkwindow after a PDF window pop up

2007-07-02 Thread Peter Dalgaard
Hao Liu wrote:
 Dear All:

 I currently have a TK window start a acroread window: However, when the 
 acroread window is open, I can't get back to the TK window unless I 
 close the acroead.

 I invoked the acroread window using: system(paste(acroread ,file, sep=))

 anything I can do to make them both available to users?
   
Tell system() not to _wait_ for command to complete.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compute time span in months between two dates

2007-07-02 Thread Peter Dalgaard
Aydemir, Zava (FID) wrote:
 Hi,
  
 I am just starting to play with R. What is the recommended manner for
 calculating time spans between 2 dates? In particular, should I be using
 the chron or the date package (so far I just found how to calculate
 a timespan in terms of days)?
  
 Thanks
   
I'd recommend something along these lines:

d1 - 11/03-1959
d2 - 2/7-2007
f - %d/%m-%Y
as.numeric(as.Date(d2, f) - as.Date(d1, f), units=days)

(The format in f needs to be adjusted to the actual format, of course. 
For some formats, it can be omitted altogether).

  
 Zava
 

 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exaustive subgrouping or combination

2007-06-29 Thread Peter Dalgaard
David Duffy wrote:
 Waverley [EMAIL PROTECTED] asked:

 Dear Colleagues,

 I am looking for a package or previous implemented R to subgroup and
 exaustively divide a vector of squence into 2 groups.

 -- 
 Waverley @ Palo Alto
 

 Google [R] Generating all possible partitions and you will find some R code
 from 2002 or so.

   
In 2002 this wasn't already in R. These days, help(combn) is more to the 
point:

 mn - sort(zapsmall(combn(sleep$extra,10,mean)))
 plot(unique(mn),table(mn))
 abline(v=mean(sleep$extra[1:10]))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparison: glm() vs. bigglm()

2007-06-29 Thread Peter Dalgaard
Benilton Carvalho wrote:
 Hi,

 Until now, I thought that the results of glm() and bigglm() would  
 coincide. Probably a naive assumption?

 Anyways, I've been using bigglm() on some datasets I have available.  
 One of the sets has 15M observations.

 I have 3 continuous predictors (A, B, C) and a binary outcome (Y).  
 And tried the following:

 m1 - bigglm(Y~A+B+C, family=binomial(), data=dataset1, chunksize=10e6)
 m2 - bigglm(Y~A*B+C, family=binomial(), data=dataset1, chunksize=10e6)
 imp - m1$deviance-m2$deviance

 For my surprise imp was negative.

 I then tried the same models, using glm() instead... and as I  
 expected, imp was positive.

 I also noticed differences on the coefficients estimated by glm() and  
 bigglm() - small differences, though, and CIs for the coefficients (a  
 given coefficient compared across methods) overlap.

 Are such incrongruences expected? What can I use to check for  
 convergence with bigglm(), as this might be one plausible cause for a  
 negative difference on the deviances?
   
It doesn't sound right, but I cannot reproduce your problem on a similar
sized problem (it pretty much killed my machine...). Some observations:

A: You do realize that you are only using 1.5 chunks? (15M vs. 10e6
chunksize)

B: Deviance changes are O(1) under the null hypothesis but the deviances
themselves are O(N). In a smaller variant (N=1e5), I got

 m1$deviance
[1] 138626.4
 m2$deviance
[1] 138626.4
 m2$deviance - m1$deviance
[1] -0.05865785

This does leave some scope for roundoff to creep in. You may want to
play with a lower setting of tol=...

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logistic regression and dummy variable coding

2007-06-29 Thread Peter Dalgaard
Li, Bingshan wrote:
 Hi Frank,
  
 I do not quite get you. What do you mean by simulation and speed issues? I do 
 not see why they have to be considered in logistic regression.
  
   
Exactly. So don't use techniques that are only needed when such issues 
do have to be considered.

 Thanks.
  
 Bingshan

 

 From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED]
 Sent: Fri 6/29/2007 7:40 AM
 To: Li, Bingshan
 Cc: Seyed Reza Jafarzadeh; r-help@stat.math.ethz.ch
 Subject: Re: [R] logistic regression and dummy variable coding



 Bingshan Li wrote:
   
 Hi All,

 Now it works. Thanks for all your answers and the explanations are 
 very clear.

 Bingshan
 

 But note that you are not using R correctly unless you are doing a
 simulation and have some special speed issues.  Let the model functions
 do all this for you.

 Frank

   
 On Jun 28, 2007, at 7:44 PM, Seyed Reza Jafarzadeh wrote:

 
 NewVar - relevel( factor(OldVar), ref = b)
 should create a dummy variable, and change the reference category 
 for the model.

 Reza


 On 6/28/07, Bingshan Li [EMAIL PROTECTED] wrote:
   
 Hello everyone,

 I have a variable with several categories and I want to convert this
 into dummy variables and do logistic regression on it. I used
 model.matrix to create dummy variables but it always picked the
 smallest one as the reference. For example,

 model.matrix(~.,data=as.data.frame(letters[1:5]))

 will code 'a' as '0 0 0 0'. But I want to code another category as
 reference, say 'b'. How to do it in R using model.matrix? Is there
 other way to do it if model.matrix  has no such functionality?

 Thanks!



 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 


 --
 Frank E Harrell Jr   Professor and Chair   School of Medicine
   Department of Biostatistics   Vanderbilt University



   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dominant eigenvector displayed as third (Marco Visser)

2007-06-29 Thread Peter Dalgaard
Marco Visser wrote:
 Dear R users  Experts,

 This is just a curiousity, I was wondering why the dominant eigenvetor and 
 eigenvalue 
 of the following matrix is given as the third. I guess this could complicate 
 automatic selection 
 procedures. 

 000005
 100000
 010000
 001000
 000100
 000010

 Please copy  paste the following into R;

 a=c(0,0,0,0,0,5,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0)
 mat=matrix(a, ncol=6,byrow=T)
 eigen(mat)

 The matrix is a population matrix for a plant pathogen (Powell et al 2005).

 Basically I would really like to know why this happens so I will know if it 
 can occur 
 again. 

 Thanks for any comments,

 Marco Visser


 Comment: In Matlab the the dominant eigenvetor and eigenvalue 
 of the described matrix are given as the sixth. Again no idea why.
   


I get

  eigen(mat)$values
[1] -0.65383+1.132467i -0.65383-1.132467i  0.65383+1.132467i  
0.65383-1.132467i
[5] -1.30766+0.00i  1.30766+0.00i
  Mod(eigen(mat)$values)
[1] 1.307660 1.307660 1.307660 1.307660 1.307660 1.307660

So all the eigenvalues are equal in modulus. What makes you think one of 
them is dominant?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exaustive subgrouping or combination

2007-06-29 Thread Peter Dalgaard
David Duffy wrote:
 On Fri, 29 Jun 2007, Peter Dalgaard wrote:

   
 David Duffy wrote:
 
 Waverley [EMAIL PROTECTED] asked:

 Dear Colleagues,

 I am looking for a package or previous implemented R to subgroup and
 exaustively divide a vector of squence into 2 groups.

 -- 
 Waverley @ Palo Alto

 
 Google [R] Generating all possible partitions and you will find some R 
 code
 from 2002 or so.


   
 In 2002 this wasn't already in R. These days, help(combn) is more to the 
 point:

 mn - sort(zapsmall(combn(sleep$extra,10,mean)))
 plot(unique(mn),table(mn))
 abline(v=mean(sleep$extra[1:10]))

 

 As I read it, the original query is about partitioning the set eg
 ((1 2) 3) ((1 3) 2) (1 (2 3)).

   
Yes, and

  combn(3,2)
 [,1] [,2] [,3]
[1,]112
[2,]233

gives you the first group of each of the three partitions

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R 2.5.1 is released

2007-06-28 Thread Peter Dalgaard
I've rolled up R-2.5.1.tar.gz a short while ago. This is a maintenance
release and fixes a number of mostly minor bugs and platform issues. See 
the full list of changes below.

You can get it (in a short while) from

http://cran.r-project.org/src/base/R-2/R-2.5.1.tar.gz

or wait for it to be mirrored at a CRAN site nearer to you. Binaries
for various platforms will appear in due course.
 
For the R Core Team

Peter Dalgaard

These are the md5sums for the freshly created files, in case you wish
to check that they are uncorrupted:

a8efde35b940278de19730d326f58449  AUTHORS
eb723b61539feef013de476e68b5c50a  COPYING
a6f89e2100d9b6cdffcea4f398e37343  COPYING.LIB
24ad9647e525609bce11f6f6ff9eac2d  FAQ
70447ae7f2c35233d3065b004aa4f331  INSTALL
f04bdfaf8b021d046b8040c8d21dad41  NEWS
88bbd6781faedc788a1cbd434194480c  ONEWS
4f004de59e24a52d0f500063b4603bcb  OONEWS
162f6d5a1bd7c60fd652145e050f3f3c  R-2.5.1.tar.gz
162f6d5a1bd7c60fd652145e050f3f3c  R-latest.tar.gz
433182754c05c2cf7a04ad0da474a1d0  README
020479f381d5f9038dcb18708997f5da  RESOURCES
4eaf8a3e428694523edc16feb0140206  THANKS


Here is the relevant bit of the NEWS file:

CHANGES IN R VERSION 2.5.1

NEW FEATURES

o   density(1:20, bw = SJ) now works as bw.SJ() now tries a larger
search interval than the default (lower, upper) if it does not
find a solution within the latter.

o   The output of library() (no arguments) is now sorted by library
trees in the order of .libPaths() and not alphabetically.

o   R_LIBS_USER and R_LIBS_SITE feature possible expansion of
specifiers for R version specific information as part of the
startup process.

o   C-level warning calls now print a more informative context,
as C-level errors have for a while.

o   There is a new option rl_word_breaks to control the way the
input line is tokenized in the readline-based terminal
interface for object- and file-name completion.
This allows it to be tuned for people who use their space bar
vs those who do not.  The default now allows filename-completion
with +-* in the filenames.

o   If the srcfile argument to parse() is not NULL, it will be added
to the result as a srcfile attribute.

o   It is no longer possible to interrupt lazy-loading (which was
only at all likely when lazy-loading environments), which
would leave the object being loaded in an unusable state.
This is a temporary measure: error-recovery when evaluating
promises will be tackled more comprehensively in 2.6.0.

INSTALLATION

o   'make check' will work with --without-iconv, to accommodate
building on AIX where the system iconv conflicts with
libiconv and is not compatible with R's requirements.

o   There is support for 'DESTDIR': see the R-admin manual.

o   The texinfo manuals are now converted to HTML with a style
sheet: in recent versions of makeinfo the markup such as @file
was being lost in the HTML rendering.

o   The use of inlining has been tweaked to avoid warnings from
gcc = 4.2.0 when compiling in C99 mode (which is the default
from configure).

BUG FIXES

o   as.dendrogram() failed on objects of class dendrogram.

o   plot(type =s) (or S) with many (hundreds of thousands)
of points could overflow the stack.  (PR#9629)

o   Coercing an S4 classed object to matrix (or other basic class)
failed to unset the S4 bit.

o   The 'useS4' argument of print.default() had been broken by an
unrelated change prior to 2.4.1.  This allowed print() and
show() to bounce badly constructed S4 objects between
themselves indefinitely.

o   Prediction of the seasonal component in HoltWinters() was one
step out at one point in the calculations.

decompose() incorrectly computed the 'random' component for a
multiplicative fit.

o   Wildcards work again in unlink() on Unix-alikes (they did not
in 2.5.0).

o   When qr() used pivoting, the coefficient names in qr.coef() were
not pivoted to match.  (PR#9623)

o   UseMethod() could crash R if the first argument was not a
character string.

o   R and Rscript on Unix-alikes were not accepting spaces in -e
arguments (even if quoted).

o   Hexadecimal integer constants (e.g. 0x10L) were not being parsed
correctly on platforms where the C function atof did not
accept hexadecimal prefixes (as required by C99, but not
implemented in MinGW as used by R on Windows).  (PR#9648)

o   libRlapack.dylib on Mac OS X had no version information and
sometimes an invalid identification name.

o   Rd conversion of \usage treated '\\' as a single backslash in
all but latex: it now acts consistently with the other
verbatim-like environments (it was never 'verbatim

Re: [R] : regular expressions: escaping a dot

2007-06-28 Thread Peter Dalgaard
Prof Brian Ripley wrote:


 This is explained in ?regexp (in the See Also of ?regexpr):

   Patterns are described here as they would be printed by 'cat': _do
   remember that backslashes need to be doubled when entering R
   character strings from the keyboard_.

 and in the R FAQ and 

   
Hmm, that's not actually correct, is it? Perhaps this is better

...entering R character string literals (i.e., between quote symbols.)

The counterexample would be

 readLines()
\\abc
[1] abc

(of course it is more important to get people to read the documentation
at all...)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aov and lme differ with interaction in oats example of MASS?

2007-06-28 Thread Peter Dalgaard
Karl Knoblick wrote:
 Dear R-Community!

 The example oats in MASS (2nd edition, 10.3, p.309) is calculated for aov 
 and lme without interaction term and the results are the same. 
 But I have problems to reproduce the example aov with interaction in MASS 
 (10.2, p.301) with lme. Here the script:

 library(MASS)
 library(nlme)
 options(contrasts = c(contr.treatment, contr.poly))
 # aov: Y ~ N + V
 oats.aov - aov(Y ~ N + V + Error(B/V), data = oats, qr = T)
 summary(oats.aov)
 # now lme
 oats.lme-lme(Y ~ N + V, random = ~1 | B/V, data = oats)
 anova(oats.lme, type=m) # Ok!
 # aov:Y ~ N * V + Error(B/V)
 oats.aov2 - aov(Y ~ N * V + Error(B/V), data = oats, qr = T)
 summary(oats.aov2)
 # now lme - my trial!
 oats.lme2-lme(Y ~ N * V, random = ~1 | B/V, data = oats)
 anova(oats.lme2, type=m)
 # differences!!! (except of interaction term)

 My questions:
 1) Is there a possibility to reproduce the result of aov with interaction 
 using lme?
  2) If not, which result of the above is the correct one for the oats 
 example? 
   

The issue is that you are using marginal tests which will do strange
things when contrasts are not coded right, and in particular treatment
contrasts are not. Switch to e.g. contr.helmert and the results become
similar. Marginal tests of main effects in the presence of interaction
is not necessarily a good idea and they have been debated here and
elsewhere a number of times before. People don't agree entirely, but the
dividing line is essentially whether it is uniformly or just mostly a
bad idea. It is essentially the discussion of type III SS.

 fortune(type III)

Some of us feel that type III sum of squares and so-called ls-means are
statistical nonsense which should have been left in SAS.
   -- Brian D. Ripley
  s-news (May 1999)


 Thanks a lot!
 Karl


   __  Alles was der Gesundheit und 
 Entspannung dient. BE A BETTER MEDIZINMANN! www.yahoo.de/clever

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Repeat if

2007-06-28 Thread Peter Dalgaard
Birgit Lemcke wrote:
 Thanks that was really a quick answer.

 It works but I get this warning message anyway:

 1: kein nicht-fehlendes Argument f�r min; gebe Inf zur�ck (None not- 
 lacking argument for min; give Inf back)
 2: kein nicht-fehlendes Argument f�r max; gebe -Inf zur�ck

 what does this mean?

   

Same as this

 max(c(NA, NA), na.rm=T)
[1] -Inf
Warning message:
no non-missing arguments to max; returning -Inf

which is related to the issues of empty sum(), prod(), any(), and all()
in that it allows a consistent concatenation rule:

max(c(x1,x2)) == max(max(x1), max(x2))

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA non-sphericity test and corrections (eg, Greenhouse-Geisser)

2007-06-25 Thread Peter Dalgaard
DarrenWeber wrote:
 I'm an experimental psychologist and when I run ANOVA analysis in
 SPSS, I normally ask for a test of non-sphericity (Box's M-test).  I
 also ask for output of the corrections for non-sphericity, such as
 Greenhouse-Geisser and Huhn-Feldt.  These tests and correction factors
 are commonly used in the journals for experimental and other
 psychology reports.  I have been switching from SPSS to R for over a
 year now, but I realize now that I don't have the non-sphericity test
 and correction factors.
   
This can be done using anova.mlm() and mauchly.test()  which work on 
mlm objects, i.e., lm() output where the response is a matrix. There 
is no theory, to my knowledge, to support it for general aov() models, 
the catch being that you need to have a within-subject covariance matrix.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Source code for rlogis

2007-06-25 Thread Peter Dalgaard
Anup Nandialath wrote:
 Dear friends,

 I was trying to read the source code for rlogis but ran into a roadblock. It 
 shows

 [[1]]
 function (n, location = 0, scale = 1) 
 .Internal(rlogis(n, location, scale))
 environment: namespace:stats

 Is is possible to access the source code for the same.

   
Yes, but as it is .Internal, you have to look in the (C code) sources 
for R itself. You can access that either by getting the source files for 
R and unpacking them somewhere on your computer, or by browsing e.g. 
https://*svn*.*R*-project.org/*R*/tags/R-2-5-0 or  
https://svn.r-project.org/R/branches/R-2-5-branch. Specifically, 
src/nmath/rlogis.c.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: Suse RPM installation problem

2007-06-22 Thread Peter Dalgaard
Stephen Henderson wrote:
 Thanks for your help

 As you suggested I do indeed have a 64bit version called exactly the same

 PC5-140:/home/rmgzshd # rpm -qf /usr/lib/libpng12.so.0
 libpng-32bit-1.2.8-19.5
 PC5-140:/home/rmgzshd # rpm -qf /usr/lib64/libpng12.so.0
 libpng-1.2.8-19.5

 SO how do I tell rpm to find this and not the 32bit file? Or do I need to 
 edit something in the rpm file?

 Thanks

   
Odd... Do you actually _have_  /usr/lib64/libpng12.so.0 (whereis didn't
seem to find it) --- as opposed to rpm -qf telling you which package
contains the file? If not, try (re)installing libpng, possibly with
--force.


 -Original Message-
 From: Peter Dalgaard [mailto:[EMAIL PROTECTED]
 Sent: Thu 6/21/2007 6:34 PM
 To: Stephen Henderson
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] FW: Suse RPM installation problem
  
 Stephen Henderson wrote:
   
 Hello 

 I am trying to install the R RPM for Suse 10.0 on an x86_64 PC. However
 I am failing a dependency for  libpng12.so.0 straight away



 PC5-140:/home/rmgzshd # rpm -i R-base-2.5.0-2.1.x86_64.rpm
 error: Failed dependencies:
 libpng12.so.0(PNG12_0)(64bit) is needed by R-base-2.5.0-2.1.x86_64

 I do seem to have this file

 PC5-140:/home/rmgzshd # whereis libpng12.so.0
 libpng12.so: /usr/lib/libpng12.so.0 /usr/local/lib/libpng12.so 

 but presuming that it is not the 64bit version mentioned I went looking
 for a 64 bit version but could not find it through google.

 However reading the Installation manual I noted that libpng is mention
 in the context of a source build. I therefore downloaded libpng-1.2.18
 (v-1.2.8 or later is specified in the manual) and succesfully compiled
 this. This did not however help with my problem.

 Any suggestions?

   
 
 I have

 viggo:~/rpm -qf /usr/lib/libpng12.so.0
 libpng-32bit-1.2.12-25
 viggo:~/rpm -qf /usr/lib64/libpng12.so.0
 libpng-1.2.12-25
 viggo:~/rpm -q R-base
 R-base-2.5.0-2.1


   
 Thanks
 Stephen Henderson
  

 **
 This email and any files transmitted with it are confidentia...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   
 



 **
 This email and any files transmitted with it are confidentia...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replace number with month

2007-06-21 Thread Peter Dalgaard
Don MacQueen wrote:
 You can get the names using

month.name[MM]


 And it may be necessary to use

  factor(month.name[MM], levels=month.name[1:12])

 to get them to show up in the correct order in the barchart.
   

You're crossing the creek to fetch water there, and getting yourself
soaked in the process... (by an unnecessary conversion to character
which is subject to alphabetical sorting)

I think the canonical way is

factor(MM, levels=1:12, labels=month.name)

(and the levels=1:12 may not even be necessary when all 12 months are
present)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: Suse RPM installation problem

2007-06-21 Thread Peter Dalgaard
Stephen Henderson wrote:
 Hello 

 I am trying to install the R RPM for Suse 10.0 on an x86_64 PC. However
 I am failing a dependency for  libpng12.so.0 straight away



 PC5-140:/home/rmgzshd # rpm -i R-base-2.5.0-2.1.x86_64.rpm
 error: Failed dependencies:
 libpng12.so.0(PNG12_0)(64bit) is needed by R-base-2.5.0-2.1.x86_64

 I do seem to have this file

 PC5-140:/home/rmgzshd # whereis libpng12.so.0
 libpng12.so: /usr/lib/libpng12.so.0 /usr/local/lib/libpng12.so 

 but presuming that it is not the 64bit version mentioned I went looking
 for a 64 bit version but could not find it through google.

 However reading the Installation manual I noted that libpng is mention
 in the context of a source build. I therefore downloaded libpng-1.2.18
 (v-1.2.8 or later is specified in the manual) and succesfully compiled
 this. This did not however help with my problem.

 Any suggestions?

   
I have

viggo:~/rpm -qf /usr/lib/libpng12.so.0
libpng-32bit-1.2.12-25
viggo:~/rpm -qf /usr/lib64/libpng12.so.0
libpng-1.2.12-25
viggo:~/rpm -q R-base
R-base-2.5.0-2.1


 Thanks
 Stephen Henderson
  

 **
 This email and any files transmitted with it are confidentia...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova on data means

2007-06-21 Thread Peter Dalgaard
Ronaldo Reis Junior wrote:
 Em Quinta 21 Junho 2007 16:56, Thomas Miller escreveu:
   
 I am transitioning from SAS to R and am struggling with a relatively simple
 analysis.  Have tried Venables and Ripley and other guides but can't find a
 solution.

 I have an experiment with 12 tanks.  Each tank holds 10 fish.  The 12 tanks
 have randomly assigned one of 4 food treatments - S(tarve), L(ow), M(edium)
 and H(igh).  There are 3 reps of each treatment.  I collect data on size of
 each fish at the end of the experiment.  So my data looks like

 Tank  Trt   Fish   Size
 1  S 1  3.4
 1  S 2  3.6
 
 1  S10  3.5
 2  L 1  3.4
 
 12M 10  2.1

 To do the correct test of hypothesis using anova, I need to calculate the
 tank means and use those in the anova.  I have tried using tapply() and
 by() functions, but when I do so I loose the treatment level because it
 is categorical.  I have used
 Meandattapply(Size,list(Tank, Trt), mean)

 But that doesn't give me a dataframe that I can then use to do the actual
 aov analysis.  So what is the most efficient way to accomplish the analysis

 Thanks

 Tom Miller
 

 Tom,

 try the aggregate funtion. Somethink like this

 meandat - aggregate(Size,list(Tank,Trt),mean)
   

Why not just include an error term for Tank in the model?

summary(aov(Size~Trt+Error(Tank)))

 
 Inte
 Ronaldo
 --
   
 Prof. Ronaldo Reis Júnior
 
 |  .''`. UNIMONTES/Depto. Biologia Geral/Lab. de Ecologia
 | : :'  : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia
 | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil
 |   `- Fone: (38) 3229-8187 | [EMAIL PROTECTED] | [EMAIL PROTECTED]
 | http://www.ppgcb.unimontes.br/ | ICQ#: 5692561 | LinuxUser#: 205366

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Got Unexpected ELSE error

2007-06-20 Thread Peter Dalgaard
Shiazy Fuzzy wrote:
 Dear R-users,

 I have a problem with the IF-ELSE syntax.
 Please look at the folllowing code and tell me what's wrong:

 a - TRUE
 if ( a )
 {
 cat(TRUE,\n)
 }
 else
 {
 cat(FALSE,\n)
 }

 If I try to execute with R I get:
  Error: syntax error, unexpected ELSE in else
 The strange thing is either cat instructions are executed!!
   
For some odd reason this is not actually a FAQ...

It is an anomaly of the R (and S) language (or maybe a necessary
consequence of its interactive usage) that it tries to complete parsing
of expressions as soon as possible, so 

2 + 2
+ 5

prints 4 and then 5, whereas

2 + 2 +
5

prints 9.  Similarly, when encountered on the command line,

if (foo) bar

will result in the value of bar if foo is TRUE and otherwise NULL. A
subsequent

else baz

will be interpreted as a new expression, which is invalid because it
starts with else. To avoid this effect you can either move the else
to the end of the previous line, or put braces around the whole if
construct. I.e.

if (foo) {
bar
} else {
baz
}

or

if (foo) bar else baz

or

{
if (foo)
 bar
else
 baz
}

should all work.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to compute Wilk's Lambda

2007-06-19 Thread Peter Dalgaard
Dietrich Trenkler wrote:
 Dear helpeRs,

 the following data set comes from Johnson/Wichern: Applied Multivariate
 Statistical Analysis, 6th ed, pp. 304-306.

 /X - structure(c(9, 6, 9, 3, 2, 7), .Dim = as.integer(c(3, 2)))
 Y - structure(c(0, 2, 4, 0), .Dim = as.integer(c(2, 2)))
 Z - structure(c(3, 1, 2, 8, 9, 7), .Dim = as.integer(c(3, 2)))/

 I would like to compute Wilk's Lambda in R, which I know is 0.0385. How
 can I do that? I tried

 /U - rbind(X,Y,Z)
 m - manova(U~rep(1:3, c(3, 2, 3)))
 summary(m,test=Wilks)/

 which gives


 / Df  Wilks approx F num Df den Df  Pr(F)
 rep(1:3, c(3, 2, 3))  1  0.162   12.930  2  5 0.01057 *
 Residuals 6
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1/


 I suppose the argument rep(1:3, c(3, 2, 3)) in manova() is not appropriate.

   
Exactly. If intended as a grouping, you need to turn it into a factor:

  m - manova(U~factor(rep(1:3, c(3, 2, 3
  summary(m,test=Wilks)
Df Wilks approx F num Df den Df Pr(F)
factor(rep(1:3, c(3, 2, 3))) 2 0.0385 8.1989 4 8 0.006234 **
Residuals 5
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Or, for that matter:

  anova(lm(U~factor(rep(1:3, c(3, 2, 3, test=Wilks)
Analysis of Variance Table

Df Wilks approx F num Df den Df Pr(F)
(Intercept) 1 0.048 39.766 2 4 0.002293 **
factor(rep(1:3, c(3, 2, 3))) 2 0.038 8.199 4 8 0.006234 **
Residuals 5
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


 Any help is very much appreciated.

 Dietrich   



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to obtain the OR and 95%CI with 1 SD change of a continue variable

2007-06-18 Thread Peter Dalgaard
felix wrote:
 Dear all,

 How to obtain the odds ratio (OR) and 95% confidence interval (CI) with 
 1 standard deviation (SD) change of a continuous variable in logistic 
 regression?

 for example, to investigate the risk of obesity for stroke. I choose the 
 happening of stroke (positive) as the dependent variable, and waist 
 circumference as an independent variable. Then I wanna to obtain the OR 
 and 95% CI with 1 SD change of waist circumference.how?

 Any default package(s) or options in glm available now?

 if not, how to calculate them by hand?

   
Unless you want to do something advanced like factoring in the sampling
error of the SD (I don't think anyone bothers with that), probably the
easiest way is to scale() the predictor and look at the relevant line of
exp(confint(glm(.))). As in

(library(MASS); example(confint.glm))

 budworm.lg0 - glm(SF ~ sex + scale(ldose), family = binomial)
 exp(confint(budworm.lg0))
Waiting for profiling to be done...
 2.5 % 97.5 %
(Intercept)  0.2652665  0.7203169
sexM 1.5208018  6.1747207
scale(ldose) 4.3399952 10.8983903

Or, if you insist on getting asymptotic Wald-statistic based intervals:

 exp(confint.default(budworm.lg0))
2.5 % 97.5 %
(Intercept)  0.269864  0.7294944
sexM 1.496808  6.0384756
scale(ldose) 4.220890 10.5546837

You can also get it from the coefficients of the unscaled analysis, like in

 budworm.lg0 - glm(SF ~ sex + ldose, family = binomial)
 confint(budworm.lg0)
Waiting for profiling to be done...
 2.5 %97.5 %
(Intercept) -4.4582430 -2.613736
sexM 0.4192377  1.820464
ldose0.8229072  1.339086
 exp(confint(budworm.lg0)[3,]*sd(ldose))
Waiting for profiling to be done...
2.5 %97.5 %
 4.339995 10.898390


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] removing values from a vector, where both the value and its name are the same?

2007-06-15 Thread Peter Dalgaard
Patrick Burns wrote:
 In case it matters, the given solution has a problem if the
 data look like:

 x - c(sum=77, test=99, sum=99)

 By the description all three elements should be kept, but
 the duplicated solution throws out the last element.  A more
 complicated solution is:

 unique(data.frame(x, names(x)))

 (and then put the vector back together again).

   
Yes, I was about to say the same.

x[!duplicated(cbind(x,names(x)))]

looks like it might cut the mustard.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Preserving dates in Excel.

2007-06-14 Thread Peter Dalgaard
Patnaik, Tirthankar wrote:
 Hi,
   Quick question: Say I have a date variable in a data frame or
 matrix, and I'd like to preserve the date format when using write.table.
 However, when I export the data, I get the generic number underlying the
 date, not the date per se, and a number such as 11323, 11324, etc are
 not meaningful in Excel. Is there any way I can preserve the format of a
 date on writing into a text-file?

   
Er, what is exactly the problem here?

 d - data.frame(date=as.Date(2007-6-1)+1:5, x=rnorm(5))
 d
date x
1 2007-06-02  0.7987635130
2 2007-06-03 -0.7381623316
3 2007-06-04 -1.3626708691
4 2007-06-05  0.0007668082
5 2007-06-06  0.6719088533
 write.table(d)
date x
1 2007-06-02 0.798763513018864
2 2007-06-03 -0.738162331606612
3 2007-06-04 -1.36267086906438
4 2007-06-05 0.000766808196322155
5 2007-06-06 0.671908853312511
 write.csv(d)
,date,x
1,2007-06-02,0.798763513018864
2,2007-06-03,-0.738162331606612
3,2007-06-04,-1.36267086906438
4,2007-06-05,0.000766808196322155
5,2007-06-06,0.671908853312511


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with na.rm=T

2007-06-14 Thread Peter Dalgaard
Lucke, Joseph F wrote:
 Suddenly (e.g. yesterday) all my functions that have na.rm= as a
 parameter (e.g., mean(), sd(), range(), etc.) have been reporting
 warnings with na.rm=T. The message is Warning message: the condition
 has length  1 and only the first element will be used in: if (na.rm) x
 - x[!is.na(x)] .   This has never happened before.  I don't recall
 having done anything that might generate this message.  How do I fix
 this?
   

Rename the object that you suddenly called T...

(And notice that some people will advise you to use na.rm=TRUE to avoid
this)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error: bad value ? what is that?

2007-06-14 Thread Peter Dalgaard
Jose Quesada wrote:
 Hi,

 I'm finding a very strange error.
 For no good reason my R console (Rgui.exe, R 2.5.0, under win XP) stops  
 producing anything meaningful, and just returns:
 Error: bad value
 to _whatever_ I enter. It starts doing this after a while, not immediately  
 when launched.

 I have to restart R when this happens.
 No idea why. I didn't change anything in the R config that I remenber.

 Any thoughts?

 Thanks.

   
Hmm that message comes from deep down inside SETCAR() and friends. I 
can't see other reasons for it than memory corruption. Are you running 
some rogue C code? Is the machine flaky in other respects?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Book Advice Needed

2007-06-13 Thread Peter Dalgaard
Roland Rau wrote:
 Hi,

 [EMAIL PROTECTED] wrote:
   
 I am new to using R and would appreciate some advice on
 which books to start with to get up to speed on using R.

 My Background:
 1-C# programmer.
 2-Programmed directly using IMSL (Now Visual Numerics).
 3- Used in past SPSS and Statistica.

 I put together a list but would like to pick the best of 
 and avoid redundancy.

 Any suggestions on these books would be helpful (i.e. too much overlap,
 porly written etc?)

 Books:
 1-Analysis of Integrated and Co-integrated Time Series with R (Use R) -
 Bernhard Pfaff
 2-An Introduction to R - W. N. Venables
 3-Statistics: An Introduction using R - Michael J. Crawley
 4-R Graphics (Computer Science and Data Analysis) - Paul Murrell
 5-A Handbook of Statistical Analyses Using R - Brian S. Everitt
 6-Introductory Statistics with R - Peter Dalgaard
 7-Using R for Introductory Statistics - John Verzani
 8-Data Analysis and Graphics Using R - John Maindonald;
 9-Linear Models with R (Texts in Statistical Science) - Julian J.
 Faraway
 10-Analysis of Financial Time Series (Wiley Series in Probability and
 Statistics)2nd edition - Ruey S. Tsay
 

 as one other message says, it depends a lot on your ideas what you want 
 to do with R. And, I'd like to add, how familiar you are with statistics.
 One book I am missing in your list is Venables / Ripley: Modern Applied 
 Statistics with S. I can highly recommend it.
 If you are going to buy yourself only one book, then I would say: buy 
 Venables/Ripley


   
And given the programming background, also check out the other VR book,
S Programming. (This is about R too).


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rounding?

2007-06-11 Thread Peter Dalgaard
jim holtman wrote:
 your number 6.6501 is to large to fit in a floating point
 number.  It takes 56 bits and there are only 54 in a real number so the
 system see it as 6.65 and does the rounding to an even digit; 6.6

 6.651 does fit into a real number (takes 54 bits) and this will
 now round to 6.7

   
Actually, a bit more insidious than that because 6.65 does not have an
exact binary representation. Hence

 round(66.5)
[1] 66
 round(6.65,1)
[1] 6.7
 round(0.665,2)
[1] 0.66

Notice that these are from Linux and differ from what you get on Windows.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-10 Thread Peter Dalgaard
Douglas Bates wrote:
 Frank Harrell indicated that it is possible to do a lot of difficult
 data transformation within R itself if you try hard enough but that
 sometimes means working against the S language and its whole object
 view to accomplish what you want and it can require knowledge of
 subtle aspects of the S language.
   
Actually, I think Frank's point was subtly different: It is *because* of 
the differences in view that it sometimes seems difficult to find the 
way to do something in R that  is apparently straightforward in SAS. 
I.e. the solutions exist and are often elegant, but may require some 
lateral thinking.

Case in point: Finding the first or the last observation for each 
subject when there are multiple records for each subject. The SAS way 
would be a datastep with IF-THEN-DELETE, and a RETAIN statement so that 
you can compare the subject ID with the one from the previous record, 
working with data that are sorted appropriately.

You can do the same thing in R with a for loop, but there are better 
ways e.g.
subset(df,!duplicated(ID)), and subset(df, rev(!duplicated(rev(ID))), or 
maybe
do.call(rbind,lapply(split(df,df$ID), head, 1)), resp. tail. Or 
something involving aggregate(). (The latter approaches generalize 
better to other within-subject functionals like cumulative doses, etc.).

The hardest cases that I know of are the ones where you need to turn one 
record into many, such as occurs in survival analysis with 
time-dependent, piecewise constant covariates. This may require 
transposing the problem, i.e. for each  interval you find out which 
subjects contribute and with what, whereas the SAS way would be a 
within-subject loop over intervals containing an OUTPUT statement.

Also, there are some really weird data formats, where e.g. the input 
format is different in different records. Back in the 80's where 
punched-card input was still common, it was quite popular to have one 
card with background information on a patient plus several cards 
detailing visits, and you'd get a stack of cards containing both kinds. 
In R you would most likely split on the card type using grep() and then 
read the two kinds separately and merge() them later.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme vs. SAS proc mixed. Point estimates and SEs are the same, DFs are different

2007-06-05 Thread Peter Dalgaard
John Sorkin wrote:
 R 2.3
 Windows XP

 I am trying to understand lme. My aim is to run a random effects regression 
 in which the intercept and jweek are random effects. I am comparing output 
 from SAS PROC MIXED with output from R. The point estimates and the SEs are 
 the same, however the DFs and the p values are different. I am clearly doing 
 something wrong in my R code. I would appreciate any suggestions of how I can 
 change the R code to get the same DFs as are provided by SAS.
   
This has been hashed over a number of times before. In short:

1) You're not necessarily doing anything wrong
2) SAS PROC MIXED is not necessarily doing it right
3) lme() is _definitely_ not doing it right in some cases
4) both work reasonably in large sample cases (but beware that this is 
not equivalent to having many observation points)

SAS has an implementation of the method by Kenward and Rogers, which 
could be the most reliable general DF-calculation method around (I don't 
trust their Satterthwaite option, though). Getting this or equivalent 
into lme() has been on the wish list for a while, but it is not a 
trivial thing to do.

 SAS code:
 proc mixed data=lipids2;
   model ldl=jweek/solution;
   random int jweek/type=un subject=patient;
   where lastvisit ge 4;
 run;

 SAS output:
Solution for Fixed Effects

  Standard
 Effect   Estimate   Error  DFt ValuePr  |t|

 Intercept  113.48  7.4539  25  15.22  .0001
 jweek -1.7164  0.5153  24  -3.33  0.0028

 Type 3 Tests of Fixed Effects

   Num Den
 Effect DF  DFF ValuePr  F
 jweek   1  24  11.090.0028


 R code:
 LesNew3 - groupedData(LDL~jweek | Patient, data=as.data.frame(LesData3), 
 FUN=mean)
 fit3- lme(LDL~jweek, data=LesNew3[LesNew3[,lastvisit]=4,], 
 random=~1+jweek)
 summary(fit3) 

 R output:
 Random effects:
  Formula: ~1 + jweek | Patient
  Structure: General positive-definite, Log-Cholesky parametrization
  

 Fixed effects: LDL ~ jweek 
 Value Std.Error DF   t-value p-value
 (Intercept) 113.47957  7.453921 65 15.224144  0.
 jweek-1.71643  0.515361 65 -3.330535  0.0014

 John Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)

 Confidentiality Statement:
 This email message, including any attachments, is for the so...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme vs. SAS proc mixed. Point estimates and SEs are the same, DFs are different

2007-06-05 Thread Peter Dalgaard
Peter Dalgaard wrote:
 John Sorkin wrote:
   
 R 2.3
 Windows XP

 I am trying to understand lme. My aim is to run a random effects regression 
 in which the intercept and jweek are random effects. I am comparing output 
 from SAS PROC MIXED with output from R. The point estimates and the SEs are 
 the same, however the DFs and the p values are different. I am clearly doing 
 something wrong in my R code. I would appreciate any suggestions of how I 
 can change the R code to get the same DFs as are provided by SAS.
   
 
 This has been hashed over a number of times before. In short:

 1) You're not necessarily doing anything wrong
 2) SAS PROC MIXED is not necessarily doing it right
 3) lme() is _definitely_ not doing it right in some cases
 4) both work reasonably in large sample cases (but beware that this is 
 not equivalent to having many observation points)

 SAS has an implementation of the method by Kenward and Rogers, which 
 could be the most reliable general DF-calculation method around (I don't 
 trust their Satterthwaite option, though). Getting this or equivalent 
 into lme() has been on the wish list for a while, but it is not a 
 trivial thing to do.
   

Forgot to say: All DF-based corrections are wrong if you have 
non-normally distributed data (they depend on the 3rd and 4th moment of 
the error distribution(s)), although they can be useful as warning signs 
even in those cases. I also forgot to point to the simulate.lme() 
function which can simulate the LR statistics directly.
   
 SAS code:
 proc mixed data=lipids2;
   model ldl=jweek/solution;
   random int jweek/type=un subject=patient;
   where lastvisit ge 4;
 run;

 SAS output:
Solution for Fixed Effects

  Standard
 Effect   Estimate   Error  DFt ValuePr  |t|

 Intercept  113.48  7.4539  25  15.22  .0001
 jweek -1.7164  0.5153  24  -3.33  0.0028

 Type 3 Tests of Fixed Effects

   Num Den
 Effect DF  DFF ValuePr  F
 jweek   1  24  11.090.0028


 R code:
 LesNew3 - groupedData(LDL~jweek | Patient, data=as.data.frame(LesData3), 
 FUN=mean)
 fit3- lme(LDL~jweek, data=LesNew3[LesNew3[,lastvisit]=4,], 
 random=~1+jweek)
 summary(fit3) 

 R output:
 Random effects:
  Formula: ~1 + jweek | Patient
  Structure: General positive-definite, Log-Cholesky parametrization
  

 Fixed effects: LDL ~ jweek 
 Value Std.Error DF   t-value p-value
 (Intercept) 113.47957  7.453921 65 15.224144  0.
 jweek-1.71643  0.515361 65 -3.330535  0.0014

 John Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)

 Confidentiality Statement:
 This email message, including any attachments, is for the so...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lines to plots with a for-loop

2007-06-05 Thread Peter Dalgaard
Saanisto, Taija wrote:
 Hello all,

 I'm plotting several graphs with a for-loop with a code:

 par(mfrow=c(3,4))

 for(i in levels(fHCGB$code)) with(subset(fHCGB,code==i),
 plot(pooledPlateIntra, type=b, ylim=ylim, xlab=code, ylab=CV%))


 With which I have no problems.. However I need to add lines to all of
 these 12 plots, but I cannot get it to work. I've tried for example

 par(mfrow=c(3,4))

 for(i in levels(fHCGB$code)) with(subset(fHCGB,code==i),
 plot(pooledPlateIntra, type=b, ylim=ylim, xlab=code, ylab=CV%)
 points(fHCGB$limitVarC,type=b, col=green)))

 But run into errors. How can the lines be added?
   
The with() construct gets a little more complicated if you want to do 
more than one thing inside:

for(i in levels(fHCGB$code)) with(subset(fHCGB,code==i), {
  plot(pooledPlateIntra, type=b, ylim=ylim, xlab=code, ylab=CV%)
  points(fHCGB$limitVarC,type=b, col=green)
})

or, since with() is really only needed for the plot()

for(i in levels(fHCGB$code)) {
  with(subset(fHCGB,code==i), 
 plot(pooledPlateIntra, type=b, ylim=ylim, xlab=code, ylab=CV%))
  points(fHCGB$limitVarC,type=b, col=green)
}


( you might have used lines() rather than points() if you think of it as an 
added line, but that's a matter of taste since the two functions only differ in 
the default for type=.)

-p

 Taija Saanisto
 Biostatistician
 Quality assurance, Process Development
 PerkinElmer Life and Analytical Sciences / Wallac Oy
 Phone: +358-2-2678 741




   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R CMD BATCH command

2007-06-05 Thread Peter Dalgaard
Austin, Peter wrote:
 The version of R on our unix system has been updated to version 2.5.0.
 When I type the following command at the unix prompt:

 'R CMD BATCH filename'

 I receive the following error message:

 Error in Sys.unsetenv(R_BATCH) : 'Sys.unsetenv' is not available on
 this system

 Execution halted.

  

 'R CMD BATCH filename' used to work with the prior version of R that I
 had installed (version 2.2.0). Is there something that I need to modify
 for it to work now?

 Thanks,

 Peter

   
A similar problem was found on an old version of Solaris and discussed 
on this very list on May 14 (use the list archive and look for the 
thread started by Simon Penel). This could be similar to your problem 
(but you omitted to tell us what system you were on).

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting t.test to work with apply()

2007-06-04 Thread Peter Dalgaard
Petr Klasterecky wrote:
 Andrew Yee napsal(a):
   
 Hi, I'm interested in using apply() with t.test() on a data.frame.

 Specifically, I'd like to use apply() to do the following:

  t.test(raw.sample[1,alive],raw.sample[1,dead])
 t.test(raw.sample[2,alive],raw.sample[2,dead])
  t.test(raw.sample[3,alive],raw.sample[3,dead])
 etc.

 I tried the following,

 apply(raw.sample,1,function(x) t.test(raw.sample[,alive],raw.sample[,dead]))
 

 Two comments:
 1) apply() works on arrays. If your dataframe only has numeric values, 
 turn it (or its copy) to a matrix via as.matrix(). If it has mixed 
 variables, take only the numeric part for t-tests. The conversion is 
 made implicitly but explicit asking for it cannot hurt.
 2) the main problem - you are using a wrong argument to t.test

 The call should look like
 apply(as.matrix(raw.sample), 1, function(x){t.test(x[alive], x[dead])})

 assuming 'alive' and 'dead' are logical vectors of the same length as 'x'.

 Petr
   
Notice also that the other apply-style functions may give an easier
route to the goal:

lapply(1:N, function(i) t.test(raw.sample[i,alive],raw.sample[i,dead]))

or (maybe, depends on raw.sample being a data frame and alive/dead being
indexing vectors)

mapply(t.test, raw.sample[,alive], raw.sample[,dead])

   
 but it gives me a list of identical results.


 Thanks,
 Andrew

  [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 

   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting t.test to work with apply()

2007-06-04 Thread Peter Dalgaard
Andrew Yee wrote:
 Thanks for everyone's suggestions.

 I did try

  results -apply(raw.sample,1,function(x) t.test(x[,alive],x[,dead]))

 However, I get:

 Error in x[, alive] : incorrect number of dimensions

 Full disclosure, raw.sample is a data.frame, and I am using alive and dead
 as indexing vectors.

 On the other hand, the lapply suggestion works better.

 results - lapply(1:nrow(raw.sample), function(i) t.test(raw.sample
 [i,alive],raw.sample[i,dead]))

   
nrow()?

Oops, yes. I didn't notice that your data are transposed relative to the
usual cases x variables layout. 

So mapply() is not going to work unless you use
as.data.frame(t(raw.sample)) first.

 -pd
 Thanks,
 Andrew


  On 6/4/07, Peter Dalgaard [EMAIL PROTECTED] wrote:

   
 Petr Klasterecky wrote:
 
 Andrew Yee napsal(a):

   
 Hi, I'm interested in using apply() with t.test() on a data.frame.

 Specifically, I'd like to use apply() to do the following:

  t.test(raw.sample[1,alive],raw.sample[1,dead])
 t.test(raw.sample[2,alive],raw.sample[2,dead])
  t.test(raw.sample[3,alive],raw.sample[3,dead])
 etc.

 I tried the following,

 apply(raw.sample,1,function(x) t.test(raw.sample[,alive],raw.sample
 
 [,dead]))
 
 Two comments:
 1) apply() works on arrays. If your dataframe only has numeric values,
 turn it (or its copy) to a matrix via as.matrix(). If it has mixed
 variables, take only the numeric part for t-tests. The conversion is
 made implicitly but explicit asking for it cannot hurt.
 2) the main problem - you are using a wrong argument to t.test

 The call should look like
 apply(as.matrix(raw.sample), 1, function(x){t.test(x[alive], x[dead])})

 assuming 'alive' and 'dead' are logical vectors of the same length as
   
 'x'.
 
 Petr

   
 Notice also that the other apply-style functions may give an easier
 route to the goal:

 lapply(1:N, function(i) t.test(raw.sample[i,alive],raw.sample[i,dead]))

 or (maybe, depends on raw.sample being a data frame and alive/dead being
 indexing vectors)

 mapply(t.test, raw.sample[,alive], raw.sample[,dead])

 
 but it gives me a list of identical results.


 Thanks,
 Andrew

  [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 
 http://www.R-project.org/posting-guide.html
 
 and provide commented, minimal, self-contained, reproducible code.


 
   
 --
   O__   Peter Dalgaard �ster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45)
 35327907


 

   [[alternative HTML version deleted]]

   
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recompile R using ActiveTcl

2007-06-03 Thread Peter Dalgaard
James Foadi wrote:
 Dear all,

 While running some code requiring the tcltk package I have realised that my 
 version of R was compiled with the Tcl/Tk libraries included in Fedora 6. It 
 would be for me better to use the ActiveTcl libraries (which I have 
 under /usr/local), and I'm aware that this probably means to recompile R with 
 the proper configuration variables.

 But...is it by any chance possible to just recompile the bit affected by 
 Tcl/Tk, like, for instance, to install tcltk with some environment variable 
 pointing at the right ActiveTcl library?

   
Maybe, but I don't think it is worth the trouble compared to a full 
rebuild. There are obstacles, e.g. that the Makefile in the packages is 
created from Makefile.in by the toplevel configure script. I.e., better 
waste some computer resources than your own.

 Many thanks for your suggestions and help.

 J


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subscript in axis label

2007-06-03 Thread Peter Dalgaard
Tobias Verbeke wrote:
 [EMAIL PROTECTED] wrote:
   
 Dear R help list members,

 I am experiencing difficulty in trying to generate a subscript '2' in an 
 axis label. Although I can get the '2' into a subscript using expression(), 
 R then forces me to leave at least one space between the '2' and the 
 following character. My label is supposed to read 'N2O concentration 
 (ppm)', and the space between the '2' and the 'O' makes it look rather 
 inelegant! My code is the following (the comments in it are there to stop 
 me forgetting what I have done, I am new to R):

 postscript(file=/Users/patrickmartin/Documents/York Innova 
 Precision/N2Oinnova.eps, horizontal=FALSE, onefile=FALSE, height=4, 
 width=5, pointsize=10)
 
 plot(n2o, lty=0, las=1, xlab=Time, ylab=expression(N[2]~O 
 concentration (ppm))) points(n2o, pch=16) # suppresses line but adds 
 points dev.off() # turns postscript device off again
   

 Is this better

 plot(1:10, ylab = expression(paste(N[2],O concentration (ppm),
   sep = )))
   

Or,

plot(1:10, ylab = expression(N[2]*O~concentration (ppm)))

(because of the ~, you can even do away with expression(), but I 
think that would be overly sneaky.)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about lex/yacc

2007-05-23 Thread Peter Dalgaard
elyakhlifi mustapha wrote:
 hello,
 what about these functions lex/yacc which can parse and recognize a syntax?
 thanks
   
What about them? There are books, notably an O'Reilly one by D.Brown, as well 
as works on parser theory (Aho+Sethi+Ullman, e.g.).

(This is more than a bit off-topic for this list). 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   7   8   9   10   >