Re: [R] EM unsupervised clustering

2007-07-19 Thread Bettina Gruen
Federico,

you might also want to have a look at packages flexclust or flexmix, 
so you can take into account that you have binary data. The mclust 
package can be used to estimate mixtures of Gaussian distributions. 
flexclust implements kmeans-like algorithms, but you can specify a 
distance measure appropriate for binary data. flexmix allows latent 
class analysis with binary data using FLXMCmvbinary() for the component 
specific model.

Best,
Bettina


Federico Calboli wrote:
 Hi All,
 
 I have a  n x m matrix. The n rows are individuals, the m columns are 
 variables.
 
 The matrix is in itself a collection of 1s (if a variable is observed for an 
 individual), and 0s (is there is no observation).
 
 Something like:
 
   [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]101100
 [2,]101100
 [3,]101100
 [4,]010000
 [5,]101100
 [6,]010010
 
 
 I use kmeans to find 2 or 3 clusters in this matrix
 
 k2 = kmeans(data, 2, 1000)
 k3 = kmeans(data, 3, 1000)
 
 but I would like to use something a bit more refined, so I though about a EM 
 based clustering. I am using the Mclust() function from the mclust package, 
 but 
 I get the following (to me incomprehensible) error message:
 
 plot(Mclust(as.data.frame(data)), as.data.frame(data))
 Hit Return to see next plot:
 Hit Return to see next plot:
 Hit Return to see next plot:
 Error in 1:L : NA/NaN argument
 In addition: Warning messages:
 1: best model occurs at the min or max # of components considered in: 
 summary.mclustBIC(Bic, data, G = G, modelNames = modelNames)
 2: optimal number of clusters occurs at min choice in: 
 Mclust(as.data.frame(anc.st.mat))
 3: insufficient input for specified plot in: coordProj(data = data, 
 parameters = 
 x$parameters, z = x$z, what = classification,
 
 That's puzzling because the example given by ?Mclust is something like
 
 plot(Mclust(iris[,-5]), iris[,-5])
 
 which is pretty simple and dumbproof and works flawlessly...
 
 best,
 
 Federico


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R

2007-07-19 Thread Fluss
Hello!
I am using for logistic regression in survey data the svyglm procedure.
I wondered how does the strata effect estimates SE (in addition to the
weights given proportional to population size).
I know that for simple regression measurements of each strata is assumed to
have different variance.
But in a logistic model this is not the case.
Can anyone help me here?

Thank you
Ron

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RAM, swap, Error: cannot allocate vector of size, Linux:

2007-07-19 Thread Peter Dalgaard
Feldman, Maximilian Jeffrey wrote:
 Dear Community,

 I am very new to the world of Linux and R and I have stumbled upon a problem 
 that I cannot seem to resolve on my own. Here is the relevant background:

 I am working on a 64-bit Linux Fedora Core 6 OS. I using R version 2.5.1. I 
 have 3.8 Gb of RAM and 1.9 Gb of swap. As I see it, there are no restraints 
 on the amount of memory that R can use imposed by this particular OS build. 
 When I type in the 'ulimit' command at the command line the response is 
 'unlimited'.

 Here is the problem:

 I have uploaded and normalized 48 ATH1 microarray slides using the justRMA 
 function.

   
 library(affy)
 

   
 setwd(/Data/cel)
 

   
 Data-justRMA()
 

 The next step in my analysis is to calculate a distance matrix for my dataset 
 using bioDist package. This is where I get my error.

   
 library(bioDist)
 

   
 x-cor.dist(exprs(Data))
 

 Error: cannot allocate vector of size 3.9 Gb

 I used the following function to examine my memory limitations:

   
 mem.limits()
 

 nsize vsize 

 NA NA 

 I believe this means there isn't any specified limit to the amount of memory 
 R can allocate to my task. I realize I only have 3.8 Gb of RAM but I would 
 expect that R would use my 1.9 Gb of swap. 
   
It does, if swap works at all on your machine. However, the error 
message is relates to the object that R fails to create, not the total 
memory usage. I.e. this might very well be the _second_ object of size 
3.9Gb that you are trying to fit into 5.7Gb of memory.  You could try 
increasing the swap space (the expedient, although perhaps not 
efficient, way is to find a file system with a few tens of Gb to spare 
and create a large swapfile on it.)
 Does R not use my swap space? Can I explicitly tell R to use my swap space 
 for large tasks such as this? 

 I was not able to find any information regarding this particular issue in the 
 R Linux manual, Linux FAQ, or on previous listserv threads. Many of the users 
 who had similar questions resolved their problems in a different manner.

 Thanks to anyone who thinks they can provide assistance!

 Max 

 Graduate Student

 Molecular Plant Sciences

 Washington State University


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Classification

2007-07-19 Thread Michal Kneifl
For all who sent help on topic Classification:

Thank you very much folks. 
I have got some inspiration how to solve this task.

Michael

- Original Message - 
From: Marc Schwartz [EMAIL PROTECTED]
To: Ing. Michal Kneifl, Ph.D. [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Wednesday, July 18, 2007 7:53 PM
Subject: Re: [R] Classification


 On Wed, 2007-07-18 at 19:36 +0200, Ing. Michal Kneifl, Ph.D. wrote:
 Hi,
 I am also a quite new user of R and would like to ask you for help:
 I have a data frame where all columns are numeric variables. My aim is  
 to convert one columnt in factors.
 Example:
 MD
 0.2
 0.1
 0.8
 0.3
 0.7
 0.6
 0.01
 0.2
 0.5
 1
 1
 
 
 I want to make classes:
 0-0.2 A
 0.21-0.4 B
 0.41-0.6 C
 . and so on
 
 So after classification I wil get:
 MD
 A
 A
 D
 B
 .
 .
 .
 and so on
 
 Please could you give an advice to a newbie?
 Thanks a lot in advance..
 
 Michael
 
 See ?cut
 
 You can then do something like:
 
 DF
 MD
 1  0.20
 2  0.10
 3  0.80
 4  0.30
 5  0.70
 6  0.60
 7  0.01
 8  0.20
 9  0.50
 10 1.00
 11 1.00
 
 
 cut(DF$MD, breaks = c(seq(0, 1, .2)), labels = LETTERS[1:5])
 [1] A A D B D C A A C E E
 Levels: A B C D E
 
 
 HTH,
 
 Marc Schwartz
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread Uwe Ligges


ONKELINX, Thierry wrote:
 Dear useRs,
 
 Lately I noticed a strange warning in the summary of a lm-object. Any
 idea what this warning is about? I'm using R 2.5.1 on Win XP pro.
 
 x - rnorm(100)
 y - rnorm(100)
 summary(lm(y~x))
 
 Call:
 lm(formula = y ~ x)
 
 Residuals:
  Min   1Q   Median   3Q  Max 
 -1,77809 -0,68438 -0,04409  0,63891  2,30863 
 
 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept) -0,002170,09244  -0,0230,981
 x0,013150,09628   0,1370,892
 
 Residual standard error: 0,9236 on 98 degrees of freedom
 Multiple R-Squared: 0.0001903,  Adjusted R-squared: -0.01001 
 F-statistic: 0.01866 on 1 and 98 DF,  p-value: 0,8916 
 
 Warning message:
 NAs introduced by coercion in: as.double.default(Cf[okP]) 

Probably you have an object in your workspace or another attached 
environment in the search path that conflicts with objects that are 
required to call summary(lm(...)). E.g. some lm... oder summary... function?

Best,
Uwe Ligges



   
 Thanks,
 
 Thierry
 
 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature
 and Forest
 Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
 methodology and quality assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium
 tel. + 32 54/436 185
 [EMAIL PROTECTED]
 www.inbo.be 
 
 Do not put your faith in what statistics say until you have carefully
 considered what they do not say.  ~William W. Watt
 A statistical analysis, properly conducted, is a delicate dissection of
 uncertainties, a surgery of suppositions. ~M.J.Moroney
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread ONKELINX, Thierry
The problem also exists in a clean workspace. But I've found the
troublemaker. I had set options(OutDec = ,). Resetting this to
options(OutDec = .) solved the problem.

Thanks,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
[EMAIL PROTECTED]
www.inbo.be 

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney

 

 -Oorspronkelijk bericht-
 Van: Uwe Ligges [mailto:[EMAIL PROTECTED] 
 Verzonden: donderdag 19 juli 2007 10:39
 Aan: ONKELINX, Thierry
 CC: r-help@stat.math.ethz.ch
 Onderwerp: Re: [R] Strange warning in summary.lm
 
 
 
 ONKELINX, Thierry wrote:
  Dear useRs,
  
  Lately I noticed a strange warning in the summary of a 
 lm-object. Any 
  idea what this warning is about? I'm using R 2.5.1 on Win XP pro.
  
  x - rnorm(100)
  y - rnorm(100)
  summary(lm(y~x))
  
  Call:
  lm(formula = y ~ x)
  
  Residuals:
   Min   1Q   Median   3Q  Max 
  -1,77809 -0,68438 -0,04409  0,63891  2,30863
  
  Coefficients:
  Estimate Std. Error t value Pr(|t|)
  (Intercept) -0,002170,09244  -0,0230,981
  x0,013150,09628   0,1370,892
  
  Residual standard error: 0,9236 on 98 degrees of freedom Multiple 
  R-Squared: 0.0001903,  Adjusted R-squared: -0.01001
  F-statistic: 0.01866 on 1 and 98 DF,  p-value: 0,8916
  
  Warning message:
  NAs introduced by coercion in: as.double.default(Cf[okP])
 
 Probably you have an object in your workspace or another 
 attached environment in the search path that conflicts with 
 objects that are required to call summary(lm(...)). E.g. some 
 lm... oder summary... function?
 
 Best,
 Uwe Ligges
 
 
 
  
  Thanks,
  
  Thierry
  
  
 --
  --
  
  ir. Thierry Onkelinx
  Instituut voor natuur- en bosonderzoek / Research Institute 
 for Nature 
  and Forest Cel biometrie, methodologie en kwaliteitszorg / Section 
  biometrics, methodology and quality assurance Gaverstraat 4 9500 
  Geraardsbergen Belgium tel. + 32 54/436 185 
 [EMAIL PROTECTED] 
  www.inbo.be
  
  Do not put your faith in what statistics say until you have 
 carefully 
  considered what they do not say.  ~William W. Watt A statistical 
  analysis, properly conducted, is a delicate dissection of 
  uncertainties, a surgery of suppositions. ~M.J.Moroney
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tapply

2007-07-19 Thread sigalit mangut-leiba
hello,
i want to compute the mean of a variable (aps) for every class
(1,2, and 3).
every id have a few obs., aps and class are constant over id.
like this:
id   aps class
1  11   2
1  11   2
1  11   2
1  11   2
1  11   2
2   83
2   83
2   83
3  12   2
3  12   2
.
.

i tried:

tapply(icu1$aps_st, icu1$hidclass, function(z) mean(unique(z)))

but it's counting every row and not every id.

thank you,

Sigalit.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a facility in R similar to MatLab

2007-07-19 Thread Hans W. Borchers
Dylan Arena darena at stanford.edu writes:
 
 Hi,
 
 I'm trying to use R to get eigenvalues and eigenvectors of a matrix 
 whose elements are of the form (2 * lambda), -(lambda + mu), etc.  I'd 
 like R to treat this matrix as a numeric matrix without treating lambda 
 and mu as variable names but rather as some sort of atomic quantities 
 (and hence give eigenvectors in terms of mu and/or lambda).  MatLab and 
 Mathematica both do this, but I'm not sure whether R does.  Does anyone 
 have any ideas about how to do this?
 
 Please let me know,
 Dylan
 

You are asking for a Computer Algebra problem and thus you should apply a
Computer Algebra System (CAS) such as Maple, Mathematica, MuPAD, or the
free software Maxima.
By the way, Matlab does _not_ do this, it utilizes the Maple kernel in its
Symbolic Toolbox.
In R you can try the Ryacas package, but I personally would prefer one of
the more extended systems. Maxima with wxMaxima interface is quite nice,
and for MuPAD there is a non-expensive personal copy available.

Hans Werner Borchers

 __
 R-help at stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subsetting dataframes

2007-07-19 Thread CG Pettersson
Dear all!

W2k, R 2.5.1

I am working with an ongoing malting barley variety evaluation within
Sweden. The structure is 25 cultivars tested each year at four sites, in
field trials with three replicates and 'lattice' structure (the replicates
are divided into five sub blocks in a structured way). As we are normally
keeping around 15 varieties from each year to the next, and take in 10 new
for next year, we have tested totally 72 different varieties during five
years.

I store the data in a field trial database, and generate text tables with
the subset of data I want and import the frame to R. I take in all
cultivars in R and use 'subset' to select what I want to look at. Using
lme{nlme} works with no problems to get mean results over the years, but
as I now have a number of years I want to analyse the general site x
cultivar relation. I am testing AMMI{agricolae} for this and it seems to
work except for the subsetting. This is what happens:

If I do the subsetting like this:

x62_samvar - subset(x62_5, cn %in%
c(Astoria,Barke,Christina,Makof, Prestige,Publican,Quench))

A test run with AMMI seems to work in the first part:

 AMMI(site, cn, rep, yield)

ANALYSIS AMMI:  yield
Class level information

ENV:  Hag Klb Bjt Ska
GEN:  Astoria Prestige Makof Christina Publican Quench
REP:  1 2 3

Number of observations:  240

model Y: yield  ~ ENV + REP%in%ENV + GEN + ENV:GEN

Analysis of Variance Table

Response: Y
   DfSum Sq   Mean Sq F valuePr(F)
ENV 3 120092418  40030806 90.0424 1.665e-06 ***
REP(ENV)8   3556620444578  0.5674  0.803923
GEN 5  21376142   4275228  5.4564 9.680e-05 ***
ENV:GEN15  28799807   1919987  2.4504  0.002555 **
Residuals 208 162973213783525
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Coeff var   Mean yield
13.08629 6764.098

After this something goes wrong, as AMMI finds a cultivar name not
selected in the subsetting. (The plotting might go wrong anyhow, but I
haven´t got that far yet):

Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
object$xlevels) :
factor 'y' has new level(s) Arkadia


Looking at the dataframe using

 edit(x62_samvar)

only shows the selected lines, but using levels() gives another answer as

 levels(x62_samvar$cn)

gives back all 72 cultivar names used during the five years (starting with
Arcadia).

Where do I go wrong and how do I use subset in a proper way?

Thanks
/CG

-- 
CG Pettersson, PhD
Swedish University of Agricultural Sciences (SLU)
Dept. of Crop Production Ecology. Box 7043.
SE-750 07 Uppsala, Sweden
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mfrow is ignored by some plots

2007-07-19 Thread Jim Lemon
Hi all,
I was just informed that the plots in the radial.plot family in the 
plotrix package do not plot correctly when using mfrow or mfcol to 
subdivide the plot window. I found one related message, an answer from 
Deepayan Sarkar to a question about lattice graphics, but that shed no 
light on this problem.

If I invoke par(mfrow=c(2,2)) and run radial.plot a few times, the plots 
appears in the upper left corner. _Sometimes_ the second plot appears in 
the upper right corner once, but then returns to the upper left. If I 
invoke par(new=TRUE), the plots appear in the lower right corner.

For instance, the following just keeps writing the plots in the upper left.

par(mfrow=c(2,2))
radial.plot(rnorm(7),0:6,line.col=red,lwd=2)
radial.plot(rnorm(7),0:6,line.col=red,lwd=2)
radial.plot(rnorm(7),0:6,line.col=red,lwd=2)
radial.plot(rnorm(7),0:6,line.col=red,lwd=2)

If I then enter:

par(new=TRUE)

and repeat the plots, they all accumulate in the lower right. I think 
this may have something to do with the calls to points or lines or 
symbols within radial.plot, but I could find no information on this in 
the help for plot, par, or several other pages.

Any clues?

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] write.table linebreaks

2007-07-19 Thread Birgit Lemcke
Hello R users,

I am a newby using R  2.5.0 on a Apple Power Book G4 with Mac OS X  
10.4.10.

when I use the write.table function, I always get the output in Unix  
linebreaks that I have to change to McIntosh linebreaks to be able to  
Import the data in Excel 2004 for Mac.

Is there a possibility to do this automatically in R and respectively  
in the write.table function?

Thanks in advance.

Birgit


Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]






[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fSeries GARCH(1,1)

2007-07-19 Thread livia

Hello all, I am trying to use the garchFit function in the fSeries Package
to fit a Garch(1,1) Model with t distribution. I am using the following
codes.

fit - garchFit(~garch(1,1),data,cond.dist=dstd)
fitted(fit)

I was expecting the fitted(fit) would return the fitted volatility, but the
result turns out to be a series of repeated same value. I tried to change
the distribution to normal, and the same thing happened.

Could anyone give me some advice? Many thanks.
-- 
View this message in context: 
http://www.nabble.com/fSeries-GARCH%281%2C1%29-tf4109574.html#a11686281
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mfrow is ignored by some plots

2007-07-19 Thread Prof Brian Ripley
I'd ask the plotrix maintainer to fix his code!

radial.plot saves and restores all the par() values, including mfrow and 
mfg. When you do par(mfrow=c(2,2)) the plot position is reset to the 
bottom right, and the next plot will advance to the top left (but 
par(new=TRUE) negates that).

Please restore only the par values you have changed, not all the 
no.readonly ones.  That values can be set does not make it a good idea to 
do so.

On Thu, 19 Jul 2007, Jim Lemon wrote:

 Hi all,
 I was just informed that the plots in the radial.plot family in the
 plotrix package do not plot correctly when using mfrow or mfcol to
 subdivide the plot window. I found one related message, an answer from
 Deepayan Sarkar to a question about lattice graphics, but that shed no
 light on this problem.

 If I invoke par(mfrow=c(2,2)) and run radial.plot a few times, the plots
 appears in the upper left corner. _Sometimes_ the second plot appears in
 the upper right corner once, but then returns to the upper left. If I
 invoke par(new=TRUE), the plots appear in the lower right corner.

 For instance, the following just keeps writing the plots in the upper left.

 par(mfrow=c(2,2))
 radial.plot(rnorm(7),0:6,line.col=red,lwd=2)
 radial.plot(rnorm(7),0:6,line.col=red,lwd=2)
 radial.plot(rnorm(7),0:6,line.col=red,lwd=2)
 radial.plot(rnorm(7),0:6,line.col=red,lwd=2)

 If I then enter:

 par(new=TRUE)

 and repeat the plots, they all accumulate in the lower right. I think
 this may have something to do with the calls to points or lines or
 symbols within radial.plot, but I could find no information on this in
 the help for plot, par, or several other pages.

 Any clues?

 Jim

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mfrow is ignored by some plots

2007-07-19 Thread Jim Lemon
Prof Brian Ripley wrote:
 I'd ask the plotrix maintainer to fix his code!
 
 radial.plot saves and restores all the par() values, including mfrow and 
 mfg. When you do par(mfrow=c(2,2)) the plot position is reset to the 
 bottom right, and the next plot will advance to the top left (but 
 par(new=TRUE) negates that).
 
 Please restore only the par values you have changed, not all the 
 no.readonly ones.  That values can be set does not make it a good idea 
 to do so.
 
Thank you Prof. Ripley, an erudite and functional answer as usual.

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Copula

2007-07-19 Thread copula

I have installed fast all packages for copula,
I have installed package 'gnml' and the others from the Jim Lindsey's web
site and the other packages like 'repeted' which is necessary for
calculating copula.
 
but when I start with copula it write: 
'gausscop' function is not found. And also 'fitted.gausscop(z)' and
'residuals.gausscop(z)'.

also what I want to say is: when I have installed package 'repeted' from Jim
Lindsey's web site in R project, this what I have received: 
 
utils:::menuInstallLocal()
updating HTML package descriptions

what have I to do next to solve this problem and to continue with copula.


Thanks a lot for previous help!


 

  

gyadav wrote:
 
 
 hi meeryana,
 
 may be this time nobnody is responding. but dont worry you will get a lot 
 of help eventually, so always post a copy to the mailing list.
 The reason is there are a lot many newbies, although i am also not so old 
 enough, who have even the simplest questions but are hesitant to ask.
 the objective of the list is to help, and thus feel free to ask, stand and 
 contribute :-) i hope you will understand not today then tomorrow as you 
 will get associated with the mailing list more closely. Personally i have 
 found friends here. Further, never post a mail with a loose subject and 
 secondly try to maintain the thread i.e. reply to the mail which you want 
 to reply(for more details refer to the posting guide) :-) that would 
 really help. 
 
 Yes now regarding your question :-) 
 
 Step1 : List of all packages can be found at this link :-)
 http://cran.r-project.org/src/contrib/PACKAGES.html
 Step2: Click on the package you want to install :-) 
 http://cran.r-project.org/src/contrib/Descriptions/copula.html
 Step3: Then download the binary of your Operating system. If windows then 
 download corresponding zip file.
 for copula it is
  http://cran.r-project.org/bin/windows/contrib/r-release/copula_0.5-3.zip
 save zip file on your system
 Step4 Open your R rpogram
 Step5: Goto Packages - Install packages from Local Zip file
 Step6: Select your package zip file which you want to install
 Step7: Sit back and relax
 Step8: load the library using library(LibraryName) on R prompt
 
 There are alternate ways of installing the package directly from R prompt. 
 It didn't worked for me a long time back, so  i always adopt this method.
 Somebody on the list may help you in this regards :-)
 
 bye and learn
 Join and stand with Open Source and R community
 Cheers and Chiao, Welcome
 -gaurav
 
 
 
 dear Mr. Yadav,
 
 I want to thank for help, 
 and for that you are only who is willing to help,
 but I have one question:
 because I'm new with R project also, I think I should install a package 
 for copula.
 I have only installed R program.
 How should I install this package? And is it what I have also to do with 
 credit metrics, Value at Risk, matix and the other formulas, I mean 
 install packages.
 
 I hope that you have a little time for me and my problem, and I hope I'm 
 not disturbing you.
 thank you for all you can do for me 
 
 and 
 
 best regards,
 
 Mirjana
 
 
 
 
 gyadav wrote:
 
 
 hi
 
 see the code below i hope this will make your understanding of copulas 
 better
 this code plots two normal distribution and their joint distribution 
 N[0,2]  N[0,4]
 
 HTH
 
 ##code
 library(copula)
 ###copy in two parts in R#
 
 ##PART A##
 ## construct a bivariate distribution whose marginals
 ## are normal and Normal respectively, coupled
 ## together via a normal copula
 op - par(mfrow = c(2, 2), # 2 x 2 pictures on one plot
   pty = s)   # square plotting region,
# independent of device size
 
 x - mvdc(normalCopula(0.75), c(norm, norm),
 list(list(mean = 0, sd =2),list(mean = 0, sd =4)))
 x.samp - rmvdc(x, 1)
 par(mfrow=c(2,3))
 hist(x.samp[,1],xlab=Normal)
 hist(x.samp[,2],xlab=Normal)
 plot(x.samp[,2],x.samp[,1],pch=21,xlab=Normal,ylab=Normal)
 
 plot(dmvdc(x, x.samp))
 plot(pmvdc(x, x.samp))
 
 ## At end of plotting, reset to previous settings:
 
 
 ###PART B###
 par(op)
 for (i in seq(1:360)){
 persp(x, dmvdc, xlim = c(-4, 4), ylim=c(0, 1),theta=i)
 }
 
 
 Regards,
 
 Gaurav Yadav
 +++
 Assistant Manager, CCIL, Mumbai (India)
 Mob: +919821286118 Email: [EMAIL PROTECTED]
 Bhagavad Gita:  Man is made by his Belief, as He believes, so He is
 
 
 
 copula [EMAIL PROTECTED] 
 Sent by: [EMAIL PROTECTED]
 07/17/2007 12:53 PM
 
 To
 r-help@stat.math.ethz.ch
 cc
 
 Subject
 Re: [R] R and Copula
 
 
 
 
 
 
 
 it would be great when somebody will help me
 thanks
 
 
 copula wrote:
 
 hi,
 first I want to say that I'm new here, and new with copula and R.
 
 That is the reason why I'm writing, if somebody can help me. 
 
 I have to make an example of Copula. 
 On internet I've found this forum and that copula can 

[R] df manipulation

2007-07-19 Thread Nikola Markov
I have multicolumn data.frames with the first comumn giving ordinal 
observation index ( ex.: 1 4 7 9 11 13 etc). I would like to fill up the 
missing observations (i.e. 2 3 5 6 8 etc) with NAs.
Thank you

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write.table linebreaks

2007-07-19 Thread Prof Brian Ripley

What do you think the 'eol' argument to write.table is for?

I don't have a Mac to hand, but eol='\r' does this on Linux and Windows.

On Thu, 19 Jul 2007, Birgit Lemcke wrote:


Hello R users,

I am a newby using R  2.5.0 on a Apple Power Book G4 with Mac OS X
10.4.10.

when I use the write.table function, I always get the output in Unix
linebreaks that I have to change to McIntosh linebreaks to be able to
Import the data in Excel 2004 for Mac.

Is there a possibility to do this automatically in R and respectively
in the write.table function?

Thanks in advance.

Birgit


Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]






[[alternative HTML version deleted]]




--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error for TukeyHSD ?

2007-07-19 Thread Silvia Lipski
Hi,

one question about TukeyHSD in R:

it seems to work only for models without Error terms but not when an  
Error is specified:

Examples:

aov1-aov(MMN_ind~(GenCond*Lang)+Error(VP),data=cutie_all)
TukeyHSD(aov1,Lang,ordered=TRUE,conf.level = 0.95)
-- the German error  message tells me its not a method that can be  
used with TukeyHSD:
Fehler in TukeyHSD(aov1, ordered = TRUE, conf.level = 0.95) :
 keine anwendbare Methode für TukeyHSD

A model like this is no problem:
aov1-aov(MMN_ind~(GenCond*Lang,data=cutie_all)

I am very grateful for any help!
Silvia

-- 
Dr. Silvia C. Lipski
University of Konstanz, Department of Linguistics, Neurolinguistics
Fach D 185, 78457 Konstanz, Germany
Tel: +49 7531 88-2927
Fax: +49 7531 88-2741
Email: [EMAIL PROTECTED]
www.uni-konstanz.de/FuF/Philo/Sprachwiss/neuroling/Lipski.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread Peter Dalgaard
ONKELINX, Thierry wrote:
 The problem also exists in a clean workspace. But I've found the
 troublemaker. I had set options(OutDec = ,). Resetting this to
 options(OutDec = .) solved the problem.

 Thanks,

 Thierry
   
Oups. That sounds like there's a bug somewhere. Can you cook up a
minimal example which shows the behaviour?

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread ONKELINX, Thierry
Dear Peter,

Here's an example. Notice the warning in the last two lines of the summary with 
options(OutDec = ,). It's not present with options(OutDec = .).

Cheers,

Thierry

 x - runif(100)
 y - rnorm(100)
 options(OutDec = ,)
 summary(lm(y~x))

Call:
lm(formula = y ~ x)

Residuals:
  Min1QMedian3Q   Max 
-2,389749 -0,607002  0,006969  0,689535  1,713197 

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  0,033970,17774   0,1910,849
x   -0,092190,29518  -0,3120,755

Residual standard error: 0,868 on 98 degrees of freedom
Multiple R-Squared: 0.0009943,  Adjusted R-squared: -0.0092 
F-statistic: 0.09754 on 1 and 98 DF,  p-value: 0,7555 

Warning message:
NAs introduced by coercion in: as.double.default(Cf[okP]) 
 options(OutDec = .)
 summary(lm(y~x))

Call:
lm(formula = y ~ x)

Residuals:
  Min1QMedian3Q   Max 
-2.389749 -0.607002  0.006969  0.689535  1.713197 

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  0.033970.17774   0.1910.849
x   -0.092190.29518  -0.3120.755

Residual standard error: 0.868 on 98 degrees of freedom
Multiple R-Squared: 0.0009943,  Adjusted R-squared: -0.0092 
F-statistic: 0.09754 on 1 and 98 DF,  p-value: 0.7555 



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology 
and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
[EMAIL PROTECTED]
www.inbo.be 

Do not put your faith in what statistics say until you have carefully 
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of 
uncertainties, a surgery of suppositions. ~M.J.Moroney

 

 -Oorspronkelijk bericht-
 Van: Peter Dalgaard [mailto:[EMAIL PROTECTED] 
 Verzonden: donderdag 19 juli 2007 13:37
 Aan: ONKELINX, Thierry
 CC: Uwe Ligges; r-help@stat.math.ethz.ch
 Onderwerp: Re: [R] Strange warning in summary.lm
 
 ONKELINX, Thierry wrote:
  The problem also exists in a clean workspace. But I've found the 
  troublemaker. I had set options(OutDec = ,). Resetting this to 
  options(OutDec = .) solved the problem.
 
  Thanks,
 
  Thierry

 Oups. That sounds like there's a bug somewhere. Can you cook 
 up a minimal example which shows the behaviour?
 
 -- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  
 (+45) 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: 
 (+45) 35327907
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread Prof Brian Ripley
On Thu, 19 Jul 2007, Peter Dalgaard wrote:

 ONKELINX, Thierry wrote:
 The problem also exists in a clean workspace. But I've found the
 troublemaker. I had set options(OutDec = ,). Resetting this to
 options(OutDec = .) solved the problem.

 Thanks,

 Thierry

 Oups. That sounds like there's a bug somewhere. Can you cook up a
 minimal example which shows the behaviour?

Any use of summary.lm will do it (e.g. example(lm)).  The problem is in 
printCoefmat, at

x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0)

and yes, it looks like an infelicity to me.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test about distribution of data in a single population

2007-07-19 Thread João Fadista
Dear all,
 
I would like to know how can I test which are the intervals of my data that 
have significant less or more counts than the other intervals.
Example:
 
Interval[1:200][200:400][400:600][600:800] ... more 900 
hundred columns
Count   122875  
  
 
 
 

Thanks in advance,
Best regards

João Fadista
Ph.d. student



 UNIVERSITY OF AARHUS   
Faculty of Agricultural Sciences
Dept. of Genetics and Biotechnology 
Blichers Allé 20, P.O. BOX 50   
DK-8830 Tjele   

Phone:   +45 8999 1900  
Direct:  +45 8999 1900  
E-mail:  [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]   
Web: www.agrsci.org http://www.agrsci.org/


News and news media http://www.agrsci.org/navigation/nyheder_og_presse .

This email may contain information that is confidential. Any use or publication 
of this email without written permission from Faculty of Agricultural Sciences 
is not allowed. If you are not the intended recipient, please notify Faculty of 
Agricultural Sciences immediately and delete this email.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tapply

2007-07-19 Thread John Kane
I do not understand what you want.  If aps is constant
over each class then the mean for each class is equal
to any value of aps.  

Using your example you can do 

tapply(icu1$aps, icu1$d, mean)

but it does not give you anything new.  Can you
explain the problem a bit more? 


--- sigalit mangut-leiba [EMAIL PROTECTED] wrote:

 hello,
 i want to compute the mean of a variable (aps) for
 every class
 (1,2, and 3).
 every id have a few obs., aps and class are
 constant over id.
 like this:
 id   aps class
 1  11   2
 1  11   2
 1  11   2
 1  11   2
 1  11   2
 2   83
 2   83
 2   83
 3  12   2
 3  12   2
 .
 .
 
 i tried:
 
 tapply(icu1$aps_st, icu1$hidclass, function(z)
 mean(unique(z)))
 
 but it's counting every row and not every id.
 
 thank you,
 
 Sigalit.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multinomial logit estimation

2007-07-19 Thread Walter Paczkowski
Good morning,

I'd like to estimate a simple multinomial logit model in R (not a McFadden 
conditional logit).  For instance, I'd like to estimate the probability of 
someone having one of eight titles in a company with the independent variables 
being the company characteristics.  A binary logit is well documented.  What 
about the multinomial?

Thanks,

Walt Paczkowski

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error for TukeyHSD ?

2007-07-19 Thread Richard M. Heiberger
TukeyHSD, and glht in the multcomp package, are designed for
aov objects.  They do not have methods for aovlist objects.

The workaround is to download and install the HH package and look at the
maiz example.

library(HH)
?MMC

In that example I show how to get the same sequential sum of squares
without the Error() term as you would get with the Error term.

Rich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread Peter Dalgaard
Prof Brian Ripley wrote:
 On Thu, 19 Jul 2007, Peter Dalgaard wrote:

   
 ONKELINX, Thierry wrote:
 
 The problem also exists in a clean workspace. But I've found the
 troublemaker. I had set options(OutDec = ,). Resetting this to
 options(OutDec = .) solved the problem.

 Thanks,

 Thierry

   
 Oups. That sounds like there's a bug somewhere. Can you cook up a
 minimal example which shows the behaviour?
 

 Any use of summary.lm will do it (e.g. example(lm)).  The problem is in 
 printCoefmat, at

 x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0)

 and yes, it looks like an infelicity to me.

   
Ick. Any better ideas than

printsAs0 - scan(con - textConnection(Cf[okP), dec=options(outDec)) ; 
close(con)
x0 - (xm[okP] == 0) != printsAs0 

?

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] df manipulation

2007-07-19 Thread Henrique Dallazuanna
Hi, see below:

 df
  index value
1 1 1
2 4 6
3 7 4
4 9 5
511 3
613 2

 foo - function(x){
+ index - ifelse(x %in% df$index, df$value[which(df$index %in% x)], NA)
+ return(index)
+ }

 df_ok - data.frame(index=1:13, value=sapply(1:13, foo))
 df_ok
   index value
1  1 1
2  2NA
3  3NA
4  4 6
5  5NA
6  6NA
7  7 4
8  8NA
9  9 5
1010NA
1111 3
1212NA
1313 2

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O


On 19/07/07, Nikola Markov [EMAIL PROTECTED] wrote:
 I have multicolumn data.frames with the first comumn giving ordinal
 observation index ( ex.: 1 4 7 9 11 13 etc). I would like to fill up the
 missing observations (i.e. 2 3 5 6 8 etc) with NAs.
 Thank you

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread Uwe Ligges
Ah, in  printCoefmat() there is Cf which is at some point:

Cf

 Estimate  Std. Error t value  Pr(|t|)
(Intercept)  0,2542  0,1875   1,356 
x   -0,5346  0,3463  -1,544 

and

cat(Cf)
0,2542 -0,5346  0,1875  0,3463  1,356 -1,544

(i.e. values are character) and then
x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0)
includes as.numeric(Cf[okP]) which won't work any more.



I think this design is not the best possible. Nevertheless, the real 
problem is in format(), because it does not use its default 
`decimal.mark = .' if options(OutDec = ,) has been set, and even 
ignores manual chosen values:


# nice:
  options(OutDec = .)
  format(0.1, decimal.mark = :) #  [1] 0:1

# bug:
  options(OutDec = ,)
  format(0.1, decimal.mark = :) #  [1] 0,1

Best,
Uwe







ONKELINX, Thierry wrote:
 Dear Peter,
 
 Here's an example. Notice the warning in the last two lines of the summary 
 with options(OutDec = ,). It's not present with options(OutDec = .).
 
 Cheers,
 
 Thierry
 
 x - runif(100)
 y - rnorm(100)
 options(OutDec = ,)
 summary(lm(y~x))
 
 Call:
 lm(formula = y ~ x)
 
 Residuals:
   Min1QMedian3Q   Max 
 -2,389749 -0,607002  0,006969  0,689535  1,713197 
 
 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept)  0,033970,17774   0,1910,849
 x   -0,092190,29518  -0,3120,755
 
 Residual standard error: 0,868 on 98 degrees of freedom
 Multiple R-Squared: 0.0009943,  Adjusted R-squared: -0.0092 
 F-statistic: 0.09754 on 1 and 98 DF,  p-value: 0,7555 
 
 Warning message:
 NAs introduced by coercion in: as.double.default(Cf[okP]) 
 options(OutDec = .)
 summary(lm(y~x))
 
 Call:
 lm(formula = y ~ x)
 
 Residuals:
   Min1QMedian3Q   Max 
 -2.389749 -0.607002  0.006969  0.689535  1.713197 
 
 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.033970.17774   0.1910.849
 x   -0.092190.29518  -0.3120.755
 
 Residual standard error: 0.868 on 98 degrees of freedom
 Multiple R-Squared: 0.0009943,  Adjusted R-squared: -0.0092 
 F-statistic: 0.09754 on 1 and 98 DF,  p-value: 0.7555 
 
 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
 Forest
 Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, 
 methodology and quality assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium
 tel. + 32 54/436 185
 [EMAIL PROTECTED]
 www.inbo.be 
 
 Do not put your faith in what statistics say until you have carefully 
 considered what they do not say.  ~William W. Watt
 A statistical analysis, properly conducted, is a delicate dissection of 
 uncertainties, a surgery of suppositions. ~M.J.Moroney
 
  
 
 -Oorspronkelijk bericht-
 Van: Peter Dalgaard [mailto:[EMAIL PROTECTED] 
 Verzonden: donderdag 19 juli 2007 13:37
 Aan: ONKELINX, Thierry
 CC: Uwe Ligges; r-help@stat.math.ethz.ch
 Onderwerp: Re: [R] Strange warning in summary.lm

 ONKELINX, Thierry wrote:
 The problem also exists in a clean workspace. But I've found the 
 troublemaker. I had set options(OutDec = ,). Resetting this to 
 options(OutDec = .) solved the problem.

 Thanks,

 Thierry
   
 Oups. That sounds like there's a bug somewhere. Can you cook 
 up a minimal example which shows the behaviour?

 -- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  
 (+45) 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: 
 (+45) 35327907



 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multinomial logit estimation

2007-07-19 Thread Chuck Cleland
Walter Paczkowski wrote:
 Good morning,
 
 I'd like to estimate a simple multinomial logit model in R (not a McFadden 
 conditional logit).  For instance, I'd like to estimate the probability of 
 someone having one of eight titles in a company with the independent 
 variables being the company characteristics.  A binary logit is well 
 documented.  What about the multinomial?
 
 Thanks,
 
 Walt Paczkowski
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

  Have you considered the results of

RSiteSearch(multinomial, restrict=function)

  which point to a number of relevant packages and functions?

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting dataframes

2007-07-19 Thread Uwe Ligges


CG Pettersson wrote:
 Dear all!
 
 W2k, R 2.5.1
 
 I am working with an ongoing malting barley variety evaluation within
 Sweden. The structure is 25 cultivars tested each year at four sites, in
 field trials with three replicates and 'lattice' structure (the replicates
 are divided into five sub blocks in a structured way). As we are normally
 keeping around 15 varieties from each year to the next, and take in 10 new
 for next year, we have tested totally 72 different varieties during five
 years.
 
 I store the data in a field trial database, and generate text tables with
 the subset of data I want and import the frame to R. I take in all
 cultivars in R and use 'subset' to select what I want to look at. Using
 lme{nlme} works with no problems to get mean results over the years, but
 as I now have a number of years I want to analyse the general site x
 cultivar relation. I am testing AMMI{agricolae} for this and it seems to
 work except for the subsetting. This is what happens:
 
 If I do the subsetting like this:
 
 x62_samvar - subset(x62_5, cn %in%
 c(Astoria,Barke,Christina,Makof, Prestige,Publican,Quench))
 
 A test run with AMMI seems to work in the first part:
 
 AMMI(site, cn, rep, yield)
 
 ANALYSIS AMMI:  yield
 Class level information
 
 ENV:  Hag Klb Bjt Ska
 GEN:  Astoria Prestige Makof Christina Publican Quench
 REP:  1 2 3
 
 Number of observations:  240
 
 model Y: yield  ~ ENV + REP%in%ENV + GEN + ENV:GEN
 
 Analysis of Variance Table
 
 Response: Y
DfSum Sq   Mean Sq F valuePr(F)
 ENV 3 120092418  40030806 90.0424 1.665e-06 ***
 REP(ENV)8   3556620444578  0.5674  0.803923
 GEN 5  21376142   4275228  5.4564 9.680e-05 ***
 ENV:GEN15  28799807   1919987  2.4504  0.002555 **
 Residuals 208 162973213783525
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
 Coeff var   Mean yield
 13.08629 6764.098
 
 After this something goes wrong, as AMMI finds a cultivar name not
 selected in the subsetting. (The plotting might go wrong anyhow, but I
 haven´t got that far yet):
 
 Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
 object$xlevels) :
 factor 'y' has new level(s) Arkadia
 
 
 Looking at the dataframe using
 
 edit(x62_samvar)
 
 only shows the selected lines, but using levels() gives another answer as
 
 levels(x62_samvar$cn)
 
 gives back all 72 cultivar names used during the five years (starting with
 Arcadia).
 
 Where do I go wrong and how do I use subset in a proper way?


So you have to drop the levels you are excluding. Example:

   x - factor(letters[1:4])
   x
   x[1:2]
   x[1:2, drop=TRUE]


Uwe Ligges




 Thanks
 /CG


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error: evaluation nested too deeply when doing heatmap with binary distfunction

2007-07-19 Thread Uwe Ligges


zhihua li wrote:
 Hi netters,
 
 I have a matrix X of the size (1000,100). The values are from -3 to +3.  
 When I tried
 
 heatmap(X, 
 distfun=function(c),dist(c,method=bin),hclustfun=function(m),hclust(m,method=average))
  
 
 
 
 I got the error message: Error: evaluation nested too deeply: infinite 
 recursion / options(expressions=)?



So, does it help to increase the thresholds?
If not, please specify a easily reproducible example that helps us to 
investigate your problem.

Best,
Uwe Ligges




 However, if I used default parameters for distfunction:
 heatmap(X, hclustfun=function(m),hclust(m,method=average))
 there is no error messages at all.
 
 But the problem is that I have to use binary method in my disfunction. 
 How can I resolve the problem?
 
 Thanks a lot!
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multinomial logit estimation

2007-07-19 Thread bady
Hi all,

You can consult help of the functions multinom (library(MASS))
and vglm (library(VGAM)).

hope this help,

Pierre




Selon Walter Paczkowski [EMAIL PROTECTED]:

 Good morning,

 I'd like to estimate a simple multinomial logit model in R (not a McFadden
 conditional logit).  For instance, I'd like to estimate the probability of
 someone having one of eight titles in a company with the independent
 variables being the company characteristics.  A binary logit is well
 documented.  What about the multinomial?

 Thanks,

 Walt Paczkowski

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange warning in summary.lm

2007-07-19 Thread Prof Brian Ripley
On Thu, 19 Jul 2007, Peter Dalgaard wrote:

 Prof Brian Ripley wrote:
 On Thu, 19 Jul 2007, Peter Dalgaard wrote:


 ONKELINX, Thierry wrote:

 The problem also exists in a clean workspace. But I've found the
 troublemaker. I had set options(OutDec = ,). Resetting this to
 options(OutDec = .) solved the problem.

 Thanks,

 Thierry


 Oups. That sounds like there's a bug somewhere. Can you cook up a
 minimal example which shows the behaviour?


 Any use of summary.lm will do it (e.g. example(lm)).  The problem is in
 printCoefmat, at

 x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0)

 and yes, it looks like an infelicity to me.


 Ick. Any better ideas than

 printsAs0 - scan(con - textConnection(Cf[okP), dec=options(outDec)) ; 
 close(con)
 x0 - (xm[okP] == 0) != printsAs0

 ?

Yes, several.  That assumes that getOption(OutDec) (which is what you 
need) is a single char since scan() does.  If so, chartr() will do the job 
and more generally sub() would.  Alternatively, we could figure out what 
representations of zero are possible and grep for them.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] df manipulation

2007-07-19 Thread Uwe Ligges


Nikola Markov wrote:
 I have multicolumn data.frames with the first comumn giving ordinal 
 observation index ( ex.: 1 4 7 9 11 13 etc). I would like to fill up the 
 missing observations (i.e. 2 3 5 6 8 etc) with NAs.

Please specify reproducible examples, they make it much easier to help!


If you have:

df - data.frame(index = c(1, 4, 7), x1 = 1:3, x2 = 3:1)
df

Then you can say:

temp - data.frame(index = 1:max(df$index))
df - merge(temp, df, all.x = TRUE)
df

Uwe Ligges





 Thank you

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] maximum likelihood estimation

2007-07-19 Thread Ajay Shah
 I need to perform maximum likelihood estimation on R, but I am not sure
 which command to use. I searched on google, and found an example using the
 function mlogl, but I couldn't find the package on R. Is there such
 function? Or how should i perform my mle?

http://www.mayin.org/ajayshah/KB/R/documents/mle/mle.html might help.

-- 
Ajay Shah  http://www.mayin.org/ajayshah  
[EMAIL PROTECTED] http://ajayshahblog.blogspot.com
*(:-? - wizard who doesn't know the answer.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tapply

2007-07-19 Thread Henrique Dallazuanna
I also don't understand, but perhaps:

with(df, tapply(aps, list(class, id), mean))


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O


On 19/07/07, sigalit mangut-leiba [EMAIL PROTECTED] wrote:
 hello,
 i want to compute the mean of a variable (aps) for every class
 (1,2, and 3).
 every id have a few obs., aps and class are constant over id.
 like this:
 id   aps class
 1  11   2
 1  11   2
 1  11   2
 1  11   2
 1  11   2
 2   83
 2   83
 2   83
 3  12   2
 3  12   2
 .
 .

 i tried:

 tapply(icu1$aps_st, icu1$hidclass, function(z) mean(unique(z)))

 but it's counting every row and not every id.

 thank you,

 Sigalit.

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can I test if there are statistical significance between different rows in R*C table?

2007-07-19 Thread zhijie zhang
Dear  friends,
  My R*C table is as follow:



better

good

bad

Goup1

16

71

37

Group2

0

4

61

Group3

1

6

57

   Can I test if there are statistical significant between Group1 and
Group2, Group2 and Group3, Group1 and Group2, taking into the multiple
comparisons?

The table can be set up using the following program:

a-matrix(data=c(16,71,37,0,4,61,1,6,57),nrow=3,byrow=TRUE)
Thanks very much.


-- 
With Kind Regards,

oooO:
(..):
:\.(:::Oooo::
::\_)::(..)::
:::)./:::
::(_/
:
[***]
Zhi Jie,Zhang ,PHD
Tel:86-21-54237149
Dept. of Epidemiology,School of Public Health,Fudan University
Address:No. 138 Yi Xue Yuan Road,Shanghai,China
Postcode:200032
Email:[EMAIL PROTECTED]
Website: www.statABC.com
[***]
oooO:
(..):
:\.(:::Oooo::
::\_)::(..)::
:::)./:::
::(_/
:

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (R) Using arguments for the empirical cumulative distribution function

2007-07-19 Thread squall44

Hi,

I have just started using R. Now I have the following problem:

I want to create an Empirical Cumulative Distribution Function and I only
came so far:

F10 - ecdf(x)
plot(F10, verticals= TRUE, do.p = TRUE, lwd=3)
x=c(1.6,1.8,2.4,2.7,2.9,3.3,3.4,3.4,4,5.2)

Now I'd like to use arguments such as xlabs and main but I don't know how to
integrate them.

I hope someone can help me, I am really stuck!

-- 
View this message in context: 
http://www.nabble.com/%28R%29-Using-arguments-for-the-empirical-cumulative-distribution-function-tf4111355.html#a11690150
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot centered line on barplot

2007-07-19 Thread [EMAIL PROTECTED]
Dear R user,

I need plot an histogram for the occurrence of a dataset, and then add a line 
corresponding to the freuqnecy of another similar dataset. in order to do this 
i used the function 
 hist_data1=hist(data1, breaks= seq(0,50,5), plot=FALSE)
 hist_data2=hist(data2, breaks= seq(0,50,5), plot=FALSE)

then I plotted the frequency

 barplot(hist_data1$density)
 lines(hist_data1$density)

but the line is shifted in respect to the center of the bars. how can I 
properly plot the line? another question. this is easy, how can I smooth the 
curve (not fit with loess of spline)?


tnx

--
Claudio

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting dataframes

2007-07-19 Thread CG Pettersson
Thanks a lot.
But an ignorant R user, like me, needed the code example from Jim Holtman
posted outside the list earlier today to understand that:

x62_samvar$cn - x62_samvar$cn[,drop=TRUE]

was the way to code. Thank you both!

/CG


On Thu, July 19, 2007 3:01 pm, Uwe Ligges said:


 CG Pettersson wrote:
 Dear all!

 W2k, R 2.5.1

 I am working with an ongoing malting barley variety evaluation within
 Sweden. The structure is 25 cultivars tested each year at four sites, in

/snip


 Where do I go wrong and how do I use subset in a proper way?


 So you have to drop the levels you are excluding. Example:

x - factor(letters[1:4])
x
x[1:2]
x[1:2, drop=TRUE]


 Uwe Ligges




 Thanks
 /CG




-- 
CG Pettersson, PhD
Swedish University of Agricultural Sciences (SLU)
Dept. of Crop Production Ecology. Box 7043.
SE-750 07 Uppsala, Sweden
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error: evaluation nested too deeply when doing heatmap with binary distfunction

2007-07-19 Thread zhihua li
Sorry, that was a typo.  Actually there wasn't a comma after 'function(m)' 
in my expression.

So I'll try to increase the threshould to see if that works.

Thanks a lot!



From: jim holtman [EMAIL PROTECTED]
To: zhihua li [EMAIL PROTECTED]
Subject: Re: [R] Error: evaluation nested too deeply when doing heatmap 

with binary distfunction

Date: Thu, 19 Jul 2007 00:19:54 -0400

you seem to have a syntax error in your statement.  There appears to
be an extra commas after 'function(m),'.  I think is should be:

heatmap(X,
   
distfun=function(c)dist(c,method=bin),hclustfun=function(m)hclust(m,method=average))





On 7/18/07, zhihua li [EMAIL PROTECTED] wrote:

Hi netters,

I have a matrix X of the size (1000,100). The values are from -3 to 
+3.

When I tried

heatmap(X,
distfun=function(c),dist(c,method=bin),hclustfun=function(m),hclust(m,method=average))





I got the error message:
Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?

However, if I used default parameters for distfunction:
heatmap(X, hclustfun=function(m),hclust(m,method=average))
there is no error messages at all.

But the problem is that I have to use binary method in my 
disfunction. How

can I resolve the problem?

Thanks a lot!


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A small problem with as.timeSeries() function...

2007-07-19 Thread Levent TERLEMEZ
Dear Users,

I have a problem with as.timeSeries() function. When I am using this function, 
I can not assign my own date column. Is it possible to do this. For example, my 
data frame is like below,

Datex1  x2...
01/01/2005   2.31.5...
01/02/2005   1.22.2...

and so on...
For getReturns(), this data frame must be a time series so the result is same 
as what as.timeSeries() converts. So the problem is rownames began from 1970 
and with year, the date and month differ from mine time range. And also 
plotting is another problem. What is the way to fix this, or how can I convert 
getReturns()'s results to original data frame like above?

Thanks already for your hints.
With my best regards,

Levent TERLEMEZ

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dates() is a great date function in R

2007-07-19 Thread JB Kim
This is great.

I haven't seen as.Date function before, and was using as.date from 
library(date).
(note the lowercase 'd')

I have an alternative which might or might not be faster...
If the date is formatted mmdd (e.g. 20070719)

library(date)
  formatted - gsub(^(\\d{4})(\\d{2})(\\d{2})$, \\2-\\3-\\1, d$mmdd, 
perl=TRUE)
  d$dates - as.date(formatted)

Since as.date only accepts certain type of date formats, I had to use gsub to
reshuffle the date substrings around.

as.Date returns the objects of class Date, whereas as.date returns the 
objects of class date.
Not sure what the differences are, but a simple test below shows that as.date 
conversion is
slightly faster, given a character vector of 1 date entries.



FYI,

I ran a quick performance comparison test on a 64bit linux machine on 2.6.9 
kernel.
The test is very rudimentary, but hopefully useful...

I have two scripts:


##  as.date_test.R  
##

library(date)

d - read.table(/tmp/dates, as.is = TRUE, col.names = c(mmdd),  
colClasses = c(character))
formatted - gsub(^(\\d{4})(\\d{2})(\\d{2})$, \\2-\\3-\\1, d$mmdd, 
perl=TRUE)
d$dates - as.date(formatted)

print(nrow(d))
print(d$dates[1:3])

##


##  as.Date_test.R  
##

d - read.table(/tmp/dates, as.is = TRUE, col.names = c(mmdd), 
colClasses = c(character))

d$dates - as.Date(d$mmdd, format = %Y%m%d)

print(nrow(d))
print(d$dates[1:3])

##


Both scripts read in the same text file containing 1 date strings, and then 
convert them into 
appropriate date objects.


# 1 date records in a flat file
[EMAIL PROTECTED]$ wc -l /tmp/dates
   1 /tmp/dates

# just to illustrate what the dates look like
[EMAIL PROTECTED]$ head -2 /tmp/dates
19900817
19900820


# Running the test script 5 times each

[EMAIL PROTECTED]$ for i in 1 2 3 4 5; do time R --vanilla  as.date_test.R  
/dev/null; done

real0m1.29s
user0m1.23s
sys 0m0.05s

real0m1.28s
user0m1.23s
sys 0m0.06s

real0m1.28s
user0m1.22s
sys 0m0.06s

real0m1.29s
user0m1.22s
sys 0m0.06s

real0m1.28s
user0m1.21s
sys 0m0.07s

[EMAIL PROTECTED]$ for i in 1 2 3 4 5; do time R --vanilla  as.Date_test.R  
/dev/null; done

real0m1.65s
user0m0.99s
sys 0m0.64s

real0m1.64s
user0m0.98s
sys 0m0.66s

real0m1.63s
user0m0.98s
sys 0m0.65s

real0m1.64s
user0m1.00s
sys 0m0.64s

real0m1.64s
user0m0.98s
sys 0m0.65s



Notice that as.date conversion is silghtly faster than as.Date conversion, on 
average...

Just thought it was interesting to share.
(and thanks Mark Leeds for reference)

Regards,
JB

On 07/18/07 16:13:49, Gavin Simpson wrote:
 On Wed, 2007-07-18 at 12:14 -0700, Mr Natural wrote: 
  Proper calendar dates in R are great for plotting and calculating. 
  However for the non-wonks among us, they can be very frustrating.
  I have recently discussed the pains that people in my lab have had 
  with dates in R. Especially the frustration of bringing date data into R 
  from Excel, which we have to do a lot. 
 
 I've always found the following reasonably intuitive:
 
 Given the csv file that I've pasted in below, the following reads the
 csv file in, formats the dates and class Date and then draws a plot.
 
 I have dates in DD/MM/ format so year is not first - thus attesting
 to R not hating dates in this format ;-)
 
 ## read in csv data
 ## as.is = TRUE stops characters being converted to factors
 ## thus saving us an extra step to convert them back
 dat - read.csv(date_data.csv, as.is = TRUE)
 
 ## we convert to class Date
 ## format tells R how the dates are formatted in our character strings
 ## see ?strftime for the meaning and available codes
 dat$Date - as.Date(dat$Date, format = %d/%m/%Y)
 
 ## check this worked ok
 str(dat$Date)
 dat$Date
 
 ## see nicely formatted dates and not a drop of R-related hatred 
 ## but just about the most boring graph I could come up with
 plot(Data ~ Date, dat, type = l)
 
 And you can keep your Excel file formatted as dates as well - bonus!
 
 Oh, and before you get Martin'd, it is the chron *package*!
 
 HTH
 
 G
 
 CSV file I used, generated in OpenOffice.org, but I presume it stores
 Dates in the same way as Excel?:
 
 Data,Date
 1,01/01/2007
 2,02/01/2007
 3,03/01/2007
 4,04/01/2007
 5,05/01/2007
 6,06/01/2007
 7,07/01/2007
 8,08/01/2007
 9,09/01/2007
 10,10/01/2007
 11,11/01/2007
 10,12/01/2007
 9,13/01/2007
 8,14/01/2007
 7,15/01/2007
 6,16/01/2007
 5,17/01/2007
 4,18/01/2007
 3,19/01/2007
 2,20/01/2007
 1,21/01/2007
 1,22/01/2007
 2,23/01/2007
 3,24/01/2007
 
  Please find below a simple analgesic

[R] tukey or Neuman-Keuls

2007-07-19 Thread elyakhlifi mustapha
hello,
I need to groupe some means and I wonder how to do to get critical range table 
for tukey or for Neuman-Keuls
I think it's possible to do that with the qtukey() function?
thanks.


  
_ 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R

2007-07-19 Thread Thomas Lumley
On Thu, 19 Jul 2007, Fluss wrote:
 Hello!
 I am using for logistic regression in survey data the svyglm procedure.
 I wondered how does the strata effect estimates SE (in addition to the
 weights given proportional to population size).
 I know that for simple regression measurements of each strata is assumed to
 have different variance.
 But in a logistic model this is not the case.

It is simpler (and more complicated) than that.  The survey package uses 
the same formula for nearly all designs and estimators, so it doesn't 
have to handle special cases like estimating separate stratum variances 
for stratified models.

For a population total the variance estimator is just the Horvitz-Thompson 
estimator.  Other estimators are defined by the estimating equations they 
solve, so the mean solves
   sum_i w_i(x_i-mu) = 0
and logistic regression solves
   sum_i w_ix_i(y_i-mu_i) = 0

We now compute the Horvitz-Thompson estimator for the sum of the 
estimating functions (V) and also the population total of the derivative 
of the estimating functions (D). The variance of the estimator is
   D^{-1}VD^{-1}


The standard reference for this in the survey literature seems to be
  Binder, David A. (1983).  On the variances of asymptotically
  normal estimators from complex surveys.  International Statistical
  Review, 51, 279-292.
which is in the References section of help(svyrecvar).

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-DB] RODBC on Oracle DB

2007-07-19 Thread [EMAIL PROTECTED]
Yes, it's a quesion of rights. My system administrator just confirme that.
And I also try with a small Oracle client, and same as R, no permission to read
tables. 
On R with RODBC I could list the tables but no permission to read or import 
tables.
But I get confused because I was using the same Oracle user with an MS Access
client, wich works perfectly even to importe table from Oracle DB.
My system admin told me that this Oracle user is in fact a synomyme or duplicata
from the real Oracle account. But why this very same account can import table 
via
the MS Access client ? This remains mystery to me.  
As soon as may sysadmin return from vacations, I will clear that out. 
grettings et thanks to everyone.
Eric R. 



On Mer Juil 18 17:50 , Marc Schwartz  sent:

I think that you are on to something there Don.

I just tried accessing a table from our Oracle server, which I do know
exists, but for which I do not have access permissions.

Using the following query in the Oracle Instant Client:

  select table_name from all_tables;

I can get a list of all tables on the server, which includes a table
called INCOMPATIBLE_USER_AGENTS, for which I do not have access
permissions.

When attempting to query the table in the Instant Client I get:

SQL select * from INCOMPATIBLE_USER_AGENTS;
select * from INCOMPATIBLE_USER_AGENTS
  *
ERROR at line 1:
ORA-00942: table or view does not exist


When running the same query from R using RODBC I get:

 sqlQuery(db, select * from INCOMPATIBLE_USER_AGENTS)
[1] [RODBC] ERROR: Could not SQLExecDirect
[2] 42S02 942 [Oracle][ODBC][Ora]ORA-00942: table or view does not exist\n


So it looks like permission issues may be possible here.  Eric,
definitely confirm with your SysAdmins that you have appropriate
permissions.

HTH,

Marc


On Wed, 2007-07-18 at 07:43 -0700, Don MacQueen wrote:
 I believe I have seen that error message from 
 Oracle when I tried to query a table for which I 
 did not have select privileges (and when I knew 
 for certain that the table existed). Ask your 
 database administrator about the table, and make 
 sure that you do have that privilege.
 
 What I am uncertain about is whether Oracle, when 
 asked to list tables, returns a list that 
 includes tables for which the user does not have 
 select privileges.
 
 -Don
 
 At 9:24 AM +0200 7/17/07, [EMAIL PROTECTED] wrote:
 essai 
 uid=osis_r,  pwd=12miss15 ,case=oracle)
 
   sqlTables(essai)$ORESTE
 
 ...
 
 1315ORESTE  S_PROFESSIONS_OLDTABLE
 1316ORESTE  S_PROVENANCESTABLE
 1317ORESTES_SEXESTABLE
 1318ORESTE S_SOUS_CLASSESTABLE
 1319ORESTE S_TYP_COLLEGESTABLE
 1320ORESTE S_TYP_ENSEIGNEMENTTABLE
 
 ...
 
   sqlQuery(essai, select * from S_TYP_COLLEGES)
 [1] [RODBC] ERROR: Could not SQLExecDirect   
 [2] 42S02 942 [Oracle][ODBC][Ora]ORA-00942: Table ou vue inexistante\n
 
 I have also tried the
 essai2 
 But with no succes.
 
 
 
 On Lun Juil 16 15:32 , Prof Brian Ripley [EMAIL PROTECTED] sent:
 
 The problem could be quoting, if Oracle is not standards-compliant.
 See the options in ?odbcConnect.
 
 If sqlQuery(essai, select * from S_TYP_COLLEGES) works, this is likely
 to be the problem.
 
 On Mon, 16 Jul 2007, [EMAIL PROTECTED] wrote:
 
 
 
   essai
   odbcGetInfo(essai)
 DBMS_Name DBMS_Ver  Driver_ODBC_Ver
  Oracle 09.00.0121  03.51
   Data_Source_Name  Driver_Name   Driver_Ver
 ORESTE_prodSQORA32.DLL 09.00.0101
  ODBC_Ver  Server_Name
  03.52.   weba
 
 
   sqlTables(essai)
 
   The result of this function is a liste of tables, one of them is called:
   S_TYP_COLLEGES.
 
 
   sqlFetch(essai,S_TYP_COLLEGES)
   [1] [RODBC] ERROR: Could not SQLExecDirect
   [2] 42S02 942 [Oracle][ODBC][Ora]ORA-00942: Table ou vue inexistante\n
 
   sqlFetch(essai, S_TYP_COLLEGES, colnames=TRUE, rownames=FALSE)
   [1] [RODBC] ERROR: Could not SQLExecDirect
   [2] 42S02 942 [Oracle][ODBC][Ora]ORA-00942: Table ou vue inexistante\n
   
   
What could be the problem here ?
Any help is welcome
Eric Röthlisberger, Neuchâtel
   


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory error with 64-bit R in linux

2007-07-19 Thread Paul Gilbert
You might try running top while R runs, to get a better idea of what is
happening. 64-bit R takes more memory than 32-bit (longer pointers) and
for a large problem I would say that 2GB RAM is a minimum if you want
any speed. Slowness is likely related to needing to use swap space. The
cannot allocate error is because you run out of both RAM and swap. If
you are close to finishing your calculation you may resolve things by
increasing swap, but don't expect it to be fast.

There is also a possibility that your user id is restricted, but I'm not
sure how that works anymore. It used to be controlled by ulimit, but
that does not seem to be the case in newer versions of Linux.

If you are still debugging your code, and there is some chance you are
just gobbling up memory endlessly until it runs out, then you can speed
things up (i.e. fail more quickly) by turning swap off. There are
debugging situations where this turns out to be useful.

HTH,
Paul

zhihua li wrote:
 Hi netters,
 
 I'm using the 64-bit R-2.5.0 on a x86-64 cpu, with an RAM of 2 GB.  The 
 operating system is SUSE 10.
 The system information is:  -uname -a
 Linux someone 2.6.13-15.15-smp #1 SMP Mon Feb 26 14:11:33 UTC 2007 
 x86_64 x86_64 x86_64 GNU/Linux
 
 I used heatmap to process a matrix of the dim [16000,100].  After 3 
 hours of desperating waiting, R told me:
 cannot allocate vector of size 896 MB.
 
 I know the matrix is very big, but since I have 2 GB of RAM and in a 
 64-bit system, there should be no problem to deal with a vector smaller 
 than 1 GB? (I was not running any other applications in my system)
 
 Does anyone know what's going on?  Is there a hardware limit where I 
 have to add more RAM, or is there some way to resolve it softwarely? 
 Also is it possible to speed up the computing (I don't wanna wait 
 another 3 hours to know I get another error message)
 
 Thank you in advance!
 
 _
 享用世界上最大的电子邮件系统― MSN Hotmail。  http://www.hotmail.com
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

La version française suit le texte anglais.



This email may contain privileged and/or confidential inform...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tapply

2007-07-19 Thread sigalit mangut-leiba
I'm sorry for the unfocused questions, i'm new here...
the output should be:
classaps_mean
1  na
2 11.5
3   8

the mean aps of every class, when every id count *once*,  for example: class
2, mean= (11+12)/2=11.5
hope it's clearer.
sigalit.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with heatmap - how to remove annoying X before numeric values?

2007-07-19 Thread Suzanne Matthews
Sorry, I just realized I didn't send this to the list! (See below)

Thanks for all the help! All is working fine now.

If anyone knows of a more straightforward way to change the Value string
for the Key, please let me know (just to satisfy my curiosity).  I got it to
work by modifying the source code (specifically, heatmap.2.R  in the
gplots package).

However, before, I didn't make the call
source(heatmap.2.R)

before I called
source(mysillyheatmap.R)

Making the additional call to  heatmap.2.R fixed everything.

Thanks again for all your help!



On 7/19/07, Suzanne Matthews [EMAIL PROTECTED] wrote:

 Thank you all for your prompt replies! The check.names=FALSE parameter
 fixed things entirely.

 One more question:

 Is there a straightforward way to modify the the Values string that
 labels the key to a user-defined value? For example, let's say I want to
 change Values to Silly Values. So far, what I have found that I need to
 do is actually go and change a static string in the source code. Is there a
 more direct way?

 Also, after I make the source code change (in gplots package, file:
 heatmap.2.R), how do I have R build from that? If I remember correctly, if
 I put the new heatmap.2.R in my directory, R is supposed to check for
 functions and such there before it goes and builds it from the main source
 code base (located at /usr/bin/R). I am a touch confused on which directory
 is my directory, where R will check first. I tried putting the modified
 heatmap.2.R file in the directory that my script is, and where I initially
 run R. But that didn't work!

 Is there anything that I should add to my R script that will force it to
 read from that from my directory? If not, which directory should I place
 this in?

 My OS is OS X, so I think Unix-based instructions will work.

 Thank you once again for your time and patience!

 Sincerely,
 Suzanne

 On 7/18/07, Moshe Olshansky [EMAIL PROTECTED] wrote:
 
  I was right saying that my solution was not the best
  possible!
 
  --- Prof Brian Ripley [EMAIL PROTECTED] wrote:
 
   read.table('temp.txt', check.names = FALSE)
  
   would be easier (and more general, since make.names
   can do more than
   prepend an 'X').
  
   On Wed, 18 Jul 2007, Moshe Olshansky wrote:
  
Hi Suzanne,
   
My solution (which I am sure is not the best)
   would
be:
   
heat - read.table('temp.txt')
heat
 X1905 X1910 X1950 X1992 X2011 X2020
Gnat   0.08  0.29  0.29  0.37  0.39  0.43
Snake  0.16  0.34  0.32  0.40  0.41  0.53
Bat0.40  0.54  0.52  0.60  0.60  0.63
Cat0.16  0.27  0.29  0.39  0.37  0.41
Dog0.43  0.54  0.52  0.61  0.60  0.62
Lynx   0.50  0.57  0.54  0.59  0.50  0.59
a-names(heat)
b-strsplit(a,split=X)
w-unlist(b)
w
[1]  1905  1910  1950 
1992  2011  2020
z - w[seq(2,length(w),by=2)]
z
[1] 1905 1910 1950 1992 2011 2020
names(heat) - z
heat
 1905 1910 1950 1992 2011 2020
Gnat  0.08 0.29 0.29 0.37 0.39 0.43
Snake 0.16 0.34 0.32 0.40 0.41 0.53
Bat   0.40 0.54 0.52 0.60 0.60 0.63
Cat   0.16 0.27 0.29 0.39 0.37 0.41
Dog   0.43 0.54 0.52 0.61 0.60 0.62
Lynx  0.50 0.57 0.54 0.59 0.50 0.59
   
   
Regards,
   
Moshe.
   
--- Suzanne Matthews
   [EMAIL PROTECTED]
wrote:
   
Hello All,
   
I have a simple question based on how things are
labeled on my heat map;
particularly, there is this annoying X that
appears before the numeric
value of all the labels of my columns.
   
Let's say I have the following silly data, stored
   in
temp.txt
19051910195019922011
   2020
Gnat0.080.290.290.370.39
   0.43
Snake   0.160.340.320.400.41
   0.53
Bat 0.400.540.520.60 0.60
   0.63
Cat 0.160.270.290.390.37
   0.41
Dog 0.430.540.520.610.60
   0.62
Lynx0.500.570.540.59 0.5
   0.59
   
I use the following commands to generate my
   heatmap:
heat - read.table('temp.txt')
x - as.matrix(heat)
   
heatmap.2(x, keysize=1.2, dendrogram=none,
trace=none, Colv = FALSE,
main = Silly Data, labCol=
NULL, margin=c(7,8))
   
This generates a very nice heatmap, but there is
   one
thing I have an issue
with: How do I get rid of the 'X' that seems to
   come
automatically before my
numeric column values? I just want those columns
   to
be labeled 1905, 1910,
1950, and so on. I cannot find anything in the
heatmap.2 documentation that
suggests how I should do this.
   
Thank you very much for your time, and patience
   in
reading this!
   
Sincerely,
Suzanne
   
   [[alternative HTML version deleted]]
   
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read 

Re: [R] tapply

2007-07-19 Thread Peter Dalgaard
sigalit mangut-leiba wrote:
 I'm sorry for the unfocused questions, i'm new here...
 the output should be:
 classaps_mean
 1  na
 2 11.5
 3   8

 the mean aps of every class, when every id count *once*,  for example: class
 2, mean= (11+12)/2=11.5
 hope it's clearer.
   
Much... Get the first record for each individual from (e.g.)

icul.redux - subset(icul, !duplicated(id))

then use tapply as before using variables from icul.redux. Or in one go

with(
  subset(icul, !duplicated(id)),
  tapply(aps, class, mean, na.rm=TRUE)
)


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot3d labels

2007-07-19 Thread Birgit Lemcke
Hello R users,

I am a newby using R  2.5.0 on a Apple Power Book G4 with Mac OS X
10.4.10.

Sorry that I ask again such stupid questions, but I haven´t found how  
to label the points created with plot3d (rgl).
Hope somebody can help me.

Thanks in advance.

Birgit


Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]






[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with heatmap - how to remove annoying X before numeric values?

2007-07-19 Thread Gabor Grothendieck
Try this.  It makes a copy of heatmap.2 whose scope is changed to
first look within heatmap.3 for functions like mtext.  We redefine mtext
to intercept the text argument and change it appropriately.  Then
we call our copy of heatmap.2.  With this there is no need to change
the source of heatmap.2.

heatmap.3 - function(...) {
environment(heatmap.2) - environment()
mtext - function(text, ...) {
if (text == Value) text - Silly Value
graphics::mtext(text, ...)
}
heatmap.2(...)
}

heatmap.3(as.matrix(heat),
keysize=1.2, dendrogram=none, trace=none, Colv = FALSE,
main = Silly Data, labCol= NULL, margin=c(7,8))




On 7/19/07, Suzanne Matthews [EMAIL PROTECTED] wrote:
 Sorry, I just realized I didn't send this to the list! (See below)

 Thanks for all the help! All is working fine now.

 If anyone knows of a more straightforward way to change the Value string
 for the Key, please let me know (just to satisfy my curiosity).  I got it to
 work by modifying the source code (specifically, heatmap.2.R  in the
 gplots package).

 However, before, I didn't make the call
 source(heatmap.2.R)

 before I called
 source(mysillyheatmap.R)

 Making the additional call to  heatmap.2.R fixed everything.

 Thanks again for all your help!



 On 7/19/07, Suzanne Matthews [EMAIL PROTECTED] wrote:
 
  Thank you all for your prompt replies! The check.names=FALSE parameter
  fixed things entirely.
 
  One more question:
 
  Is there a straightforward way to modify the the Values string that
  labels the key to a user-defined value? For example, let's say I want to
  change Values to Silly Values. So far, what I have found that I need to
  do is actually go and change a static string in the source code. Is there a
  more direct way?
 
  Also, after I make the source code change (in gplots package, file:
  heatmap.2.R), how do I have R build from that? If I remember correctly, if
  I put the new heatmap.2.R in my directory, R is supposed to check for
  functions and such there before it goes and builds it from the main source
  code base (located at /usr/bin/R). I am a touch confused on which directory
  is my directory, where R will check first. I tried putting the modified
  heatmap.2.R file in the directory that my script is, and where I initially
  run R. But that didn't work!
 
  Is there anything that I should add to my R script that will force it to
  read from that from my directory? If not, which directory should I place
  this in?
 
  My OS is OS X, so I think Unix-based instructions will work.
 
  Thank you once again for your time and patience!
 
  Sincerely,
  Suzanne
 
  On 7/18/07, Moshe Olshansky [EMAIL PROTECTED] wrote:
  
   I was right saying that my solution was not the best
   possible!
  
   --- Prof Brian Ripley [EMAIL PROTECTED] wrote:
  
read.table('temp.txt', check.names = FALSE)
   
would be easier (and more general, since make.names
can do more than
prepend an 'X').
   
On Wed, 18 Jul 2007, Moshe Olshansky wrote:
   
 Hi Suzanne,

 My solution (which I am sure is not the best)
would
 be:

 heat - read.table('temp.txt')
 heat
  X1905 X1910 X1950 X1992 X2011 X2020
 Gnat   0.08  0.29  0.29  0.37  0.39  0.43
 Snake  0.16  0.34  0.32  0.40  0.41  0.53
 Bat0.40  0.54  0.52  0.60  0.60  0.63
 Cat0.16  0.27  0.29  0.39  0.37  0.41
 Dog0.43  0.54  0.52  0.61  0.60  0.62
 Lynx   0.50  0.57  0.54  0.59  0.50  0.59
 a-names(heat)
 b-strsplit(a,split=X)
 w-unlist(b)
 w
 [1]  1905  1910  1950 
 1992  2011  2020
 z - w[seq(2,length(w),by=2)]
 z
 [1] 1905 1910 1950 1992 2011 2020
 names(heat) - z
 heat
  1905 1910 1950 1992 2011 2020
 Gnat  0.08 0.29 0.29 0.37 0.39 0.43
 Snake 0.16 0.34 0.32 0.40 0.41 0.53
 Bat   0.40 0.54 0.52 0.60 0.60 0.63
 Cat   0.16 0.27 0.29 0.39 0.37 0.41
 Dog   0.43 0.54 0.52 0.61 0.60 0.62
 Lynx  0.50 0.57 0.54 0.59 0.50 0.59


 Regards,

 Moshe.

 --- Suzanne Matthews
[EMAIL PROTECTED]
 wrote:

 Hello All,

 I have a simple question based on how things are
 labeled on my heat map;
 particularly, there is this annoying X that
 appears before the numeric
 value of all the labels of my columns.

 Let's say I have the following silly data, stored
in
 temp.txt
 19051910195019922011
2020
 Gnat0.080.290.290.370.39
0.43
 Snake   0.160.340.320.400.41
0.53
 Bat 0.400.540.520.60 0.60
0.63
 Cat 0.160.270.290.390.37
0.41
 Dog 0.430.540.520.610.60
0.62
 Lynx0.500.570.540.59 0.5
0.59

 I use the following commands to generate my
heatmap:
 heat - 

Re: [R] plot3d labels

2007-07-19 Thread jim holtman
The documentation has:

text3d(x, y = NULL, z = NULL, texts, adj = 0.5, justify, ...)

Do this do it for you?

On 7/19/07, Birgit Lemcke [EMAIL PROTECTED] wrote:
 Hello R users,

 I am a newby using R  2.5.0 on a Apple Power Book G4 with Mac OS X
 10.4.10.

 Sorry that I ask again such stupid questions, but I haven´t found how
 to label the points created with plot3d (rgl).
 Hope somebody can help me.

 Thanks in advance.

 Birgit


 Birgit Lemcke
 Institut für Systematische Botanik
 Zollikerstrasse 107
 CH-8008 Zürich
 Switzerland
 Ph: +41 (0)44 634 8351
 [EMAIL PROTECTED]






[[alternative HTML version deleted]]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: Do GLM by groups

2007-07-19 Thread Hongmei Jia

Dear All,

I'm trying to do 'glm' analysis by groups just like in SAS you use by
variable.   I don't know how to do it in R, anyone can help with this?
i.e.

groupline  rep  value
1  1   1   0.2
1  1   2   0.3
1 1   3   0.23
1 2   1   0.2
1 2   2   0.3
1 2  3   0.23
2  1   1   0.2
2  1   2   0.3
2 1   3   0.23
2 2   1   0.2
2 2   2   0.3
2 2  3   0.23

in SAS we say:
model value=line rep;
by group;

How can I do this in R?

Thanks,

Hongmei Jia

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fw: Do GLM by groups

2007-07-19 Thread Benilton Carvalho
Check the following example for by():

  require(stats)
  attach(warpbreaks)
  by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x))

or just type:
example(by)

b


On Jul 19, 2007, at 1:18 PM, Hongmei Jia wrote:


 Dear All,

 I'm trying to do 'glm' analysis by groups just like in SAS you use by
 variable.   I don't know how to do it in R, anyone can help with  
 this?
 i.e.

 groupline  rep  value
 1  1   1   0.2
 1  1   2   0.3
 1 1   3   0.23
 1 2   1   0.2
 1 2   2   0.3
 1 2  3   0.23
 2  1   1   0.2
 2  1   2   0.3
 2 1   3   0.23
 2 2   1   0.2
 2 2   2   0.3
 2 2  3   0.23

 in SAS we say:
 model value=line rep;
 by group;

 How can I do this in R?

 Thanks,

 Hongmei Jia

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave on mac os x

2007-07-19 Thread marco.R.help marco.R.help
Dear all,

 is Sweave working on MAC ?
I installed R-2.5.1 but seems like Sweave is not coming with the
distribution as it comes on linux.
Do I need to install it separately ?
In that case is there a .dmg for mac ?

Thanks for the help!

Regards

Marco

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] linear interpolation of multiple random time series

2007-07-19 Thread Mike Lawrence
Hi all,

Looking for tips on how I might more optimally solve this. I have  
time series data (samples from a force sensor) that are not  
guaranteed to be sampled at the same time values across trials. ex.

trial   timex
1   1   1
1   5   4
1   7   9
1   12  20
2   1   0
2   3   5
2   9   10
2   13  14
2   19  22
2   24  32

Within each trial I'd like to use linear interpolation between each  
successive time sample to fill in intermediary timepoints and x- 
values, ex.

trial   timex
1   1   1
1   2   1.75
1   3   2.5
1   4   3.25
1   5   4
1   6   6.5
1   7   9
1   8   11.2
1   9   13.4
1   10  15.6
1   11  17.8
1   12  20
2   1   0
2   2   2.5
2   3   5
2   4   5.83
2   5   6.67
2   6   7.5
2   7   8.33
2   8   9.17
2   9   10
2   10  11
2   11  12
2   12  13
2   13  14
2   14  15.3
2   15  16.7
2   16  18
2   17  19.3
2   18  20.7
2   19  22
2   20  24
2   21  26
2   22  28
2   23  30
2   24  32


The solution I've coded (below) involves going through the original  
data frame line by line and is thus very slow (indeed, I had to  
resort to writing to file as with a large data set I started running  
into memory issues if I tried to create the new data frame in  
memory). Any suggestions on a faster way to achieve what I'm trying  
to do?

#assumes the first data frame above is stored as 'a'
arows = (length(a$x)-1)
write('', 'temp.txt')
for(i in 1:arows){
if(a$time[i+1]  a$time[i]){
write.table(a[i,], 'temp.txt', row.names = F, col.names = F, 
append  
= T)
x1 = a$time[i]
x2 = a$time[i+1]
dx = x2-x1
if(dx != 1){
y1 = a$x[i]
y2 = a$x[i+1]
dy = y2-y1
slope = dy/dx
int = -slope*x1+y1
temp=a[i,]
for(j in (x1+1):(x2-1)){
temp$time = j
temp$x = slope*j+int
write.table(temp, 'temp.txt', row.names = F, 
col.names = F,  
append = T)
}
}
}else{
write.table(a[i,], 'temp.txt', row.names = F, col.names = F, 
append  
= T)
}
}
i=i+1
write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T)

b=read.table('temp.txt',skip=1)
names(b)=names(a)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave on mac os x

2007-07-19 Thread Thomas Adams
Marco,

Yes, absolutely sweave comes with the binary Mac OS X distribution; it 
works great for me.

Regards,
Tom

marco.R.help marco.R.help wrote:
 Dear all,

  is Sweave working on MAC ?
 I installed R-2.5.1 but seems like Sweave is not coming with the
 distribution as it comes on linux.
 Do I need to install it separately ?
 In that case is there a .dmg for mac ?

 Thanks for the help!

 Regards

 Marco

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177

EMAIL:  [EMAIL PROTECTED]

VOICE:  937-383-0528
FAX:937-383-0033

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] linear interpolation of multiple random time series

2007-07-19 Thread jim holtman
This should do it for you:

 x - read.table(textConnection(trial   timex
+ 1   1   1
+ 1   5   4
+ 1   7   9
+ 1   12  20
+ 2   1   0
+ 2   3   5
+ 2   9   10
+ 2   13  14
+ 2   19  22
+ 2   24  32), header=TRUE)
 # compute for each trial
 trial.list - lapply(split(x, x$trial), function(set){
+ .xval - seq(min(set$time), max(set$time))
+ .yval - approx(set$time, set$x, xout=.xval)$y
+ cbind(trial=set$trial[1], time=.xval, x=.yval)
+ })
 do.call('rbind', trial.list)
  trial time x
 [1,] 11  1.00
 [2,] 12  1.75
 [3,] 13  2.50
 [4,] 14  3.25
 [5,] 15  4.00
 [6,] 16  6.50
 [7,] 17  9.00
 [8,] 18 11.20
 [9,] 19 13.40
[10,] 1   10 15.60
[11,] 1   11 17.80
[12,] 1   12 20.00
[13,] 21  0.00
[14,] 22  2.50
[15,] 23  5.00
[16,] 24  5.83
[17,] 25  6.67
[18,] 26  7.50
[19,] 27  8.33
[20,] 28  9.17
[21,] 29 10.00
[22,] 2   10 11.00
[23,] 2   11 12.00
[24,] 2   12 13.00
[25,] 2   13 14.00
[26,] 2   14 15.33
[27,] 2   15 16.67
[28,] 2   16 18.00
[29,] 2   17 19.33
[30,] 2   18 20.67
[31,] 2   19 22.00
[32,] 2   20 24.00
[33,] 2   21 26.00
[34,] 2   22 28.00
[35,] 2   23 30.00
[36,] 2   24 32.00



On 7/19/07, Mike Lawrence [EMAIL PROTECTED] wrote:
 Hi all,

 Looking for tips on how I might more optimally solve this. I have
 time series data (samples from a force sensor) that are not
 guaranteed to be sampled at the same time values across trials. ex.

 trial   timex
 1   1   1
 1   5   4
 1   7   9
 1   12  20
 2   1   0
 2   3   5
 2   9   10
 2   13  14
 2   19  22
 2   24  32

 Within each trial I'd like to use linear interpolation between each
 successive time sample to fill in intermediary timepoints and x-
 values, ex.

 trial   timex
 1   1   1
 1   2   1.75
 1   3   2.5
 1   4   3.25
 1   5   4
 1   6   6.5
 1   7   9
 1   8   11.2
 1   9   13.4
 1   10  15.6
 1   11  17.8
 1   12  20
 2   1   0
 2   2   2.5
 2   3   5
 2   4   5.83
 2   5   6.67
 2   6   7.5
 2   7   8.33
 2   8   9.17
 2   9   10
 2   10  11
 2   11  12
 2   12  13
 2   13  14
 2   14  15.3
 2   15  16.7
 2   16  18
 2   17  19.3
 2   18  20.7
 2   19  22
 2   20  24
 2   21  26
 2   22  28
 2   23  30
 2   24  32


 The solution I've coded (below) involves going through the original
 data frame line by line and is thus very slow (indeed, I had to
 resort to writing to file as with a large data set I started running
 into memory issues if I tried to create the new data frame in
 memory). Any suggestions on a faster way to achieve what I'm trying
 to do?

 #assumes the first data frame above is stored as 'a'
 arows = (length(a$x)-1)
 write('', 'temp.txt')
 for(i in 1:arows){
if(a$time[i+1]  a$time[i]){
write.table(a[i,], 'temp.txt', row.names = F, col.names = F, 
 append
 = T)
x1 = a$time[i]
x2 = a$time[i+1]
dx = x2-x1
if(dx != 1){
y1 = a$x[i]
y2 = a$x[i+1]
dy = y2-y1
slope = dy/dx
int = -slope*x1+y1
temp=a[i,]
for(j in (x1+1):(x2-1)){
temp$time = j
temp$x = slope*j+int
write.table(temp, 'temp.txt', row.names = F, 
 col.names = F,
 append = T)
}
}
}else{
write.table(a[i,], 'temp.txt', row.names = F, col.names = F, 
 append
 = T)
}
 }
 i=i+1
 write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T)

 b=read.table('temp.txt',skip=1)
 names(b)=names(a)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch 

Re: [R] linear interpolation of multiple random time series

2007-07-19 Thread Gabor Grothendieck
Thsi can be done compactly using the zoo package.

The first statement after library converts the rows for each trial into
a separate zoo object and then uses by to merge these into a
single zoo object with one column per trial.

The second statement converts it from zoo to ts which has
the effect of filling in all missing times.  na.approx
does the actual linear interpolation.

The result is an mts object (or you could use as.zoo to
convert that to a zoo object if you prefer).

library(zoo)
z - do.call(merge, by(a, a$trial, function(DF) zoo(DF$x, DF$time)))
na.approx(as.ts(z), na.rm = FALSE)


On 7/19/07, Mike Lawrence [EMAIL PROTECTED] wrote:
 Hi all,

 Looking for tips on how I might more optimally solve this. I have
 time series data (samples from a force sensor) that are not
 guaranteed to be sampled at the same time values across trials. ex.

 trial   timex
 1   1   1
 1   5   4
 1   7   9
 1   12  20
 2   1   0
 2   3   5
 2   9   10
 2   13  14
 2   19  22
 2   24  32

 Within each trial I'd like to use linear interpolation between each
 successive time sample to fill in intermediary timepoints and x-
 values, ex.

 trial   timex
 1   1   1
 1   2   1.75
 1   3   2.5
 1   4   3.25
 1   5   4
 1   6   6.5
 1   7   9
 1   8   11.2
 1   9   13.4
 1   10  15.6
 1   11  17.8
 1   12  20
 2   1   0
 2   2   2.5
 2   3   5
 2   4   5.83
 2   5   6.67
 2   6   7.5
 2   7   8.33
 2   8   9.17
 2   9   10
 2   10  11
 2   11  12
 2   12  13
 2   13  14
 2   14  15.3
 2   15  16.7
 2   16  18
 2   17  19.3
 2   18  20.7
 2   19  22
 2   20  24
 2   21  26
 2   22  28
 2   23  30
 2   24  32


 The solution I've coded (below) involves going through the original
 data frame line by line and is thus very slow (indeed, I had to
 resort to writing to file as with a large data set I started running
 into memory issues if I tried to create the new data frame in
 memory). Any suggestions on a faster way to achieve what I'm trying
 to do?

 #assumes the first data frame above is stored as 'a'
 arows = (length(a$x)-1)
 write('', 'temp.txt')
 for(i in 1:arows){
if(a$time[i+1]  a$time[i]){
write.table(a[i,], 'temp.txt', row.names = F, col.names = F, 
 append
 = T)
x1 = a$time[i]
x2 = a$time[i+1]
dx = x2-x1
if(dx != 1){
y1 = a$x[i]
y2 = a$x[i+1]
dy = y2-y1
slope = dy/dx
int = -slope*x1+y1
temp=a[i,]
for(j in (x1+1):(x2-1)){
temp$time = j
temp$x = slope*j+int
write.table(temp, 'temp.txt', row.names = F, 
 col.names = F,
 append = T)
}
}
}else{
write.table(a[i,], 'temp.txt', row.names = F, col.names = F, 
 append
 = T)
}
 }
 i=i+1
 write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T)

 b=read.table('temp.txt',skip=1)
 names(b)=names(a)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with Dates

2007-07-19 Thread Alex Park
R 

I am taking an excel dataset and reading it into R using read.table.
(actually I am dumping the data into a .txt file first and then reading data
in to R).

Here is snippet:

 head(data); 
   Date  Price Open.Int. Comm.Long Comm.Short net.comm
1 15-Jan-86 673.25175645 65910  2842537485
2 31-Jan-86 677.00167350 54060  2712026940
3 14-Feb-86 680.25157985 37955  2542512530
4 28-Feb-86 691.75162775 49760  1603033730
5 14-Mar-86 706.50163495 54120  2799526125
6 31-Mar-86 709.75164120 54715  3039024325

The dataset runs from 1986 to 2007.

I want to be able to take subsets of my data based on date e.g. data between
2000 - 2005.

As it stands, I can't work with the dates as they are not in correct format.

I tried successfully converting the dates to just the year using:

transform(data, Yr = format(as.Date(as.character(Date),format = '%d-%b-%y'),
%y)))

This gives the following format:

   Date  Price Open.Int. Comm.Long Comm.Short net.comm Yr
1 15-Jan-86 673.25175645 65910  2842537485 86
2 31-Jan-86 677.00167350 54060  2712026940 86
3 14-Feb-86 680.25157985 37955  2542512530 86
4 28-Feb-86 691.75162775 49760  1603033730 86
5 14-Mar-86 706.50163495 54120  2799526125 86
6 31-Mar-86 709.75164120 54715  3039024325 86

I can subset for a single year e.g:

head(subset(df, Yr ==00)

But how can I subset for multiple periods e.g 00- 05? The following won't
work:

head(subset(df, Yr ==00  Yr==01)

or

head(subset(df, Yr = c(00,01,02,03)

I can't help but feeling that I am missing something and there is a simpler
route.

I leafed through R newletter 4.1 which deals with dates and times but it
seemed that strptime and POSIXct / POSIXlt are not what I need either.

Can anybody help me?

Regards


Alex

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: tapply

2007-07-19 Thread sigalit mangut-leiba
thank you!
sigalit.

-- Forwarded message --
From: sigalit mangut-leiba [EMAIL PROTECTED]
Date: Jul 19, 2007 7:03 PM
Subject: tapply
To: r-help r-help@stat.math.ethz.ch


I'm sorry for the unfocused questions, i'm new here...
the output should be:
classaps_mean
1  na
2 11.5
3   8

the mean aps of every class, when every id count *once*,  for example: class
2, mean= (11+12)/2=11.5
hope it's clearer.
sigalit.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Copula

2007-07-19 Thread Julian Burgos
Hi Meeryana,

It seems you have not loaded the package.  To use a package that you 
already installed, do:

  library(copula)

I recommend you review the R documentation.  There are several good 
references and tutorial on the CRAN site:

http://cran.r-project.org/manuals.html
http://cran.r-project.org/other-docs.html

Any time you spend learning the basics will make your work more 
efficient. :)
Good luck,

Julian

Fisheries Acoustics Research Lab
School of Aquatic and Fishery Science

copula wrote:
 I have installed fast all packages for copula,
 I have installed package 'gnml' and the others from the Jim Lindsey's web
 site and the other packages like 'repeted' which is necessary for
 calculating copula.
  
 but when I start with copula it write: 
 'gausscop' function is not found. And also 'fitted.gausscop(z)' and
 'residuals.gausscop(z)'.

 also what I want to say is: when I have installed package 'repeted' from Jim
 Lindsey's web site in R project, this what I have received: 
  
 utils:::menuInstallLocal()
 updating HTML package descriptions

 what have I to do next to solve this problem and to continue with copula.


 Thanks a lot for previous help!


  

   

 gyadav wrote:
   
 hi meeryana,

 may be this time nobnody is responding. but dont worry you will get a lot 
 of help eventually, so always post a copy to the mailing list.
 The reason is there are a lot many newbies, although i am also not so old 
 enough, who have even the simplest questions but are hesitant to ask.
 the objective of the list is to help, and thus feel free to ask, stand and 
 contribute :-) i hope you will understand not today then tomorrow as you 
 will get associated with the mailing list more closely. Personally i have 
 found friends here. Further, never post a mail with a loose subject and 
 secondly try to maintain the thread i.e. reply to the mail which you want 
 to reply(for more details refer to the posting guide) :-) that would 
 really help. 

 Yes now regarding your question :-) 

 Step1 : List of all packages can be found at this link :-)
 http://cran.r-project.org/src/contrib/PACKAGES.html
 Step2: Click on the package you want to install :-) 
 http://cran.r-project.org/src/contrib/Descriptions/copula.html
 Step3: Then download the binary of your Operating system. If windows then 
 download corresponding zip file.
 for copula it is
  http://cran.r-project.org/bin/windows/contrib/r-release/copula_0.5-3.zip
 save zip file on your system
 Step4 Open your R rpogram
 Step5: Goto Packages - Install packages from Local Zip file
 Step6: Select your package zip file which you want to install
 Step7: Sit back and relax
 Step8: load the library using library(LibraryName) on R prompt

 There are alternate ways of installing the package directly from R prompt. 
 It didn't worked for me a long time back, so  i always adopt this method.
 Somebody on the list may help you in this regards :-)

 bye and learn
 Join and stand with Open Source and R community
 Cheers and Chiao, Welcome
 -gaurav



 dear Mr. Yadav,

 I want to thank for help, 
 and for that you are only who is willing to help,
 but I have one question:
 because I'm new with R project also, I think I should install a package 
 for copula.
 I have only installed R program.
 How should I install this package? And is it what I have also to do with 
 credit metrics, Value at Risk, matix and the other formulas, I mean 
 install packages.

 I hope that you have a little time for me and my problem, and I hope I'm 
 not disturbing you.
 thank you for all you can do for me 

 and 

 best regards,

 Mirjana




 gyadav wrote:
 
 hi

 see the code below i hope this will make your understanding of copulas 
 better
 this code plots two normal distribution and their joint distribution 
 N[0,2]  N[0,4]

 HTH

 ##code
 library(copula)
 ###copy in two parts in R#

 ##PART A##
 ## construct a bivariate distribution whose marginals
 ## are normal and Normal respectively, coupled
 ## together via a normal copula
 op - par(mfrow = c(2, 2), # 2 x 2 pictures on one plot
   pty = s)   # square plotting region,
# independent of device size

 x - mvdc(normalCopula(0.75), c(norm, norm),
 list(list(mean = 0, sd =2),list(mean = 0, sd =4)))
 x.samp - rmvdc(x, 1)
 par(mfrow=c(2,3))
 hist(x.samp[,1],xlab=Normal)
 hist(x.samp[,2],xlab=Normal)
 plot(x.samp[,2],x.samp[,1],pch=21,xlab=Normal,ylab=Normal)

 plot(dmvdc(x, x.samp))
 plot(pmvdc(x, x.samp))

 ## At end of plotting, reset to previous settings:


 ###PART B###
 par(op)
 for (i in seq(1:360)){
 persp(x, dmvdc, xlim = c(-4, 4), ylim=c(0, 1),theta=i)
 }


 Regards,

 Gaurav Yadav
 +++
 Assistant Manager, CCIL, Mumbai (India)
 Mob: +919821286118 Email: [EMAIL PROTECTED]
 Bhagavad Gita:  Man is made by his Belief, as 

Re: [R] linear interpolation of multiple random time series

2007-07-19 Thread Mike Lawrence
Jim Holtman's solution works great, and will try the zoo method just  
for fun as well.

Thanks to all of you :)

Mike

--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://memetic.ca

Public calendar: http://icalx.com/public/informavore/Public

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Questions regarding R and fitting GARCH models

2007-07-19 Thread Jeroen van der Heide
Dear all,

I've recently switched from EViews to R with RMetrics/fSeries (newest
version of july 10) for my analysis because of the much bigger
flexibility it offers. So far my experiences had been great -prior I
had already worked extensively with S-Plus so was already kind of
familiar with the language- until I got to the fSeries package.

My problem with the documentation of fSeries is that it is pretty
sparse; therefore I don't whether I am doing something wrong or if my
problem is related to elementary statistics (though I hold an MSc in
Econometrics, it's been quite awhile :o). Next to that I have two
questions related to the syntaxis of the R language itself; I've been
searching for a good couple of hours but couldn't find the answers.
Hope you can help me out.

From the descriptive statistics of my series, I had determined that my
GARCH error term should follow a student's t distribution, preferrably
skewed; the sstdFit function returns nu=2.002 and xi=1.012. I know
that for this distribution 2 degrees of freedom means the second,
third and fourth moment are not defined; however, the QQ plot looks
good -had to use the skwt package though, because rsstd didn't work-
so I decided to give it a try.

1) That didn't go all to well. garchFit returns alpha1 and beta1 of
0.1 and 0.8 respectively (and beta1 has a standard error of NaN) -
which are not consistent at all with what EViews reported. How is this
possible? Is it because GARCH estimation with a student's t
distribution with 2 degrees of freedom is impossible? What are my
options in this case? Or do I need to use garchSpec, of which I have
absolutely *no* idea what its use is (i.e. the entire idea is to
obtain estimates for a given order of AR-GARCH, right)?

2) R/RGui crashes frequently - also if I just want to estimate a
GARCH(1,1) with a student's t distributed error term and 4 or 5
degrees of freedom. I think it just gets into an endless calculation
loop; is there a way to abort this?

3) Is there a way to extract the data from the garchFit objects? So if
fit is my object, how can I extract specific data like the
fit$coef[1] I can use with other objects? Ultimately, fSeries comes
with a really nice plotting function (i.e., plot(fit, which=2)) but I
want to extract the series it actually plots from there - is that
possible like the hist function?

Thanks a lot for your time.

Best regards,


Jeroen van der Heide

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions regarding R and fitting GARCH models

2007-07-19 Thread Dale Smith
As for (3) below, we are also interested in the xi parameter from the
gpdFit function from fExtremes. It seems the fit classes returned by the
associated functions don't have the full properties as finMetrics has
for its classes.

My solution was to modify the appropriate fMetrics code, build a new
package, and install it. I haven't completed that yet. Are there any
other things I can try?

Dale Smith
Vicis Capital, LLC
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeroen van der
Heide
Sent: Thursday, July 19, 2007 4:54 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Questions regarding R and fitting GARCH models

Dear all,

I've recently switched from EViews to R with RMetrics/fSeries (newest
version of july 10) for my analysis because of the much bigger
flexibility it offers. So far my experiences had been great -prior I
had already worked extensively with S-Plus so was already kind of
familiar with the language- until I got to the fSeries package.

My problem with the documentation of fSeries is that it is pretty
sparse; therefore I don't whether I am doing something wrong or if my
problem is related to elementary statistics (though I hold an MSc in
Econometrics, it's been quite awhile :o). Next to that I have two
questions related to the syntaxis of the R language itself; I've been
searching for a good couple of hours but couldn't find the answers.
Hope you can help me out.

From the descriptive statistics of my series, I had determined that my
GARCH error term should follow a student's t distribution, preferrably
skewed; the sstdFit function returns nu=2.002 and xi=1.012. I know
that for this distribution 2 degrees of freedom means the second,
third and fourth moment are not defined; however, the QQ plot looks
good -had to use the skwt package though, because rsstd didn't work-
so I decided to give it a try.

1) That didn't go all to well. garchFit returns alpha1 and beta1 of
0.1 and 0.8 respectively (and beta1 has a standard error of NaN) -
which are not consistent at all with what EViews reported. How is this
possible? Is it because GARCH estimation with a student's t
distribution with 2 degrees of freedom is impossible? What are my
options in this case? Or do I need to use garchSpec, of which I have
absolutely *no* idea what its use is (i.e. the entire idea is to
obtain estimates for a given order of AR-GARCH, right)?

2) R/RGui crashes frequently - also if I just want to estimate a
GARCH(1,1) with a student's t distributed error term and 4 or 5
degrees of freedom. I think it just gets into an endless calculation
loop; is there a way to abort this?

3) Is there a way to extract the data from the garchFit objects? So if
fit is my object, how can I extract specific data like the
fit$coef[1] I can use with other objects? Ultimately, fSeries comes
with a really nice plotting function (i.e., plot(fit, which=2)) but I
want to extract the series it actually plots from there - is that
possible like the hist function?

Thanks a lot for your time.

Best regards,


Jeroen van der Heide

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

All e-mail sent to or from this address will be received or otherwise recorded 
by Vicis Capital, LLC and is subject to archival, monitoring and/or review, by 
and/or disclosure to, someone other than the recipient.  This message is 
intended only for the use of the person(s) (intended recipient) to whom it is 
addressed.  It may contain information that is privileged and confidential.  If 
you are not the intended recipient, please contact the sender as soon as 
possible and delete the message without reading it or making a copy.  Any 
dissemination, distribution, copying, or other use of this message or any of 
its content by any person other than the intended recipient is strictly 
prohibited.  Vicis Capital, LLC only transacts business in states where it is 
properly registered or notice filed, or excluded or exempted from registration 
or notice filing requirements.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trend lines on a scatterplot matrix

2007-07-19 Thread Alan S Barnett
I'm using pairs() to generate a scatterplot matrix;

pairs(~ Fuzzy.gray.white.ratio+Fuzzy.gw.t.score+AgeWhenTested+signal_mean.noise,
data=datam,subset=status==control,main=Controls,
labels=c(G/W,Peak Separation,Age,S/N))


How can I add regression lines to the plots?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with Dates

2007-07-19 Thread jim holtman
Try some of the following:

head(subset(df, Yr %in% c(00,01,02,03)))

subset(df, (Yr = '00')  (Yr = '03'))  # same as above

subset(df, (Yr == '00') | (Yr == '01') | (Yr == '02') |(Yr == '03'))  # same


On 7/19/07, Alex Park [EMAIL PROTECTED] wrote:
 R

 I am taking an excel dataset and reading it into R using read.table.
 (actually I am dumping the data into a .txt file first and then reading data
 in to R).

 Here is snippet:

  head(data);
   Date  Price Open.Int. Comm.Long Comm.Short net.comm
 1 15-Jan-86 673.25175645 65910  2842537485
 2 31-Jan-86 677.00167350 54060  2712026940
 3 14-Feb-86 680.25157985 37955  2542512530
 4 28-Feb-86 691.75162775 49760  1603033730
 5 14-Mar-86 706.50163495 54120  2799526125
 6 31-Mar-86 709.75164120 54715  3039024325

 The dataset runs from 1986 to 2007.

 I want to be able to take subsets of my data based on date e.g. data between
 2000 - 2005.

 As it stands, I can't work with the dates as they are not in correct format.

 I tried successfully converting the dates to just the year using:

 transform(data, Yr = format(as.Date(as.character(Date),format = '%d-%b-%y'),
 %y)))

 This gives the following format:

   Date  Price Open.Int. Comm.Long Comm.Short net.comm Yr
 1 15-Jan-86 673.25175645 65910  2842537485 86
 2 31-Jan-86 677.00167350 54060  2712026940 86
 3 14-Feb-86 680.25157985 37955  2542512530 86
 4 28-Feb-86 691.75162775 49760  1603033730 86
 5 14-Mar-86 706.50163495 54120  2799526125 86
 6 31-Mar-86 709.75164120 54715  3039024325 86

 I can subset for a single year e.g:

 head(subset(df, Yr ==00)

 But how can I subset for multiple periods e.g 00- 05? The following won't
 work:

 head(subset(df, Yr ==00  Yr==01)

 or

 head(subset(df, Yr = c(00,01,02,03)

 I can't help but feeling that I am missing something and there is a simpler
 route.

 I leafed through R newletter 4.1 which deals with dates and times but it
 seemed that strptime and POSIXct / POSIXlt are not what I need either.

 Can anybody help me?

 Regards


 Alex

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] can I paste 'newline'?

2007-07-19 Thread runner

It is ok to bury a reg expression '\n' when using 'cat', but not 'paste'. 
e.g.

cat ('I need to move on to a new line', '\n', 'at here') # change line!
paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a
character as it is.

Is there a way around pasting '\n' ? Thanks a lot.
-- 
View this message in context: 
http://www.nabble.com/can-I-paste-%27newline%27--tf4114350.html#a11699845
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can I paste 'newline'?

2007-07-19 Thread Duncan Murdoch
On 19/07/2007 7:41 PM, runner wrote:
 It is ok to bury a reg expression '\n' when using 'cat', but not 'paste'. 
 e.g.
 
 cat ('I need to move on to a new line', '\n', 'at here') # change line!
 paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a
 character as it is.
 
 Is there a way around pasting '\n' ? Thanks a lot.

What do you want to get?  Do you want a two element vector?  Then use 
c().  Do you want a one element vector that prints on two lines?  Use 
either form, they both work (but you need to use cat() to do the display).

  x - paste ('I need to move on to a new line', '\n', 'at here')
  cat(x)
I need to move on to a new line
  at here

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can I paste 'newline'?

2007-07-19 Thread jim holtman
Notice the difference:

 cat ('I need to move on to a new line', '\n', 'at here') # change line!
I need to move on to a new line
 at here paste ('I need to move on to a new line', '\n', 'at here') #
'\n' is just a
[1] I need to move on to a new line \n at here
 cat(paste ('I need to move on to a new line', '\n', 'at here'))
I need to move on to a new line
 at here

 paste(a long string
+ with carriage
+ returns)
[1] a long string\nwith carriage\nreturns


 cat(paste(a long string
+ with carriage
+ returns))
a long string
with carriage
returns


paste is showing you the characters in the string; cat is acutally
outputting to a print device where '\n' is a line feed.

On 7/19/07, runner [EMAIL PROTECTED] wrote:

 It is ok to bury a reg expression '\n' when using 'cat', but not 'paste'.
 e.g.

 cat ('I need to move on to a new line', '\n', 'at here') # change line!
 paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a
 character as it is.

 Is there a way around pasting '\n' ? Thanks a lot.
 --
 View this message in context: 
 http://www.nabble.com/can-I-paste-%27newline%27--tf4114350.html#a11699845
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] exclude1 in summary.formula from Hmisc

2007-07-19 Thread david dav
Dear Users,
Still having troubles with sharing my results, I'm trying to display a
contingency table using summary.formula.
Computing works well but I'd like to display information on redundant
entries say, males AND females.
I wonder if this code is correct :

desqualjum - summary.formula(varqual1$X ~.,  data = subset( varqual1,
select = -X), method = reverse,  overall = T, test = T, exclude1 = FALSE)
where varqual1 is the data frame containing the variable sex.

I'm using R 2.5.1 and Hmisc 3.4-2 on a Mac (OSX.4, intel)  but I had the
same trouble with  former versions of R and the package.

Thank you for your help.
David

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] BOA (Bayesian Output Analysis)

2007-07-19 Thread Marcus Vinicius
Dear all,
May anyone help me to use this package for the R software?
My procedure have been:


 dir()
[1] line1.ind line1.out line1.txt line2.ind 
line2.out line2.txt regressao.odc


 boa.menu()

Bayesian Output Analysis Program (BOA)
Version 1.1.6 for i386, mingw32
Copyright (c) 2007 Brian J. Smith [EMAIL PROTECTED]

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License or any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

For a copy of the GNU General Public License write to the Free
Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
MA  02111-1307, USA, or visit their web site at
http://www.gnu.org/copyleft/gpl.html

NOTE: if the event of a menu system crash, type
boa.menu(recover = TRUE) to restart and recover your work.

BOA MAIN MENU
*

1: File 
2: Data 
3: Analysis 
4: Plot 
5: Options  
6: Window   

Seleção: 1

FILE MENU
=

1: Back
2: ---+
3: Import Data  |
4: Save Session   |
5: Load Session   |
6: Exit BOA   |
7: ---+

Seleção: 3

IMPORT DATA MENU


1: Back
2: ---+
3: CODA Output Files  |
4: Flat ASCII Files   |
5: Data Matrix Objects|
6: View Format Specifications |
7: Options... |
8: ---+

Seleção: 3

Enter index filename prefix without the .ind extension [Working Directory:
]
1: line1
Read 1 item

Enter output filename prefix without the .out extension [Default: line1]
1: line1
Read 1 item
*Warning: import failed
 Could not find '/line1.ind' or '/line1.out'.*

**

*What is happening??*

*Thanks a lot.*

*Marcus Vinicius*

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to determine/assign a numeric vector to Y in the cor.test function for spearman's correlations?

2007-07-19 Thread G.

Hello to all of you, R-expeRts!
I am trying to compute the cor.test for a matrix that i labelled mydata
according to mydata=read.csv...
then I converted my csv file into a matrix with the 
mydata=as.matrix(mydata)
NOW, I need to get the p-values from the correlations...
I can successfully get the spearman's correlation matrix with:
cor(mydata, method=s, use=pairwise,complete,obs)
but then if I try the 
cor.test(mydata, method=s, use=pairwise.complete.obs)
i always get an error message as follows:
the y argument is missing and does not have any default value assigned... 
(excuse my translation)
WHAT SHOULD I DO???
HOW CAN I DEFINE Y AS A NUMERIC VECTOR???
thanx guys, i just can't express myself as a computer would...
G :)
-- 
View this message in context: 
http://www.nabble.com/how-to-determine-assign-a-numeric-vector-to-%22Y%22-in-the-cor.test-function-for-spearman%27s-correlations--tf4114486.html#a11700314
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error: evaluation nested too deeply when doing heatmap with binary distfunction

2007-07-19 Thread zhihua li

Yes. After I increase the threshould to 1 it got through. Thanks a lot!



From: Uwe Ligges [EMAIL PROTECTED]
To: zhihua li [EMAIL PROTECTED]
CC: r-help@stat.math.ethz.ch
Subject: Re: [R] Error: evaluation nested too deeply when doing heatmap 

with binary distfunction

Date: Thu, 19 Jul 2007 15:18:29 +0200



zhihua li wrote:

Hi netters,

I have a matrix X of the size (1000,100). The values are from -3 to 
+3.  When I tried


heatmap(X, 
distfun=function(c),dist(c,method=bin),hclustfun=function(m),hclust(m,method=average))






I got the error message: Error: evaluation nested too deeply: 
infinite recursion / options(expressions=)?




So, does it help to increase the thresholds?
If not, please specify a easily reproducible example that helps us 
to investigate your problem.


Best,
Uwe Ligges





However, if I used default parameters for distfunction:
heatmap(X, hclustfun=function(m),hclust(m,method=average))
there is no error messages at all.

But the problem is that I have to use binary method in my 
disfunction. How can I resolve the problem?


Thanks a lot!




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


_
享用世界上最大的电子邮件系统― MSN Hotmail。  http://www.hotmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dataframe of factors transform speed?

2007-07-19 Thread Latchezar Dimitrov
Hello,

This is a speed question. I have a dataframe genoT: 

 dim(genoT)
[1]   1002 238304

 str(genoT)
'data.frame':   1002 obs. of  238304 variables:
 $ SNP_A.4261647: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3
...
 $ SNP_A.4261610: Factor w/ 3 levels 0,1,2: 1 1 3 3 1 1 1 2 2 2
...
 $ SNP_A.4261601: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1
...
 $ SNP_A.4261704: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3
...
 $ SNP_A.4261563: Factor w/ 3 levels 0,1,2: 3 1 2 1 2 3 2 3 3 1
...
 $ SNP_A.4261554: Factor w/ 3 levels 0,1,2: 1 1 NA 1 NA 2 1 1 2 1
...
 $ SNP_A.4261666: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2
...
 $ SNP_A.4261634: Factor w/ 3 levels 0,1,2: 3 3 2 3 3 3 3 3 3 2
...
 $ SNP_A.4261656: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2
...
 $ SNP_A.4261637: Factor w/ 3 levels 0,1,2: 1 3 2 3 2 1 2 1 1 3
...
 $ SNP_A.4261597: Factor w/ 3 levels AA,AB,BB: 2 2 3 3 3 2 1 2 2 3
...
 $ SNP_A.4261659: Factor w/ 3 levels AA,AB,BB: 3 3 3 3 3 3 3 3 3 3
...
 $ SNP_A.4261594: Factor w/ 3 levels AA,AB,BB: 2 2 2 1 1 1 2 2 2 2
...
 $ SNP_A.4261698: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ...
 $ SNP_A.4261538: Factor w/ 3 levels AA,AB,BB: 2 3 2 2 3 2 2 1 1 2
...
 $ SNP_A.4261621: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 1 1 1
...
 $ SNP_A.4261553: Factor w/ 3 levels AA,AB,BB: 1 1 2 1 1 1 1 1 1 1
...
 $ SNP_A.4261528: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ...
 $ SNP_A.4261579: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 2 1 1 1 2
...
 $ SNP_A.4261513: Factor w/ 3 levels AA,AB,BB: 2 1 2 2 2 NA 1 NA 2
1 ...
 $ SNP_A.4261532: Factor w/ 3 levels AA,AB,BB: 1 2 2 1 1 1 3 1 1 1
...
 $ SNP_A.4261600: Factor w/ 2 levels AB,BB: 2 2 2 2 2 2 2 2 2 2 ...
 $ SNP_A.4261706: Factor w/ 2 levels AA,BB: 1 1 1 1 1 1 1 1 1 1 ...
 $ SNP_A.4261575: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 2 2 1
...

Its columns are factors with different number of levels (from 1 to 3 -
that's what I got from read.table, i.e., it dropped missing levels). I
want to convert it to uniform factors with 3 levels. The 1st 10 rows
above show already converted columns and the rest are not yet converted.
Here's my attempt wich is a complete failure as speed:

 system.time(
+ for(j in 1:(10 )){ #-- this is to try 1st 10 cols and
measure the time, it otherwise is ncol(genoT) instead of 10

+gt-genoT[[j]]  #-- this is to avoid 2D indices
+for(l in 1:length([EMAIL PROTECTED])){
+  levels(gt)[l] - switch([EMAIL PROTECTED],AA=0,AB=1,BB=2)
#-- convert levels to 0,1, or 2
+  genoT[[j]]-factor(gt,levels=0:2)   #-- make a 3-level factor
and put it back
+}
+ }
+ )
[1] 785.085   4.358 789.454   0.000   0.000

789s for 10 columns only!

To me it seems like replacing 10 x 3 levels and then making a factor of
1002 element vector x 10 is a negligible amount of operations needed.

So, what's wrong with me? Any idea how to accelerate significantly the
transformation or (to go to the very beginning) to make read.table use a
fixed set of levels (AA,AB, and BB) and not to drop any (missing)
level?

R-devel_2006-08-26, Sun Solaris 10 OS - x86 64-bit

The machine is with 32G RAM and AMD Opteron 285 (2.? GHz) so it's not
it.

Thank you very much for the help,

Latchezar Dimitrov,
Analyst/Programmer IV,
Wake Forest University School of Medicine,
Winston-Salem, North Carolina, USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dataframe of factors transform speed?

2007-07-19 Thread Benilton Carvalho
it looks like that whatever method you used to genotype the 1002  
samples on the STY array gave you a transposed matrix of genotype  
calls. :-)

i'd use:

genoT = read.table(yourFile, stringsAsFactors = FALSE)

as a starting point... but I don't think that would be efficient (as  
you'd need to fix one column at a time - lapply).

i'd preprocess yourFile before trying to load it:

cat yourFile | sed -e 's/AA/1/g' | sed -e 's/AB/2/g' | sed -e 's/BB/3/ 
g'  outFile

and, now, in R:

genoT = read.table(outFile, header=TRUE)

b

On Jul 19, 2007, at 11:51 PM, Latchezar Dimitrov wrote:

 Hello,

 This is a speed question. I have a dataframe genoT:

 dim(genoT)
 [1]   1002 238304

 str(genoT)
 'data.frame':   1002 obs. of  238304 variables:
  $ SNP_A.4261647: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3
 ...
  $ SNP_A.4261610: Factor w/ 3 levels 0,1,2: 1 1 3 3 1 1 1 2 2 2
 ...
  $ SNP_A.4261601: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1
 ...
  $ SNP_A.4261704: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3
 ...
  $ SNP_A.4261563: Factor w/ 3 levels 0,1,2: 3 1 2 1 2 3 2 3 3 1
 ...
  $ SNP_A.4261554: Factor w/ 3 levels 0,1,2: 1 1 NA 1 NA 2 1 1  
 2 1
 ...
  $ SNP_A.4261666: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2
 ...
  $ SNP_A.4261634: Factor w/ 3 levels 0,1,2: 3 3 2 3 3 3 3 3 3 2
 ...
  $ SNP_A.4261656: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2
 ...
  $ SNP_A.4261637: Factor w/ 3 levels 0,1,2: 1 3 2 3 2 1 2 1 1 3
 ...
  $ SNP_A.4261597: Factor w/ 3 levels AA,AB,BB: 2 2 3 3 3 2 1  
 2 2 3
 ...
  $ SNP_A.4261659: Factor w/ 3 levels AA,AB,BB: 3 3 3 3 3 3 3  
 3 3 3
 ...
  $ SNP_A.4261594: Factor w/ 3 levels AA,AB,BB: 2 2 2 1 1 1 2  
 2 2 2
 ...
  $ SNP_A.4261698: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1  
 1 ...
  $ SNP_A.4261538: Factor w/ 3 levels AA,AB,BB: 2 3 2 2 3 2 2  
 1 1 2
 ...
  $ SNP_A.4261621: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1  
 1 1 1
 ...
  $ SNP_A.4261553: Factor w/ 3 levels AA,AB,BB: 1 1 2 1 1 1 1  
 1 1 1
 ...
  $ SNP_A.4261528: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1  
 1 ...
  $ SNP_A.4261579: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 2 1  
 1 1 2
 ...
  $ SNP_A.4261513: Factor w/ 3 levels AA,AB,BB: 2 1 2 2 2 NA 1  
 NA 2
 1 ...
  $ SNP_A.4261532: Factor w/ 3 levels AA,AB,BB: 1 2 2 1 1 1 3  
 1 1 1
 ...
  $ SNP_A.4261600: Factor w/ 2 levels AB,BB: 2 2 2 2 2 2 2 2 2  
 2 ...
  $ SNP_A.4261706: Factor w/ 2 levels AA,BB: 1 1 1 1 1 1 1 1 1  
 1 ...
  $ SNP_A.4261575: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1  
 2 2 1
 ...

 Its columns are factors with different number of levels (from 1 to 3 -
 that's what I got from read.table, i.e., it dropped missing levels). I
 want to convert it to uniform factors with 3 levels. The 1st 10 rows
 above show already converted columns and the rest are not yet  
 converted.
 Here's my attempt wich is a complete failure as speed:

 system.time(
 + for(j in 1:(10 )){ #-- this is to try 1st 10 cols and
 measure the time, it otherwise is ncol(genoT) instead of 10

 +gt-genoT[[j]]  #-- this is to avoid 2D indices
 +for(l in 1:length([EMAIL PROTECTED])){
 +  levels(gt)[l] - switch([EMAIL PROTECTED],AA=0,AB=1,BB=2)
 #-- convert levels to 0,1, or 2
 +  genoT[[j]]-factor(gt,levels=0:2)   #-- make a 3-level  
 factor
 and put it back
 +}
 + }
 + )
 [1] 785.085   4.358 789.454   0.000   0.000

 789s for 10 columns only!

 To me it seems like replacing 10 x 3 levels and then making a  
 factor of
 1002 element vector x 10 is a negligible amount of operations  
 needed.

 So, what's wrong with me? Any idea how to accelerate significantly the
 transformation or (to go to the very beginning) to make read.table  
 use a
 fixed set of levels (AA,AB, and BB) and not to drop any  
 (missing)
 level?

 R-devel_2006-08-26, Sun Solaris 10 OS - x86 64-bit

 The machine is with 32G RAM and AMD Opteron 285 (2.? GHz) so it's not
 it.

 Thank you very much for the help,

 Latchezar Dimitrov,
 Analyst/Programmer IV,
 Wake Forest University School of Medicine,
 Winston-Salem, North Carolina, USA

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dataframe of factors transform speed?

2007-07-19 Thread jim holtman
Is this what you want?  It took 0.01 seconds to convert 20 rows of the
test data:

 # create some data (20 rows with 1000 columns)
 n - 20
 result - list()
 vals - c(AA, AB, BB)
 for (i in 1:n){
+ result[[as.character(i)]] - sample(vals,1000, replace=TRUE,
prob=c(9000,1,1))
+ }
 result.df - do.call('data.frame', result)


 str(result.df)
'data.frame':   1000 obs. of  20 variables:
 $ X1 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X2 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X3 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X4 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X5 : Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ...
 $ X6 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X7 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X8 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X9 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X10: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X11: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X12: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X13: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X14: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X15: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X16: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X17: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X18: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X19: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...
 $ X20: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ...

 # go through each row and convert the factors according to 'vals' above
 system.time({  # time to convert 20 rows
+ x - lapply(result.df, function(facts){
+ factor(match(as.character(facts), vals) - 1, levels=0:2)
+ })
+ result.df - do.call('data.frame', x)
+ })
   user  system elapsed
   0.010.000.01

 str(result.df)
'data.frame':   1000 obs. of  20 variables:
 $ X1 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X2 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X3 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X4 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X5 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X6 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X7 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X8 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X9 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X10: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X11: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X12: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X13: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X14: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X15: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X16: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X17: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X18: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X19: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...
 $ X20: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ...



On 7/19/07, Latchezar Dimitrov [EMAIL PROTECTED] wrote:
 Hello,

 This is a speed question. I have a dataframe genoT:

  dim(genoT)
 [1]   1002 238304

  str(genoT)
 'data.frame':   1002 obs. of  238304 variables:
  $ SNP_A.4261647: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3
 ...
  $ SNP_A.4261610: Factor w/ 3 levels 0,1,2: 1 1 3 3 1 1 1 2 2 2
 ...
  $ SNP_A.4261601: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1
 ...
  $ SNP_A.4261704: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3
 ...
  $ SNP_A.4261563: Factor w/ 3 levels 0,1,2: 3 1 2 1 2 3 2 3 3 1
 ...
  $ SNP_A.4261554: Factor w/ 3 levels 0,1,2: 1 1 NA 1 NA 2 1 1 2 1
 ...
  $ SNP_A.4261666: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2
 ...
  $ SNP_A.4261634: Factor w/ 3 levels 0,1,2: 3 3 2 3 3 3 3 3 3 2
 ...
  $ SNP_A.4261656: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2
 ...
  $ SNP_A.4261637: Factor w/ 3 levels 0,1,2: 1 3 2 3 2 1 2 1 1 3
 ...
  $ SNP_A.4261597: Factor w/ 3 levels AA,AB,BB: 2 2 3 3 3 2 1 2 2 3
 ...
  $ SNP_A.4261659: Factor w/ 3 levels AA,AB,BB: 3 3 3 3 3 3 3 3 3 3
 ...
  $ SNP_A.4261594: Factor w/ 3 levels AA,AB,BB: 2 2 2 1 1 1 2 2 2 2
 ...
  $ SNP_A.4261698: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ...
  $ SNP_A.4261538: Factor w/ 3 levels AA,AB,BB: 2 3 2 2 3 2 2 1 1 2
 ...
  $ SNP_A.4261621: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 1 1 1
 ...
  $ SNP_A.4261553: Factor w/ 3 levels AA,AB,BB: 1 1 2 1 1 1 1 1 1 1
 ...
  $ SNP_A.4261528: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ...
  $ SNP_A.4261579: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 2 1 1 1 2
 ...
  $ SNP_A.4261513: Factor w/ 3 levels AA,AB,BB: 2 1 2 2 2 NA 1 NA 2
 1 ...
  $ SNP_A.4261532: Factor w/ 3 levels AA,AB,BB: 1 2 2 1 1 1 3 1 1 1
 ...
  $ SNP_A.4261600: Factor w/ 2 levels AB,BB: 2 2 2 2 2 2 2 2 2 2 ...
  $ SNP_A.4261706: Factor w/ 2 levels AA,BB: 1 1 1 1 1 1 1 1 1 1 ...
  $ SNP_A.4261575: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 2 2 1
 ...

 Its columns are factors with