date:20081223

Re: [R] Using 'cat' on data frame

2008-12-23 Thread Veslot Jacques

print(as.matrix(raw.count),quote=F)

cat(as.character(raw.count$Var1))

Jacques VESLOT

CEMAGREF - UR Hydrobiologie

Route de Cézanne - CS 40061  
13182 AIX-EN-PROVENCE Cedex 5, France

Tél.   + 0033   04 42 66 99 76
fax+ 0033   04 42 66 99 34
email   jacques.ves...@cemagref.fr  


>-Message d'origine-
>De : r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] De la 
>part
>de Gundala Viswanath
>Envoyé : mercredi 24 décembre 2008 08:35
>À : r-h...@stat.math.ethz.ch
>Objet : [R] Using 'cat' on data frame
>
>Dear all,
>
>I have the following data frame:
>
>> raw.count
>Var1
> Freq
>1 AA   707
>2 AC14
>3 AT 3
>
>But why when printint it using 'cat', it doesn't print
>the desired string "AAA" ?
>
>> cat(raw.count$Var1, "\n")
>1 2 3
>
>What's wrong with my cat command above.
>
>
>
>- Gundala Viswanath
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using 'cat' on data frame

2008-12-23 Thread Gundala Viswanath

Dear all,

I have the following data frame:

> raw.count
Var1
 Freq
1 AA   707
2 AC14
3 AT 3

But why when printint it using 'cat', it doesn't print
the desired string "AAA" ?

> cat(raw.count$Var1, "\n")
1 2 3

What's wrong with my cat command above.



- Gundala Viswanath

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using SPSS Labels

2008-12-23 Thread Prof Brian Ripley

1) We didn't get the attachment.  The allowed types are given in the 
posting guide and are pretty restrictive: the suggestion is to put binary 
files on the web for download.


2) We don't have the version information asked for in the posting guide. I 
would only expect this to work in foreign 0.8-29 or 0.8-30.  If you are 
using one of those, please send me the file directly and I will take a 
look at what is happening.


3) Since you have the value labels attribute you can use it to change the 
second column into a factor.  Again, without more details I cannot tell 
you exactly what is needed.



On Tue, 23 Dec 2008, Andrew Choens wrote:


I am trying to import a SPSS.sav file into R. The attached file is not
technically the file I am trying to import, but does replicate my
problem. The actual file is much too large to attach. No matter what I
do, I can not get R (base or Hmisc) to apply the value labels in
the .sav file to the dataframe created in R. Here's the code that I am
using.

maine <- spss.get("test.sav")
# or
maine2 <- read.spss("test.sav", read.value.labels=TRUE)

When I try to import the file, the value labels are not assigned to the
rows. This is what I get.

  ID GENDER
1   1  1
2   2  2
3   3  1
4   4  2
5   5  1
6   6  1
7   7  1
8   8  2
9   9  2
10 10  1
11 11  3

In the .sav file, 1 = Men 2 = Women 3 = user assigned missing.

The variable values are attached as a value.labels attribute. If I
remove row # 11 (gender = 3), I can import the file as I expect.

  ID GENDER
1   1  Men
2   2  Women
3   3  Men
4   4  Women
5   5  Men
6   6  Men
7   7  Men
8   8  Women
9   9  Women
10 10  Men

Given all of this: How can I import a .sav file with user assigned
missing values correctly.

If this is not possible, what is the best way to use the value.labels
attribute when I make a table with table(Gender).

Thanks.

--
Insert something humorous here.  :-)



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] imputing the numerical columns of a dataframe, returning the rest unchanged

2008-12-23 Thread Yihui Xie

Hi,

?sapply will tell you


 'sapply' is a user-friendly version of 'lapply' by default
 returning a vector or matrix if appropriate.


so 'x' has lost its class in sapply(); e.g.

## iris is a data.frame
> str(iris)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1
1 1 1 1 1 1 ...
## but sapply() will coerce it into a numeric matrix
> str(sapply(iris, function(x)x))
 num [1:150, 1:5] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:5] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" ...

I'd suggest you get the class of each column first, then apply
impute() to these columns (i.e. DF[, sapply(DF, class) == "numeric"])
and assign the new values to the original columns.

Regards,
Yihui
--
Yihui Xie 
Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
Mobile: +86-15810805877
Homepage: http://www.yihui.name
School of Statistics, Room 1037, Mingde Main Building,
Renmin University of China, Beijing, 100872, China



On Mon, Dec 22, 2008 at 11:38 PM, Mark Heckmann  wrote:
> Hi R-experts,
>
> how can I apply a function to each numeric column of a data frame and return
> the whole data frame with changes in numeric columns only?
> In my case I want to do a median imputation of the numeric columns and
> retain the other columns. My dataframe (DF) contains factors, characters and
> numerics.
>
> I tried the following but that does not work:
>
> foo <- function(x){
>  if(is.numeric(x)==TRUE) return(impute(x))
>  else(return(x))
> }
>
> sapply(DF, foo)
>
>  day version ID V1 V2  V3
>  [1,] "4" "A"   "1a" "1"   "5"  "5"
>  [2,] "4" "A"   "2a" "2"   "3"  "5"
>  [3,] "4" "B"   "3a" "3"   "5"  "5"
>
> All the variables are coerced to characters now ("day" and "version" were
> factors, "id" a character). I only want imputations on the numerics, but the
> rest to be returned unchanged.
>
> Is there a function available. If not, how can I do it?
>
> TIA and merry x-mas,
> Mark
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conditional Counting with Table

2008-12-23 Thread Gustavo Carvalho

Hello,

Something like this should work:

table(test$V1[!test$V2 %in% c("NM","QC")])

Cheers,

Gustavo.

On Wed, Dec 24, 2008 at 3:06 AM, Gundala Viswanath  wrote:
> Dear all,
>
> I have the following data frame:
>
> V1 V2
> aaachr1
> aaachr2
> aaaNM
> aaaQC
> aaachr10
> att  NM
> att  chr7
>
> What I want to do is to count the string (V1).
> But the condition of counting is: if the V2 of the string
> is "NM" or "QC"  then the count is not increased.
>
> Hence the contigency table will look like this:
>
> #tag   count
> aaa  3
> att1
>
> Is there a compact way to achieve that in R?
> I am thinking of using "table" but can't see
> how to impose such condition into it.
>
>
> - Gundala Viswanath
> Jakarta - Indonesia
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sem package fails when no of factors increase from 3 to 4

2008-12-23 Thread Xiaoxu LI

Dear Prof. FOX,

Good starting values are very critical in real applying cases. If sem
package could automatically delete one or some variables to fit an embedded
small model, just to automatically provide good starting values, I think it
could be more useful. Moreover, I think starting value search could be
recursive.

I am writing sem sample codes for a popular Chinese SEM textbook. Hope sem
package can be more powerful :)


Xiaoxu

On Tue, Dec 23, 2008 at 8:32 AM, John Fox  wrote:

> Dear Xiaoxu LI,
>
> sem.mod(mod4, cor18, 500, debug=TRUE) will show you what went wrong with
> the
> optimization. Since the three-factor solutions look reasonable, I tried
> using them to get better start values for the parameters in the four-factor
> model, producing the solution shown below.
>
> As well, I noticed that your correlation matrix was given only to two
> decimal places, and that some of the correlations have only one significant
> digit. It's possible, though not necessarily the case, that using a more
> precise correlation matrix would produce the solution more easily.
>
> I hope this helps,
>  John
>
> --- snip 
>
> > mod4 <- specify.model()
> 1: X1  <-> X1, TD11, 0.30397
> 2: X2  <-> X2, TD22, 0.33656
> 3: X3  <-> X3, TD33, 0.48680
> 4: X4  <-> X4, TD44, 0.62441
> 5: X5  <-> X5, TD55, 0.78681
> 6: X6  <-> X6, TD66, 0.68547
> 7: X7  <-> X7, TD77, 0.79154
> 8: X8  <-> X8, TD88, 0.67417
> 9: X9  <-> X9, TD99, 0.60875
> 10: X10 <-> X10, TDaa, 0.37764
> 11: X11 <-> X11, TDbb, 0.74658
> 12: X12 <-> X12, TDcc, 0.85765
> 13: X1  <- xi1, LY11, 0.83428
> 14: X2  <- xi1, LY21, 0.81452
> 15: X3  <- xi1, LY31, 0.71638
> 16: X4  <- xi2, LY42, 0.61285
> 17: X5  <- xi2, LY52, 0.46173
> 18: X6  <- xi2, LY62, 0.56084
> 19: X7  <- xi3, LY73, 0.45658
> 20: X8  <- xi3, LY83, 0.57082
> 21: X9  <- xi3, LY93, 0.62550
> 22: X10 <- xi4, LXa4, 0.78890
> 23: X11 <- xi4, LXb4, 0.50340
> 24: X12 <- xi4, LXc4, 0.37729
> 25: xi1  <-> xi1, NA, 1
> 26: xi2  <-> xi2, NA, 1
> 27: xi3  <-> xi3, NA, 1
> 28: xi4  <-> xi4, NA, 1
> 29: xi1  <-> xi2, PH12, 0.13185
> 30: xi1  <-> xi3, PH13, 0.17445
> 31: xi2  <-> xi3, PH23, 0.25125
> 32: xi4  <-> xi1, PH41, 0.35819
> 33: xi4  <-> xi2, PH42, 0.12253
> 34: xi4  <-> xi3, PH43, 0.22137
> 35:
> Read 34 records
>
> > summary(sem(mod4, cor18, 500))
>
>  Model Chisquare =  80.675   Df =  48 Pr(>Chisq) = 0.0021920
>  Chisquare (null model) =  1106.4   Df =  66
>  Goodness-of-fit index =  0.9747
>  Adjusted goodness-of-fit index =  0.95888
>  RMSEA index =  0.036935   90% CI: (0.022163, 0.050657)
>  Bentler-Bonnett NFI =  0.92708
>  Tucker-Lewis NNFI =  0.95682
>  Bentler CFI =  0.9686
>  SRMR =  0.032512
>  BIC =  -217.63
>
>  Normalized Residuals
>Min.  1st Qu.   Median Mean  3rd Qu. Max.
> -1.71000 -0.23300 -0.00337  0.08850  0.26700  2.13000
>
>  Parameter Estimates
> Estimate Std Error z value Pr(>|z|)
> TD11 0.30641  0.037053   8.2694 2.2204e-16 X1 <--> X1
> TD22 0.33226  0.037158   8.9419 0.e+00 X2 <--> X2
> TD33 0.48899  0.039007  12.5358 0.e+00 X3 <--> X3
> TD44 0.62205  0.076640   8.1165 4.4409e-16 X4 <--> X4
> TD55 0.78652  0.063364  12.4126 0.e+00 X5 <--> X5
> TD66 0.68780  0.070102   9.8114 0.e+00 X6 <--> X6
> TD77 0.79474  0.062019  12.8144 0.e+00 X7 <--> X7
> TD88 0.67378  0.069039   9.7595 0.e+00 X8 <--> X8
> TD99 0.60536  0.075437   8.0247 1.1102e-15 X9 <--> X9
> TDaa 0.39902  0.094378   4.2279 2.3590e-05 X10 <--> X10
> TDbb 0.74223  0.060911  12.1854 0.e+00 X11 <--> X11
> TDcc 0.84956  0.060891  13.9523 0.e+00 X12 <--> X12
> LY11 0.83282  0.040846  20.3895 0.e+00 X1 <--- xi1
> LY21 0.81715  0.041065  19.8990 0.e+00 X2 <--- xi1
> LY31 0.71485  0.042041  17.0036 0.e+00 X3 <--- xi1
> LY42 0.61478  0.066956   9.1818 0.e+00 X4 <--- xi2
> LY52 0.46204  0.059887   7.7152 1.1990e-14 X5 <--- xi2
> LY62 0.55875  0.064082   8.7192 0.e+00 X6 <--- xi2
> LY73 0.45306  0.058293   7.7721 7.7716e-15 X7 <--- xi3
> LY83 0.57116  0.062721   9.1064 0.e+00 X8 <--- xi3
> LY93 0.62821  0.065434   9.6007 0.e+00 X9 <--- xi3
> LXa4 0.77523  0.069569  11.1434 0.e+00 X10 <--- xi4
> LXb4 0.50771  0.056580   8.9733 0.e+00 X11 <--- xi4
> LXc4 0.38786  0.056614   6.8510 7.3350e-12 X12 <--- xi4
> PH12 0.13207  0.064099   2.0604 3.9361e-02 xi2 <--> xi1
> PH13 0.17417  0.063512   2.7423 6.1006e-03 xi3 <--> xi1
> PH23 0.25059  0.077099   3.2503 1.1529e-03 xi3 <--> xi2
> PH41 0.36109  0.055310   6.5285 6.6416e-11 xi1 <--> xi4
> PH42 0.12606  0.072905   1.7292 8.3780e-02 xi2 <--> xi4
> PH43 0.22301  0.071781   3.1068 1.8913e-03 xi3 <--> xi4
>
>  Iterations =  14
> Warning message:
> In sem.mod(mod4, cor18, 500) :
>  The following observed variables are in the input covariance or raw-moment
> matrix but do not appear in the model:
> X13, X14, X15, X16, X17, X18
>
> >
>
> --
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaste

[R] Conditional Counting with Table

2008-12-23 Thread Gundala Viswanath

Dear all,

I have the following data frame:

V1 V2
aaachr1
aaachr2
aaaNM
aaaQC
aaachr10
att  NM
att  chr7

What I want to do is to count the string (V1).
But the condition of counting is: if the V2 of the string
is "NM" or "QC"  then the count is not increased.

Hence the contigency table will look like this:

#tag   count
aaa  3
att1

Is there a compact way to achieve that in R?
I am thinking of using "table" but can't see
how to impose such condition into it.


- Gundala Viswanath
Jakarta - Indonesia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Write a data frame to Oracle Data Base

2008-12-23 Thread usstata

Hi guys:



When I use the ROracle package, I found the function 'dbWriteTable'
can't work.

The data frame can't be written to 10g Oracle Data Base. Here is what
happened to me.


>library(DBI)
>library(ROracle)
>drv <- Oracle()
>con <- dbConnect(drv , 'uid','pw','db')
>mm <- data.frame(CO2)
> dbWriteTable(con,'eve_tmp',mm)
Error in function (classes, fdef, mtable) :
unable to find an inherited method for function "dbPrepareStatement",
for signature "OraConnection", "character", "list"
[1] FALSE

And I tried the RODBC package, 'write' function is OK.
Is it the Oracle version' problem? : (


Best regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Non-finite finite difference error

2008-12-23 Thread JS . Augustyn

Hello, I'm trying to use fitdistr() from the MASS package to fit a gamma  
distribution to a set of data. The data set is too large (1167 values) to  
reproduce in an email, but the summary statistics are:

Min. 1st Qu. Median Mean 3rd Qu. Max.
116.7 266.7 666.7 1348.0 1642.0 16720.0

The call I'm trying to make is:
fitdistr(x,"gamma")

and the error is:
Error in optim(x = c(3466.676842, 1666.749002, 2500.067852, 1200.053892, :
non-finite finite-difference value [2]
In addition: Warning message:
In dgamma(x, shape, scale, log) : NaNs produced

I found a couple of other posts from folks who were getting the same error  
from optim(), but did not find any useful tips for my situation. The error  
seems to indicate a problem with value 2 in my data set (1666.749002), but  
nothing seems odd about that value.

I'm willing to pass along the full data set as an attachment if it would  
help.

Thank you in advance!

Jason S. Augustyn, Ph.D.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using SPSS Labels

2008-12-23 Thread Andrew Choens

I am trying to import a SPSS.sav file into R. The attached file is not
technically the file I am trying to import, but does replicate my
problem. The actual file is much too large to attach. No matter what I
do, I can not get R (base or Hmisc) to apply the value labels in
the .sav file to the dataframe created in R. Here's the code that I am
using.

maine <- spss.get("test.sav")
# or
maine2 <- read.spss("test.sav", read.value.labels=TRUE)

When I try to import the file, the value labels are not assigned to the
rows. This is what I get.

   ID GENDER
1   1  1
2   2  2
3   3  1
4   4  2
5   5  1
6   6  1
7   7  1
8   8  2
9   9  2
10 10  1
11 11  3

In the .sav file, 1 = Men 2 = Women 3 = user assigned missing.

The variable values are attached as a value.labels attribute. If I
remove row # 11 (gender = 3), I can import the file as I expect.

   ID GENDER
1   1  Men
2   2  Women
3   3  Men
4   4  Women
5   5  Men
6   6  Men
7   7  Men
8   8  Women
9   9  Women
10 10  Men

Given all of this: How can I import a .sav file with user assigned
missing values correctly.

If this is not possible, what is the best way to use the value.labels
attribute when I make a table with table(Gender).

Thanks.

-- 
Insert something humorous here.  :-)
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I reload sessions from a non-default directory in OS X?

2008-12-23 Thread David Winsemius



On Dec 23, 2008, at 10:02 PM, David Winsemius wrote:


On Dec 23, 2008, at 5:18 PM, Bill McNeill (UW) wrote:

i use the GUI version of R on OS X.  I launch it by double-clicking  
on the R
icon.  The process always starts in my home directory which means  
that the
only .RData file that is ever read in is the one in my home  
directory.  I

would instead like to have different R sessions saved in different
directories, but I can't figure out how to do this.

A workaround is to use the shell version of R and always change to  
the
directory containing my saved session before launching, but I  
prefer to use

the GUI version.

Is there a command to tell R to load a session from a particular  
location?

How do other people with Macs do this?


?load

(It's not Mac-specific.)





If you want to clean out the workspace you could use:

rm( ls() )


That won't do anything. Instead, try:

rm(list = ls())




You can also associate file types with your R.app.

--
David Winsemius



--
Bill McNeill
http://staff.washington.edu/billmcn/index.shtml

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I reload sessions from a non-default directory in OS X?

2008-12-23 Thread ronggui

You can use setwd() to change the working directory.

Merry X-mas

On Wed, Dec 24, 2008 at 6:18 AM, Bill McNeill (UW)
 wrote:
> i use the GUI version of R on OS X.  I launch it by double-clicking on the R
> icon.  The process always starts in my home directory which means that the
> only .RData file that is ever read in is the one in my home directory.  I
> would instead like to have different R sessions saved in different
> directories, but I can't figure out how to do this.
>
> A workaround is to use the shell version of R and always change to the
> directory containing my saved session before launching, but I prefer to use
> the GUI version.
>
> Is there a command to tell R to load a session from a particular location?
> How do other people with Macs do this?
>
> --
> Bill McNeill
> http://staff.washington.edu/billmcn/index.shtml
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
HUANG Ronggui, Wincent
Tel: (00852) 3442 3832
PhD Candidate, City University of Hong Kong
Website: http://ronggui.huang.googlepages.com/
RQDA project: http://rqda.r-forge.r-project.org/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I reload sessions from a non-default directory in OS X?

2008-12-23 Thread Charles C. Berry




In the

R for Mac OS X FAQ

(available in the Help menu)

see
4.5.4 Misc Menu
and
4.5.5 Workspace Menu



This is an R-sig-mac question, so posting there would have been preferred.

Chuck


On Tue, 23 Dec 2008, Bill McNeill (UW) wrote:


i use the GUI version of R on OS X.  I launch it by double-clicking on the R
icon.  The process always starts in my home directory which means that the
only .RData file that is ever read in is the one in my home directory.  I
would instead like to have different R sessions saved in different
directories, but I can't figure out how to do this.

A workaround is to use the shell version of R and always change to the
directory containing my saved session before launching, but I prefer to use
the GUI version.

Is there a command to tell R to load a session from a particular location?
How do other people with Macs do this?

--
Bill McNeill
http://staff.washington.edu/billmcn/index.shtml

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2's qplot() not rendering title descender

2008-12-23 Thread hadley wickham

Hi Mike,

Yup, that's a bug that will be fixed in the next version.  In the
meantime one easy fix is to add an \n on the end of the title string.

Hadley

On Tue, Dec 23, 2008 at 7:02 PM, Mike Lawrence  wrote:
> On my machine (Mac OS 10.5.6, R 2.8.1) the following plot is drawn such that
> a substantial portion of the descender of the title is covered by the plot:
> library(ggplot2)
> qplot(x=1:10,y=1:10,main='p q j g')
>
> --
> Mike Lawrence
> Graduate Student
> Department of Psychology
> Dalhousie University
> www.thatmike.com
>
> Looking to arrange a meeting? Do so at:
> http://www.timetomeet.info/with/mike/
>
> ~ Certainty is folly... I think. ~
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract values based on indexes without looping?

2008-12-23 Thread Charles C. Berry


On Tue, 23 Dec 2008, Sean Zhang wrote:


Dear R-Helpers:

I am a entry level user of R.



You need to read

An Introduction to R

particularly this section:

2.7 Index vectors; selecting and modifying subsets of a data set

...

2. A vector of positive integral quantities

Also, see

?Subscript

and run

example( Subscript )

HTH,

Chuck




Have the following question. Many thanks in advance.


# value.vec stores values
value.vec <- c('a','b','c')
#  which.vec stores the locations/indexs of values in value.vec.
which.vec <- c(3, 2, 2, 1)
# How can I obtain the following vector based on the value.vec and which.vec
mentioned above
# vector.I.want <- c('c', 'b', 'b', 'a')
#  3221


# I try to avoid using the following loop to achieve the goal because the
which.vec in reality will be very long

vector.I.want <- rep(NA,length(which.vec))
for (i in 1:length(which.vec))
{ vector.I.want[i] <- value.vec[which.vec[i]] }

# is there a faster way than looping?

Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 4 questions regarding hypothesis testing, survey package, ts on samples, plotting

2008-12-23 Thread Ben Bolker

Khawaja, Aman wrote:
> 
> I need to answer one of the question in my open source test is: What are
> the four questions asked about the parameters in hypothesis testing?
> 
> 

Please check the posting guide.
* We don't answer homework questions ("open source" doesn't mean
that other people answer the questions for you, it means you can find
the answers outside your own head -- and in any case, we don't have
any of way of knowing that the test is really open).
* this is not an R question but a statistics question
* please don't post the same question multiple times

  sincerely
Ben Bolker

-- 
View this message in context: 
http://www.nabble.com/4-questions-regarding-hypothesis-testing%2C-survey-package%2C-ts-on-samples%2C-plotting-tp21154468p21154709.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract values based on indexes without looping?

2008-12-23 Thread Jorge Ivan Velez

How about this?
value.vec[which.vec]

HTH,

Jorge


On Tue, Dec 23, 2008 at 9:52 PM, Sean Zhang  wrote:

> Dear R-Helpers:
>
> I am a entry level user of R.
>
> Have the following question. Many thanks in advance.
>
>
> # value.vec stores values
> value.vec <- c('a','b','c')
> #  which.vec stores the locations/indexs of values in value.vec.
> which.vec <- c(3, 2, 2, 1)
> # How can I obtain the following vector based on the value.vec and
> which.vec
> mentioned above
> # vector.I.want <- c('c', 'b', 'b', 'a')
> #  3221
>
>
> # I try to avoid using the following loop to achieve the goal because the
> which.vec in reality will be very long
>
> vector.I.want <- rep(NA,length(which.vec))
> for (i in 1:length(which.vec))
> { vector.I.want[i] <- value.vec[which.vec[i]] }
>
> # is there a faster way than looping?
>
> Thanks in advance.
>
> -Sean
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract values based on indexes without looping?

2008-12-23 Thread David Winsemius


> sapply(which.vec, function(x){ value.vec[x] } )
[1] "c" "b" "b" "a"


On Dec 23, 2008, at 9:52 PM, Sean Zhang wrote:


# value.vec stores values
value.vec <- c('a','b','c')
#  which.vec stores the locations/indexs of values in value.vec.
which.vec <- c(3, 2, 2, 1)
# How can I obtain the following vector based on the value.vec and  
which.vec

mentioned above
# vector.I.want <- c('c', 'b', 'b', 'a')



vector.I.want <- rep(NA,length(which.vec))
for (i in 1:length(which.vec))
{ vector.I.want[i] <- value.vec[which.vec[i]] }


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] wavelet

2008-12-23 Thread stephen sefick

I would like to be able to preform a DWT and filter out everything
except for 2^0 and then take that back into the time domain.  Does
anyone have any suggestions.  I am using wmtsa to try and do this.

-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I reload sessions from a non-default directory in OS X?

2008-12-23 Thread David Winsemius


On Dec 23, 2008, at 5:18 PM, Bill McNeill (UW) wrote:

i use the GUI version of R on OS X.  I launch it by double-clicking  
on the R
icon.  The process always starts in my home directory which means  
that the
only .RData file that is ever read in is the one in my home  
directory.  I

would instead like to have different R sessions saved in different
directories, but I can't figure out how to do this.

A workaround is to use the shell version of R and always change to the
directory containing my saved session before launching, but I prefer  
to use

the GUI version.

Is there a command to tell R to load a session from a particular  
location?

How do other people with Macs do this?


?load

(It's not Mac-specific.)

If you want to clean out the workspace you could use:

rm( ls() )

You can also associate file types with your R.app.

--
David Winsemius



--
Bill McNeill
http://staff.washington.edu/billmcn/index.shtml

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extract values based on indexes without looping?

2008-12-23 Thread Sean Zhang

Dear R-Helpers:

I am a entry level user of R.

Have the following question. Many thanks in advance.


# value.vec stores values
value.vec <- c('a','b','c')
#  which.vec stores the locations/indexs of values in value.vec.
which.vec <- c(3, 2, 2, 1)
# How can I obtain the following vector based on the value.vec and which.vec
mentioned above
# vector.I.want <- c('c', 'b', 'b', 'a')
#  3221


# I try to avoid using the following loop to achieve the goal because the
which.vec in reality will be very long

vector.I.want <- rep(NA,length(which.vec))
for (i in 1:length(which.vec))
{ vector.I.want[i] <- value.vec[which.vec[i]] }

# is there a faster way than looping?

Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 4 questions regarding hypothesis testing, survey package, ts on samples, plotting

2008-12-23 Thread Khawaja, Aman

I need to answer one of the question in my open source test is: What are
the four questions asked about the parameters in hypothesis testing?

Thanks.
Aman
 
- CONFIDENTIAL-
This email and any files transmitted with it are confide...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How do I reload sessions from a non-default directory in OS X?

2008-12-23 Thread Bill McNeill (UW)

i use the GUI version of R on OS X.  I launch it by double-clicking on the R
icon.  The process always starts in my home directory which means that the
only .RData file that is ever read in is the one in my home directory.  I
would instead like to have different R sessions saved in different
directories, but I can't figure out how to do this.

A workaround is to use the shell version of R and always change to the
directory containing my saved session before launching, but I prefer to use
the GUI version.

Is there a command to tell R to load a session from a particular location?
How do other people with Macs do this?

-- 
Bill McNeill
http://staff.washington.edu/billmcn/index.shtml

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2's qplot() not rendering title descender

2008-12-23 Thread Mike Lawrence

On my machine (Mac OS 10.5.6, R 2.8.1) the following plot is drawn such that
a substantial portion of the descender of the title is covered by the plot:
library(ggplot2)
qplot(x=1:10,y=1:10,main='p q j g')

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Do so at:
http://www.timetomeet.info/with/mike/

~ Certainty is folly... I think. ~

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 4 questions regarding hypothesis testing, survey package, ts on samples, plotting

2008-12-23 Thread Khawaja, Aman

Please provide me the info on what are the 4 questions about the
parameters in hypothesis testing

Thanks.
Aman
 
- CONFIDENTIAL-
This email and any files transmitted with it are confide...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] .C and 'temporaries'

2008-12-23 Thread Duncan Murdoch


On 23/12/2008 6:55 PM, Thomas Mang wrote:

Hello,

Before I get into troubles I ask here:

I make a call to a C function using .C. What I would like to know is if 
the arguments in the .C call can also be 'temporary' objects.


An example will illustrate this:

# here we don't have a temporary
Obj1 = 5
.C("Func", as.integer(Obj1 ), ...) # certainly works

# here we do have a temporary
Obj2 = 5
.C("Func", as.integer(Obj2 + 100), ...) # is this guaranteed to work?


Is the second call valid?


Yes.  It makes a copy of the object computed as Obj2 + 100, and passes 
that to Func.



Is it also valid if DUP = FALSE, or only if DUP = TRUE ?


I suspect it is, but I would never use DUP = FALSE.  What the docs say 
is that this will let you modify the temporary object Obj2 + 100, and 
then the results will be returned in the return value of .C.  So it 
should be safe, but it is probably not tested very frequently.  If you 
are worried about the overhead of duplicating the vector, it's probably 
time to learn the .Call interface instead.


Duncan Murdoch




Thanks,
Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] beginner data.frame question

2008-12-23 Thread John Fox

Dear Kirk,

Actually, co2 isn't a data frame but rather a "ts" (timeseries) object. A
nice thing about R is that you can query and examine objects:

> class(co2)
[1] "ts"

> str(co2) # structure of object
 Time-Series [1:468] from 1959 to 1998: 315 316 316 318 318 ...

> unclass(co2)
  [1] 315.42 316.31 316.50 317.56 318.13 318.00 316.39 314 . . .
 [33] 314.83 315.16 315.94 316.85 317.78 318.40 319.53 320.42 . . .
 [65] 322.06 321.73 320.27 318.54 316.54 316.71 317.53 318.55 . . .
 [97] 322.17 322.34 322.88 324.25 324.83 323.93 322.38 320.76  . . .
. . .
[449] 365.45 365.01 363.70  . . . 360.83 362.49 364.34
attr(,"tsp")
[1] 1959.000 1997.917   12.000
>

(where . . . represents output that I've elided).

I hope this helps,
 John

--
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
> Behalf Of Kirk Wythers
> Sent: December-23-08 6:30 PM
> To: r-help@r-project.org
> Subject: [R] beginner data.frame question
> 
> I need some help understanding how on of the example data sets is
> formatted in the basic R installation. If I load the Mona Loa CO2
> data, with the command:
> 
>  > data(co2)
> 
> I can view the data with:
> 
>  > co2
> 
> And the data are in the form of 11 rows labeled as years (1994-2004)
> and 12 columns labeled (Jan - Dec). This structure appears to be a
> dataframe, however, if I type the command
> 
>  > plot(co2)
> 
> I get a time series with CO2 on the x axis and time on the y. Also,
> 
>  > summary(co2) gives a single Min, Median, Max.
> 
> The reason for my confusion is that I created another "similar
> looking" data set with read.table. In that case, the data looks to be
> in same format (rows as years, and columns as months). However, the
> command
> 
>  > summary(test.data)
> 
> gives a summary for each month. Completely different behavior.
> 
> If use the data.frame command:
> 
>  > data.frame(co2)
> 
> I get a single column of CO2 data, while the data.frame command on my
> test.data data, keeps it's year-row, column-month format.
> 
> Can anyone help me understand the differences in how these data sets
> are formatted?
> 
> Thanks in advance.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] .C and 'temporaries'

2008-12-23 Thread Thomas Mang


Hello,

Before I get into troubles I ask here:

I make a call to a C function using .C. What I would like to know is if 
the arguments in the .C call can also be 'temporary' objects.


An example will illustrate this:

# here we don't have a temporary
Obj1 = 5
.C("Func", as.integer(Obj1 ), ...) # certainly works

# here we do have a temporary
Obj2 = 5
.C("Func", as.integer(Obj2 + 100), ...) # is this guaranteed to work?


Is the second call valid?
Is it also valid if DUP = FALSE, or only if DUP = TRUE ?

Thanks,
Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] beginner data.frame question

2008-12-23 Thread David Winsemius

> class(co2)
[1] "ts"
> is.data.frame(co2)
[1] FALSE

If you try these functions on your test.data I am guessing that the  
results will differ from above.

--
David Winsemius
On Dec 23, 2008, at 6:29 PM, Kirk Wythers wrote:

I need some help understanding how on of the example data sets is  
formatted in the basic R installation. If I load the Mona Loa CO2  
data, with the command:

> data(co2)

I can view the data with:

> co2

And the data are in the form of 11 rows labeled as years (1994-2004)  
and 12 columns labeled (Jan - Dec). This structure appears to be a  
dataframe, however, if I type the command

> plot(co2)

I get a time series with CO2 on the x axis and time on the y. Also,

> summary(co2) gives a single Min, Median, Max.

The reason for my confusion is that I created another "similar  
looking" data set with read.table. In that case, the data looks to  
be in same format (rows as years, and columns as months). However,  
the command

> summary(test.data)

gives a summary for each month. Completely different behavior.

If use the data.frame command:

> data.frame(co2)

I get a single column of CO2 data, while the data.frame command on  
my test.data data, keeps it's year-row, column-month format.

Can anyone help me understand the differences in how these data sets  
are formatted?

Thanks in advance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] beginner data.frame question

2008-12-23 Thread Kirk Wythers

I need some help understanding how on of the example data sets is  
formatted in the basic R installation. If I load the Mona Loa CO2  
data, with the command:


> data(co2)

I can view the data with:

> co2

And the data are in the form of 11 rows labeled as years (1994-2004)  
and 12 columns labeled (Jan - Dec). This structure appears to be a  
dataframe, however, if I type the command


> plot(co2)

I get a time series with CO2 on the x axis and time on the y. Also,

> summary(co2) gives a single Min, Median, Max.

The reason for my confusion is that I created another "similar  
looking" data set with read.table. In that case, the data looks to be  
in same format (rows as years, and columns as months). However, the  
command


> summary(test.data)

gives a summary for each month. Completely different behavior.

If use the data.frame command:

> data.frame(co2)

I get a single column of CO2 data, while the data.frame command on my  
test.data data, keeps it's year-row, column-month format.


Can anyone help me understand the differences in how these data sets  
are formatted?


Thanks in advance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] QCA adn Fuzzy

2008-12-23 Thread Adrian Dusa

Hi ronggui,

I believe it is "dear Adrian and Prof. Gott" :)
(at least I feel more comfortable being called directly)

Thanks very much for this, I am developing a series of functions for fuzzy-set 
operations to be included in the next release of the QCA package, and your 
function seems to do a good job.

I am not currently at my office (away for the Christmas holiday), but I will 
of course resume my work once I get back in Bucharest.

In the mean time, I wish you all a Merry Christmas and lots of achievements in 
the new year.
Warm regards,
Adrian

On Tuesday 23 December 2008, ronggui wrote:
> Dear  Gott and Prof Adrian DUSA ,
>
> I am learning fuzzy set QCA and recently, I just write a function to
> construct a truthTable, which can be passed to QCA:::eqmcc to do the
> Boolean minimization.  The function is here:
> http://code.google.com/p/asrr/source/browse/trunk/R/fs_truthTable.R
> and the help page is:
> http://code.google.com/p/asrr/source/browse/trunk/man/fs_truthTable.rd
> and the example dataset  from Ragin (2009) is here
> http://code.google.com/p/asrr/source/browse/trunk/data/Lipset_fs.rda
>
> Best
>
> On Wed, Mar 8, 2006 at 2:13 AM, Adrian DUSA  wrote:
> > Dear Prof. Gott,
> >
> > On Monday 06 March 2006 14:37, R Gott wrote:
> >> Does anybody know of aything that will help me do Quantitiative
> >> Comparative Analysis (QCA) and/or Fuzzy set analysis??  Or failing that
> >> Quine?
> >> ta
> >> rg
> >> Prof R Gott
> >> Durham Univesrity
> >> UK
> >
> > There is a package called QCA which (in its first release) performs only
> > crisp set analysis. I am currently adapting a Graphical User Interface,
> > but the functions are nevertheless usefull.
> > For fuzzy set analysis, please consider Charles Ragin's web site
> > http://www.u.arizona.edu/%7Ecragin/fsQCA/index.shtml
> > which offers a software (still not complete, though). Also to consider
> > is a good software called Tosmana (http://www.tosmana.org/) which does
> > multi-value
> > QCA.
> > I am considering writing the inclusion algorithms in the next releases of
> >  my package, but it is going to take a little while. Any contributions
> > and/or feedback are more than welcome.
> >
> > I hope this helps you,
> > Adrian
> >
> >
> > --
> > Adrian DUSA
> > Romanian Social Data Archive
> > 1, Schitu Magureanu Bd
> > 050025 Bucharest sector 5
> > Romania
> > Tel./Fax: +40 21 3126618 \
> >  +40 21 3120210 / int.101
> >
> > __
> > r-h...@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html


-- 
Adrian Dusa
Romanian Social Data Archive
1, Schitu Magureanu Bd.
050025 Bucharest sector 5
Romania
Tel.:+40 21 3126618 \
 +40 21 3120210 / int.101
Fax: +40 21 3158391


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] newbie problem using Design.rcs

2008-12-23 Thread Frank E Harrell Jr

sp wrote:

Dr. Harrell,

Thanks for your patient replies. I've two more questions:

1) Is your book appropriate for beginners or is it more for advanced users?

It is in the middle of those two.

2) f <- ols(y ~ rcs(x,3), data=mydata)
Function(f)
does not produce anything for me (i.e.) it's empty.

Here's a test I just ran

> y <- rnorm(100)
> x <- runif(100)
> f <- ols(y ~ rcs(x,3))
> Function(f)
function(x = NA) {-0.18307779+0.91343138* 
x-2.1908543*pmax(x-0.10568075,0)^3+3.8312836*pmax(x-0.45620427,0)^3-1.6404293*pmax(x-0.92434146,0)^3 
}

If you are doing this inside { } you will need to do print(Function(f))

Note this is a simplified version of the rcs with 2 redundant terms to 
avoid writing in terms of differences in cubes.

Frank

Sincerely,
sp

--- On Tue, 12/23/08, Frank E Harrell Jr  wrote:

From: Frank E Harrell Jr 
Subject: Re: [R] newbie problem using Design.rcs
To: to_rent_2...@yahoo.com
Cc: "David Winsemius" , r-help@r-project.org
Date: Tuesday, December 23, 2008, 4:57 PM
sp wrote:

2. I didn't have x^3 b/c that co-efficient happens

to be zero in this fitting.

That's strange.

Also, I'm forced to call win.graph() before my

first plot() to see the first plot. Is that normal?

no

I started with testing it on just x,y dimensions so

that I can visually evaluate the fitting. I tried y=x, y=x^2
etc, adding Gaussian noise each time (to the y). 

I plot original x,y and x,y' where y' is

calculated using the co-efficients returned by rcs. I find
that the regression curve differs from the actual points by
as high as 10^5 with 3 knots and roughly -10^5 with 4 knots
as I make y=x^2, y=x^3

wait until you have studied regression

Frank

If this is NOT a good way to test fitting, could you

pls tell me a better way?

Respectfully,
sp

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] newbie problem using Design.rcs

2008-12-23 Thread sp

Dr. Harrell,

Thanks for your patient replies. I've two more questions:

1) Is your book appropriate for beginners or is it more for advanced users?

2) f <- ols(y ~ rcs(x,3), data=mydata)
Function(f)
does not produce anything for me (i.e.) it's empty.

Sincerely,
sp


--- On Tue, 12/23/08, Frank E Harrell Jr  wrote:

> From: Frank E Harrell Jr 
> Subject: Re: [R] newbie problem using Design.rcs
> To: to_rent_2...@yahoo.com
> Cc: "David Winsemius" , r-help@r-project.org
> Date: Tuesday, December 23, 2008, 4:57 PM
> sp wrote:

> > 2. I didn't have x^3 b/c that co-efficient happens
> to be zero in this fitting.
> 
> That's strange.
> 
> > 
> > Also, I'm forced to call win.graph() before my
> first plot() to see the first plot. Is that normal?
> 
> no
> 
> > 
> > I started with testing it on just x,y dimensions so
> that I can visually evaluate the fitting. I tried y=x, y=x^2
> etc, adding Gaussian noise each time (to the y). 
> > I plot original x,y and x,y' where y' is
> calculated using the co-efficients returned by rcs. I find
> that the regression curve differs from the actual points by
> as high as 10^5 with 3 knots and roughly -10^5 with 4 knots
> as I make y=x^2, y=x^3
> 
> wait until you have studied regression
> 
> Frank
> 
> 
> > 
> > If this is NOT a good way to test fitting, could you
> pls tell me a better way?
> > 
> > Respectfully,
> > sp

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] newbie problem using Design.rcs

2008-12-23 Thread Frank E Harrell Jr


sp wrote:

Sincere thanks for both the replies.

0. I agree, I'm waiting for my copy of a regression book to arrive. Meanwhile, 
I'm trying to read on google.

1. My bad, I'm using Gaussian noise.

2. I didn't have x^3 b/c that co-efficient happens to be zero in this fitting.


That's strange.



3. I used lines() b/c I wanted to superimpose the curve from regression atop my first plot of the original data points (x,y). 
I'm not sure how to use plot(f, x1 = NA) after my first plot(). The examples I managed to find on google all use plot() followed by lines(). [In Matlab, I'd just say "hold" in between these calls.]


plot(f, x1=NA)
plot(f, x2=NA, add=TRUE)



Also, I'm forced to call win.graph() before my first plot() to see the first 
plot. Is that normal?


no



4. I really could use some guidance on this part. I need to use rcs() to fit points in a high-dimensional space and I'm trying to understand and use it correctly. 


keep reading



I started with testing it on just x,y dimensions so that I can visually evaluate the fitting. I tried y=x, y=x^2 etc, adding Gaussian noise each time (to the y). 


I plot original x,y and x,y' where y' is calculated using the co-efficients 
returned by rcs. I find that the regression curve differs from the actual 
points by as high as 10^5 with 3 knots and roughly -10^5 with 4 knots as I make 
y=x^2, y=x^3


wait until you have studied regression

Frank




If this is NOT a good way to test fitting, could you pls tell me a better way?

Respectfully,
sp



--- On Tue, 12/23/08, Frank E Harrell Jr  wrote:


From: Frank E Harrell Jr 
Subject: Re: [R] newbie problem using Design.rcs
To: "David Winsemius" 
Cc: to_rent_2...@yahoo.com, r-help@r-project.org
Date: Tuesday, December 23, 2008, 9:41 AM
In addition to David's excellent response, I'll add
that your problems seem to be statistical and not
programming ones.  I recommend that you spend a significant
amount of time with a good regression text or course before
using the software.  Also, with Design you can find out the
algebraic form of the fit:

f <- ols(y ~ rcs(x,3), data=mydata)
Function(f)

Frank


David Winsemius wrote:

On Dec 22, 2008, at 11:38 PM, sp wrote:


Hi,

I read data from a file. I'm trying to

understand how to use Design.rcs by using simple test data
first. I use 1000 integer values (1,...,1000) for x (the
predictor) with some noise (x+.02*x) and I set the response
variable y=x. Then, I try rcs and ols as follows:

Not sure what sort of noise that is.


m = ( sqrt(y1) ~ ( rcs(x1,3) ) ); #I tried without

sqrt also

f = ols(m, data=data_train.df);
print(f);

[I plot original x1,y1 vectors and the regression

as in

y <- coef2[1] + coef2[2]*x1 + coef2[3]*x1*x1]

That does not look as though it would capture the

structure of a restricted **cubic** spline. The usual method
in Design for plotting a model prediction would be:

plot(f, x1 = NA)



But this gives me a VERY bad fit:
"

Can you give some hint why you consider this to be a

"VERY bad fit"? It appears a rather good fit to
me, despite the test case apparently not being construct
with any curvature which is what the rcs modeling strategy
should be detecting.

-- Frank E Harrell Jr   Professor and Chair  
School of Medicine
 Department of Biostatistics  
Vanderbilt University



  




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] NetCDF within R: installation assistance

2008-12-23 Thread Prof Brian Ripley


On Tue, 23 Dec 2008, Ken Schmidt wrote:

Greetings.  I am attempting to add NetCDF libraries within R, and have 
failed.  We have R version 2.8, and are running on a 64-bit Redhat Linux 
2.6.18 kernel:


Red Hat Enterprise Linux Client release 5.2 (Tikanga)
Linux halfmoon.ncdc.noaa.gov 2.6.18-92.1.22.el5 #1 SMP Fri Dec 5 09:28:22 EST 
2008 x86_64 x86_64 x86_64 GNU/Linux


I have run the installation instructions found at 
"http://www.image.ucar.edu/Software/Netcdf/";, but not successfully.  I'm 
wondering if you can give some advice that will change our outcome?


We've run "R CMD INSTALL 
--configure-args="-with-netcdf_incdir=/usr/local/netcdf-3.6.1/include 
-with-netcdf_libdir=/usr/local/netcdf-3.6.1/lib" ncdf_1.6.tar.gz" against 
NetCDF 3.6.1, 3.6.2, and 4.0 all with the same results.
I think our being on a 64-bit Linux has to do with the failure.  Can you 
confirm this?


No, it works successfully, as the CRAN check logs will confirm at
http://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64/RNetCDF-00check.html

You need a PIC library on x86_64 Linux: it is normal for use dynamic 
libraries (which are PIC), and that is the problem you need to address 
(with your RHEL support, not here).




Outputs are below from the 64-bit system "halfmoon":

[r...@halfmoon local]# R --version
R version 2.8.0 (2008-10-20)
Copyright (C) 2008 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License version 2.
For more information about these matters see
http://www.gnu.org/licenses/.


Here is my output:

[r...@halfmoon local]# R CMD INSTALL 
--configure-args="-with-netcdf_incdir=/usr/local/netcdf-3.6.1/include 
-with-netcdf_libdir=/usr/local/netcdf-3.6.1/lib" ncdf_1.6.tar.gz

* Installing to library '/usr/local/lib64/R/library'
* Installing *source* package 'ncdf' ...
checking for gcc... gcc -std=gnu99
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -std=gnu99 accepts -g... yes
checking for gcc -std=gnu99 option to accept ANSI C... none needed
checking how to run the C preprocessor... gcc -std=gnu99 -E
checking for egrep... grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking /usr/local/netcdf-3.6.1/include/netcdf.h usability... yes
checking /usr/local/netcdf-3.6.1/include/netcdf.h presence... yes
checking for /usr/local/netcdf-3.6.1/include/netcdf.h... yes
Using user-specified netCDF include dir=/usr/local/netcdf-3.6.1/include
Found netcdf.h in: /usr/local/netcdf-3.6.1/include
checking for /usr/local/netcdf-3.6.1/lib/libnetcdf.a... yes
Using user-specified netCDF library dir=/usr/local/netcdf-3.6.1/lib
Found netcdf library file libnetcdf.a in directory 
/usr/local/netcdf-3.6.1/lib

configure: creating ./config.status
config.status: creating R/load.R
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/local/lib64/R/include -I/usr/local/netcdf-3.6.1/include 
-I/usr/local/include-fpic  -g -O2 -c ncdf2.c -o ncdf2.o
gcc -std=gnu99 -I/usr/local/lib64/R/include -I/usr/local/netcdf-3.6.1/include 
-I/usr/local/include-fpic  -g -O2 -c ncdf3.c -o ncdf3.o

ncdf3.c: In function 'R_nc_get_vara_charvarid':
ncdf3.c:221: warning: assignment discards qualifiers from pointer target type
ncdf3.c: In function 'R_nc_get_vara_numvarid':
ncdf3.c:267: warning: assignment discards qualifiers from pointer target type
gcc -std=gnu99 -I/usr/local/lib64/R/include -I/usr/local/netcdf-3.6.1/include 
-I/usr/local/include-fpic  -g -O2 -c ncdf.c -o ncdf.o

ncdf.c: In function 'R_nc_ttc_to_nctype':
ncdf.c:424: warning: implicit declaration of function 'exit'
ncdf.c:424: warning: incompatible implicit declaration of built-in function 
'exit'
gcc -std=gnu99 -shared -L/usr/local/lib64 -o ncdf.so ncdf2.o ncdf3.o ncdf.o 
-L/usr/local/netcdf-3.6.1/lib -lnetcdf
/usr/bin/ld: /usr/local/netcdf-3.6.1/lib/libnetcdf.a(attr.o): relocation 
R_X86_64_32 against `a local symbol' can not be used when making a shared 
object; recompile with -fPIC

/usr/local/netcdf-3.6.1/lib/libnetcdf.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [ncdf.so] Error 1
chmod: cannot access `/usr/local/lib64/R/library/ncdf/libs/*': No such file 
or directory

ERROR: compilation failed for package 'ncdf'
** Removing '/usr/local/lib64/R/library/ncdf'

Thanks in advance,

Ken Schmidt
National Climatic Data Center
Asheville, NC 28801

_

Re: [R] newbie problem using Design.rcs

2008-12-23 Thread sp

Sincere thanks for both the replies.

0. I agree, I'm waiting for my copy of a regression book to arrive. Meanwhile, 
I'm trying to read on google.

1. My bad, I'm using Gaussian noise.

2. I didn't have x^3 b/c that co-efficient happens to be zero in this fitting.

3. I used lines() b/c I wanted to superimpose the curve from regression atop my 
first plot of the original data points (x,y). 
I'm not sure how to use plot(f, x1 = NA) after my first plot(). The examples I 
managed to find on google all use plot() followed by lines(). [In Matlab, I'd 
just say "hold" in between these calls.]

Also, I'm forced to call win.graph() before my first plot() to see the first 
plot. Is that normal?

4. I really could use some guidance on this part. I need to use rcs() to fit 
points in a high-dimensional space and I'm trying to understand and use it 
correctly. 

I started with testing it on just x,y dimensions so that I can visually 
evaluate the fitting. I tried y=x, y=x^2 etc, adding Gaussian noise each time 
(to the y). 

I plot original x,y and x,y' where y' is calculated using the co-efficients 
returned by rcs. I find that the regression curve differs from the actual 
points by as high as 10^5 with 3 knots and roughly -10^5 with 4 knots as I make 
y=x^2, y=x^3

If this is NOT a good way to test fitting, could you pls tell me a better way?

Respectfully,
sp

--- On Tue, 12/23/08, Frank E Harrell Jr  wrote:

> From: Frank E Harrell Jr 
> Subject: Re: [R] newbie problem using Design.rcs
> To: "David Winsemius" 
> Cc: to_rent_2...@yahoo.com, r-help@r-project.org
> Date: Tuesday, December 23, 2008, 9:41 AM
> In addition to David's excellent response, I'll add
> that your problems seem to be statistical and not
> programming ones.  I recommend that you spend a significant
> amount of time with a good regression text or course before
> using the software.  Also, with Design you can find out the
> algebraic form of the fit:
> 
> f <- ols(y ~ rcs(x,3), data=mydata)
> Function(f)
> 
> Frank
> 
> 
> David Winsemius wrote:
> > 
> > On Dec 22, 2008, at 11:38 PM, sp wrote:
> > 
> >> Hi,
> >> 
> >> I read data from a file. I'm trying to
> understand how to use Design.rcs by using simple test data
> first. I use 1000 integer values (1,...,1000) for x (the
> predictor) with some noise (x+.02*x) and I set the response
> variable y=x. Then, I try rcs and ols as follows:
> >> 
> > Not sure what sort of noise that is.
> > 
> >> m = ( sqrt(y1) ~ ( rcs(x1,3) ) ); #I tried without
> sqrt also
> >> f = ols(m, data=data_train.df);
> >> print(f);
> >> 
> >> [I plot original x1,y1 vectors and the regression
> as in
> >> y <- coef2[1] + coef2[2]*x1 + coef2[3]*x1*x1]
> > 
> > That does not look as though it would capture the
> structure of a restricted **cubic** spline. The usual method
> in Design for plotting a model prediction would be:
> > 
> > plot(f, x1 = NA)
> > 
> >> 
> >> 
> >> But this gives me a VERY bad fit:
> >> "
> > 
> > Can you give some hint why you consider this to be a
> "VERY bad fit"? It appears a rather good fit to
> me, despite the test case apparently not being construct
> with any curvature which is what the rcs modeling strategy
> should be detecting.
> > 
> 
> 
> -- Frank E Harrell Jr   Professor and Chair  
> School of Medicine
>  Department of Biostatistics  
> Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Tukey on interaction means after lmer

2008-12-23 Thread Lawrence Hanser

Dear Colleagues,
I fit this model:

mod1 <- lmer(x~category*comp+(1|id),data=impchiefsrm)

where category has 4 levels and comp has 8 levels.

These work:

glht(mod1, linfct=mcp(category="Tukey")
glht(mod1, linfct=mcp(comp="Tukey")

What I'd like is (conceptually):

glht(mod1, linfct=mcp(category:comp="Tukey")

but it gives a syntax error.

Any help is appreciated.

Thanks,

Larry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interval censored Data in survreg() with zero values!

2008-12-23 Thread Achim Zeileis


On Tue, 23 Dec 2008, Geraldine Henningsen wrote:


Hello,

I have interval censored data, censored between (0, 100). I used the
tobit function in the AER package which in turn backs on survreg.
Actually I'm struggling with the distribution. Data is asymmetrically
distributed, so first choice would be a Weibull distribution.
Unfortunately  the Weibull doesn't allow for zero values in time data,
as it requires x > 0. So I tried the exponential distribution that
allows x to be >= 0 and the log-normal that sets x <= 0 to 0. Still I
get the same error:

" Fehler in survreg(formula = Surv(ifelse(A16_1_1 >= 100, 100,
ifelse(A16_1_1 <=  :
 Invalid survival times for this distribution "

The only distributions that seem to work are gaussian and logistic, but
they don't really fit the data.
I searched for this problem in the archive and found a suggestion by
Terry Therneau to set all 0  to NA, applying Weibull afterwards.  But
this solution is not very satisfying as it eliminates the left censored
data from the dataset.

So I have three questions:

1. Does anybody know why the lognormal and exponential distribution
don't work in survreg?


For these distributions, observations left-censored at zero are rather 
unlikely to occur: pexp(0) = plnorm(0) = 0.



2.  What else could I do to find a distribution that fits the data well?

3. What about the non-parametric approach in survfit(), could that be a
solution?


Both probably depend on the questions you want to ask about your data. For 
the tools implemented in "survival", the "Modeling Survival Data" book by 
Therneau and Grambsch is the natural reference.


hth,
Z


I hope my question aren't too stupid, as I'm not a big statistician.

Regards,

Geraldine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] NetCDF within R: installation assistance

2008-12-23 Thread Ken Schmidt

Greetings.  I am attempting to add NetCDF libraries within R, and have 
failed.  We have R version 2.8, and are running on a 64-bit Redhat Linux 
2.6.18 kernel:


Red Hat Enterprise Linux Client release 5.2 (Tikanga)
Linux halfmoon.ncdc.noaa.gov 2.6.18-92.1.22.el5 #1 SMP Fri Dec 5 
09:28:22 EST 2008 x86_64 x86_64 x86_64 GNU/Linux


I have run the installation instructions found at 
"http://www.image.ucar.edu/Software/Netcdf/";, but not successfully.  I'm 
wondering if you can give some advice that will change our outcome?


We've run "R CMD INSTALL 
--configure-args="-with-netcdf_incdir=/usr/local/netcdf-3.6.1/include 
-with-netcdf_libdir=/usr/local/netcdf-3.6.1/lib" ncdf_1.6.tar.gz" 
against NetCDF 3.6.1, 3.6.2, and 4.0 all with the same results.
I think our being on a 64-bit Linux has to do with the failure.  Can you 
confirm this?


Outputs are below from the 64-bit system "halfmoon":

[r...@halfmoon local]# R --version
R version 2.8.0 (2008-10-20)
Copyright (C) 2008 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License version 2.
For more information about these matters see
http://www.gnu.org/licenses/.


Here is my output:

[r...@halfmoon local]# R CMD INSTALL 
--configure-args="-with-netcdf_incdir=/usr/local/netcdf-3.6.1/include 
-with-netcdf_libdir=/usr/local/netcdf-3.6.1/lib" ncdf_1.6.tar.gz

* Installing to library '/usr/local/lib64/R/library'
* Installing *source* package 'ncdf' ...
checking for gcc... gcc -std=gnu99
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -std=gnu99 accepts -g... yes
checking for gcc -std=gnu99 option to accept ANSI C... none needed
checking how to run the C preprocessor... gcc -std=gnu99 -E
checking for egrep... grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking /usr/local/netcdf-3.6.1/include/netcdf.h usability... yes
checking /usr/local/netcdf-3.6.1/include/netcdf.h presence... yes
checking for /usr/local/netcdf-3.6.1/include/netcdf.h... yes
Using user-specified netCDF include dir=/usr/local/netcdf-3.6.1/include
Found netcdf.h in: /usr/local/netcdf-3.6.1/include
checking for /usr/local/netcdf-3.6.1/lib/libnetcdf.a... yes
Using user-specified netCDF library dir=/usr/local/netcdf-3.6.1/lib
Found netcdf library file libnetcdf.a in directory 
/usr/local/netcdf-3.6.1/lib

configure: creating ./config.status
config.status: creating R/load.R
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/local/lib64/R/include 
-I/usr/local/netcdf-3.6.1/include -I/usr/local/include-fpic  -g -O2 
-c ncdf2.c -o ncdf2.o
gcc -std=gnu99 -I/usr/local/lib64/R/include 
-I/usr/local/netcdf-3.6.1/include -I/usr/local/include-fpic  -g -O2 
-c ncdf3.c -o ncdf3.o

ncdf3.c: In function 'R_nc_get_vara_charvarid':
ncdf3.c:221: warning: assignment discards qualifiers from pointer target 
type

ncdf3.c: In function 'R_nc_get_vara_numvarid':
ncdf3.c:267: warning: assignment discards qualifiers from pointer target 
type
gcc -std=gnu99 -I/usr/local/lib64/R/include 
-I/usr/local/netcdf-3.6.1/include -I/usr/local/include-fpic  -g -O2 
-c ncdf.c -o ncdf.o

ncdf.c: In function 'R_nc_ttc_to_nctype':
ncdf.c:424: warning: implicit declaration of function 'exit'
ncdf.c:424: warning: incompatible implicit declaration of built-in 
function 'exit'
gcc -std=gnu99 -shared -L/usr/local/lib64 -o ncdf.so ncdf2.o ncdf3.o 
ncdf.o -L/usr/local/netcdf-3.6.1/lib -lnetcdf
/usr/bin/ld: /usr/local/netcdf-3.6.1/lib/libnetcdf.a(attr.o): relocation 
R_X86_64_32 against `a local symbol' can not be used when making a 
shared object; recompile with -fPIC

/usr/local/netcdf-3.6.1/lib/libnetcdf.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [ncdf.so] Error 1
chmod: cannot access `/usr/local/lib64/R/library/ncdf/libs/*': No such 
file or directory

ERROR: compilation failed for package 'ncdf'
** Removing '/usr/local/lib64/R/library/ncdf'

Thanks in advance,

Ken Schmidt
National Climatic Data Center
Asheville, NC 28801

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quotation problem/dataframe names as function input argument.

2008-12-23 Thread David Winsemius

I think you are making it too hard. See if one of these is a path to  
the answer you seek:

> get(dframe.vec)
  V1 V2
1  1  2

#
> n.obs <- function(x) dim(x)
> n.obs(get(dframe.vec))
[1] 1 2

#---
> ng.obs <- function(x) dim(get(x))
> ng.obs(dframe.vec)
[1] 1 2

#-
> ng1.obs <- function(x) dim(get(x)[1])
> ng1.obs(dframe.vec)
[1] 1 1

Best;
David Winsemius

On Dec 23, 2008, at 3:53 PM, Sean Zhang wrote:

Dear R friends:

Can someone help me with the following problem? Many thanks in  
advance.

# Problem Description:
# I want to write functions which take a (character) vector of  
dataframe

names as input argument.
# For example, I want to extract the number of observations from a  
number of

dataframes.
# I tried the following:

nobs.fun <- function (dframe.vec)
{
 nobs.vec <- array(NA,c(length(dframe.vec),1))

 for (i in 1:length(dframe.vec))
 {
 nobs.vec[i] <- dim(dframe.vec[i])[1]
 }

 return(nobs.vec)
}

# To show the problem, I create a fake dataframe and store its name  
(i.e.,

dframe.1)
# in a vector (i.e., dframe.vec) of length 1.

# creation of fake dataframe
dframe.1 <- as.data.frame(matrix(seq(1:2),c(1,2)))
# store the dataframe name into a vector using c() function
dframe.vec <- c("dframe.1")

# The problem is that the following line does not work
nobs.fun(dframe.vec)

# Seems to me, the problem stems from the fact that dframe.vec[1] is
intepreted by R as "dframe.vec" (note: it is quotated)
# and dim("dframe.vec")[1] gives NULL.
# Also, I realize the following line works as expected (note: dframe. 
1 is

not quoted any more):
dim(dframe.1)[1]

So my question is then: how can I pass dataframe names as an input  
argument

for another function
without running into the quotation mark issue above?

Any hint?

Thank you in advance.
-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "assign" statement in S-Plus

2008-12-23 Thread Greg Snow

If I remember correctly, frame 1 is the evaluation frame that comes into 
existence with each evaluation, then goes away at the end of the evaluation.  
The main use of it is to get past the fact that S-PLUS searches the current 
functions variables, but not the ones the current function is nested in, so a 
person could assign something to frame 1, then call another function and that 
function could look for the variable in frame 1.

To do the same thing in R depends on what you are trying to accomplish.  In 
some cases the lexical scoping of R makes this completely unneeded.  If 
function g is defined inside of function f and function f assigns a value to 
'prime' before function g is called, then function g will be able to see 
'prime' in function f without any use of assign or frame 1.  If function g 
needs to change the value of 'prime', then <<- will work.

If function g is not defined inside of function f and they both need to see the 
same variable (and it cannot be passed as an argument), then one way to do this 
is to just insure that both functions inherit from the same environment.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Douglas Bates
> Sent: Tuesday, December 23, 2008 1:22 PM
> To: David Winsemius
> Cc: r-help@r-project.org
> Subject: Re: [R] "assign" statement in S-Plus
> 
> On Tue, Dec 23, 2008 at 1:38 PM, David Winsemius
>  wrote:
> 
> > On Dec 23, 2008, at 1:41 PM, kathie wrote:
> 
> >>
> >> Dear R users...
> >>
> >> I need to change the S+ code below to R code.
> >>
> >> I am wondering if there is a R statement equivalent for "assign"
> statement
> >> in S-plus.
> >>
> >
> > ?assign   # 
> 
> The problem is not with the assign function per se but with the use of
> frame = 1 as an argument to assign.  R uses evaluation environments
> and S used evaluation frames.  frame = 1 was special, although I have
> forgotten which one it was.
> 
> Kathie, I would try removing the call to assign altogether and seeing
> if the rest of your S-PLUS script works as intended.
> 
> >
> >
> >>
> >> 
> >>
> >>  prime <- function(x)
> >>   {
> >>   1*(abs(x) < chuber)
> >>   }
> >>  assign("prime",prime,frame=1)
> >>
> >> -
> >>
> >>
> >> Any comments will be greatly appreciated.
> >>
> >> Kathryn Lord
> >> --
> >> View this message in context:
> >> http://www.nabble.com/%22assign%22-statement-in-S-Plus-
> tp21149319p21149319.html
> >> Sent from the R help mailing list archive at Nabble.com.
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] quotation problem/dataframe names as function input argument.

2008-12-23 Thread Sean Zhang

Dear R friends:

Can someone help me with the following problem? Many thanks in advance.

# Problem Description:
# I want to write functions which take a (character) vector of dataframe
names as input argument.
# For example, I want to extract the number of observations from a number of
dataframes.
# I tried the following:

nobs.fun <- function (dframe.vec)
{
  nobs.vec <- array(NA,c(length(dframe.vec),1))

  for (i in 1:length(dframe.vec))
  {
  nobs.vec[i] <- dim(dframe.vec[i])[1]
  }

  return(nobs.vec)
}

# To show the problem, I create a fake dataframe and store its name (i.e.,
dframe.1)
# in a vector (i.e., dframe.vec) of length 1.

# creation of fake dataframe
dframe.1 <- as.data.frame(matrix(seq(1:2),c(1,2)))
# store the dataframe name into a vector using c() function
dframe.vec <- c("dframe.1")

# The problem is that the following line does not work
nobs.fun(dframe.vec)

# Seems to me, the problem stems from the fact that dframe.vec[1] is
intepreted by R as "dframe.vec" (note: it is quotated)
# and dim("dframe.vec")[1] gives NULL.
# Also, I realize the following line works as expected (note: dframe.1 is
not quoted any more):
dim(dframe.1)[1]

So my question is then: how can I pass dataframe names as an input argument
for another function
without running into the quotation mark issue above?

Any hint?

Thank you in advance.
-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "assign" statement in S-Plus

2008-12-23 Thread Douglas Bates

On Tue, Dec 23, 2008 at 1:38 PM, David Winsemius  wrote:

> On Dec 23, 2008, at 1:41 PM, kathie wrote:

>>
>> Dear R users...
>>
>> I need to change the S+ code below to R code.
>>
>> I am wondering if there is a R statement equivalent for "assign" statement
>> in S-plus.
>>
>
> ?assign   # 

The problem is not with the assign function per se but with the use of
frame = 1 as an argument to assign.  R uses evaluation environments
and S used evaluation frames.  frame = 1 was special, although I have
forgotten which one it was.

Kathie, I would try removing the call to assign altogether and seeing
if the rest of your S-PLUS script works as intended.

>
>
>>
>> 
>>
>>  prime <- function(x)
>>   {
>>   1*(abs(x) < chuber)
>>   }
>>  assign("prime",prime,frame=1)
>>
>> -
>>
>>
>> Any comments will be greatly appreciated.
>>
>> Kathryn Lord
>> --
>> View this message in context:
>> http://www.nabble.com/%22assign%22-statement-in-S-Plus-tp21149319p21149319.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ordered Multidimensional Arrays

2008-12-23 Thread Whit Armstrong

I take a similar approach by storing my vcv's in a list w/ the date
stored as a character vector "%y-%m-%d" as the list names.  That way
you can easily grab the vcv you need by casting your date to a string
and using it to index the list.

not sure if that will work for you.

hth,
Whit


On Tue, Dec 23, 2008 at 3:08 PM, Patrick Burns  wrote:
> The old fashioned solution is to have the N x N x T
> array and use character strings of the dates as the
> dimnames on the third dimension.
>
> Is there something you think you need to do that is
> hard with such a setup?
>
>
> Patrick Burns
> patr...@burns-stat.com
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
>
> Derek Schaeffer wrote:
>>
>> Hi,
>> I am inquiring as to what are the best practices with respect to storing
>> and
>> manipulating ordered multi-dimensional arrays.  For example, suppose I
>> have
>> a sequence of time-varying covariance matrices of asset returns.  The data
>> is ordered, but the ordering is not necessarily regular (e.g. daily data
>> omitting weekends and holidays, etc.).  The data array is say, N x N x T.
>> For example, the first two elements may look as follows:
>>
>>
>>>
>>> *result$covariance[,,1:2]
>>>
>>
>> , , 1*
>> * [,1] [,2] [,3] [,4]
>> [1,] 1.511137e-06 1.918668e-06 1.201553e-06 3.205271e-06
>> [2,] 1.918668e-06 7.488916e-06 6.593317e-06 1.203421e-05
>> [3,] 1.201553e-06 6.593317e-06 1.305861e-05 2.132272e-05
>> [4,] 3.205271e-06 1.203421e-05 2.132272e-05 4.571225e-05*
>> *, , 2*
>> * [,1] [,2] [,3] [,4]
>> [1,] 1.500858e-06 1.905574e-06 1.193412e-06 3.183290e-06
>> [2,] 1.905574e-06 7.444871e-06 6.555459e-06 1.195876e-05
>> [3,] 1.193412e-06 6.555459e-06 1.297075e-05 2.11e-05
>> [4,] 3.183290e-06 1.195876e-05 2.11e-05 4.551706e-05*
>>
>> I would like to be able to partition this sequence of matrices by date and
>> by individual element.  Partitioning by individual elements is trivial;
>> however, partitioning by time stamp is not (especially if the partitioned
>> data set must be carried through a number of downstream calculations).  I
>> could carry the data in a list complete with a date vector and the data
>> array, and partition the list as I go, but this seems somewhat clunky.
>>  Any
>> ideas?  A "zoo"-like package capable of handling multidimensional arrays
>> would be optimal, but I don't believe this exists.
>>
>> Thanks,
>> Derek
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ordered Multidimensional Arrays

2008-12-23 Thread Patrick Burns


The old fashioned solution is to have the N x N x T
array and use character strings of the dates as the
dimnames on the third dimension.

Is there something you think you need to do that is
hard with such a setup?


Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Derek Schaeffer wrote:

Hi,
I am inquiring as to what are the best practices with respect to storing and
manipulating ordered multi-dimensional arrays.  For example, suppose I have
a sequence of time-varying covariance matrices of asset returns.  The data
is ordered, but the ordering is not necessarily regular (e.g. daily data
omitting weekends and holidays, etc.).  The data array is say, N x N x T.
For example, the first two elements may look as follows:

  

*result$covariance[,,1:2]


, , 1*
* [,1] [,2] [,3] [,4]
[1,] 1.511137e-06 1.918668e-06 1.201553e-06 3.205271e-06
[2,] 1.918668e-06 7.488916e-06 6.593317e-06 1.203421e-05
[3,] 1.201553e-06 6.593317e-06 1.305861e-05 2.132272e-05
[4,] 3.205271e-06 1.203421e-05 2.132272e-05 4.571225e-05*
*, , 2*
* [,1] [,2] [,3] [,4]
[1,] 1.500858e-06 1.905574e-06 1.193412e-06 3.183290e-06
[2,] 1.905574e-06 7.444871e-06 6.555459e-06 1.195876e-05
[3,] 1.193412e-06 6.555459e-06 1.297075e-05 2.11e-05
[4,] 3.183290e-06 1.195876e-05 2.11e-05 4.551706e-05*

I would like to be able to partition this sequence of matrices by date and
by individual element.  Partitioning by individual elements is trivial;
however, partitioning by time stamp is not (especially if the partitioned
data set must be carried through a number of downstream calculations).  I
could carry the data in a list complete with a date vector and the data
array, and partition the list as I go, but this seems somewhat clunky.  Any
ideas?  A "zoo"-like package capable of handling multidimensional arrays
would be optimal, but I don't believe this exists.

Thanks,
Derek

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "assign" statement in S-Plus

2008-12-23 Thread David Winsemius



On Dec 23, 2008, at 1:41 PM, kathie wrote:



Dear R users...

I need to change the S+ code below to R code.

I am wondering if there is a R statement equivalent for "assign"  
statement

in S-plus.



?assign   # 






 prime <- function(x)
   {
   1*(abs(x) < chuber)
   }
 assign("prime",prime,frame=1)

-


Any comments will be greatly appreciated.

Kathryn Lord
--
View this message in context: 
http://www.nabble.com/%22assign%22-statement-in-S-Plus-tp21149319p21149319.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interval censored Data in survreg() with zero values!

2008-12-23 Thread Don MacQueen


Surv() allows left, right, or interval censoring.

Try left censoring instead of interval censoring. For the weibull or 
lognormal, think of your data as <=100 instead of [0,100].


-Don

At 8:08 PM +0100 12/23/08, Geraldine Henningsen wrote:

Hello,

I have interval censored data, censored between (0, 100). I used the
tobit function in the AER package which in turn backs on survreg.
Actually I'm struggling with the distribution. Data is asymmetrically
distributed, so first choice would be a Weibull distribution.
Unfortunately  the Weibull doesn't allow for zero values in time data,
as it requires x > 0. So I tried the exponential distribution that
allows x to be >= 0 and the log-normal that sets x <= 0 to 0. Still I
get the same error:

" Fehler in survreg(formula = Surv(ifelse(A16_1_1 >= 100, 100,
ifelse(A16_1_1 <=  :
  Invalid survival times for this distribution "

The only distributions that seem to work are gaussian and logistic, but
they don't really fit the data.
I searched for this problem in the archive and found a suggestion by
Terry Therneau to set all 0  to NA, applying Weibull afterwards.  But
this solution is not very satisfying as it eliminates the left censored
data from the dataset.

So I have three questions:

1. Does anybody know why the lognormal and exponential distribution
don't work in survreg?

2.  What else could I do to find a distribution that fits the data well?

3. What about the non-parametric approach in survfit(), could that be a
solution?

I hope my question aren't too stupid, as I'm not a big statistician.

Regards,

Geraldine

__
R-help@r-project.org mailing list
https:// stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http:// www. R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: cannot allocate vector of size 1.8 Gb

2008-12-23 Thread iamsilvermember


>What are you going to do with an agglomerative hierarchical clustering of
22283 objects?  It will not be interpretible.

As a matter of fact I was ask to do a clustering analysis on gene
expression. Something 
http://www.ncbi.nlm.nih.gov/projects/geo/gds/analyze/analyze.cgi?datadir=UCorrelationUPGMA&ID=GDS3254&myType=0
Like this 

>Why? Both will have a 3GB address space limits unless the Xeon box is
64-bit.  And this works on my 64-bit Linux boxes.

I am pretty sure the linux server is 64bit.



Sorry I am just a beginner in R.  I read the "memory-limit" help you
suggested, but I still cannot find a solution to my problem...  May I know
if there is any work around for this issue?

Thank you so much again!
-- 
View this message in context: 
http://www.nabble.com/Error%3A-cannot-allocate-vector-of-size-1.8-Gb-tp21133949p21149727.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Interval censored Data in survreg() with zero values!

2008-12-23 Thread Geraldine Henningsen

Hello,

I have interval censored data, censored between (0, 100). I used the
tobit function in the AER package which in turn backs on survreg. 
Actually I'm struggling with the distribution. Data is asymmetrically
distributed, so first choice would be a Weibull distribution. 
Unfortunately  the Weibull doesn't allow for zero values in time data,
as it requires x > 0. So I tried the exponential distribution that
allows x to be >= 0 and the log-normal that sets x <= 0 to 0. Still I
get the same error:

" Fehler in survreg(formula = Surv(ifelse(A16_1_1 >= 100, 100,
ifelse(A16_1_1 <=  :
  Invalid survival times for this distribution "

The only distributions that seem to work are gaussian and logistic, but
they don't really fit the data. 
I searched for this problem in the archive and found a suggestion by
Terry Therneau to set all 0  to NA, applying Weibull afterwards.  But
this solution is not very satisfying as it eliminates the left censored
data from the dataset.

So I have three questions:

1. Does anybody know why the lognormal and exponential distribution
don't work in survreg?

2.  What else could I do to find a distribution that fits the data well?

3. What about the non-parametric approach in survfit(), could that be a
solution?

I hope my question aren't too stupid, as I'm not a big statistician.

Regards,

Geraldine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I avoid nested 'for' loops or quicken the process?

2008-12-23 Thread David Winsemius

I have to agree with Daniel Nordlund regarding not creating subsidiary  
problems when the main problem has been cracked. Nonetheless, ...   
might you be happier with the result of changing the last data.frame()  
call in calcProfit to c()?


I get a matrix:
> str(Results2)
 num [1:14, 1:16] 3.00e+04 3.00 -4.50e+02 -1.50e-02 -1.54e-02  
7.50e-01 -5.00e-01 1.00e+04 -1.50e-02 2.00e-04 ...

 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:14] "OutTotInvestment" "OutNumInvestments.investment"  
"OutDolProf" "OutPerProf" ...

  ..$ : NULL

... if you go along with that strategy, then I think it is possible  
that you really want as.data.frame( t( Results2)) since the rows and  
columns seem to be transposed from what I would have wanted.


Now, ...  your next task is to set up your mail-client so it sends  
unformatted text to R-help.


--
David Winsemius
Heritage Labs

On Dec 23, 2008, at 11:36 AM, Brigid Mooney wrote:


--
Problem Description:  (reproducible code below)
--
I cannot seem to get as.data.frame() to work as I would expect.

 Results2 seems to contain repeated column titles for each row, as  
well as a row name 'investment' (which is not intended), like:

 Results2
[[1]]
   OutTotInvestment OutNumInvestments OutDolProf OutPerProf  
OutNetGains OutLong OutShort OutInvestment OutStoploss OutComission  
OutPenny OutVolume OutNumU OutAccDefn
investment3 3   -450  
-0.015 -0.01540.75 -0.5 1  -0.015 
2e-043  0.02   2  0

[[2]]
   OutTotInvestment OutNumInvestments OutDolProf OutPerProf  
OutNetGains OutLong OutShort OutInvestment OutStoploss OutComission  
OutPenny OutVolume OutNumU OutAccDefn
investment3 3   -450  
-0.015 -0.0154 1.5 -0.5 1  -0.015 
2e-043  0.02   2  0

...

When I try to apply 'as.data.frame', it concatenates incremental  
numbers to the repeated row headers and gives:

as.data.frame(Results2)
   OutTotInvestment OutNumInvestments OutDolProf OutPerProf  
OutNetGains OutLong OutShort OutInvestment OutStoploss OutComission  
OutPenny OutVolume OutNumU OutAccDefn
investment3 3   -450  
-0.015 -0.01540.75 -0.5 1  -0.015 
2e-043  0.02   2  0
   OutTotInvestment.1 OutNumInvestments.1 OutDolProf.1  
OutPerProf.1 OutNetGains.1 OutLong.1 OutShort.1 OutInvestment.1  
OutStoploss.1 OutComission.1 OutPenny.1
investment  3   3 -450
-0.015   -0.0154   1.5   -0.5   1 
-0.015  2e-04  3
   OutVolume.1 OutNumU.1 OutAccDefn.1 OutTotInvestment.2  
OutNumInvestments.2 OutDolProf.2 OutPerProf.2 OutNetGains.2 OutLong. 
2 OutShort.2 OutInvestment.2
investment0.02 20   
3   3 -450   -0.015
-0.0154  0.75 -1   1

...

which is a data frame of dimension 1 224, when I am looking for a  
data frame like Results of dimension 16 14.



--
Reproducible code:
--
# --
# FUNCTION calcProfit
# --
calcProfit <- function(IterParam,  marketData, dailyForecast) #,  
long, short, investment, stoploss, comission, penny, volume, numU,  
accDefn)

  {
if (class(IterParam) == "numeric")
  {
long <- IterParam["long"]
short <- IterParam["short"]
investment <- IterParam["investment"]
stoploss <- IterParam["stoploss"]
comission <- IterParam["comission"]
penny <- IterParam["penny"]
volume <- IterParam["volume"]
numU <- IterParam["numU"]
accDefn <- IterParam["accDefn"]
  } else {
  long <- IterParam$long
  short <- IterParam$short
  investment <- IterParam$investment
  stoploss <- IterParam$stoploss
  comission <- IterParam$comission
  penny <- IterParam$penny
  volume <- IterParam$volume
  numU <- IterParam$numU
  accDefn <- IterParam$accDefn
  }
compareMarket <- merge(dailyForecast, marketData,  
by.x="SymbolID", by.y="SymbolID")


weight <- ifelse(rep(accDefn, times=length(compareMarket 
$weight))==1, compareMarket$weight, compareMark

[R] Ordered Multidimensional Arrays

2008-12-23 Thread Derek Schaeffer

Hi,
I am inquiring as to what are the best practices with respect to storing and
manipulating ordered multi-dimensional arrays.  For example, suppose I have
a sequence of time-varying covariance matrices of asset returns.  The data
is ordered, but the ordering is not necessarily regular (e.g. daily data
omitting weekends and holidays, etc.).  The data array is say, N x N x T.
For example, the first two elements may look as follows:

> *result$covariance[,,1:2]
, , 1*
* [,1] [,2] [,3] [,4]
[1,] 1.511137e-06 1.918668e-06 1.201553e-06 3.205271e-06
[2,] 1.918668e-06 7.488916e-06 6.593317e-06 1.203421e-05
[3,] 1.201553e-06 6.593317e-06 1.305861e-05 2.132272e-05
[4,] 3.205271e-06 1.203421e-05 2.132272e-05 4.571225e-05*
*, , 2*
* [,1] [,2] [,3] [,4]
[1,] 1.500858e-06 1.905574e-06 1.193412e-06 3.183290e-06
[2,] 1.905574e-06 7.444871e-06 6.555459e-06 1.195876e-05
[3,] 1.193412e-06 6.555459e-06 1.297075e-05 2.11e-05
[4,] 3.183290e-06 1.195876e-05 2.11e-05 4.551706e-05*

I would like to be able to partition this sequence of matrices by date and
by individual element.  Partitioning by individual elements is trivial;
however, partitioning by time stamp is not (especially if the partitioned
data set must be carried through a number of downstream calculations).  I
could carry the data in a list complete with a date vector and the data
array, and partition the list as I go, but this seems somewhat clunky.  Any
ideas?  A "zoo"-like package capable of handling multidimensional arrays
would be optimal, but I don't believe this exists.

Thanks,
Derek

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Approximate Entropy?

2008-12-23 Thread Ben Bolker

Stephan Kolassa wrote:
> 
> Dear guRus,
> 
> is there a package that calculates the Approximate Entropy (ApEn) of a 
> time series?
> 
> RSiteSearch only gave me a similar question in 2004, which appears not 
> to have been answered:
> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/28830.html
> 
> RSeek.org didn't yield any results at all.
> 
> Happy holidays (where appropriate),
> Stephan
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

I went ahead and translated D. Kaplan's MATLAB code for this purpose into R.
Warning: it's barely tested, and I'm not sure I understand what it's doing
-- you're on your own from here ...
(it won't work if you simply cut and paste what's here, since the examples
are above the definitions
of the utility functions etc.)

## http://www.macalester.edu/~kaplan/hrv/doc/funs/apen.html

## Approximate Entropy
## Syntax

## entropy = apen( pre, post, r );

## Arguments
## pre  An embedding of data.
## post The images of the data in the embedding.
## rThe filter factor, which sets the length scale over which to compute
the approximate entropy.
## Returned Values
## entropy  The numerical value of the approximate entropy.
## Description

## The "approximate entropy" was introduced by Pincus to quantify the
## creation of information in a time series. A low value of the
## entropy indicates that the time series is deterministic; a high
## value indicates randomness.

## The "filter factor" r is an important parameter. In principle, with
## an infinite amount of data, it should approach zero. With finite
## amounts of data, or with measurement noise, it is not always clear
## what is the best value to choose. Past work on heart rate
## variability has suggested setting r to be 0.2 times the standard
## deviation of the data.

## Another important parameter is the "embedding dimension." Again,
## there is know precise means of knowing the best such dimension, but
## previous work has used a dimension of 2. The final parameter is the
## embedding lag, which is often set to 1, but perhaps more
## appropriately is set to be the smallest lag at which the
## autocorrelation function of the time series is close to zero.

## The apen function expects the data to be presented in a specific
## format. Working with a time series tseries, the following steps
## will compute the approximate entropy, with an embedding dimension
## of 2 and a lag of 1.

## edim = 2;
## lag = 1;
## edata = lagembed(tseries,edim,lag);
## [pre,post] = getimage(edata,lag);
## r = 0.2*std(tseries);
## apen(pre,post,r);

edim <- 2
lag <- 1
edata <- lagembed(tseries,edim,lag)
im <- getimage(edata,lag)
r <- 0.2*sd(tseries)
apen(im$pre,im$post,r)

apenembed <- function(tseries,edim,lag=1,relr=0.2,r) {
  edata <- lagembed(tseries,edim,lag)
  im <- getimage(edata,lag)
  if (missing(r)) r <- relr*sd(tseries)
  apen(im$pre,im$post,r)
}

## References

## * SM Pincus (1991) Proc. Natl. Acad. Sci. USA 88:2297-2301
## * D Kaplan, MI Furman, SM Pincus, SM Ryan, LA Lipsitz, AL Goldberger
(1991) "Aging and the complexity of cardiovascular dynamics," Biophys.J.
59:945-949 

## See Also
## apenhr. lagembed. getimage.
## Examples

lagembed <- function(ts,d,lag) {
  z <- embed(ts,d)
  z[seq(1,nrow(z),by=lag),]
}

getimage <- function(x,pred) {
  pre <- x[1:(nrow(x)-pred),]
  post <- x[(pred+1):nrow(x),1]
  list(pre=pre,post=post)
}

## tseries = randn(500,1);
## should have a large approximate entropy.

set.seed(1001)
apenembed(rnorm(500),edim=3)
## 0.3240493

apenembed(sin(seq(1,100,by=0.2)),edim=3)

## should have a small approximate entropy since it is deterministic. 
## 0.1001811

## examples from utils web page:
ts <- 1:5
x <- lagembed(ts,3,1)
im <- getimage(x,1)

apen <- function(pre,post,r) {

  ## translated from Kaplan D (1998) HRV software.
  ## http://www.macalester.edu/~kaplan/hrv/doc/Feb3snap.tar
  ## computer approximate entropy a la Steve Pincus

  N <- nrow(pre)
  p <- ncol(pre)

  ## number of pairs of points closer than r in pre/post space
  phiM <- phiMplus1 <- 0
  for (k in 1:N) {
## replicate the current point
foo <- matrix(rep(pre[k,],N),byrow=TRUE,nrow=N)
## calculate distance (max norm)
goo <- abs(foo-pre)<=r
## which ones of them are closer than r using the max norm?
closerpre <- if (p==1) goo else apply(goo,1,all)
precount <- sum(closerpre)
phiM <- phiM+log(precount)

## of the ones that were closer in the pre space, how many are closer
## in post also ?
postcount <- sum(abs(post[closerpre] - post[k])http://www.cbi.dongnocchi.it/glossary/ApEn.html
## http://www.physionet.org/physiotools/ApEn/

ts <- rep(61:65,10)
apenembed(ts,d=5,r=2) ## get 0 instead of 0.00189?

-- 
View this message in context

Re: [R] How can I avoid nested 'for' loops or quicken the process?

2008-12-23 Thread Bert Gunter

FWIW:

Good advice below! -- after all, the first rule of optimizing code is:
Don't!

For the record (yet again), the apply() family of functions (and their
packaged derivatives, of course) are "merely" vary carefully written for()
loops: their main advantage is in code readability, not in efficiency gains,
which may well be small or nonexistent. True efficiency gains require
"vectorization", which essentially moves the for() loops from interpreted
code to (underlying) C code (on the underlying data structures): e.g.
compare rowMeans() [vectorized] with ave() or apply(..,1,mean).

Cheers,
Bert Gunter
Genentech Nonclinical Statistics

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Daniel Nordlund
Sent: Tuesday, December 23, 2008 10:01 AM
To: r-help@r-project.org
Subject: Re: [R] How can I avoid nested 'for' loops or quicken the process?

Avoiding multiple nested for loops (as requested in the subject) is usually
a good idea, especially if you can take advantage of vectorized functions.
You were able redesign your code to use a single for loop.  I presume there
was a substantial improvement in program speed.  How much additional time is
saved by using apply to  eliminate the final for loop?  Is it worth the
additional programming time?  Enquiring minds want to know. :-)

Dan

Daniel Nordlund
Bothell, WA USA  

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Brigid Mooney
> Sent: Tuesday, December 23, 2008 8:36 AM
> To: David Winsemius
> Cc: r-help@r-project.org
> Subject: Re: [R] How can I avoid nested 'for' loops or 
> quicken the process?
> 
> --
> 
> Problem Description:  (reproducible code below)
> --
> 
> I cannot seem to get as.data.frame() to work as I would expect.
> 
>  Results2 seems to contain repeated column titles for each 
> row, as well as a
> row name 'investment' (which is not intended), like:
>  Results2
> [[1]]
>OutTotInvestment OutNumInvestments OutDolProf OutPerProf
> OutNetGains OutLong OutShort OutInvestment OutStoploss 
> OutComission OutPenny
> OutVolume OutNumU OutAccDefn
> investment3 3   -450 -0.015
> -0.01540.75 -0.5 1  -0.0152e-04
> 3  0.02   2  0
> [[2]]
>OutTotInvestment OutNumInvestments OutDolProf OutPerProf
> OutNetGains OutLong OutShort OutInvestment OutStoploss 
> OutComission OutPenny
> OutVolume OutNumU OutAccDefn
> investment3 3   -450 -0.015
> -0.0154 1.5 -0.5 1  -0.0152e-04
> 3  0.02   2  0
> ...
> 
> When I try to apply 'as.data.frame', it concatenates 
> incremental numbers to
> the repeated row headers and gives:
> as.data.frame(Results2)
>OutTotInvestment OutNumInvestments OutDolProf OutPerProf
> OutNetGains OutLong OutShort OutInvestment OutStoploss 
> OutComission OutPenny
> OutVolume OutNumU OutAccDefn
> investment3 3   -450 -0.015
> -0.01540.75 -0.5 1  -0.0152e-04
> 3  0.02   2  0
>OutTotInvestment.1 OutNumInvestments.1 
> OutDolProf.1 OutPerProf.1
> OutNetGains.1 OutLong.1 OutShort.1 OutInvestment.1 OutStoploss.1
> OutComission.1 OutPenny.1
> investment  3   3 -450
> -0.015   -0.0154   1.5   -0.5   1
> -0.015  2e-04  3
>OutVolume.1 OutNumU.1 OutAccDefn.1 OutTotInvestment.2
> OutNumInvestments.2 OutDolProf.2 OutPerProf.2 OutNetGains.2 OutLong.2
> OutShort.2 OutInvestment.2
> investment0.02 20
> 3   3 -450   -0.015   -0.0154
> 0.75 -1   1
> ...
> 
> which is a data frame of dimension 1 224, when I am looking 
> for a data frame
> like Results of dimension 16 14.
> 
> 
> --
> 
> Reproducible code:
> --
> 
> # --
> # FUNCTION calcProfit
> # --
> calcProfit <- function(IterParam,  marketData, dailyForecast) #, long,
> short, investment, stoploss, comission, penny, volume, numU, accDefn)
>   {
> if (class(IterParam) == "numeric")
>   {
> long <- IterParam["long"]
> short <- IterParam["short"]
> investment <- IterPa

Re: [R] Approximate Entropy?

2008-12-23 Thread Stephan Kolassa


Ben,

thanks a lot for that! I have a (reasonably) good idea about what ApEn 
should be, and I'll try to understand your translation of Kaplan's 
Matlab code.


Best,
Stephan


Ben Bolker schrieb:



Stephan Kolassa wrote:

Dear guRus,

is there a package that calculates the Approximate Entropy (ApEn) of a 
time series?


RSiteSearch only gave me a similar question in 2004, which appears not 
to have been answered:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/28830.html

RSeek.org didn't yield any results at all.

Happy holidays (where appropriate),
Stephan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




I went ahead and translated D. Kaplan's MATLAB code for this purpose into R.
Warning: it's barely tested, and I'm not sure I understand what it's doing
-- you're on your own from here ...
(it won't work if you simply cut and paste what's here, since the examples
are above the definitions
of the utility functions etc.)


## http://www.macalester.edu/~kaplan/hrv/doc/funs/apen.html

## Approximate Entropy
## Syntax

## entropy = apen( pre, post, r );

## Arguments
## pre  An embedding of data.
## post The images of the data in the embedding.
## rThe filter factor, which sets the length scale over which to compute
the approximate entropy.
## Returned Values
## entropy  The numerical value of the approximate entropy.
## Description

## The "approximate entropy" was introduced by Pincus to quantify the
## creation of information in a time series. A low value of the
## entropy indicates that the time series is deterministic; a high
## value indicates randomness.

## The "filter factor" r is an important parameter. In principle, with
## an infinite amount of data, it should approach zero. With finite
## amounts of data, or with measurement noise, it is not always clear
## what is the best value to choose. Past work on heart rate
## variability has suggested setting r to be 0.2 times the standard
## deviation of the data.

## Another important parameter is the "embedding dimension." Again,
## there is know precise means of knowing the best such dimension, but
## previous work has used a dimension of 2. The final parameter is the
## embedding lag, which is often set to 1, but perhaps more
## appropriately is set to be the smallest lag at which the
## autocorrelation function of the time series is close to zero.

## The apen function expects the data to be presented in a specific
## format. Working with a time series tseries, the following steps
## will compute the approximate entropy, with an embedding dimension
## of 2 and a lag of 1.

## edim = 2;
## lag = 1;
## edata = lagembed(tseries,edim,lag);
## [pre,post] = getimage(edata,lag);
## r = 0.2*std(tseries);
## apen(pre,post,r);

edim <- 2
lag <- 1
edata <- lagembed(tseries,edim,lag)
im <- getimage(edata,lag)
r <- 0.2*sd(tseries)
apen(im$pre,im$post,r)

apenembed <- function(tseries,edim,lag=1,relr=0.2,r) {
  edata <- lagembed(tseries,edim,lag)
  im <- getimage(edata,lag)
  if (missing(r)) r <- relr*sd(tseries)
  apen(im$pre,im$post,r)
}

## References

## * SM Pincus (1991) Proc. Natl. Acad. Sci. USA 88:2297-2301
## * D Kaplan, MI Furman, SM Pincus, SM Ryan, LA Lipsitz, AL Goldberger
(1991) "Aging and the complexity of cardiovascular dynamics," Biophys.J.
59:945-949 


## See Also
## apenhr. lagembed. getimage.
## Examples


lagembed <- function(ts,d,lag) {
  z <- embed(ts,d)
  z[seq(1,nrow(z),by=lag),]
}

getimage <- function(x,pred) {
  pre <- x[1:(nrow(x)-pred),]
  post <- x[(pred+1):nrow(x),1]
  list(pre=pre,post=post)
}

## tseries = randn(500,1);
## should have a large approximate entropy.

set.seed(1001)
apenembed(rnorm(500),edim=3)
## 0.3240493

apenembed(sin(seq(1,100,by=0.2)),edim=3)

## should have a small approximate entropy since it is deterministic. 
## 0.1001811


## examples from utils web page:
ts <- 1:5
x <- lagembed(ts,3,1)
im <- getimage(x,1)

apen <- function(pre,post,r) {

  ## translated from Kaplan D (1998) HRV software.
  ## http://www.macalester.edu/~kaplan/hrv/doc/Feb3snap.tar
  ## computer approximate entropy a la Steve Pincus

  N <- nrow(pre)
  p <- ncol(pre)
  
  ## number of pairs of points closer than r in pre/post space

  phiM <- phiMplus1 <- 0
  for (k in 1:N) {
## replicate the current point
foo <- matrix(rep(pre[k,],N),byrow=TRUE,nrow=N)
## calculate distance (max norm)
goo <- abs(foo-pre)<=r
## which ones of them are closer than r using the max norm?
closerpre <- if (p==1) goo else apply(goo,1,all)
precount <- sum(closerpre)
phiM <- phiM+log(precount)

## of the ones that were closer in the pre space, how many are closer

## in post also ?
postcount <- sum(abs(post[closerpre] - post[k])http://www.cbi.dongnocchi.it/glossary/ApE

[R] "assign" statement in S-Plus

2008-12-23 Thread kathie


Dear R users...

I need to change the S+ code below to R code.

I am wondering if there is a R statement equivalent for "assign" statement
in S-plus.




  prime <- function(x)
{
1*(abs(x) < chuber)
}
  assign("prime",prime,frame=1)

-


Any comments will be greatly appreciated.

Kathryn Lord
-- 
View this message in context: 
http://www.nabble.com/%22assign%22-statement-in-S-Plus-tp21149319p21149319.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate / tranpose data

2008-12-23 Thread Henrique Dallazuanna

More simple:

 aggregate(x$ZIP_CODE, list(CODE_NAME = x$CODE_NAME), paste, collapse = ",")

On Tue, Dec 23, 2008 at 3:27 PM, Henrique Dallazuanna wrote:

> Try this:
>
> do.call(rbind,
>lapply(split(x, x$CODE_NAME),
>   function(cod)
>data.frame(CODE_NAME =
> unique(cod$CODE_NAME),
>   ZIP_CODE =
> paste(cod$ZIP_CODE, collapse = ","))
>  )
>  )
>
>
>
> On Tue, Dec 23, 2008 at 3:10 PM, Ferry  wrote:
>
>> Dear R-Users,
>>
>> Suppose I have data in the following format:
>>
>> CODE_NAME ZIP_CODE
>> John   12345
>> John   23456
>> John   34567
>> Jane   13242
>> Jane   22123
>>
>> I want to transpose / convert it into:
>> CODE_NAME ZIP_CODE
>> John  12345,23456,34567
>> Jane  13242,22123
>>
>> Any idea/pointer is appreciated.
>>
>> Thanks a bunch,
>>
>> Ferry
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I avoid nested 'for' loops or quicken the process?

2008-12-23 Thread Daniel Nordlund

Avoiding multiple nested for loops (as requested in the subject) is usually
a good idea, especially if you can take advantage of vectorized functions.
You were able redesign your code to use a single for loop.  I presume there
was a substantial improvement in program speed.  How much additional time is
saved by using apply to  eliminate the final for loop?  Is it worth the
additional programming time?  Enquiring minds want to know. :-)

Dan

Daniel Nordlund
Bothell, WA USA  

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Brigid Mooney
> Sent: Tuesday, December 23, 2008 8:36 AM
> To: David Winsemius
> Cc: r-help@r-project.org
> Subject: Re: [R] How can I avoid nested 'for' loops or 
> quicken the process?
> 
> --
> 
> Problem Description:  (reproducible code below)
> --
> 
> I cannot seem to get as.data.frame() to work as I would expect.
> 
>  Results2 seems to contain repeated column titles for each 
> row, as well as a
> row name 'investment' (which is not intended), like:
>  Results2
> [[1]]
>OutTotInvestment OutNumInvestments OutDolProf OutPerProf
> OutNetGains OutLong OutShort OutInvestment OutStoploss 
> OutComission OutPenny
> OutVolume OutNumU OutAccDefn
> investment3 3   -450 -0.015
> -0.01540.75 -0.5 1  -0.0152e-04
> 3  0.02   2  0
> [[2]]
>OutTotInvestment OutNumInvestments OutDolProf OutPerProf
> OutNetGains OutLong OutShort OutInvestment OutStoploss 
> OutComission OutPenny
> OutVolume OutNumU OutAccDefn
> investment3 3   -450 -0.015
> -0.0154 1.5 -0.5 1  -0.0152e-04
> 3  0.02   2  0
> ...
> 
> When I try to apply 'as.data.frame', it concatenates 
> incremental numbers to
> the repeated row headers and gives:
> as.data.frame(Results2)
>OutTotInvestment OutNumInvestments OutDolProf OutPerProf
> OutNetGains OutLong OutShort OutInvestment OutStoploss 
> OutComission OutPenny
> OutVolume OutNumU OutAccDefn
> investment3 3   -450 -0.015
> -0.01540.75 -0.5 1  -0.0152e-04
> 3  0.02   2  0
>OutTotInvestment.1 OutNumInvestments.1 
> OutDolProf.1 OutPerProf.1
> OutNetGains.1 OutLong.1 OutShort.1 OutInvestment.1 OutStoploss.1
> OutComission.1 OutPenny.1
> investment  3   3 -450
> -0.015   -0.0154   1.5   -0.5   1
> -0.015  2e-04  3
>OutVolume.1 OutNumU.1 OutAccDefn.1 OutTotInvestment.2
> OutNumInvestments.2 OutDolProf.2 OutPerProf.2 OutNetGains.2 OutLong.2
> OutShort.2 OutInvestment.2
> investment0.02 20
> 3   3 -450   -0.015   -0.0154
> 0.75 -1   1
> ...
> 
> which is a data frame of dimension 1 224, when I am looking 
> for a data frame
> like Results of dimension 16 14.
> 
> 
> --
> 
> Reproducible code:
> --
> 
> # --
> # FUNCTION calcProfit
> # --
> calcProfit <- function(IterParam,  marketData, dailyForecast) #, long,
> short, investment, stoploss, comission, penny, volume, numU, accDefn)
>   {
> if (class(IterParam) == "numeric")
>   {
> long <- IterParam["long"]
> short <- IterParam["short"]
> investment <- IterParam["investment"]
> stoploss <- IterParam["stoploss"]
> comission <- IterParam["comission"]
> penny <- IterParam["penny"]
> volume <- IterParam["volume"]
> numU <- IterParam["numU"]
> accDefn <- IterParam["accDefn"]
>   } else {
>   long <- IterParam$long
>   short <- IterParam$short
>   investment <- IterParam$investment
>   stoploss <- IterParam$stoploss
>   comission <- IterParam$comission
>   penny <- IterParam$penny
>   volume <- IterParam$volume
>   numU <- IterParam$numU
>   accDefn <- IterParam$accDefn
>   }
> 
> compareMarket <- merge(dailyForecast, marketData, by.x="SymbolID",
> by.y="SymbolID")
> 
> weight <- ifelse(rep(accDefn, 
> times=length(compareMarket$weight))==1,
> compareMarket$weight, compareMarket$CPweight)
> 
> position <- ifelse((weight<=sho

[R] source() argument specification in packages

2008-12-23 Thread Seeliger . Curt

Folks,

I am creating a small package which builds just fine but fails the check 
during the installation phase, as it can not find the files I am 
source()ing:

cannot open file 'c:\PROGRA~1\R\R-28~1.0\library\nla\nlamets.r': No 
such file or directory

The path to the files are predicated on the package being already 
installed, for example:

if(!(exists('codeLocation'))) { 
codeLocation <- paste(Sys.getenv("R_HOME"), "\\library\\nla\\", 
sep="")
}
source(paste(codeLocation,'nlamets.r',sep=''))

So it can't install the package because the package isn't installed yet. 
At this point, specifying a file path for a package looks like a 
chicken/egg problem.  Is there a more appropriate means for specifying 
file paths in source(), or am I overlooking something even simpler?

Thank you for your help,
cur

-- 
Curt Seeliger, Data Ranger
Raytheon Information Services - Contractor to ORD
seeliger.c...@epa.gov
541/754-4638

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate / tranpose data

2008-12-23 Thread Henrique Dallazuanna

Try this:

do.call(rbind,
   lapply(split(x, x$CODE_NAME),
  function(cod)
   data.frame(CODE_NAME =
unique(cod$CODE_NAME),
  ZIP_CODE =
paste(cod$ZIP_CODE, collapse = ","))
 )
 )


On Tue, Dec 23, 2008 at 3:10 PM, Ferry  wrote:

> Dear R-Users,
>
> Suppose I have data in the following format:
>
> CODE_NAME ZIP_CODE
> John   12345
> John   23456
> John   34567
> Jane   13242
> Jane   22123
>
> I want to transpose / convert it into:
> CODE_NAME ZIP_CODE
> John  12345,23456,34567
> Jane  13242,22123
>
> Any idea/pointer is appreciated.
>
> Thanks a bunch,
>
> Ferry
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] aggregate / tranpose data

2008-12-23 Thread Ferry

Dear R-Users,

Suppose I have data in the following format:

CODE_NAME ZIP_CODE
John   12345
John   23456
John   34567
Jane   13242
Jane   22123

I want to transpose / convert it into:
CODE_NAME ZIP_CODE
John  12345,23456,34567
Jane  13242,22123

Any idea/pointer is appreciated.

Thanks a bunch,

Ferry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tabular output: from R to Excel or HTML

2008-12-23 Thread Stavros Macrakis

David, Tobias,

Thanks for your pointers to the various HTML and OpenOffice tools.  I will
look into them.  odfWeave looks particularly promising since "OpenOffice can
be used to export the document to MS Word, rich text format, HTML, plain
text or pdf formats."  It looks as though I have to learn Sweave first,
though... I thought I'd left TeX behind me 20 years ago!

   -s

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sorting regression coefficients by p-value

2008-12-23 Thread Sharma, Dhruv

thanks David.

Dhruv


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Tue 12/23/2008 12:40 AM
To: Sharma, Dhruv
Cc: r-help@r-project.org
Subject: Re: [R] sorting regression coefficients by p-value
 

Assuming that you are using the example in the lm help page:

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2,10,20, labels=c("Ctl","Trt")) weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group)
# The coefficients are just :
coef(lm.D9)

# The relevant section of str(lm.D9):
$ coefficients : num [1:2, 1:4] 5.032 -0.371 0.22 0.311 22.85 ...
   ..- attr(*, "dimnames")=List of 2
   .. ..$ : chr [1:2] "(Intercept)" "groupTrt"
   .. ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
 > as.data.frame(summary(lm.D9)$coefficients)
 Estimate Std. Error   t value Pr(>|t|)
(Intercept)5.032  0.2202177 22.850117 9.547128e-15
groupTrt  -0.371  0.3114349 -1.191260 2.490232e-01

set X <- that object,
cbind(rownames(X),X[,c("Estimate", "Pr(>|t|)")])
is what you asked for.
--  
David Winsemius
On Dec 22, 2008, at 10:44 PM, Sharma, Dhruv wrote:

> Hi,
>  Is there a way to get/extract a matrix of regression variable name,  
> coefficient, and p values?
>  (for lm and glm; which can be sort by p value?)
>
> thanks
> Dhruv
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I avoid nested 'for' loops or quicken the process?

2008-12-23 Thread Brigid Mooney

--
Problem Description:  (reproducible code below)
--
I cannot seem to get as.data.frame() to work as I would expect.

 Results2 seems to contain repeated column titles for each row, as well as a
row name 'investment' (which is not intended), like:
 Results2
[[1]]
   OutTotInvestment OutNumInvestments OutDolProf OutPerProf
OutNetGains OutLong OutShort OutInvestment OutStoploss OutComission OutPenny
OutVolume OutNumU OutAccDefn
investment3 3   -450 -0.015
-0.01540.75 -0.5 1  -0.0152e-04
3  0.02   2  0
[[2]]
   OutTotInvestment OutNumInvestments OutDolProf OutPerProf
OutNetGains OutLong OutShort OutInvestment OutStoploss OutComission OutPenny
OutVolume OutNumU OutAccDefn
investment3 3   -450 -0.015
-0.0154 1.5 -0.5 1  -0.0152e-04
3  0.02   2  0
...

When I try to apply 'as.data.frame', it concatenates incremental numbers to
the repeated row headers and gives:
as.data.frame(Results2)
   OutTotInvestment OutNumInvestments OutDolProf OutPerProf
OutNetGains OutLong OutShort OutInvestment OutStoploss OutComission OutPenny
OutVolume OutNumU OutAccDefn
investment3 3   -450 -0.015
-0.01540.75 -0.5 1  -0.0152e-04
3  0.02   2  0
   OutTotInvestment.1 OutNumInvestments.1 OutDolProf.1 OutPerProf.1
OutNetGains.1 OutLong.1 OutShort.1 OutInvestment.1 OutStoploss.1
OutComission.1 OutPenny.1
investment  3   3 -450
-0.015   -0.0154   1.5   -0.5   1
-0.015  2e-04  3
   OutVolume.1 OutNumU.1 OutAccDefn.1 OutTotInvestment.2
OutNumInvestments.2 OutDolProf.2 OutPerProf.2 OutNetGains.2 OutLong.2
OutShort.2 OutInvestment.2
investment0.02 20
3   3 -450   -0.015   -0.0154
0.75 -1   1
...

which is a data frame of dimension 1 224, when I am looking for a data frame
like Results of dimension 16 14.


--
Reproducible code:
--
# --
# FUNCTION calcProfit
# --
calcProfit <- function(IterParam,  marketData, dailyForecast) #, long,
short, investment, stoploss, comission, penny, volume, numU, accDefn)
  {
if (class(IterParam) == "numeric")
  {
long <- IterParam["long"]
short <- IterParam["short"]
investment <- IterParam["investment"]
stoploss <- IterParam["stoploss"]
comission <- IterParam["comission"]
penny <- IterParam["penny"]
volume <- IterParam["volume"]
numU <- IterParam["numU"]
accDefn <- IterParam["accDefn"]
  } else {
  long <- IterParam$long
  short <- IterParam$short
  investment <- IterParam$investment
  stoploss <- IterParam$stoploss
  comission <- IterParam$comission
  penny <- IterParam$penny
  volume <- IterParam$volume
  numU <- IterParam$numU
  accDefn <- IterParam$accDefn
  }

compareMarket <- merge(dailyForecast, marketData, by.x="SymbolID",
by.y="SymbolID")

weight <- ifelse(rep(accDefn, times=length(compareMarket$weight))==1,
compareMarket$weight, compareMarket$CPweight)

position <- ifelse((weight<=short & compareMarket$OpeningPrice > penny &
compareMarket$noU>=numU), "S",
  ifelse((weight>=long & compareMarket$OpeningPrice > penny &
compareMarket$noU>=numU), "L", NA))
positionTF <- ifelse(position=="L" | position=="S", TRUE, FALSE)

estMaxInv <- volume*compareMarket$MinTrVol*compareMarket$YesterdayClose

investbySymbol <- ifelse(positionTF==TRUE, ifelse(estMaxInv >=
investment, investment, 0))

opClProfit <- ifelse(position=="L",
compareMarket$ClosingPrice/compareMarket$OpeningPrice-1,
ifelse(position=="S",
1-compareMarket$ClosingPrice/compareMarket$OpeningPrice, 0.0))

Gains <- investbySymbol*ifelse(opClProfit <= stoploss, stoploss,
opClProfit)

ProfitTable <- data.frame(SymbolID=compareMarket$SymbolID,
investbySymbol, Gains, percentGains=Gains/investbySymbol,
LessComm=rep(comission, times=length(Gains)),
NetGains=Gains/investbySymbol-2*comission)

AggregatesTable <- data.frame( OutTotInvestment =
sum(ProfitTable

Re: [R] How can I avoid nested 'for' loops or quicken the process?

2008-12-23 Thread David Winsemius

On Dec 23, 2008, at 10:56 AM, Brigid Mooney wrote:

Thank you again for your help.

snip

-
With the 'apply' call, Results2 is of class list.

Results2 <- apply(CombParam, 1, calcProfit, X, Y)
---

How can I get convert Results2 from a list to a data frame like  
Results?

Have you tried as.data.frame() on Results2? Each of its elements  
should have the proper structure.

You no longer have a reproducible example, but see this session clip:
> lairq <- apply(airquality,1, function(x) x )
> str(lairq)
 num [1:6, 1:153] 41 190 7.4 67 5 1 36 118 8 72 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:6] "Ozone" "Solar.R" "Wind" "Temp" ...
  ..$ : NULL
> is.data.frame(lairq)
[1] FALSE
> is.data.frame(rbind(lairq))
[1] FALSE
> is.data.frame( as.data.frame(lairq) )
--
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using transform to add a date column to a dataframe

2008-12-23 Thread Peter Dalgaard

Prof Brian Ripley wrote:

A Date object is not a vector.  In released versions of R, data.frame() 
only knows how to replicate vectors and factors.

That's a rather oblique way of saying that it actually does work with 
the unreleased version:

> head(transform(airquality,Date=as.Date("1950-01-01")))
  Ozone Solar.R Wind Temp Month Day   Date
141 190  7.4   67 5   1 1950-01-01
236 118  8.0   72 5   2 1950-01-01
312 149 12.6   74 5   3 1950-01-01
418 313 11.5   62 5   4 1950-01-01
5NA  NA 14.3   56 5   5 1950-01-01
628  NA 14.9   66 5   6 1950-01-01
> R.version.string
[1] "R version 2.9.0 Under development (unstable) (2008-12-23 r47310)"

--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I avoid nested 'for' loops or quicken the process?

2008-12-23 Thread Brigid Mooney

Thank you again for your help.

I updated the parsing at the beginning of the calcProfit function with:

if (class(IterParam) == "numeric")
  {
long <- IterParam["long"]
short <- IterParam["short"]
investment <- IterParam["investment"]
stoploss <- IterParam["stoploss"]
comission <- IterParam["comission"]
penny <- IterParam["penny"]
volume <- IterParam["volume"]
numU <- IterParam["numU"]
accDefn <- IterParam["accDefn"]
  } else {
  long <- IterParam$long
  short <- IterParam$short
  investment <- IterParam$investment
  stoploss <- IterParam$stoploss
  comission <- IterParam$comission
  penny <- IterParam$penny
  volume <- IterParam$volume
  numU <- IterParam$numU
  accDefn <- IterParam$accDefn
  }

This allows for everything to process as expected for calling it both in the
'for' loop I showed before and as part of 'apply'.
However, I have one other question.

With the 'for' loop, Results is of class data frame.

for (i in 1:length(CombParam$long))
  {
   if(i==1)
 { Results <- calcProfit(CombParam[i,], X, Y)
 } else {
 Results <- rbind(Results, calcProfit(CombParam[i,], X, Y))
   }
  }
---
With the 'apply' call, Results2 is of class list.

Results2 <- apply(CombParam, 1, calcProfit, X, Y)

---

How can I get convert Results2 from a list to a data frame like Results?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using transform to add a date column to a dataframe

2008-12-23 Thread Prof Brian Ripley


On Tue, 23 Dec 2008, Tom La Bone wrote:




Gavin Simpson wrote:



It says that the two arguments have different numbers of observations.
The reason for which should now be pretty obvious as you provided a
single Date whereas airquality has 153 observations.




Thanks. I did look at ?transform but I was a bit confused because this
worked

  data1 <- transform(airquality,LTMDA=T)


What is "T"?  (If you mean 'TRUE', please say so, as "T" is a regular 
variable.)



whereas this did not

 data1 <- transform(airquality,Date=as.Date("1950-01-01"))

Why does the first one work with one argument but the second one does not?


A Date object is not a vector.  In released versions of R, data.frame() 
only knows how to replicate vectors and factors.


As Gavin pointed out, you were warned so why are you disregarding the 
warning?  That a subclass works does not mean that you are entitled to 
leap to conclusions about another subclass, does it?  Abductive inference 
does not apply (it would be like useRs considering you to be always in 
error based on your email address).


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Comparing Data and Simulation

2008-12-23 Thread Koen van Rhee


Hello,

I've got a function f(x) with this function. For x in -2 : 2 i've 
generated a vector y with function values. There is also a data set and 
I've constructed an estimator which should find values similar to y.


My question is, what is the best way to compare these vectors?

A simple example y=1 and a sample size of 10.

y = [1] 1 1 1 1 1 1 1 1 1 1
est =  [1] -2.34465214 -1.85665524 -1.36865834 -0.88066144 -0.39266455  
0.09533235  0.58332925  1.07132615  1.55932305  2.04731995


Kind regards,
Koen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I avoid nested 'for' loops or quicken the process?

2008-12-23 Thread David Winsemius



On Dec 23, 2008, at 9:55 AM, Brigid Mooney wrote:

I have used some of your advice in making some changes to my  
function and function call before reposting.


Instead of nesting many 'for' loops, I have gotten to the point  
where I only have one.(Also, please note, I am pasting the  
function calcProfit at the end of this message as it is a bit long.)


This process works correctly, but still has a 'for' loop, which I  
thought I would be able to avoid with 'apply'.

--
# Sample iteration parameters (these can be vectors of arbitrary  
length)
# Need to iterate through all possible combinations of these  
parameters

Param <- list(long=c(.75, 1.5),
  short=c(-.5, -1),
  investment=1,
  stoploss=c(-.015),
  comission=.0002,
  penny=3,
  volume=c(.02, .01),
  numU=2,
  accDefn=0:1 )
CombParam <- expand.grid(Param)

# Create sample X and Y data frames  for function call
Y <- data.frame(SymbolID=10:14, OpeningPrice = c(1,3,10,20,60),  
ClosingPrice = c(2,2.5,11,18,61.5), YesterdayClose= c(1,3,10,20,60),  
MinTrVol = rep(1000, times=5))
X <- data.frame(SymbolID=10:14, weight = c(1, .5, -3, -.75, 2),  
CPweight=c(1.5, .25, -1.75, 2, -1), noU = c(2,3,4,2,10))


for (i in 1:length(CombParam$long))
  {
   if(i==1)
 { Results <- calcProfit(CombParam[i,], X, Y)
 } else {
 Results <- rbind(Results, calcProfit(CombParam[i,], X, Y))
   }
  }
--

However, when I try to replace this for loop with 'apply', I get the  
following result:


Results2 <- apply(CombParam, 1, calcProfit, X, Y)
Error in IterParam$long : $ operator is invalid for atomic vectors


apply is giving calcProfit a named numeric vector and then calcProfit  
is trying to parse it with "$" which is an operator for lists. Try  
serial corrects of the form:


long <- IterParam["long"]

That seemed to let the interpreter move on to the next error ;-)

> Results2 <- apply(CombParam, 1, calcProfit, X, Y)
Error in IterParam$short : $ operator is invalid for atomic vectors

--
David Winsemius



Any advice that anyone could provide would be much appreciated.

Here is the function which I am using:

--
calcProfit <- function(IterParam,  marketData, dailyForecast) {
long <- IterParam$long
short <- IterParam$short
investment <- IterParam$investment
stoploss <- IterParam$stoploss
comission <- IterParam$comission
penny <- IterParam$penny
volume <- IterParam$volume
numU <- IterParam$numU
accDefn <- IterParam$accDefn

compareMarket <- merge(dailyForecast, marketData,  
by.x="SymbolID", by.y="SymbolID")


weight <- ifelse(rep(accDefn, times=length(compareMarket 
$weight))==1, compareMarket$weight, compareMarket$CPweight)


position <- ifelse((weight<=short & compareMarket$OpeningPrice >  
penny & compareMarket$noU>=numU), "S",
  ifelse((weight>=long & compareMarket$OpeningPrice > penny &  
compareMarket$noU>=numU), "L", NA))

positionTF <- ifelse(position=="L" | position=="S", TRUE, FALSE)

estMaxInv <- volume*compareMarket$MinTrVol*compareMarket 
$YesterdayClose


investbySymbol <- ifelse(positionTF==TRUE, ifelse(estMaxInv >=  
investment, investment, 0))


opClProfit <- ifelse(position=="L", compareMarket$ClosingPrice/ 
compareMarket$OpeningPrice-1,
ifelse(position=="S", 1-compareMarket 
$ClosingPrice/compareMarket$OpeningPrice, 0.0))


Gains <- investbySymbol*ifelse(opClProfit <= stoploss, stoploss,  
opClProfit)
ProfitTable <- data.frame(SymbolID=compareMarket$SymbolID,  
investbySymbol, Gains, percentGains=Gains/investbySymbol,
LessComm=rep(comission,  
times=length(Gains)), NetGains=Gains/investbySymbol-2*comission)


AggregatesTable <- data.frame( OutTotInvestment = sum(ProfitTable 
$investbySymbol, na.rm=TRUE),
OutNumInvestments = sum(ProfitTable$investbySymbol,  
na.rm=TRUE)/investment, OutDolProf = sum(ProfitTable$Gains,  
na.rm=TRUE),
OutPerProf = sum(ProfitTable$Gains, na.rm=TRUE)/ 
sum(ProfitTable$investbySymbol, na.rm=TRUE),
OutNetGains = sum(ProfitTable$Gains, na.rm=TRUE)/ 
sum(ProfitTable$investbySymbol, na.rm=TRUE)-2*comission, OutLong =  
long,
OutShort = short, OutInvestment = investment, OutStoploss =  
stoploss, OutComission = comission, OutPenny = penny, OutVolume =  
volume,

OutNumU = numU, OutAccDefn = accDefn )

return(AggregatesTable)
  }

--



On Mon, Dec 22, 2008 at 4:32 PM, D

Re: [R] Using transform to add a date column to a dataframe

2008-12-23 Thread Tom La Bone

Gavin Simpson wrote:
> 
> 
> It says that the two arguments have different numbers of observations.
> The reason for which should now be pretty obvious as you provided a
> single Date whereas airquality has 153 observations.
> 

Thanks. I did look at ?transform but I was a bit confused because this
worked

   data1 <- transform(airquality,LTMDA=T)

whereas this did not

  data1 <- transform(airquality,Date=as.Date("1950-01-01"))

Why does the first one work with one argument but the second one does not?

-- 
View this message in context: 
http://www.nabble.com/Using-transform-to-add-a-date-column-to-a-dataframe-tp21144414p21146167.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I avoid nested 'for' loops or quicken the process?

2008-12-23 Thread Brigid Mooney

I have used some of your advice in making some changes to my function and
function call before reposting.

Instead of nesting many 'for' loops, I have gotten to the point where I only
have one.(Also, please note, I am pasting the function calcProfit at the
end of this message as it is a bit long.)

This process works correctly, but still has a 'for' loop, which I thought I
would be able to avoid with 'apply'.
--
# Sample iteration parameters (these can be vectors of arbitrary length)
# Need to iterate through all possible combinations of these parameters
Param <- list(long=c(.75, 1.5),
  short=c(-.5, -1),
  investment=1,
  stoploss=c(-.015),
  comission=.0002,
  penny=3,
  volume=c(.02, .01),
  numU=2,
  accDefn=0:1 )
CombParam <- expand.grid(Param)

# Create sample X and Y data frames  for function call
Y <- data.frame(SymbolID=10:14, OpeningPrice = c(1,3,10,20,60), ClosingPrice
= c(2,2.5,11,18,61.5), YesterdayClose= c(1,3,10,20,60), MinTrVol =
rep(1000, times=5))
X <- data.frame(SymbolID=10:14, weight = c(1, .5, -3, -.75, 2),
CPweight=c(1.5, .25, -1.75, 2, -1), noU = c(2,3,4,2,10))

for (i in 1:length(CombParam$long))
  {
   if(i==1)
 { Results <- calcProfit(CombParam[i,], X, Y)
 } else {
 Results <- rbind(Results, calcProfit(CombParam[i,], X, Y))
   }
  }
--

However, when I try to replace this for loop with 'apply', I get the
following result:

Results2 <- apply(CombParam, 1, calcProfit, X, Y)
Error in IterParam$long : $ operator is invalid for atomic vectors

Any advice that anyone could provide would be much appreciated.

Here is the function which I am using:

--
*calcProfit* <- function(IterParam,  marketData, dailyForecast) {
long <- IterParam$long
short <- IterParam$short
investment <- IterParam$investment
stoploss <- IterParam$stoploss
comission <- IterParam$comission
penny <- IterParam$penny
volume <- IterParam$volume
numU <- IterParam$numU
accDefn <- IterParam$accDefn

compareMarket <- merge(dailyForecast, marketData, by.x="SymbolID",
by.y="SymbolID")

weight <- ifelse(rep(accDefn, times=length(compareMarket$weight))==1,
compareMarket$weight, compareMarket$CPweight)

position <- ifelse((weight<=short & compareMarket$OpeningPrice > penny &
compareMarket$noU>=numU), "S",
  ifelse((weight>=long & compareMarket$OpeningPrice > penny &
compareMarket$noU>=numU), "L", NA))
positionTF <- ifelse(position=="L" | position=="S", TRUE, FALSE)

estMaxInv <- volume*compareMarket$MinTrVol*compareMarket$YesterdayClose

investbySymbol <- ifelse(positionTF==TRUE, ifelse(estMaxInv >=
investment, investment, 0))

opClProfit <- ifelse(position=="L",
compareMarket$ClosingPrice/compareMarket$OpeningPrice-1,
ifelse(position=="S",
1-compareMarket$ClosingPrice/compareMarket$OpeningPrice, 0.0))

Gains <- investbySymbol*ifelse(opClProfit <= stoploss, stoploss,
opClProfit)
ProfitTable <- data.frame(SymbolID=compareMarket$SymbolID,
investbySymbol, Gains, percentGains=Gains/investbySymbol,
LessComm=rep(comission, times=length(Gains)),
NetGains=Gains/investbySymbol-2*comission)

AggregatesTable <- data.frame( OutTotInvestment =
sum(ProfitTable$investbySymbol, na.rm=TRUE),
OutNumInvestments = sum(ProfitTable$investbySymbol,
na.rm=TRUE)/investment, OutDolProf = sum(ProfitTable$Gains, na.rm=TRUE),
OutPerProf = sum(ProfitTable$Gains,
na.rm=TRUE)/sum(ProfitTable$investbySymbol, na.rm=TRUE),
OutNetGains = sum(ProfitTable$Gains,
na.rm=TRUE)/sum(ProfitTable$investbySymbol, na.rm=TRUE)-2*comission, OutLong
= long,
OutShort = short, OutInvestment = investment, OutStoploss =
stoploss, OutComission = comission, OutPenny = penny, OutVolume = volume,
OutNumU = numU, OutAccDefn = accDefn )

return(AggregatesTable)
  }

--



On Mon, Dec 22, 2008 at 4:32 PM, David Winsemius wrote:

> I do agree with Dr Berry that your question failed on several grounds in
> adherence to the Posting Guide, so this is off list.
>
> Maybe this will give you  guidance that you can apply to your next question
> to the list:
>
> > alist <- list("a","b","c")
> > blist <- list("ab","ac","ad")
>
> > expand.grid(alist, blist)
>  Var1 Var2
> 1a   ab
> 2b   ab
> 3c   ab
> 4a   ac
> 5b   ac
> 6c   ac
> 7a   ad
> 8b   ad
> 9c   ad
>
> > apply( expand.grid(alist, blist), 1, fun

Re: [R] newbie problem using Design.rcs

2008-12-23 Thread Frank E Harrell Jr

In addition to David's excellent response, I'll add that your problems 
seem to be statistical and not programming ones.  I recommend that you 
spend a significant amount of time with a good regression text or course 
before using the software.  Also, with Design you can find out the 
algebraic form of the fit:


f <- ols(y ~ rcs(x,3), data=mydata)
Function(f)

Frank


David Winsemius wrote:


On Dec 22, 2008, at 11:38 PM, sp wrote:


Hi,

I read data from a file. I'm trying to understand how to use 
Design.rcs by using simple test data first. I use 1000 integer values 
(1,...,1000) for x (the predictor) with some noise (x+.02*x) and I set 
the response variable y=x. Then, I try rcs and ols as follows:



Not sure what sort of noise that is.


m = ( sqrt(y1) ~ ( rcs(x1,3) ) ); #I tried without sqrt also
f = ols(m, data=data_train.df);
print(f);

[I plot original x1,y1 vectors and the regression as in
y <- coef2[1] + coef2[2]*x1 + coef2[3]*x1*x1]


That does not look as though it would capture the structure of a 
restricted **cubic** spline. The usual method in Design for plotting a 
model prediction would be:


plot(f, x1 = NA)




But this gives me a VERY bad fit:
"


Can you give some hint why you consider this to be a "VERY bad fit"? It 
appears a rather good fit to me, despite the test case apparently not 
being construct with any curvature which is what the rcs modeling 
strategy should be detecting.





--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using transform to add a date column to a dataframe

2008-12-23 Thread Philipp Pagel

> I would like to add a column to the airquality dataset that contains the date
> 1950-01-01 in each row. This method does not appear to work:
> 
> > attach(airquality)
> > data1 <- transform(airquality,Date=as.Date("1950-01-01"))
> 
> Error in data.frame(list(Ozone = c(41L, 36L, 12L, 18L, NA, 28L, 23L, 19L,  : 
>   arguments imply differing number of rows: 153, 1
>  
> I can't decipher what the error message is trying to tell me. Any
> suggestions on how to do this?

You already got an answer solving your problem using transform
and rep. I would like to add that the automatic recycling would
have worked in this case:

airquality$Date <- as.Date('1950-01-01')

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://mips.gsf.de/staff/pagel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using transform to add a date column to a dataframe

2008-12-23 Thread Gabor Grothendieck

or not using transform at all:

data1 <- airquality
data1$Date <- as.Date("1950-01-01")

# or in just one line:

data1 <- replace(airquality, "Date", as.Date("1950-01-01"))


On Tue, Dec 23, 2008 at 9:06 AM, Gavin Simpson  wrote:
> On Tue, 2008-12-23 at 05:24 -0800, Tom La Bone wrote:
>> I would like to add a column to the airquality dataset that contains the date
>> 1950-01-01 in each row. This method does not appear to work:
>>
>> > attach(airquality)
>> > data1 <- transform(airquality,Date=as.Date("1950-01-01"))
>>
>> Error in data.frame(list(Ozone = c(41L, 36L, 12L, 18L, NA, 28L, 23L, 19L,  :
>>   arguments imply differing number of rows: 153, 1
>>
>> I can't decipher what the error message is trying to tell me. Any
>> suggestions on how to do this?
>
> It says that the two arguments have different numbers of observations.
> The reason for which should now be pretty obvious as you provided a
> single Date whereas airquality has 153 observations.
>
> You did read ?transform , which points out this "problem"? ;-)
>
> Anyway, don't assume R recycles everything if it is not of sufficient
> length to match other arguments. In this case, repeat the date as many
> times as there are rows in airquality:
>
>> data(airquality)
>> data1 <- transform(airquality, Date = rep(as.Date("1950-01-01"),
> nrow(airquality)))
>> head(data1)
>  Ozone Solar.R Wind Temp Month Day   Date
> 141 190  7.4   67 5   1 1950-01-01
> 236 118  8.0   72 5   2 1950-01-01
> 312 149 12.6   74 5   3 1950-01-01
> 418 313 11.5   62 5   4 1950-01-01
> 5NA  NA 14.3   56 5   5 1950-01-01
> 628  NA 14.9   66 5   6 1950-01-01
>
> Also, the attach(airquality) call in your example doesn't do anything
> that affects your example so is redundant.
>
> HTH
>
> G
>
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
>  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
>  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
>  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using transform to add a date column to a dataframe

2008-12-23 Thread Gavin Simpson

On Tue, 2008-12-23 at 05:24 -0800, Tom La Bone wrote:
> I would like to add a column to the airquality dataset that contains the date
> 1950-01-01 in each row. This method does not appear to work:
> 
> > attach(airquality)
> > data1 <- transform(airquality,Date=as.Date("1950-01-01"))
> 
> Error in data.frame(list(Ozone = c(41L, 36L, 12L, 18L, NA, 28L, 23L, 19L,  : 
>   arguments imply differing number of rows: 153, 1
>  
> I can't decipher what the error message is trying to tell me. Any
> suggestions on how to do this?

It says that the two arguments have different numbers of observations.
The reason for which should now be pretty obvious as you provided a
single Date whereas airquality has 153 observations.

You did read ?transform , which points out this "problem"? ;-)

Anyway, don't assume R recycles everything if it is not of sufficient
length to match other arguments. In this case, repeat the date as many
times as there are rows in airquality:

> data(airquality)
> data1 <- transform(airquality, Date = rep(as.Date("1950-01-01"),
nrow(airquality)))
> head(data1)
  Ozone Solar.R Wind Temp Month Day   Date
141 190  7.4   67 5   1 1950-01-01
236 118  8.0   72 5   2 1950-01-01
312 149 12.6   74 5   3 1950-01-01
418 313 11.5   62 5   4 1950-01-01
5NA  NA 14.3   56 5   5 1950-01-01
628  NA 14.9   66 5   6 1950-01-01

Also, the attach(airquality) call in your example doesn't do anything
that affects your example so is redundant.

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

signature.asc
Description: This is a digitally signed message part
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using transform to add a date column to a dataframe

2008-12-23 Thread Tom La Bone


I would like to add a column to the airquality dataset that contains the date
1950-01-01 in each row. This method does not appear to work:

> attach(airquality)
> data1 <- transform(airquality,Date=as.Date("1950-01-01"))

Error in data.frame(list(Ozone = c(41L, 36L, 12L, 18L, NA, 28L, 23L, 19L,  : 
  arguments imply differing number of rows: 153, 1
 
I can't decipher what the error message is trying to tell me. Any
suggestions on how to do this?


-- 
View this message in context: 
http://www.nabble.com/Using-transform-to-add-a-date-column-to-a-dataframe-tp21144414p21144414.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] newbie problem using Design.rcs

2008-12-23 Thread David Winsemius



On Dec 22, 2008, at 11:38 PM, sp wrote:


Hi,

I read data from a file. I'm trying to understand how to use  
Design.rcs by using simple test data first. I use 1000 integer  
values (1,...,1000) for x (the predictor) with some noise (x+.02*x)  
and I set the response variable y=x. Then, I try rcs and ols as  
follows:



Not sure what sort of noise that is.


m = ( sqrt(y1) ~ ( rcs(x1,3) ) ); #I tried without sqrt also
f = ols(m, data=data_train.df);
print(f);

[I plot original x1,y1 vectors and the regression as in
y <- coef2[1] + coef2[2]*x1 + coef2[3]*x1*x1]


That does not look as though it would capture the structure of a  
restricted **cubic** spline. The usual method in Design for plotting a  
model prediction would be:


plot(f, x1 = NA)




But this gives me a VERY bad fit:
"


Can you give some hint why you consider this to be a "VERY bad fit"?  
It appears a rather good fit to me, despite the test case apparently  
not being construct with any curvature which is what the rcs modeling  
strategy should be detecting.


--
David Winsemius


Linear Regression Model

ols(formula = m, data = data_train.df)

n Model L.R.   d.f. R2  Sigma
 1000   4573  2 0.9897   0.76

Residuals:
 Min1QMedian3Q   Max
-4.850930 -0.414008 -0.009648  0.418537  3.212079

Coefficients:
Value Std. Error  t Pr(>|t|)
Intercept  5.90958  0.0672612  87.860
x1 0.03679  0.0002259 162.880
x1'   -0.01529  0.0002800 -54.600

Residual standard error: 0.76 on 997 degrees of freedom
Adjusted R-Squared: 0.9897
"

I appreciate any and all help!

Sincerely,
sp

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Approximate Entropy?

2008-12-23 Thread Stephan Kolassa


Dear guRus,

is there a package that calculates the Approximate Entropy (ApEn) of a 
time series?


RSiteSearch only gave me a similar question in 2004, which appears not 
to have been answered:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/28830.html

RSeek.org didn't yield any results at all.

Happy holidays (where appropriate),
Stephan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Borders for rectangles in lattice plot key

2008-12-23 Thread Richard . Cotton

Hopefully an easy question.  When drawing a rectangles in a lattice plot 
key, how do you omit the black borders?

Here is an example adapted from one on the xyplot help page:

bar.cols <- c("red", "blue")
key.list <- list(
   space="top",
   rectangles=list(col=bar.cols),
   text=list(c("foo", "bar"))
)

barchart(
   yield ~ variety | site, 
   data = barley,
   groups = year, 
   layout = c(1,6),
   ylab = "Barley Yield (bushels/acre)",
   scales = list(x = list(abbreviate = TRUE, minlength = 5)),
   col=bar.cols,
   border="transparent",
   key=key.list
)

Notice the black borders around the rectangles in the key. 

I checked to see if there was an undocumented border component for the 
rectangles compoenent of key that I could set to "transparent" or FALSE, 
but no luck.  I also tried setting lwd=0 on the rectangle component but 
that didn't change anything either.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: cannot allocate vector of size 1.8 Gb

2008-12-23 Thread Prof Brian Ripley


On Mon, 22 Dec 2008, iamsilvermember wrote:




dim(data)

[1] 2228319


dm=dist(data, method = "euclidean", diag = FALSE, upper = FALSE, p = 2)

Error: cannot allocate vector of size 1.8 Gb


That would be an object of size 1.8Gb.

See ?"Memory-limits"





Hi Guys, thank you in advance for helping. :-D

Recently I ran into the "cannot allocate vector of size 1.8GB" error.  I am
pretty sure this is not a hardware limitation because it happens no matter I
ran the R code in a 2.0Ghz Core Duo 2GB ram Mac or on a Intel Xeon 2x2.0Ghz
quard-core 8GB ram Linux server.


Why? Both will have a 3GB address space limits unless the Xeon box is 
64-bit.  And this works on my 64-bit Linux boxes.



I also tried to clear the workspace before running the code too, but it
didn't seem to help...

Weird thing though is that once in a while it will work, but next when I run
clustering on the above result

hc=hclust(dm, method = "complete", members=NULL)

it give me the same error...


See ?"Memory-limits" for the first part.


I searched around already, but the memory.limit, memory.size method does not
seem to help.  May I know what can i do to resolve this problem?


What are you going to do with an agglomerative hierarchical clustering of 
22283 objects?  It will not be interpretible.



Thank you so much for your help.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] AR(2) coefficient interpretation

2008-12-23 Thread Prof Brian Ripley


You forgot to RTFM.  From ?arima

 Different definitions of ARMA models have different signs for the
 AR and/or MA coefficients.  The definition used here has


 'X[t] = a[1]X[t-1] + ... + a[p]X[t-p] + e[t] + b[1]e[t-1] + ... + 
b[q]e[t-q]'



 and so the MA coefficients differ in sign from those of S-PLUS.
 Further, if 'include.mean' is true (the default for an ARMA
 model), this formula applies to X - m rather than X.

Since you have not yet produced a reproducible example (at least in a 
single email), we don't have enough information to reproduce your reults.
But I hope we are not fitting AR(2) models to (potentialy seasonal) time 
series of length 11.


On Mon, 22 Dec 2008, Stephen Oman wrote:



As I need your urgent help so let me modify my question. I imported the
following data set to R and run the statements i mentioned in my previous
reply
  Year Month   Period ab  c
1  2008   Jan 2008-Jan 105,536,785  9,322,074  9,212,111
2  2008   Feb 2008-Feb 137,239,037 10,986,047 11,718,202
3  2008   Mar 2008-Mar 130,237,985 10,653,977 11,296,096
4  2008   Apr 2008-Apr 133,634,288 10,582,305 11,729,520
5  2008   May 2008-May 161,312,530 13,486,695 13,966,435
6  2008   Jun 2008-Jun 153,091,141 12,635,693 13,360,372
7  2008   Jul 2008-Jul 176,063,906 13,882,619 15,202,934
8  2008   Aug 2008-Aug 193,584,660 14,756,116 16,083,263
9  2008   Sep 2008-Sep 180,894,120 13,874,154 14,524,268
10 2008   Oct 2008-Oct 196,691,055 14,998,119 15,802,627
11 2008   Nov 2008-Nov 184,977,893 13,748,124 14,328,875

and the AR result is
Call:
arima(x = a, order = c(2, 0, 0))

Coefficients:
ar1 ar2  intercept
 0.4683  0.4020 5.8654
s.e.  0.2889  0.3132 2.8366

sigma^2 estimated as 4.115:  log likelihood = -24.04,  aic = 56.08

The minimum mount of a is more than 100 million and the intercept is 5.86
based on the result above.
If I placed all values into the formula then Xt=5.8654+0.4683*(184,977,893
)+0.4020*(196,691,055 )= 165,694,957.27. Do you think that makes sense? Did
i interpret the result incorrectly?

Also, i submit the following statement for the prediction of next period


predict<-predict(fit, n.ahead=1)
predict


it came out the value of 9.397515 below and I have no idea about how to
interpret this value. Please help.

$pred
Time Series:
Start = 12
End = 12
Frequency = 1
[1] 9.397515

$se
Time Series:
Start = 12
End = 12
Frequency = 1
[1] 2.028483



Stephen Oman wrote:


I am a beginner in using R and I need help in the interpretation of AR
result by R.  I used 12 observations for my AR(2) model and it turned out
the intercept showed 5.23 while first and second AR coefficients showed
0.40 and 0.46. It is because my raw data are in million so it seems the
intercept is too small and it doesn't make sense. Did i make any mistake
in my code? My code is as follows:

r<-read.table("data.txt", dec=",", header=T)
attach(r)
fit<-arima(a, c(2,0,0))

Thank you for your help first.




--
View this message in context: 
http://www.nabble.com/AR%282%29-coefficient-interpretation-tp21129322p21138255.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

81 matches

Mail list logo