date:20100305

Re: [R] Screen settings for point of view in lattice and misc3d

2010-03-05 Thread Deepayan Sarkar

On Wed, Mar 3, 2010 at 1:22 PM, Waichler, Scott R
 wrote:
> I'm making some 3D plots with contour3d from misc3d and wireframe from 
> lattice.  I want to view them from below; i.e. the negative z-axis.  I can't 
> figure out how to do so.  I would like my point of view looking up from 
> below, with the z, y, and x axes positive going away.  Can anyone tell me the 
> correct settings for screen to achieve this?  Here is what I've found so far:
>
>  screen=list(z=-40, x=-60, y=0), # looking down and away in negative x 
> direction
>  screen=list(z=40, x=60, y=0),  # domain turned upside down, looking up and 
> away in neg. x direction
>  screen=list(z=-40, x=60, y=0),  # domain turned upside down, looking up and 
> away in pos. x direction
>  screen=list(z=40, x=-60, y=0),     # looking down and away in positive x 
> direction

The initial view is along the positive z-axis:

wireframe(volcano, shade = TRUE, screen = list())

To change it to the negative z-axis, rotate by 180 along the x-axis:

wireframe(volcano, shade = TRUE, screen = list(x = 180))

Then, add any further adjustments you wish:

wireframe(volcano, shade = TRUE, screen = list(x = 180, z = -30, x = -30))

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to create a line and bar panel chart with two different axes?

2010-03-05 Thread Deepayan Sarkar

On Wed, Mar 3, 2010 at 12:35 PM, DougNiu  wrote:
>
> I need to create a line and bar panel chart with two different axes. I tried
> in lattice but couldn't get it worked. Here is my code:
>
> data(barley)
> barchart(yield ~ variety | site, data = barley,
>              groups = year, layout = c(1,6), stack = F,
>              auto.key = list(points = FALSE, rectangles = TRUE, space =
> "right"),
>              ylab = "Barley Yield (bushels/acre)",
>              scales = list(x = list(rot = 45)))
>
> Suppose now I need to add two lines in each panel to show the cost (10^3
> dollars) of each type (Svansota,,Trebi) at different locations
> (Waseca,..., Grand Rapids) for 1931 and 1932.
>
> Can any body tell me how I should do to create this chart with two different
> axes (one is yield, the other is cost)?

A couple of relevant examples:

http://lmdvr.r-forge.r-project.org/figures/figures.html?chapter=05;figure=05_13;
http://lmdvr.r-forge.r-project.org/figures/figures.html?chapter=08;figure=08_06;

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Robust SE for lrm object

2010-03-05 Thread Achim Zeileis


On Sat, 6 Mar 2010, David Winsemius wrote:


On Mar 5, 2010, at 11:54 PM, Patrick Shea wrote:



I'm trying to obtain the robust standard errors for a multinomial ordered 
logit model:


mod6 <- lrm(wdlshea ~   initdesch + concap + capasst + qualrat + 
terrain,data=full2)


The model is fine but when I try to get the RSE I get an error.

coeftest(mod6, vcov = vcovHAC(mod6))

Error in match.arg(type) :
'arg' should be one of ?ordinary?, ?score?, ?score.binary?,  ?pearson?, 
?deviance?, ?pseudo.dep?, ?partial?, etc.


I'm a novice R user and am not sure how to address this problem. I have 
also tried to use alternatives   (zelig, polr) but have had no luck. Any 
assistance on generating RSE for a multinomial order logit model would be 
appreciated


Have you loaded the library that contains the vcovHAC function?


That is in the "sandwich" package. However, I doubt that it makes sense in 
this context. Using HAC covariances would imply that you have time series 
data and want to correct for heteroskedasticity and autocorrelation. I'm 
not even sure whether sandwich standard errors would be terribly useful. 
Both would require that you correctly specified the estimating functions 
of your proportional odds logistic regression but misspecified a few other 
aspects of the remaining likelihood. Not sure whether that can be obtained 
for an ordinal multinomial response.



(And do you know whether coeftest works with Design/rms objects?)


It does (unlike its own summary function in some situations):

library("rms")
library("lmtest")
data("BankWages", package = "AER")
fm <- lrm(job ~ ., data = BankWages)
summary(fm)
coeftest(fm)

The reason why vcovHAC() or sandwich() do not work is that bread() and 
estfun() methods would need to be available for "lrm" objects which is 
currently not the case (dito for "polr" objects). In principle they could 
be written, see

  vignette("sandwich-OOP", package = "sandwich")
but as I said above I'm not sure whether it would be very useful.
Z



--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scientific (statistical) foundation for Y-RANDOMIZATION in regression analysis

2010-03-05 Thread Greg Snow

In the stats literature these are more often called permutation tests.  Looking 
up that term should give you some results (if not, I have some references, but 
they are at work and I am not, I could probably get them for you on Monday if 
you have not found anything before then).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Damjan Krstajic
> Sent: Friday, March 05, 2010 5:39 PM
> To: r-help@r-project.org
> Subject: [R] scientific (statistical) foundation for Y-RANDOMIZATION in
> regression analysis
> 
> 
> Dear all,
> 
> I am a statistician doing research in QSAR, building regression models
> where the dependent variable is a numerical expression of some chemical
> activity and input variables are chemical descriptors, e.g. molecular
> weight, number of carbon atoms, etc.
> 
> I am building regression models and I am confronted with a widely a
> technique called Y-RANDOMIZATION for which I have difficulties in
> finding references in general statistical literature regarding
> regression analysis. I would be grateful if someone could point me to
> papers/literature in statistical regression analysis which give
> scientific (statistical) foundation for using Y-RANDOMIZATION.
> 
> Y-RANDOMIZATION is a widely used technique in QSAR community to unsure
> the robustness of a QSPR (regression) model. It is used after the
> "best" regression model is selected and to make sure that there are no
> chance correlations. Here is a short description. The dependent
> variable vector (Y-vector) is randomly shuffled and a new QSPR
> (regression) model is fitted using the original independent variable
> matrix. By repeating this a number of times, say 100 times, one will
> get hundred R2 and q2 (leave one out cross-validation R2) based on
> hundred shuffled Y. It is expected that the resulting regression models
> should generally have low R2 and low q2 values. However, if the
> majority of hundred regression models obtained in the Y-randomization
> have relatively high R2 and high q2 then it implies that an acceptable
> regression model cannot be obtained for the given data set by the
> current modelling method.
> 
> I cannot find any references to Y-randomization or Y-scrambling
> anywhere in the literature outside chemometrics/QSAR. Any links or
> references would be much appreciated.
> 
> Thanks in advance.
> 
> DK
> --
> Damjan Krstajic
> Director
> Research Centre for Cheminformatics
> Belgrade, Serbia
> 
> --
> 
> 
> _
> Tell us your greatest, weirdest and funniest Hotmail stories
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] complex--scatter plot

2010-03-05 Thread OEM Configuration (temporary user)



My input

"ID"  "Label"   "*Stype*" "Ntype"   "Stype_No""*log*"
"S1"  "xxx" "A/A" 1   6   2.8
"S1"  "xxx" "A/G" 2   2   3
"S1"  "xxx" "G/G" 3   1   4
"S2"  "yyy" "A/A" 1   1   6.8
"S2"  "yyy" "A/G" 2   2   7
"S2"  "yyy" "G/G" 3   6   7.4
"S2"  "yyy" "NULL""null""null"8
"S3"  "zzz" "A/A" 1   3   12
"S3"  "zzz" "A/G" 2   3   14
"S3"  "zzz" "G/G" 3   3   16
"S3"  "zzz" "NULL""null""null"18
"S3"  "zzz" "NULL""null""null"20




I would like to draw a single scatter plot for every ID (S1 -plot and 
S2-Plot)by using *Stype* as X axis and *log* as Y-axis with a header 
"D_Scatterplot".

For example S1-Plot looks like this
X-axis consider Stype A/A=1   A/G=2   G/G=3 and their Stype_No 
(6,2,1)means A/A  i.e, 1 repeated 6 times with log values 2.8,3 and 4 
and A/G i.e. 2 repeated 2 times and G/G i.e. 3 1 time with the same log 
values. simply it looks like this


1   2   3   2.8
1   2   
3
1   

4
1   


1   


1   





Y-axis


Label_ 6 |
log   5 |
   4 |   ***
   3 |   ** **
   2 |   ** ** *
   1 |
   0 |__
  A/A   A/G   G/G   
  
  ID*


Thanx n advance
Pearl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conditioning variable in panel.xyplot?

2010-03-05 Thread Deepayan Sarkar

On Fri, Mar 5, 2010 at 12:45 PM, Seth W Bigelow  wrote:
> I'm stumped after an hour or so reading about subscripts in panel.xyplot.
> Apparently the panel function is executed for each subset of data in the
> main dataset (specified by the conditioning variable, 'site' in my
> example), and the 'subscripts' keyword passes a vector of the
> corresponding row numbers to the panel function. But, if I want the panel
> function to simultaneously plot data from a different dataframe, as in the
> example below, I don't understand how having a vector of row numbers from
> a subset of the dataframe used in the main xyplot statement helps me with
> selecting data from an entirely different dataframe ('q' in my example).

It doesn't (I didn't read your original question carefully enough).

Look at ?which.packet. Continuing your example,

q.split <- split(q, q$site)

mypanel <- function(..., alt.data) {
with(alt.data[[ which.packet()[1] ]],
 panel.xyplot(x = x, y = y, col="red"))
panel.xyplot(...)
}

xyplot(y ~ x | site, d, alt.data = q.split,
   panel = mypanel)

-Deepayan

>
> library(lattice)
>
> d <- data.frame(site  = c(rep("A",12), rep("B",12)),
> x=rnorm(24),y=rnorm(24))
> q <- data.frame(site  = c(rep("A",7), rep("B",7)),
> x=rnorm(14),y=rnorm(14))
>
> mypanel <- function(...){
>                panel.xyplot(q$x, q$y, col="red")
>                panel.xyplot(...)}
>
> xyplot(y ~ x | site, d,
>        panel = mypanel
>        )
>
> --Seth
>
>
> On Thu, Mar 4, 2010 at 4:42 PM, Seth W Bigelow  wrote:
>> I wish to create a multipanel plot (map) from several datasets ("d" and
>> "q" in the example below). I can condition the main xyplot statement on
>> the "site" variable, but I don't know how to pass a conditioning
> variable
>> to panel.xyplot plot so that the x-y coordinates from dataset q are only
>> plotted at the appropriate site.
>
> The keyword is 'subscripts'. Look at the entry for 'panel' in ?xyplot,
> and let us know if you still have doubts.
>
> -Deepayan
>
>>
>>
>> library(lattice)
>> d <- data.frame(site  = c(rep("A",12), rep("B",12)),
>> x=rnorm(24),y=rnorm(24))                        # Create dataframe "d",
>> with 12 x-y coordinates for each site
>> q <- data.frame(site  = c(rep("A",7), rep("B",7)),
>> x=rnorm(14),y=rnorm(14))                        # Create dataframe "q",
>> with 7 pairs of x-y coordinates for each site.
>>
>> mypanel <- function(...){
>>        panel.xyplot(q$x, q$y, col="red")               # Statement that
>> needs a "Site" conditioning variable
>>        panel.xyplot(...)}
>>
>> xyplot(y~x|site, d, panel=mypanel)      # Statement erroneously plots
> all
>> 14 x-y points in "q" on panels for sites A & B
>>
>>
>>
>> Dr. Seth  W. Bigelow
>> Biologist, USDA-FS Pacific Southwest Research Station
>> 1731 Research Park Drive, Davis California
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Robust SE for lrm object

2010-03-05 Thread David Winsemius



On Mar 5, 2010, at 11:54 PM, Patrick Shea wrote:



I'm trying to obtain the robust standard errors for a multinomial  
ordered logit model:


mod6 <- lrm(wdlshea ~   initdesch + concap + capasst + qualrat +  
terrain,data=full2)


The model is fine but when I try to get the RSE I get an error.

coeftest(mod6, vcov = vcovHAC(mod6))

Error in match.arg(type) :
 'arg' should be one of ordinary, score, score.binary,   
pearson, deviance, pseudo.dep, partial, etc.


I'm a novice R user and am not sure how to address this problem. I  
have also tried to use alternatives   (zelig, polr) but have had no  
luck. Any assistance on generating RSE for a multinomial order logit  
model would be appreciated


Have you loaded the library that contains the vcovHAC function?

(And do you know whether coeftest works with Design/rms objects?)

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Robust SE for lrm object

2010-03-05 Thread Patrick Shea


I'm trying to obtain the robust standard errors for a multinomial ordered logit 
model:

mod6 <- lrm(wdlshea ~   initdesch + concap + capasst + qualrat + 
terrain,data=full2)

The model is fine but when I try to get the RSE I get an error.

coeftest(mod6, vcov = vcovHAC(mod6))

Error in match.arg(type) : 
  'arg' should be one of ordinary, score, score.binary,  pearson, 
deviance, pseudo.dep, partial, etc. 

I'm a novice R user and am not sure how to address this problem. I have also 
tried to use alternatives   (zelig, polr) but have had no luck. Any assistance 
on generating RSE for a multinomial order logit model would be appreciated

Thanks,
Patrick 
  
_
Hotmail: Trusted email with powerful SPAM protection.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] converting multiple lines of text to a data frame

2010-03-05 Thread Phil Spector


Andrew-
   Maybe something like this:


dd = read.table(filename)
unstack(dd,V2~V1)

  A. B.   C.
1  1  2 10.0
2 34 20  6.7
3  2 78 35.0

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu




On Fri, 5 Mar 2010, Andrew Yee wrote:


I'm trying to find a way for converting multiple lines of text into a
table.  I'm not sure if there's a way where you can use read.delim()
to read in multiple lines of text and create the following data frame
with something akin to rehape()?.  Apologies if there is an obvious
way to do this.

A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35

Convert the above lines into the following data frame

A B C
1  2  10
34    20    6.7
2  78    35

Thanks,
Andrew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Writing own simulation function in C

2010-03-05 Thread Sharpie

TheSavageSam wrote:
> 
> I am wishing to write my own random distribution simulation function using
> C programmin language(for speed) via R. I am familiar with R programming
> but somewhat new to C programming. I was trying to understand "Writing R
> extensions" -guide and its part 6.16, but I found it hard to
> understand(http://cran.r-project.org/doc/manuals/R-exts.html#Standalone-Mathlib).
> I also tried to get familiar with the example codes, but it didn't make me
> any wiser.
> 
> The biggest problem seems to be how to get(what code to write in C) random
> uniform numbers using Rmath. That seems to be too complicated to me at the
> moment. So if someone of you could give a small example and it would
> definitely help me a lot. All I wish to do first is to finally write(and
> understand) my own function similar to what you run in R Command line via
> command "runif".
> 
> And all this I am hoping to do without recompiling my whole R. Instead of
> that I think that it is possible to use dyn.load("code.so") in R. (Yes, I
> use linux)
> 

If the code in this message gets mangled by the mailing list, try viewing it
on nabble:

http://n4.nabble.com/Writing-own-simulation-function-in-C-td1580190.html#a1580190

...

Actually, funny story... I wrote this post using Nabble, but the Spam
Detector refuses to let me post it claiming that there are "too many sex
words" due some constructs in the R language.  So I guess this confirms that
using R is the quickest way to increase your statistical manhood/womanhood.

As a result of this, all code chunks have been moved to a gist at GitHub--
links are provided.

...

I would suggest first learning how to implement callbacks to R functions
from C functions rather than diving straight into calling the C
implementations of R functions.  The reasons are:

  1.  You already know how the R functions is supposed to be called.

  2.  You don't have to search through the R source to track down the C
implementation of the function you are after.

I would also bet that the overhead of calling an R function that calls it's
compiled implementation is not that significant-- but I haven't done any
profiling.

The way I would approach this problem would be to create a new package since
the package structure helps manage the steps involved in loading and testing
the compiled code.

So, here's an example of calling the R function runif() from C (tested on
Linux Mint 8 with R 2.10.1):

Start an R terminal and execute the following:

  setwd( 'path/to/wherever/you/like/to/work' )

  # Create a dummy wrapper function to start the package with
  myRunif <- function( n, min = 0, max = 1 ){}

  # Start the package
  package.skeleton( 'RtoC' )

Now, quit R.  You'll notice that a folder called RtoC appeared in the
directory where R was working.  Edit RtoC/R/myRunif.R to contain the
following:

  see:
  http://gist.github.com/323498#file_my_runif.r

The above R function is a wrapper for our compiled C function.  It states
that we will be calling a C routine named "myRuinf" and passing the
parameters n, min and max.  The .Call() interface will pass these arguments
as separate SEXP objects, which are the C-level representation of R objects. 
The alternatives to .Call are .External which gathers all the arguments up
and passes them as a single SEXP and .C() which passes them directly as
basic C variables (care must be used with .C(), you may need to wrap things
in as.double(), as.integer(), etc.).

I am using .Call() over .C() because .Call() allows an SEXP (i.e. native R
object) to be returned directly.   .External() also returns a SEXP, but we
are calling back to an R function so we would have to split each argument
into a separate SEXP anyway.  With .C() we would have to coerce the C types
to separate SEXP variables. Furthermore, with .C(), an additional variable
to hold the return value would have to be included as an argument the
function as we are expecting to return a vector and all the arguments are
scalars.

So, now the task is to write the C implementation of myRunif.  Create the
following folder in the package:

  mkdir  RtoC/src

Any C or Fortran code placed in the /src folder of an R package will be
automagically compiled to a shared library that will be included when the
package is installed.  So, let's add a C routine, RtoC/src/myRunif.c:

  see:
  http://gist.github.com/323498#file_m_runif.c

The first thing that myRunif.c does is to locate the namespace of the stats
function.  This is important as runif() lives in that namespace and we will
be executing a call to it later.  The first thing to note is that results
returned by R are assigned inside of a PROTECT() statement.  This is because
SEXPs are *pointers* to data and we don't want the R garbage collecter to
munch the data we're pointing to before we get a chance to use it.

After finding the stats namespace, the second operation is to set up the
call back to the R function ruinf()-- this is started by creating a vector
o

Re: [R] transposing data

2010-03-05 Thread Phil Spector


Frank -
   I think you need to create a composite time variable
to do what you want to do:

exp1.r5$key = with(exp1.r5,paste(CannonAngle,CannonOriB,
  CannonOriR,nRedPelelts,TargetColor,tbearing,sep='.'))
exp1.r5.use = subset(exp1.r5,select=-c(CannonAngle,CannonOriB,
  CannonOriR,nRedPelelts,TargetColor,tbearing))
exp1.r5.wide <- reshape(exp1.r5.use, idvar="Subject", direction = "wide",
   v.names="RT", timevar = 'key')

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu





On Fri, 5 Mar 2010, Frank Tamborello wrote:


Hi. I have repeated measures data of the form where each observation
is a trial, and trials are grouped by subject, and variables encode
whatever level of a factor was present during that trial, and the
dependent variable is response time (RT). I want to transpose the
data to a form suitable for MANOVA such that there is one observation
per subject and RT is recoded across many variables, the totality of
which represents all the combinations of all the levels of all the
independent variables.

For example, exp1.r5 is my data frame, Subject is the grouping
variable, I want to go from long to wide, the dependent variable is
RT, and I want to get RT as observed at each combination of each
level of all the variables listed in the timevar argument:
exp1.r5.wide <- reshape(exp1.r5, idvar="Subject", direction = "wide",
sep = "", v.names="RT", timevar = c("CannonAngle", "CannonOriB",
"CannonOriR", "nRedPellets", "TargetColor", "tbearing"))

Instead what I get is Subject plus six variables that are some sort
of conjunctions of all the levels of each of the variables I want.
Instead what I want is Subject plus up to 1,008 variables encoding RT
at every combination of CannonAngle, CannonOriB, CannonOriR,
nRedPelelts, TargetColor, and tbearing that occurred. Is that
something the reshape function can give me or should I be looking
elsewhere?

Sincerely,

Frank Tamborello, PhD
W. M. Keck Postdoctoral Fellow
School of Health Information Sciences
University of Texas Health Science Center, Houston


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] converting multiple lines of text to a data frame

2010-03-05 Thread Andrew Yee

I'm trying to find a way for converting multiple lines of text into a
table.  I'm not sure if there's a way where you can use read.delim()
to read in multiple lines of text and create the following data frame
with something akin to rehape()?.  Apologies if there is an obvious
way to do this.

A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35

Convert the above lines into the following data frame

A B C
1  2  10
34    20    6.7
2  78    35

Thanks,
Andrew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] transposing data

2010-03-05 Thread Frank Tamborello

Hi. I have repeated measures data of the form where each observation  
is a trial, and trials are grouped by subject, and variables encode  
whatever level of a factor was present during that trial, and the  
dependent variable is response time (RT). I want to transpose the  
data to a form suitable for MANOVA such that there is one observation  
per subject and RT is recoded across many variables, the totality of  
which represents all the combinations of all the levels of all the  
independent variables.

For example, exp1.r5 is my data frame, Subject is the grouping  
variable, I want to go from long to wide, the dependent variable is  
RT, and I want to get RT as observed at each combination of each  
level of all the variables listed in the timevar argument:
exp1.r5.wide <- reshape(exp1.r5, idvar="Subject", direction = "wide",  
sep = "", v.names="RT", timevar = c("CannonAngle", "CannonOriB",  
"CannonOriR", "nRedPellets", "TargetColor", "tbearing"))

Instead what I get is Subject plus six variables that are some sort  
of conjunctions of all the levels of each of the variables I want.  
Instead what I want is Subject plus up to 1,008 variables encoding RT  
at every combination of CannonAngle, CannonOriB, CannonOriR,  
nRedPelelts, TargetColor, and tbearing that occurred. Is that  
something the reshape function can give me or should I be looking  
elsewhere?

Sincerely,

Frank Tamborello, PhD
W. M. Keck Postdoctoral Fellow
School of Health Information Sciences
University of Texas Health Science Center, Houston


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make this sequence: 1,2,3,4,5,4,3,2,1

2010-03-05 Thread kMan

c(x,(x<-1:5)[4:1])

-Original Message-
From: baptiste auguie [mailto:baptiste.aug...@googlemail.com] 
Sent: Friday, March 05, 2010 1:08 AM
To: kensuguro
Cc: r-help@r-project.org
Subject: Re: [R] how to make this sequence: 1,2,3,4,5,4,3,2,1

c(x <- 1:5, rev(x[-length(x)]))



On 5 March 2010 07:04, kensuguro  wrote:
>
> I'm just beginning R, with book Using R for Introductory Statistics, 
> and one of the early questions has me baffled.  The question is, 
> create the
> sequence: 1,2,3,4,5,4,3,2,1 using seq() and rep().
>
> Now, as a programmer, I am punching myself to not be able to figure it
out..
> I mean, as simple as a for loop, but using seq, I am stumped.  I would 
> think c(1:5, 4:1) would be the brute force method with very non 
> intelligent coding..  there has to be a way to make the "turning 
> point" (in this case 5) parametric right?  So you could change it 
> later and the sequence will reflect it.
> --
> View this message in context: 
> http://n4.nabble.com/how-to-make-this-sequence-1-2-3-4-5-4-3-2-1-tp157
> 9245p1579245.html Sent from the R help mailing list archive at 
> Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--


Baptiste Auguié

Departamento de Química Física,
Universidade de Vigo,
Campus Universitario, 36310, Vigo, Spain

tel: +34 9868 18617
http://webs.uvigo.es/coloides

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] memory error in for loop

2010-03-05 Thread Peter


hi,

I have been attempting to run this script and am getting some strange 
results. The script connects to a database and retrieves a series of 
tables, using sequential sql statements. I have tested all of the sql 
statements in the PostGreSQL terminal and they all return the desired 
results. I place each table into a list and run a FOR loop for 'i' in 
the list. The script generates the first map perfectly and begins to 
draw the second and then crashes in the middle of creating the second 
plot. I am suspicious that somehow i have a problem in my for loop that 
results in memory overload but I have been struggling to figure it out 
all day with no success. I have included below the for loop in the code 
and the beginning of the error message that is produced in the terminal. 
Any suggestions would be most welcome.


thanks in advance,

peter

script:

#create a list of results tables
matrixlist <- list(r_baseline, r_badlce, r_cblend, r_fedforest, r_ffv, 
r_hiencrop, r_loencrop, r_maxfeed) 
#par(mfcol= c(5,5))

for (i in matrixlist){
   
 ptable<-i[,-1]#create plot matrix w/o price point column


 ptable [is.na(ptable)] <- 0 #convert NA values to 0
   
 maxval<-max(ptable) #create volume axis

 interval<-ceiling(maxval)/27
 mgy<- seq(0,ceiling(maxval), 
by=interval) 
 fprice <- c(i$price_point)#create price point axis

 #pdf(paste(i,"fuel_pw.pdf", sep=""), bg="white")
 matplot(ptable, fprice, type="l", col= rainbow(length(names(i
 #rm(i,ptable, fprice, maxval, interval, mgy)
 dev.off()
}

and the resulting error:

+   fprice <- c(i$price_point)#create price point axis
+   #pdf(paste(i,"fuel_pw.pdf", sep=""), bg="white")
+   matplot(ptable, fprice, type="l", col= rainbow(length(names(i
+   rm(i,ptable, fprice, maxval, interval, mgy)
+   dev.off()
+ }
matplot: doing 17 plots with  col= ("#FFFF" "#FF5500FF"
"#FFAA00FF" "#00FF" "#AAFF00FF" "#55FF00FF" "#00FF00FF"
"#00FF55FF" "#00FFAAFF" "#00FF" "#00AA" "#0055"
"#" "#5500" "#AA00" "#FF00" "#FF00AAFF"
"#FF0055FF") pch= ("1" "2" "3" "4" "5" "6" "7" "8" "9" "0" "a" "b" "c"
"d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t"
"u" "v" "w" "x" "y" "z" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K"
"L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z") ...

matplot: doing 23 plots with  col= ("#FFFF" "#FF4000FF"
"#FF8000FF" "#FFBF00FF" "#00FF" "#BFFF00FF" "#80FF00FF"
"#40FF00FF" "#00FF00FF" "#00FF40FF" "#00FF80FF" "#00FFBFFF"
"#00FF" "#00BF" "#0080" "#0040" "#"
"#4000" "#8000" "#BF00" "#FF00" "#FF00BFFF"
"#FF0080FF" "#FF0040FF") pch= ("1" "2" "3" "4" "5" "6" "7" "8" "9" "0"
"a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q"
"r" "s" "t" "u" "v" "w" "x" "y" "z" "A" "B" "C" "D" "E" "F" "G" "H"
"I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y"
"Z") ...

Garbage collection 3 = 2+0+1 (level 0) ...
3.6 Mbytes of cons cells used (38%)
0.8 Mbytes of vectors used (13%)
*** glibc detected *** /usr/lib/R/bin/exec/R: double free or
corruption (!prev): 0x094f8c38 ***
=== Backtrace: =
/lib/i686/cmov/libc.so.6[0xb7486824]
/lib/i686/cmov/libc.so.6[0xb74880b3]
/lib/i686/cmov/libc.so.6(cfree+0x6d)[0xb748b0dd]
/usr/lib/libpq.so.5(PQclear+0xf6)[0xb6e53976]
/home/ptittmann/R/i486-pc-linux-gnu-library/2.10/RdbiPgSQL/libs/RdbiPgSQL.so(PgSQLclearResult+0x25)[0xb6fc2665]
/usr/lib/R/lib/libR.so(R_RunWeakRefFinalizer+0x161)[0xb76721f1]
/usr/lib/R/lib/libR.so[0xb7672357]
/usr/lib/R/lib/libR.so[0xb76743fe]
/usr/lib/R/lib/libR.so(Rf_allocVector+0xb8)[0xb7675628]
/usr/lib/R/lib/libR.so(Rf_usemethod+0x41d)[0xb768306d]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Three most useful R package

2010-03-05 Thread kMan

(1) - nlme, lattice, stats
(2) - a usable large-file/out of memory regression package that abstracts
"all" the details of connections & etc from the user, accept perhaps the
initial function call, so I don't have to actually know anything about the
file I'm opening, how big it is, how many lines of data, how much data my
system can load into memory at once without paging or crashing R, etc, but
will still give me parameter estimates for multiple categorical and
continuous predictors on a TB of data in less than a half hour, and can work
with something more interesting than a matrix.

Sincerely,
KeithC.

-Original Message-
From: Ralf B [mailto:ralf.bie...@gmail.com] 
Sent: Tuesday, March 02, 2010 1:14 PM
To: r-help@r-project.org
Subject: [R] Three most useful R package

Hi R-fans,

I would like put out a question to all R users on this list and hope it will
create some feedback and discussion.

1) What are your 3 most useful R package? and

2) What R package do you still miss and why do you think it would make a
useful addition?

Pulling answers together for these questions will serve as a guide for new
users and help people who just want to get a hint where to look first. Happy
replying!

Best,
Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot interaction in multilevel model

2010-03-05 Thread dadrivr


I am trying to plot an interaction in a multilevel model.  Here is some
sample data.  In the following example, it is longitudinal (i.e., repeated
measures), so the outcome, score (at each of the three time points), is
nested within the individual.  I am interested in the interaction between
gender and happiness predicting score.

id <- c(1,1,1,2,2,2,3,3,3)
age <- c(10,15,20,10,15,20,10,15,20)
gender <- c(1,1,1,0,0,0,1,1,1)
happiness <- c(50,30,25,70,65,80,70,40,60)
score <- c(180,140,110,240,220,280,150,140,130)
mydata <- data.frame(id,age,gender,happiness,score) 

I am looking to create two plots:

1. A plot with score on the y-axis, happiness on the x-axis, gender as the
moderating variable, and a linear best-fit line for each level of gender
(male & female).  Here is my attempt, but I don't know how to make it into
linear best-fit lines:
with(mydata,interaction.plot(happiness,gender,score))

2. A plot with score on the y-axis, age on the x-axis, and 4 different best
fit lines representing the following levels of gender and happiness (male hi
happy, male lo happy, female hi happy, female lo happy).  Here is my
attempt, but I don't know how to create the 4 different best-fit lines
representing the 4 different interaction levels:
with(mydata,interaction.plot(age,gender,score))

Any ideas?  Any help would be greatly appreciated with these two plots. 
Thanks so much!
-- 
View this message in context: 
http://n4.nabble.com/Plot-interaction-in-multilevel-model-tp1580370p1580370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Writing own simulation function in C

2010-03-05 Thread stephen sefick

Look in the sources of runif for the C code?

On Fri, Mar 5, 2010 at 3:54 PM, TheSavageSam  wrote:
>
> I am wishing to write my own random distribution simulation function using C
> programmin language(for speed) via R. I am familiar with R programming but
> somewhat new to C programming. I was trying to understand "Writing R
> extensions" -guide and its part 6.16, but I found it hard to
> understand(http://cran.r-project.org/doc/manuals/R-exts.html#Standalone-Mathlib).
> I also tried to get familiar with the example codes, but it didn't make me
> any wiser.
>
> The biggest problem seems to be how to get(what code to write in C) random
> uniform numbers using Rmath. That seems to be too complicated to me at the
> moment. So if someone of you could give a small example and it would
> definitely help me a lot. All I wish to do first is to finally write(and
> understand) my own function similar to what you run in R Command line via
> command "runif".
>
> And all this I am hoping to do without recompiling my whole R. Instead of
> that I think that it is possible to use dyn.load("code.so") in R. (Yes, I
> use linux)
> --
> View this message in context: 
> http://n4.nabble.com/Writing-own-simulation-function-in-C-tp1580190p1580190.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting Comparisons with Missing Data

2010-03-05 Thread Alastair


Hi, 

I'm new to R and I've run into a problem that I'm not really sure how to
express properly in the language. I've got a data table that I've read from
a file containing some simple information about the performance of 4
algorithms. The columns are the name of the algorithm, the problem instance
and the resulting score on that problem (if it wasn't solved I mark that
with NA).

solver instance result
A prob1 40
B prob1 NA
C prob1 39
D prob1 35
A prob2 100
B prob2 50
C prob2 NA
D prob2 NA
A prob3 75
B prob3 80
C prob3 60
D prob3 70
A prob4 80
B prob4 NA
C prob4 85
D prob4 75

I've managed to read in the data as follows:
data <- table.read("./test.txt", header = TRUE, colClasses =
c("factor","factor","numeric"), na.strings = c("NA"))
and I've got a nice barchart via lattice
library(lattice)
barchart(result ~ instance, group = solver, data = data)

What I want to try and calculate (and plot somehow) is 
a) What percentage of the instances each solver can solve
and b) What percentage of the instances a solver returns a better score than
solver A for that particular problem.

These don't seem like particularly ambitious requirements, but I still don't
really know where to start. Any pointers would be most appreciated.

Thanks,
Alastair
-- 
View this message in context: 
http://n4.nabble.com/Plotting-Comparisons-with-Missing-Data-tp1580334p1580334.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] test question, variance

2010-03-05 Thread Phil Spector

Contact your instructor, and let them know that you are 
having problems with an assignment.  Most instructors have

office hours, where you can discuss problems like this.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Fri, 5 Mar 2010, SquareAce wrote:



Can someone get me going in the right direction with this test question? We
have went as far as t.test in our class, but not to power or anova yet. I
have pasted the question and a summary of the dataset "gpa".

5. Given that all the GPA data in this dataset is comprised of means, would
you expect the underlying dataset of students’ individual GPA to be more
variable or less variable than this dataset as measured by the variance?
Explain. (5 points)

here is the summary of gpa:

summary(gpa)
  schoolGPAgender classification semester 
Business :32   Min.   :2.200   f:80   fresh :40  fa04:40 
Education:32   1st Qu.:2.590   m:80   junior:40  fa05:40 
Liberal  :32   Median :2.790  senior:40  sp05:40 
Nursing  :32   Mean   :2.775  soph  :40  su05:40 
Sciences :32   3rd Qu.:2.922
   Max.   :3.640 
--

View this message in context: 
http://n4.nabble.com/test-question-variance-tp1580064p1580064.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scientific (statistical) foundation for Y-RANDOMIZATION in regression analysis

2010-03-05 Thread Damjan Krstajic


Dear all,

I am a statistician doing research in QSAR, building regression models where 
the dependent variable is a numerical expression of some chemical activity and 
input variables are chemical descriptors, e.g. molecular weight, number of 
carbon atoms, etc.

I am building regression models and I am confronted with a widely a technique 
called Y-RANDOMIZATION for which I have difficulties in finding references in 
general statistical literature regarding regression analysis. I would be 
grateful if someone could point me to papers/literature in statistical 
regression analysis which give scientific (statistical) foundation for using 
Y-RANDOMIZATION.

Y-RANDOMIZATION is a widely used technique in QSAR community to unsure the 
robustness of a QSPR (regression) model. It is used after the "best" regression 
model is selected and to make sure that there are no chance correlations. Here 
is a short description. The dependent variable vector (Y-vector) is randomly 
shuffled and a new QSPR (regression) model is fitted using the original 
independent variable matrix. By repeating this a number of times, say 100 
times, one will get hundred R2 and q2 (leave one out cross-validation R2) based 
on hundred shuffled Y. It is expected that the resulting regression models 
should generally have low R2 and low q2 values. However, if the majority of 
hundred regression models obtained in the Y-randomization have relatively high 
R2 and high q2 then it implies that an acceptable regression model cannot be 
obtained for the given data set by the current modelling method.

I cannot find any references to Y-randomization or Y-scrambling anywhere in the 
literature outside chemometrics/QSAR. Any links or references would be much 
appreciated.

Thanks in advance.

DK
--
Damjan Krstajic
Director
Research Centre for Cheminformatics
Belgrade, Serbia

--

  
_
Tell us your greatest, weirdest and funniest Hotmail stories

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bootstrap standard errors

2010-03-05 Thread Rman


I have this coding to help me work out the bootstrap standard errord for the
following array/matrix
dat <-
matrix(c(8,23,14,13,9,11,25,25,14,11,4,25,19,7,13,1,8,16,2,6,3,4,6,5,7),
nrow = 5, ncol=5, byrow=TRUE)

the function is:

mybootstrap <- function(dat, nr, nc, m)
{
  # dat is a two dimentional array
  # nr is the number of rows in the table
  # nc is the number of columns in the table
  # m is the number of bootstrap simulations
  x <- dat[,1]
  # Tries scored combinations (0-0, 0-1,...)
  y <- dat[,2]
  # Frequencies
  n <- sum(y)
  r <- nr * nc
  v <- matrix(0., r, m)
  prob <- y / sum(y)
  z <- vector(mode = "numeric", length = r)
  for (k in 1.:m){
y <-sample(x, n, replace = T, prob)
for (i in 1.:r){
  z[i] <- length(y[y == x[i]])
}
w <- matrix(0., nr, nc)
for(i in 1:nr){
  w[i, 1:nc] <- z[(nc * (i - 1) + 1) : (nc * i)]
}
rsum <- apply(w, 1., sum)
csum <- apply(w, 2., sum)
u <- matrix(0., nr, nc)
for(i in 1.:nr){
  for (j in 1.:nc){
if(w[i,j] > 0.)
  u[i,j] <- (w[i,j] * 100.* n) / (rsum[i] * csum[j])
  }
}
a <- u[1, ]
for(i in 2:nr){
  a <- c(a, u[i, ])
}
v[, k] <- a
  }
  stderr <- apply (v, 1., var)
  return(sqrt(stderr))
}

mybootstrap(dat, 5, 5, 500)

there are no errors as such but the numbers produced are too small,
incorrect and are reroduced. I am expecting numbers such as 18 or 21 etc not
0.4

can anybody help me please.

thanks

-- 
View this message in context: 
http://n4.nabble.com/Bootstrap-standard-errors-tp1580291p1580291.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] test question, variance

2010-03-05 Thread SquareAce


Can someone get me going in the right direction with this test question? We
have went as far as t.test in our class, but not to power or anova yet. I
have pasted the question and a summary of the dataset "gpa".

5. Given that all the GPA data in this dataset is comprised of means, would
you expect the underlying dataset of students’ individual GPA to be more
variable or less variable than this dataset as measured by the variance?
Explain. (5 points)

here is the summary of gpa:
> summary(gpa)
   schoolGPAgender classification semester 
 Business :32   Min.   :2.200   f:80   fresh :40  fa04:40  
 Education:32   1st Qu.:2.590   m:80   junior:40  fa05:40  
 Liberal  :32   Median :2.790  senior:40  sp05:40  
 Nursing  :32   Mean   :2.775  soph  :40  su05:40  
 Sciences :32   3rd Qu.:2.922  
Max.   :3.640  
-- 
View this message in context: 
http://n4.nabble.com/test-question-variance-tp1580064p1580064.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to parse the arguments from a function call and evaluate them in a dataframe?

2010-03-05 Thread Thomas Lumley


On Fri, 5 Mar 2010, Ravi Varadhan wrote:


Hi,

I would like to write a function which has the following syntax:

myfn <- function(formula, ftime, fstatus, data) {
# step 1: obtain terms in `formula' from dataframe `data'
# step 2: obtain ftime from `data'
# step 3:  obtain fstatus from `data'
# step 4: do model estimation
# step 5: return results
}

The user would call this function as:

myfn(formula=myform, ftime=myftime, fstatus=myfstatus, data=mydata)

Where `myform' is a formula object; and the terms in `myform', and the 
variables `myftime' and `myfstatus' should be obtained from the dataframe 
`mydata'.

I am getting tripped up in trying to figure out how to do the seemingly simple 
steps of 1, 2, and 3.



I don't think steps 2 and 3 are a good idea as written -- I would strongly 
advocate that all variables to be looked up in the data frame should be 
supplied as formulas so that it is clear they are not being evaluated according 
to the usual rules.  Many existing functions work the way you suggest, but I 
still think it's unclear and makes it harder to use them in programs.

Having said that, you can use

   mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "ftime", "fstatus"), names(mf), 0)
mf <- mf[c(1, m)]
mf$drop.unused.levels <- TRUE
mf[[1]] <- as.name("model.frame")
mf <- eval.parent(mf)

to create a model frame that will contain the variables in the formula, and 
columns `(ftime)` and `(fstatus)` for the other arguments.


If you use formulas for ftime and fstatus you would have to call model.frame() 
multiple times, which is a bit more work. You would also need to use 
na.action=na.pass() to let through any missing data and then remove missing 
data after you have all three variables.


  -thomas

Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to assign week numbers to a time-series

2010-03-05 Thread David Winsemius

On Mar 5, 2010, at 4:46 PM, David Winsemius wrote:

On Mar 5, 2010, at 4:23 PM, Hosack, Michael wrote:

Hello everyone,

My progress has stalled on finding a way of creating a somewhat  
complicated variable to add to my existing dataframe and I am  
hoping one of you could help me out. The dataframe below contains  
only a fraction of the data of my complete dataframe, but all of  
the variables. What I want to do is add another variable named  
'WEEK' to this dataframe that is assigned 1 for row 1 and remains 1  
until the first SAT (i.e. Saturday) under variable 'DOW' (day of  
week) occurs, at which point variable 'WEEK' is now assigned 2.  
'WEEK' should continue to be assigned 2 until the following SAT  
under variable 'DOW' at which variable 'WEEK' will now be assigned  
3, and so on. In this scheme, weekdays are such that SAT=1, SUN=2,  
MON=3,.FRI=7. I am basically trying to assign week numbers to  
potential sampling days in a survey season for use in a program  
that will generate a fisheries creel survey schedule. I should note  
that if element 1 happens to have DOW=SAT (that is the case this  
year, since!
the first day of our survey 05/01 is a Saturday), then WEEK 1  
begins on day 1 (05/01/2010) and WEEK 2 will begin on the first SAT  
under variable DOW. I hope I explained this clearly enough, if not  
let me know. If this sent twice, I apologize.

Mike

   MM DD YR DOW DOW. DTYPE  TOD  TOD. SITENUM   DESC
1 05 01 2010 SAT1 2MORN1  
101 WALNUT.CK
18505 01 2010 SAT 1 2MORN1  
102   LAMPE
36905 01 2010 SAT 1 2MORN1  
103  EAST.AVE
55305 01 2010 SAT 1 2   MORN1  
104  NORTH.EAST
73705 01 2010 SAT 1 2  AFTN   2  
101   WALNUT.CK
92105 01 2010 SAT 1 2  AFTN   2  
102LAMPE
1105 05 01 2010 SAT  1 2 AFTN   2  
103   EAST.AVE
1289 05 01 2010 SAT 1 2  AFTN   2  
104   NORTH.EAST
205 02 2010 SUN2 2  MORN1  
101   WALNUT.CK
186   05 02 2010 SUN2 2  MORN1  
102LAMPE
370   05 02 2010 SUN2 2  MORN1  
103EAST.AVE
554   05 02 2010 SUN 2 2 MORN1  
104   NORTH.EAST
738   05 02 2010 SUN 2 2 AFTN  2  
101WALNUT.CK
922   05 02 2010 SUN 2 2  AFTN 2  
102  LAMPE
1106 05 02 2010 SUN  2 2   AFTN   2  
103EAST.AVE
1290 05 02 2010 SUN2 2  AFTN  2  
104NORTH.EAST
305 03 2010 MON3 1MORN1  
101   WALNUT.CK
187   05 03 2010 MON3 1   MORN1  
102   LAMPE
371   05 03 2010 MON  3 1  MORN1  
103   EAST.AVE
555   05 03 2010 MON  3 1  MORN1  
104   NORTH.EAST
739   05 03 2010 MON3 1   AFTN  2  
101WALNUT.CK
923   05 03 2010 MON  3 1  AFTN  2  
102   LAMPE
1107 05 03 2010 MON  3 1  AFTN  2  
103EAST.AVE
1291 05 03 2010 MON 3 1  AFTN  2  
104NORTH.EAST
405 04 2010 TUE  4 1 MORN1  
101   WALNUT.CK
188   05 04 2010 TUE 4 1   MORN1  
102   LAMPE
372   05 04 2010 TUE  4 1MORN 1  
103EAST.AVE

.  ..   .   .   .   .   .   
 .   ..

You could trunc() the results of this function applied to your dates  
and "2010-05-01":

> diffweek <- function(x,y) {difft <- difftime( x , y)/7;  
attr(difft, "units") <- "weeks"; difft}

> diffweek(Sys.Date() , as.Date("2010-01-01") )
Time difference of 9 weeks
> diffweek(Sys.Date()+1 , as.Date("2010-01-01") )
Time difference of 9.142857 weeks

There is also a week function in the tis package.

Perhaps (untested):

dfrm$weeknum <- trunc(apply(dfrm, 1, function(x)  
diffweek(as.Date(x[4], x[2], x[3], sep="-") ,

as.Date("2010-05-01")

  )
 ) )

Well, testing shows that it fails. I didn't get an apply() based  
solution to work and needed to vectorize diffweek:

diffweekV <- Vectorize(diffweek)
seqwk <-  with(dfrm , as.Date(paste(YR, MM, DD, sep="-")) )
dfrm$weeknum <- 1+ trunc(diffweekV(seqwk,  as.Date("2010-05-01")   ) )

(Which looks clearer anyway.)
I did throw a few other values at the function to see if it had  
sensible return values.

[R] How to parse the arguments from a function call and evaluate them in a dataframe?

2010-03-05 Thread Ravi Varadhan

Hi,

I would like to write a function which has the following syntax:

myfn <- function(formula, ftime, fstatus, data) {
# step 1: obtain terms in `formula' from dataframe `data'
# step 2: obtain ftime from `data'
# step 3:  obtain fstatus from `data'
# step 4: do model estimation
# step 5: return results
}

The user would call this function as:

myfn(formula=myform, ftime=myftime, fstatus=myfstatus, data=mydata)

Where `myform' is a formula object; and the terms in `myform', and the 
variables `myftime' and `myfstatus' should be obtained from the dataframe 
`mydata'.

I am getting tripped up in trying to figure out how to do the seemingly simple 
steps of 1, 2, and 3.  

I looked at the code for `lm', `coxph', `nls' etc, but they are too complicated 
for my understanding.  Is there a simple way to accomplish this?  

Thanks very much,
Ravi.


Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to assign week numbers to a time-series

2010-03-05 Thread Achim Zeileis


On Fri, 5 Mar 2010, Hosack, Michael wrote:


Hello everyone,

My progress has stalled on finding a way of creating a somewhat complicated 
variable to add to my existing dataframe and I am hoping one of you could help 
me out. The dataframe below contains only a fraction of the data of my complete 
dataframe, but all of the variables. What I want to do is add another variable 
named 'WEEK' to this dataframe that is assigned 1 for row 1 and remains 1 until 
the first SAT (i.e. Saturday) under variable 'DOW' (day of week) occurs, at 
which point variable 'WEEK' is now assigned 2. 'WEEK' should continue to be 
assigned 2 until the following SAT under variable 'DOW' at which variable 
'WEEK' will now be assigned 3, and so on. In this scheme, weekdays are such 
that SAT=1, SUN=2, MON=3,.FRI=7. I am basically trying to assign week 
numbers to potential sampling days in a survey season for use in a program that 
will generate a fisheries creel survey schedule. I should note that if element 
1 happens to have DOW=SAT (that is the case this year, sin!

ce!

 the first day of our survey 05/01 is a Saturday), then WEEK 1 begins on day 1 
(05/01/2010) and WEEK 2 will begin on the first SAT under variable DOW. I hope 
I explained this clearly enough, if not let me know.


If I understand correctly, then you can easily do the following:
  - create a "Date" object with YR-MM-DD
  - compute the number of days since origin (2010-04-30 in your case)
  - compute integer division by 7 plus 1

In one step:

  as.numeric(as.Date(paste(YR, MM, DD, sep = "-")) -
as.Date("2010-04-30")) %/% 7 + 1

BTW: Some of the rest of your information (like DOW) could also be easily 
inferred from the "Date", e.g., via transformation to POSIXlt:


  as.POSIXlt(as.Date("2010-05-01"))$wday

hth,
Z


Mike

MM DD YR DOW DOW. DTYPE  TOD  TOD. SITENUM   DESC
1 05 01 2010 SAT1 2MORN1 101 
WALNUT.CK
18505 01 2010 SAT 1 2MORN1 102   
LAMPE
36905 01 2010 SAT 1 2MORN1 103  
EAST.AVE
55305 01 2010 SAT 1 2   MORN1 104  
NORTH.EAST
73705 01 2010 SAT 1 2  AFTN   2 101   
WALNUT.CK
92105 01 2010 SAT 1 2  AFTN   2 102
LAMPE
1105 05 01 2010 SAT  1 2 AFTN   2 103   
EAST.AVE
1289 05 01 2010 SAT 1 2  AFTN   2 104   
NORTH.EAST
205 02 2010 SUN2 2  MORN1 101   
WALNUT.CK
186   05 02 2010 SUN2 2  MORN1 102LAMPE
370   05 02 2010 SUN2 2  MORN1 103
EAST.AVE
554   05 02 2010 SUN 2 2 MORN1 104   
NORTH.EAST
738   05 02 2010 SUN 2 2 AFTN  2 101
WALNUT.CK
922   05 02 2010 SUN 2 2  AFTN 2 102  
LAMPE
1106 05 02 2010 SUN  2 2   AFTN   2 103
EAST.AVE
1290 05 02 2010 SUN2 2  AFTN  2 104
NORTH.EAST
305 03 2010 MON3 1MORN1 101   
WALNUT.CK
187   05 03 2010 MON3 1   MORN1 102   LAMPE
371   05 03 2010 MON  3 1  MORN1 103   EAST.AVE
555   05 03 2010 MON  3 1  MORN1 104   
NORTH.EAST
739   05 03 2010 MON3 1   AFTN  2 101
WALNUT.CK
923   05 03 2010 MON  3 1  AFTN  2 102   
LAMPE
1107 05 03 2010 MON  3 1  AFTN  2 103
EAST.AVE
1291 05 03 2010 MON 3 1  AFTN  2 104
NORTH.EAST
405 04 2010 TUE  4 1 MORN1 101   
WALNUT.CK
188   05 04 2010 TUE 4 1   MORN1 102   LAMPE
372   05 04 2010 TUE  4 1MORN 1 103
EAST.AVE
.  ..   .   .   .   .   .   
 .   ..

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R algorithm for maximum curvatures measures of nonlinear models

2010-03-05 Thread Walmes Marques Zeviani


Hi all,

I'm looking for the algorithm to calculate maximum intrinsic and parameter 
effects curvature of Bates & Watts (1980). I have Bates & Watts (1980) original 
article, Bates et al (1983) article, Seber & Wild (1989) and Ratkowsky (1983) 
books. Those sources show steps and algorithm to get this measures but I don't 
know translate C code to R language and I've no success until now.

I know and I use rms.curv() in MASS package but I would like the maximum 
curvatures measures.

Does someone have this implemented in R? Or know some paper illustrating this 
calculation with real data (but not so trivial) that can able me to create my 
functions?

Thanks a lot.
Walmes Zeviani.

Bates; Hamilton; Watts. Calculation onf intrinsic and parameter-effects 
curvatures for nonlinear models. Communications in statistics - simulations and 
computation, 469-477, 1983.
Bates; Watts. Relative curvature measures of nonlinearity. Journal Royal 
Statistical Society - B. 40, 1-25, 1980.
Seber; Wild. Nonlinear regression. Wiley, 1989.
Ratkowsky. Nonlinear regression modeling. Marcel Dekker, 1983.
  
_



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Redhat Linux Install

2010-03-05 Thread Ryan Garner


I just installed R on Redhat Linux at work for the first time and have two
questions.

1. I tried to install R to have png and cairo capabilities and was
unsuccessful. Before running make, I ran ./configure --with-libpng=yes
--with-x=no --with-cairo=yes --with-readline-yes . R installed fine, but
when I run R and type capabilities()
> capabilities()
 jpeg pngtiff   tcltk X11   aqua   http/ftp 
sockets 
TRUE TRUE TRUE TRUEFALSEFALSE TRUE  TRUE 
  l  ibxmlfifo cledit iconv NLS  profmemcairo 
TRUEFALSE TRUE TRUE TRUE TRUEFALSE 

Why are png and cairo still FALSE?

2. I would also like to have X11 enabled. From reading the message board,
the consesus seems to be to install xorg-dev. I'm unable to do this because
I don't have root or super user priveleges. But if I'm able to log into my
work servers with PuTTY and Xming and run xemacs or xvim, does this mean
that X11 is already installed somewhere? If so, how do I specify this when
doing ./configure?

-- 
View this message in context: 
http://n4.nabble.com/Redhat-Linux-Install-tp1580181p1580181.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to assign week numbers to a time-series

2010-03-05 Thread Hosack, Michael

Hello everyone,

My progress has stalled on finding a way of creating a somewhat complicated 
variable to add to my existing dataframe and I am hoping one of you could help 
me out. The dataframe below contains only a fraction of the data of my complete 
dataframe, but all of the variables. What I want to do is add another variable 
named 'WEEK' to this dataframe that is assigned 1 for row 1 and remains 1 until 
the first SAT (i.e. Saturday) under variable 'DOW' (day of week) occurs, at 
which point variable 'WEEK' is now assigned 2. 'WEEK' should continue to be 
assigned 2 until the following SAT under variable 'DOW' at which variable 
'WEEK' will now be assigned 3, and so on. In this scheme, weekdays are such 
that SAT=1, SUN=2, MON=3,.FRI=7. I am basically trying to assign week 
numbers to potential sampling days in a survey season for use in a program that 
will generate a fisheries creel survey schedule. I should note that if element 
1 happens to have DOW=SAT (that is the case this year, since!
  the first day of our survey 05/01 is a Saturday), then WEEK 1 begins on day 1 
(05/01/2010) and WEEK 2 will begin on the first SAT under variable DOW. I hope 
I explained this clearly enough, if not let me know.

Mike

 MM DD YR DOW DOW. DTYPE  TOD  TOD. SITENUM   DESC
1 05 01 2010 SAT1 2MORN1 101 
WALNUT.CK
18505 01 2010 SAT 1 2MORN1 102   
LAMPE
36905 01 2010 SAT 1 2MORN1 103  
EAST.AVE
55305 01 2010 SAT 1 2   MORN1 104  
NORTH.EAST
73705 01 2010 SAT 1 2  AFTN   2 101   
WALNUT.CK
92105 01 2010 SAT 1 2  AFTN   2 102
LAMPE
1105 05 01 2010 SAT  1 2 AFTN   2 103   
EAST.AVE
1289 05 01 2010 SAT 1 2  AFTN   2 104   
NORTH.EAST
205 02 2010 SUN2 2  MORN1 101   
WALNUT.CK
186   05 02 2010 SUN2 2  MORN1 102LAMPE
370   05 02 2010 SUN2 2  MORN1 103
EAST.AVE
554   05 02 2010 SUN 2 2 MORN1 104   
NORTH.EAST
738   05 02 2010 SUN 2 2 AFTN  2 101
WALNUT.CK
922   05 02 2010 SUN 2 2  AFTN 2 102  
LAMPE
1106 05 02 2010 SUN  2 2   AFTN   2 103
EAST.AVE
1290 05 02 2010 SUN2 2  AFTN  2 104
NORTH.EAST
305 03 2010 MON3 1MORN1 101   
WALNUT.CK
187   05 03 2010 MON3 1   MORN1 102   LAMPE
371   05 03 2010 MON  3 1  MORN1 103   EAST.AVE
555   05 03 2010 MON  3 1  MORN1 104   
NORTH.EAST
739   05 03 2010 MON3 1   AFTN  2 101
WALNUT.CK
923   05 03 2010 MON  3 1  AFTN  2 102   
LAMPE
1107 05 03 2010 MON  3 1  AFTN  2 103
EAST.AVE
1291 05 03 2010 MON 3 1  AFTN  2 104
NORTH.EAST
405 04 2010 TUE  4 1 MORN1 101   
WALNUT.CK
188   05 04 2010 TUE 4 1   MORN1 102   LAMPE
372   05 04 2010 TUE  4 1MORN 1 103
EAST.AVE
.  ..   .   .   .   .   .   
 .   ..

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Writing own simulation function in C

2010-03-05 Thread TheSavageSam


I am wishing to write my own random distribution simulation function using C
programmin language(for speed) via R. I am familiar with R programming but
somewhat new to C programming. I was trying to understand "Writing R
extensions" -guide and its part 6.16, but I found it hard to
understand(http://cran.r-project.org/doc/manuals/R-exts.html#Standalone-Mathlib).
I also tried to get familiar with the example codes, but it didn't make me
any wiser.

The biggest problem seems to be how to get(what code to write in C) random
uniform numbers using Rmath. That seems to be too complicated to me at the
moment. So if someone of you could give a small example and it would
definitely help me a lot. All I wish to do first is to finally write(and
understand) my own function similar to what you run in R Command line via
command "runif".

And all this I am hoping to do without recompiling my whole R. Instead of
that I think that it is possible to use dyn.load("code.so") in R. (Yes, I
use linux)
-- 
View this message in context: 
http://n4.nabble.com/Writing-own-simulation-function-in-C-tp1580190p1580190.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave and optional document sections

2010-03-05 Thread Aleksey Naumov

Max,

Thank you very much! I'll give LaTeX a go (haven't used it in a while). Is
there a recommended distribution of LaTeX for Windows?

Aleksey

On Fri, Mar 5, 2010 at 1:50 PM, Max Kuhn  wrote:

> That isn't hard to do with Sweave.
>
> Within you if branch, you can have your code write out any markup you
> choose (including \section). For figures, I would suggest writing the
> image file "manually" (i.e with the code chunk option fig = FALSE) and
> also write out the \includegraphics statements too.
>
> With odfWeave, it is more difficult. Adding sections isn't obvious to
> me, although odfCat might be able to do it. Figures are more
> problematic. The way odfWeave works right now, whenever you use fig =
> TRUE in the chunk, odfWeave sets up the figure environment in the
> document (whether you produce an actual image or not). My recollection
> was that there wasn't any way around this, but it's been a thorn in my
> side for a while.
>
> I suggest using LaTeX in the short term to do this.
>
> Max
>
>
>
> On Fri, Mar 5, 2010 at 12:55 PM, Aleksey Naumov  wrote:
> > Dear R and Sweave users,
> >
> > Is there a way to have optional sections in a Sweave-generated report
> > document, complete with section header(s), text and code chunks? In other
> > words, I'd like for my report to include or omit certain sections based
> on
> > the data itself.
> >
> > For example, If I examine the input dataset early on in the report and
> set a
> > variable has_daily_data = TRUE, then I'd like to have a section that
> looks
> > at the daily data in detail, with subsections, text and code chunks. What
> > facilities do I have in Sweave (or odfWeave) for that?
> >
> > Perhaps, for code chunks, I could wrap all the code inside in
> > if(has_daily_data) {...}? What about inserting conditional text and
> > sections? Would it make sense to split the document into subdocuments and
> > somehow include the optional subdocuments only if needed? How would I go
> > about that?
> >
> > Would appreciate any pointers. Thank you,
> > Aleksey
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
>
> Max
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Update RMySQL and ... it's no more running

2010-03-05 Thread bernese


Yep this worked. I encountered this problem only after installing the ODBC
connector plugin. Not sure if there's any correlation.  I installed the
MySQL 5.1.33 binary from 
http://biostat.mc.vanderbilt.edu/wiki/Main/RMySQL 

and that will fixed 'RMySQL was compiled with MySQL 5.0.67 but loading MySQL
5.1.44' , and stopped R from crashing.
-- 
View this message in context: 
http://n4.nabble.com/Update-RMySQL-and-it-s-no-more-running-tp1561401p1580116.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] convert rpart tree to phylo (in ape)

2010-03-05 Thread Chris Hane

Hello,

Has anyone written a conversion from rpart tree class to phylo or hclust
trees?  The plot.phylo method allows much more readable layouts.

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make this sequence: 1,2,3,4,5,4,3,2,1

2010-03-05 Thread j verzani

kensuguro  gmail.com> writes:

> 
> 
> so basically, it's impossible to do with just seq() and rep()..  Doesn't seem
> like a good question for chapter 1...
> 
> Also, problem 1.13 is even more crazy..  it asks you to build the fibonacci
> sequence.  Now I'm a programmer, and so went way ahead in the book to see
> how functions were written, and just wrote my own function, but again, in
> chapter 1?  (only covered c(), seq(), and rep() at this point)  What away
> kick start a book.  Is "Using R project for Introductory Statistics" known
> to jump around or have out of sequence practice problems?  Otherwise it
> seems to be written very well.

Sorry about that one. On the errata page I have:

page 16, exercise 1.12 #5
This one is most easily done using c() and the sequence operator :. (Please
ignore request to use just seq and rep.) 

As for 1.13, I just wanted someone to use c() for that one. I don't even define
the Fibonacci sequence, I only wanted some context to a sequence of numbers that
is not an arithmetic progression.

Not sure if the rest of the book "jumps around", but if it seems to to you, I'm
very open to comments about improvements. Just email.

John

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Equation for model generated by auto.arima

2010-03-05 Thread Stephan Kolassa


Hi,

In that case, I'd recommend reading a good book on time series analysis. 
"Forecasting: Methods and Applications" by Makridakis, Wheelwright and 
Hyndman is very accessible. Alternatively, there are probably tons of 
webpages on ARIMA, so google around.


Best,
Stephan

testuser schrieb:

Thanks for the reply, Stephan. I don't want to use R to predict the future
value. I am looking to write the logic in a programming language like Java
to predict future values using the model coefficients generated by R. For
this, I would like to know what formula to use to estimate the value at any
time t. I looked at the forecast package but cannot find how to calculate
the value. Any help is appreciated.

Thanks


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to assign week numbers to a time-series

2010-03-05 Thread David Winsemius

On Mar 5, 2010, at 4:23 PM, Hosack, Michael wrote:

Hello everyone,

My progress has stalled on finding a way of creating a somewhat  
complicated variable to add to my existing dataframe and I am hoping  
one of you could help me out. The dataframe below contains only a  
fraction of the data of my complete dataframe, but all of the  
variables. What I want to do is add another variable named 'WEEK' to  
this dataframe that is assigned 1 for row 1 and remains 1 until the  
first SAT (i.e. Saturday) under variable 'DOW' (day of week) occurs,  
at which point variable 'WEEK' is now assigned 2. 'WEEK' should  
continue to be assigned 2 until the following SAT under variable  
'DOW' at which variable 'WEEK' will now be assigned 3, and so on. In  
this scheme, weekdays are such that SAT=1, SUN=2, MON=3,.FRI=7.  
I am basically trying to assign week numbers to potential sampling  
days in a survey season for use in a program that will generate a  
fisheries creel survey schedule. I should note that if element 1  
happens to have DOW=SAT (that is the case this year, since!
 the first day of our survey 05/01 is a Saturday), then WEEK 1  
begins on day 1 (05/01/2010) and WEEK 2 will begin on the first SAT  
under variable DOW. I hope I explained this clearly enough, if not  
let me know. If this sent twice, I apologize.

Mike

MM DD YR DOW DOW. DTYPE  TOD  TOD. SITENUM   DESC
1 05 01 2010 SAT1 2MORN1  
101 WALNUT.CK
18505 01 2010 SAT 1 2MORN1  
102   LAMPE
36905 01 2010 SAT 1 2MORN1  
103  EAST.AVE
55305 01 2010 SAT 1 2   MORN1  
104  NORTH.EAST
73705 01 2010 SAT 1 2  AFTN   2  
101   WALNUT.CK
92105 01 2010 SAT 1 2  AFTN   2  
102LAMPE
1105 05 01 2010 SAT  1 2 AFTN   2  
103   EAST.AVE
1289 05 01 2010 SAT 1 2  AFTN   2  
104   NORTH.EAST
205 02 2010 SUN2 2  MORN1  
101   WALNUT.CK
186   05 02 2010 SUN2 2  MORN1  
102LAMPE
370   05 02 2010 SUN2 2  MORN1  
103EAST.AVE
554   05 02 2010 SUN 2 2 MORN1  
104   NORTH.EAST
738   05 02 2010 SUN 2 2 AFTN  2  
101WALNUT.CK
922   05 02 2010 SUN 2 2  AFTN 2  
102  LAMPE
1106 05 02 2010 SUN  2 2   AFTN   2  
103EAST.AVE
1290 05 02 2010 SUN2 2  AFTN  2  
104NORTH.EAST
305 03 2010 MON3 1MORN1  
101   WALNUT.CK
187   05 03 2010 MON3 1   MORN1  
102   LAMPE
371   05 03 2010 MON  3 1  MORN1  
103   EAST.AVE
555   05 03 2010 MON  3 1  MORN1  
104   NORTH.EAST
739   05 03 2010 MON3 1   AFTN  2  
101WALNUT.CK
923   05 03 2010 MON  3 1  AFTN  2  
102   LAMPE
1107 05 03 2010 MON  3 1  AFTN  2  
103EAST.AVE
1291 05 03 2010 MON 3 1  AFTN  2  
104NORTH.EAST
405 04 2010 TUE  4 1 MORN1  
101   WALNUT.CK
188   05 04 2010 TUE 4 1   MORN1  
102   LAMPE
372   05 04 2010 TUE  4 1MORN 1  
103EAST.AVE

.  ..   .   .   .   .   .   
 .   ..

You could trunc() the results of this function applied to your dates  
and "2010-05-01":

> diffweek <- function(x,y) {difft <- difftime( x , y)/7; attr(difft,  
"units") <- "weeks"; difft}

> diffweek(Sys.Date() , as.Date("2010-01-01") )
Time difference of 9 weeks
> diffweek(Sys.Date()+1 , as.Date("2010-01-01") )
Time difference of 9.142857 weeks

There is also a week function in the tis package.

Perhaps (untested):

dfrm$weeknum <- trunc(apply(dfrm, 1, function(x)  
diffweek(as.Date(x[4], x[2], x[3], sep="-") ,

as.Date("2010-05-01")

   )
  ) )

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Dylan Beaudette

On Friday 05 March 2010, Ted Harding wrote:
> Again, I have the same attitude as Gabor. I don't need to know the
> real identity of a poster, if they ask a sensible question (which
> may well be elementary).
>
> As a contribution to the discussion, may I point out that many people
> use different email addresses for different purposes, so that any
> traffic involving a particular email address gets dropped into a
> particular folder. If you are running a private server you can pick
> whatever you like. If, like Blue Sky, you are on gmail, you will
> need to pick an identity that is not already taken. So perhaps
> Blue Sky (wheoever he/she is) may be using bluesky...@gmail.com
> to encapsulate their R-help correspondence.
>
> Or perhaps not; but other people do (e.g. jluo.rh...@gmail.com,
> amitrh...@yahoo.co.uk, rhelp...@gmail.com -- for the first two
> we do have a name though no affiliation; for the latter, neither
> name nor affiliation). It doesn't matter much, in my view.
>
> Some of the opinions expressed seem to kinterpret this kind of
> "anonymity" as a smoke-screen for students who are trying to get
> their homework done, to hide from their teachers. This no doubt
> happens, but that should not justify a knee-jerk assumption that
> it is the case.
>
> And I do find it a bit distasteful to see people being beaten about
> the head with the posting guide on slight pretexts. I hope we can
> be more relaxed.
>
> Ted.

It seems that the real problem has more to do with a user (I think it has now 
been stated several times who this is) flooding the list with questions posed 
without attention to the posting guide, AND replying to those how are trying 
to help with antagonistic and belligerent words. Then, when that user is 
sanctioned, he/she hides behind an anon email account. 

I have no problem helping someone with an anon email account- I do have a 
problem when people like (who I am referring to) waste other people's time 
and bandwidth.

Probably the best thing to do-- identify those who are acting this way, ask 
politely, then sanction, then ignore.

Cheers,
Dylan

> On 05-Mar-10 18:47:31, Matthew Dowle wrote:
> > John,
> >
> > So you want BlueSky to change their name to "Paul Smith" at
> > "New York  University", just to give a totally random, false
> > name, example,  and then you will be happy ?  I just picked a
> > popular, real name at a real, big place. Are you, or is anyone
> > else,  going to check its real ?
> >
> > We want BlueSky to ask great questions,  which haven't been
> > asked before, and to follow the posting guide.  If BlueSky
> > improves the knowledge base whats the problem?  This person
> > may well be breaking the posting guide for many other reasons
> > (I haven't looked), and if they are then you could take issue
> > with them on those points, but not for simply writing as
> > "BlueSky".
> >
> > David W has got it right when he replied to "ManInMoon". Shall
> > we stop this thread now, and follow his lead? I would have
> > picked "ManOnMoon" myself but maybe that one was taken. Its
> > rather difficult to be on a moon, let alone inside it.
> >
> > Matthew
> >
> >
> > "John Sorkin"  wrote in message
> > news:4b91068702cb00064...@medicine.umaryland.edu...
> >
> >> The sad part of this interchanges is that Blue Sky does not seem to be
> >> amiable to suggestion. He, or she, has not taken note, or responded to
> >> the
> >> fact that a number of people believe it is good manners to give a real
> >> name and affiliation. My mother taught me that when two people tell
> >> you
> >> that you are drunk you should lie down until the inebriation goes
> >> away.
> >> Blue Sky, several people have noted that you would do well to give us
> >> your
> >> name and affiliation. Is this too much to ask given that people are
> >> good
> >> enough to help you?
> >> John
> >>
> >>
> >>
> >>
> >> John David Sorkin M.D., Ph.D.
> >> Chief, Biostatistics and Informatics
> >> University of Maryland School of Medicine Division of Gerontology
> >> Baltimore VA Medical Center
> >> 10 North Greene Street
> >> GRECC (BT/18/GR)
> >> Baltimore, MD 21201-1524
> >> (Phone) 410-605-7119
> >> (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>>
> >> "Matthew Dowle"  3/5/2010 12:58 PM >>>
> >> Frank, I respect your views but I agree with Gabor.  The posting guide
> >> does
> >> not support your views.
> >>
> >> It is not any of our views that are important but we are following the
> >> posting guide.  It covers affiliation. It says only that "some"
> >> consider
> >> it
> >> "good manners to include a concise signature specifying affiliation".
> >> It
> >> does not agree that it is bad manners not to.  It is therefore going
> >> too
> >> far
> >> to urge R-gurus, whoever they might be, to ignore such postings on
> >> that
> >> basis alone.  It is up to responders (I think that is the better word
> >> which
> >> is the one used by the posting guide) whether they reply.  Missing
> >> affilia

[R] How to assign week numbers to a time-series

2010-03-05 Thread Hosack, Michael

Hello everyone,

My progress has stalled on finding a way of creating a somewhat complicated 
variable to add to my existing dataframe and I am hoping one of you could help 
me out. The dataframe below contains only a fraction of the data of my complete 
dataframe, but all of the variables. What I want to do is add another variable 
named 'WEEK' to this dataframe that is assigned 1 for row 1 and remains 1 until 
the first SAT (i.e. Saturday) under variable 'DOW' (day of week) occurs, at 
which point variable 'WEEK' is now assigned 2. 'WEEK' should continue to be 
assigned 2 until the following SAT under variable 'DOW' at which variable 
'WEEK' will now be assigned 3, and so on. In this scheme, weekdays are such 
that SAT=1, SUN=2, MON=3,.FRI=7. I am basically trying to assign week 
numbers to potential sampling days in a survey season for use in a program that 
will generate a fisheries creel survey schedule. I should note that if element 
1 happens to have DOW=SAT (that is the case this year, since!
  the first day of our survey 05/01 is a Saturday), then WEEK 1 begins on day 1 
(05/01/2010) and WEEK 2 will begin on the first SAT under variable DOW. I hope 
I explained this clearly enough, if not let me know. If this sent twice, I 
apologize.

Mike

 MM DD YR DOW DOW. DTYPE  TOD  TOD. SITENUM   DESC
1 05 01 2010 SAT1 2MORN1 101 
WALNUT.CK
18505 01 2010 SAT 1 2MORN1 102   
LAMPE
36905 01 2010 SAT 1 2MORN1 103  
EAST.AVE
55305 01 2010 SAT 1 2   MORN1 104  
NORTH.EAST
73705 01 2010 SAT 1 2  AFTN   2 101   
WALNUT.CK
92105 01 2010 SAT 1 2  AFTN   2 102
LAMPE
1105 05 01 2010 SAT  1 2 AFTN   2 103   
EAST.AVE
1289 05 01 2010 SAT 1 2  AFTN   2 104   
NORTH.EAST
205 02 2010 SUN2 2  MORN1 101   
WALNUT.CK
186   05 02 2010 SUN2 2  MORN1 102LAMPE
370   05 02 2010 SUN2 2  MORN1 103
EAST.AVE
554   05 02 2010 SUN 2 2 MORN1 104   
NORTH.EAST
738   05 02 2010 SUN 2 2 AFTN  2 101
WALNUT.CK
922   05 02 2010 SUN 2 2  AFTN 2 102  
LAMPE
1106 05 02 2010 SUN  2 2   AFTN   2 103
EAST.AVE
1290 05 02 2010 SUN2 2  AFTN  2 104
NORTH.EAST
305 03 2010 MON3 1MORN1 101   
WALNUT.CK
187   05 03 2010 MON3 1   MORN1 102   LAMPE
371   05 03 2010 MON  3 1  MORN1 103   EAST.AVE
555   05 03 2010 MON  3 1  MORN1 104   
NORTH.EAST
739   05 03 2010 MON3 1   AFTN  2 101
WALNUT.CK
923   05 03 2010 MON  3 1  AFTN  2 102   
LAMPE
1107 05 03 2010 MON  3 1  AFTN  2 103
EAST.AVE
1291 05 03 2010 MON 3 1  AFTN  2 104
NORTH.EAST
405 04 2010 TUE  4 1 MORN1 101   
WALNUT.CK
188   05 04 2010 TUE 4 1   MORN1 102   LAMPE
372   05 04 2010 TUE  4 1MORN 1 103
EAST.AVE
.  ..   .   .   .   .   .   
 .   ..


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave and optional document sections

2010-03-05 Thread Max Kuhn

I haven't used windows for a while... when I did I used MikTex and TeXnicCenter.

Max

On Fri, Mar 5, 2010 at 2:27 PM, Aleksey Naumov  wrote:
> Max,
>
> Thank you very much! I'll give LaTeX a go (haven't used it in a while). Is
> there a recommended distribution of LaTeX for Windows?
>
> Aleksey
>
> On Fri, Mar 5, 2010 at 1:50 PM, Max Kuhn  wrote:
>>
>> That isn't hard to do with Sweave.
>>
>> Within you if branch, you can have your code write out any markup you
>> choose (including \section). For figures, I would suggest writing the
>> image file "manually" (i.e with the code chunk option fig = FALSE) and
>> also write out the \includegraphics statements too.
>>
>> With odfWeave, it is more difficult. Adding sections isn't obvious to
>> me, although odfCat might be able to do it. Figures are more
>> problematic. The way odfWeave works right now, whenever you use fig =
>> TRUE in the chunk, odfWeave sets up the figure environment in the
>> document (whether you produce an actual image or not). My recollection
>> was that there wasn't any way around this, but it's been a thorn in my
>> side for a while.
>>
>> I suggest using LaTeX in the short term to do this.
>>
>> Max
>>
>>
>>
>> On Fri, Mar 5, 2010 at 12:55 PM, Aleksey Naumov  wrote:
>> > Dear R and Sweave users,
>> >
>> > Is there a way to have optional sections in a Sweave-generated report
>> > document, complete with section header(s), text and code chunks? In
>> > other
>> > words, I'd like for my report to include or omit certain sections based
>> > on
>> > the data itself.
>> >
>> > For example, If I examine the input dataset early on in the report and
>> > set a
>> > variable has_daily_data = TRUE, then I'd like to have a section that
>> > looks
>> > at the daily data in detail, with subsections, text and code chunks.
>> > What
>> > facilities do I have in Sweave (or odfWeave) for that?
>> >
>> > Perhaps, for code chunks, I could wrap all the code inside in
>> > if(has_daily_data) {...}? What about inserting conditional text and
>> > sections? Would it make sense to split the document into subdocuments
>> > and
>> > somehow include the optional subdocuments only if needed? How would I go
>> > about that?
>> >
>> > Would appreciate any pointers. Thank you,
>> > Aleksey
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>>
>> Max
>
>



-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conditioning variable in panel.xyplot?

2010-03-05 Thread Seth W Bigelow

I'm stumped after an hour or so reading about subscripts in panel.xyplot. 
Apparently the panel function is executed for each subset of data in the 
main dataset (specified by the conditioning variable, 'site' in my 
example), and the 'subscripts' keyword passes a vector of the 
corresponding row numbers to the panel function. But, if I want the panel 
function to simultaneously plot data from a different dataframe, as in the 
example below, I don't understand how having a vector of row numbers from 
a subset of the dataframe used in the main xyplot statement helps me with 
selecting data from an entirely different dataframe ('q' in my example). 

library(lattice)

d <- data.frame(site  = c(rep("A",12), rep("B",12)), 
x=rnorm(24),y=rnorm(24))
q <- data.frame(site  = c(rep("A",7), rep("B",7)), 
x=rnorm(14),y=rnorm(14))

mypanel <- function(...){
panel.xyplot(q$x, q$y, col="red")
panel.xyplot(...)}

xyplot(y ~ x | site, d,
panel = mypanel
)

--Seth

On Thu, Mar 4, 2010 at 4:42 PM, Seth W Bigelow  wrote:
> I wish to create a multipanel plot (map) from several datasets ("d" and
> "q" in the example below). I can condition the main xyplot statement on
> the "site" variable, but I don't know how to pass a conditioning 
variable
> to panel.xyplot plot so that the x-y coordinates from dataset q are only
> plotted at the appropriate site.

The keyword is 'subscripts'. Look at the entry for 'panel' in ?xyplot,
and let us know if you still have doubts.

-Deepayan

>
>
> library(lattice)
> d <- data.frame(site  = c(rep("A",12), rep("B",12)),
> x=rnorm(24),y=rnorm(24))# Create dataframe "d",
> with 12 x-y coordinates for each site
> q <- data.frame(site  = c(rep("A",7), rep("B",7)),
> x=rnorm(14),y=rnorm(14))# Create dataframe "q",
> with 7 pairs of x-y coordinates for each site.
>
> mypanel <- function(...){
>panel.xyplot(q$x, q$y, col="red")   # Statement that
> needs a "Site" conditioning variable
>panel.xyplot(...)}
>
> xyplot(y~x|site, d, panel=mypanel)  # Statement erroneously plots 
all
> 14 x-y points in "q" on panels for sites A & B
>
>
>
> Dr. Seth  W. Bigelow
> Biologist, USDA-FS Pacific Southwest Research Station
> 1731 Research Park Drive, Davis California
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] zoo Package Question

2010-03-05 Thread Gabor Grothendieck

If your series is nearly regular, that is it is regularly spaced
except for a few missing points then as.ts(z) will convert it to a
regular ts series and as.zoo(as.ts(z)) does that and converts it back.
You may need to add back the class.  For example see:

https://stat.ethz.ch/pipermail/r-help/2010-March/230700.html

Also see the three zoo vignettes and R News 4/1.

On Fri, Mar 5, 2010 at 10:32 AM, testuser  wrote:
>
> I want to use the zoo package to convert an irregular time series to a
> regular one at every 15 min. interval. I would like to read in the csv file
> as a zoo object. When trying to do a seq(from,to,by), how can I specify the
> "from" to be the first time element in the zoo object and "to" to be the
> last element in the zoo object. Thanks for your help.
> --
> View this message in context: 
> http://n4.nabble.com/zoo-Package-Question-tp1579735p1579735.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is it possible to recursively update a function?

2010-03-05 Thread David Winsemius

On Mar 5, 2010, at 11:06 AM, Seeker wrote:

Thanks for your suggestion, Carl. Actually I am looking for a series
of functions like these

exp(-x)*.5^x
exp(-x)*.5^x*(1-.4^x)
exp(-x)*.5^x*(1-.4^x)*(1-.3^x)
where(.5,.4,.3...) are from the coming results.

Those are not really functions but rather "expressions". Does this help?

> tvalues <- c(.5,.4,.3)
> expr <- substitute(exp(-x)*t1^x*(1-t2^x)*(1-t3^x),
  list(t1=tvalues[1], t2=tvalues[2], t3=tvalues[3]) )
> expr
# exp(-x) * 0.5^x * (1 - 0.4^x) * (1 - 0.3^x)

I suppose they could be wrapped up into functions but would need to be  
further wrapped inside eval() or else they will return an expression  
string.

> tfn <- function(x) eval( substitute(exp(-x)*t1^x*(1-t2^x)*(1-t3^x),
list(t1=tvalues[1], t2=tvalues[2],  
t3=tvalues[3]) ) )

> tfn(.5)
[1] 0.07129393
> tfn2 <- function(x) exp(-x) * 0.5^x * (1 - 0.4^x) * (1 - 0.3^x)
>
> tfn2(.5)
[1] 0.07129393

--
David.

I am not sure how to construct such an irregular fucntion vector.

On Mar 4, 6:40 pm, Carl Witthoft  wrote:

My foolish move for this week: I'm going to go way out on a limb and
guess what the OP wanted was something like this.

i=1, foo = x*exp(-x)

i=2, foo= x^2*exp(-x)
i=3, foo = x^3*exp(-x)
.
.
.

In which case he really should create a vector bar<-rep(na,5) ,
and then inside the loop,

bar[i]<-x^i*foo(x)

Carl

quoted material:
Date: Thu, 04 Mar 2010 11:37:23 -0800 (PST)

I need to update posterior dist function upon the coming results and
find the posterior mean each time.

On Mar 4, 1:31 pm, jim holtman  wrote:
 > What exactly are you trying to do?  'foo' calls 'foo' calls  
'foo' 

 >  How did you expect it to stop the recursive calls?
 >
 >
 >
 >
 >
 > On Thu, Mar 4, 2010 at 2:08 PM, Seeker   
wrote:

 > > Here is the test code.
 >
 > > foo<-function(x) exp(-x)
 > > for (i in 1:5)
 > > {
 > > foo<-function(x) foo(x)*x
 > > foo(2)
 > > }

__
r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/ 
listinfo/r-help

PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Douglas Bates

On Fri, Mar 5, 2010 at 12:16 PM, Gabor Grothendieck
 wrote:
> On Fri, Mar 5, 2010 at 12:58 PM, Matthew Dowle  wrote:
>> As far as I know you are wrong that there is no moderator.  There are in
>> fact an uncountable number of people who are empowered to moderate i.e. all
>> of us. In other words its up to the responders to moderate.  The posting
>
> I think moderator is being used in the sense of a person who receives
> posts before they become public and allows or disallows each post.
> Using that definition there is no moderator.

Not quite.  Postings from those subscribed to the list, with one
exception, are passed to the list without being held for moderator
approval.  Postings from those not subscribed are held for moderator
approval.

The one exception is Peng Yu  who, a few months
ago, flooded the list with queries not unlike those from ol' Blue Sky
and was sufficiently argumentative that he even wore down the patience
of Martin Maechler.  As a result, he was sanctioned by having his
postings held for moderator approval.  This is not a terrible sanction
because there are many people who can approve postings so this is done
fairly rapidly.

(By the way, I say "he" and "his" because the .ut in the email address
leads me to believe that the email belongs to this person,
www.cerc.utexas.edu/~yupeng/, who seems recently to have developed an
interest in Statistics and Bioinformatics.)

If you look at the history of postings by Peng Yu and by Blue Sky you
will see that the postings by Blue Sky started around the time that
Peng Yu was sanctioned. Indeed the headers from some of the early
postings indicate that they were posted on behalf of the email address
pengu...@gmail.com (although current postings do not).

Unfortunately email lists like R-help are, like any public resource,
subject to the "Tragedy of the commons" phenomenon
(http://en.wikipedia.org/wiki/Tragedy_of_the_commons).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] REvolutions blog: February Roundup

2010-03-05 Thread David M Smith

I write about R every weekday at the Revolutions blog:
 http://blog.revolution-computing.com
and every month I post a summary of articles from the previous month
of particular interest to readers of r-help.

http://bit.ly/9GoTVd announced the availability on YouTube of "What is
R", a 4-part video based on a recent webcast I hosted.

http://bit.ly/bVSSaH announced a webinar I hosted on REvolution's
debugger for R (a recorded replay is now available).

http://bit.ly/b86kGB seeks a Parallel Computing developer to work at
REvolution Computing.

http://bit.ly/9i2qeO reviewed an application of R to create social
networks from 10Gb of phone call data.

http://bit.ly/bPwsfz linked to a slide presentation by Ryan Rosario
explaining the base graphics system in R.

http://bit.ly/9MPlec updated a previous geographic visualization of an
election, illustrating that color scales do matter.

http://bit.ly/aOmo1z noted the great lineup for R/Finance 2010 in
Chicago (register now!).

http://bit.ly/cMkjDa reviewed CRAN packages released and updated in
January & February.

http://bit.ly/cXiO8H linked to information about Frank Harrell's rms.
and Hmisc packages, and his upcoming course.

http://bit.ly/bZ6dLI linked to a story about creating a cluster in
Amazon EC2 for parallel computations with the multicore package.

http://bit.ly/bvkiQ2 gave some examples of creating pretty HTML and
LaTeX tables with the xtable package.

http://bit.ly/9M508I showed how to create a mosaic plot (or treemap) in R.

http://bit.ly/coPTj9 noted media attention for the R Project, named as
2010 Editor's Choice at Intelligent Enterprise.

http://bit.ly/aK7PAU linked to Dirk Eddelbuettel's presentation about
the Rcpp interface for C++ code in R.

http://bit.ly/bzPaII linked to some useful tips on speeding up R code
with the Rprof function.

http://bit.ly/9u5Fwv linked to a useful introduction to R's basic
object types (vectors, data frames, etc.)

http://bit.ly/dbaGau linked to a Sudoku solver for R (using a
different method than the sudoko package)

http://bit.ly/cucF8I noted that Tex Hull, co-founder of SPSS, has
joined the team at REvolution Computing.

http://bit.ly/btmKQO linked Salvio Rodrigues at the Open Source blog,
who found that Robert Gentelman's appointment to the REvolution Board
was "a great impetus ... to look at R again".

Other non-R-specific posts in the past month covered a newspaper
miscalculating a simple probability (http://bit.ly/a8ZRLV), the fate
of the employees of the collapsed megabucks (http://bit.ly/bPIYXy) and
(on a lighter note) Carl Sagan singing again, this time about
evolution (http://bit.ly/a7d2sr), and visualizing what happens when
you reply-all to an email list (http://bit.ly/cuLaNh).

The R Community Calendar has also been updated at:
http://blog.revolution-computing.com/calendar.html

If you're looking for more articles about R, you can find summaries
from previous months at http://bit.ly/dt1AZe . Join the REvolution
mailing list at http://bit.ly/bOISmy to be alerted to new articles on
a monthly basis.

As always, thanks for the comments and please keep sending suggestions
to me at da...@revolution-computing.com . Don't forget you can also
follow the blog using an RSS reader like Google Reader, or by
following me on Twitter (I'm @revodavid).

Kind regards to all,
# David Smith

--
David M Smith 
VP of Marketing, REvolution Computing  http://blog.revolution-computing.com
Tel: +1 (650) 330-0553 x205 (Palo Alto, CA, USA)

Download REvolution R free:
www.revolution-computing.com/downloads/revolution-r.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Ben Bolker

blue sky  gmail.com> writes:


   I almost certainly shouldn't feed the trolls, but:

1. ?kruskal.test (listed in "see also" in ?wilcox.test)
2. One of the disadvantages of nonparametric tests is that it is
in general difficult to generalize them to analogues of arbitrarily
complex linear mixed models.
3. You might look for "multi-reponse permutation procedures" (MRPP).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Robert A LaBudde

A search on "bluesky...@gmail.com" shows the user is in Norfolk, VA, USA.

At 01:26 PM 3/5/2010, John Sorkin wrote:
The sad part of this interchanges is that Blue Sky does not seem to 
be amiable to suggestion. He, or she, has not taken note, or 
responded to the fact that a number of people believe it is good 
manners to give a real name and affiliation. My mother taught me 
that when two people tell you that you are drunk you should lie down 
until the inebriation goes away. Blue Sky, several people have noted 
that you would do well to give us your name and affiliation. Is this 
too much to ask given that people are good enough to help you?

John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to 
faxing)>>> "Matthew Dowle"  3/5/2010 12:58 PM >>>

Frank, I respect your views but I agree with Gabor.  The posting guide does
not support your views.

It is not any of our views that are important but we are following the
posting guide.  It covers affiliation. It says only that "some" consider it
"good manners to include a concise signature specifying affiliation". It
does not agree that it is bad manners not to.  It is therefore going too far
to urge R-gurus, whoever they might be, to ignore such postings on that
basis alone.  It is up to responders (I think that is the better word which
is the one used by the posting guide) whether they reply.  Missing
affiliation is ok by the posting guide.  Users shouldn't be put off from
posting because of that alone.

Sending from an anonymous email address such as "BioStudent" is also fine by
the posting guide as far as my eyes read it. It says only that the email
address should work. I would also answer such anonymous posts, providing
they demonstrate they made best efforts to follow the posting guide, as
usual for all requests for help.  Its so easy to send from a false, but
apparently real name, why worry about that?

If you disagree with the posting guide then you could make a suggestion to
get the posting guide changed with respect to these points.  But, currently,
good and practice is defined by the posting guide, and I can't see that your
view is backed up by it.  In fact it seems to me that these points were
carefully considered, and the wording is careful on these points.

As far as I know you are wrong that there is no moderator.  There are in
fact an uncountable number of people who are empowered to moderate i.e. all
of us. In other words its up to the responders to moderate.  The posting
guide is our guide.  As a last resort we can alert the list administrator
(which I believe is the correct name for him in that role), who has powers
to remove an email address from the list if he thinks that is appropriate,
or act otherwise, or not at all.  It is actually up to responders (i.e. all
of us) to ensure the posting guide is followed.

My view is that the problems started with some responders on some occasions.
They sometimes forgot, a little bit, to encourage and remind posters to
follow the posting guide when it was not followed. This then may have
encouraged more posters to think it was ok not to follow the posting guide.
That is my own personal view,  not a statistical one backed up by any
evidence.

Matthew

"Frank E Harrell Jr"  wrote in message
news:4b913880.9020...@vanderbilt.edu...
> Gabor Grothendieck wrote:
>> I am happy to answer posts to r-help regardless of the name and email
>> address of the poster but would draw the line at someone excessively
>> posting without a reasonable effort to find the answer first or using
>> it for homework since such requests could flood the list making it
>> useless for everyone.
>
> Gabor I respectfully disagree.  It is bad practice to allow anonymous
> postings.  We need to see real names and real affiliations.
>
> r-help is starting to border on uselessness because of the age old problem
> of the same question being asked every two days, a high frequency of
> specialty questions, and answers given with the best of intentions in
> incremental or contradictory e-mail pieces (as opposed to a cumulative
> wiki or hierarchically designed discussion web forum), as there is no
> moderator for the list.  We don't need even more traffic from anonymous
> postings.
>
> Frank
>
>>
>> On Fri, Mar 5, 2010 at 10:55 AM, Ravi Varadhan 
>> wrote:
>>> David,
>>>
>>> I agree with your sentiments.  I also think that it is bad posting
>>> etiquette not to sign one's genuine name and affiliation when asking for
>>> help, which "blue sky" seems to do a lot.  Bert Gunter has already
>>> raised this issue, and I completely agree with him. I would also like to
>>> urge the R-gurus to ignore such postings.
>>>
>>> Best,
>>> Ravi.
>>>

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Frank E Harrell Jr


Matthew Dowle wrote:

John,

So you want BlueSky to change their name to "Paul Smith" at "New York 
University",   just to give a totally random, false name, example,  and then 
you will be happy ?  I just picked a popular, real name at a real, big 
place.   Are you, or is anyone else,  going to check its real ?


Matthew that is poorly stated.  We want real names backed up by 
affiliations that if anyone wanted to check they could.  It is just 
common courtesy, and helps some of us feel good about helping others.


Frank



We want BlueSky to ask great questions,  which haven't been asked before, 
and to follow the posting guide.  If BlueSky improves the knowledge base 
whats the problem?  This person may well be breaking the posting guide for 
many other reasons  (I haven't looked), and if they are then you could take 
issue with them on those points, but not for simply writing as "BlueSky".


David W has got it right when he replied to "ManInMoon".   Shall we stop 
this thread now,  and follow his lead ?   I would have picked "ManOnMoon" 
myself but maybe that one was taken. Its rather difficult to be on a moon, 
let alone inside it.


Matthew




--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how can I look at .Internal(model.matrix(t, data))?

2010-03-05 Thread Werner W.

Hi,

I would like to see how model.matrix expands factor column to a set of dummy 
columns. I think that is done int .Internal(model.matrix(t, data)) which is 
called from model.matrix.default. But I have not idea how I can look at this 
function. How can I get to such internal functions?

Thanks so much!
  Werner



einen herausragenden Schutz gegen Massenmails. 
http://mail.yahoo.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Ted Harding

Again, I have the same attitude as Gabor. I don't need to know the
real identity of a poster, if they ask a sensible question (which
may well be elementary).

As a contribution to the discussion, may I point out that many people
use different email addresses for different purposes, so that any
traffic involving a particular email address gets dropped into a
particular folder. If you are running a private server you can pick
whatever you like. If, like Blue Sky, you are on gmail, you will
need to pick an identity that is not already taken. So perhaps
Blue Sky (wheoever he/she is) may be using bluesky...@gmail.com
to encapsulate their R-help correspondence.

Or perhaps not; but other people do (e.g. jluo.rh...@gmail.com,
amitrh...@yahoo.co.uk, rhelp...@gmail.com -- for the first two
we do have a name though no affiliation; for the latter, neither
name nor affiliation). It doesn't matter much, in my view.

Some of the opinions expressed seem to kinterpret this kind of
"anonymity" as a smoke-screen for students who are trying to get
their homework done, to hide from their teachers. This no doubt
happens, but that should not justify a knee-jerk assumption that
it is the case.

And I do find it a bit distasteful to see people being beaten about
the head with the posting guide on slight pretexts. I hope we can
be more relaxed.

Ted.

On 05-Mar-10 18:47:31, Matthew Dowle wrote:
> John,
> 
> So you want BlueSky to change their name to "Paul Smith" at
> "New York  University", just to give a totally random, false
> name, example,  and then you will be happy ?  I just picked a
> popular, real name at a real, big place. Are you, or is anyone
> else,  going to check its real ?
> 
> We want BlueSky to ask great questions,  which haven't been
> asked before, and to follow the posting guide.  If BlueSky
> improves the knowledge base whats the problem?  This person
> may well be breaking the posting guide for many other reasons
> (I haven't looked), and if they are then you could take issue
> with them on those points, but not for simply writing as
> "BlueSky".
> 
> David W has got it right when he replied to "ManInMoon". Shall
> we stop this thread now, and follow his lead? I would have
> picked "ManOnMoon" myself but maybe that one was taken. Its
> rather difficult to be on a moon, let alone inside it.
> 
> Matthew
> 
> 
> "John Sorkin"  wrote in message 
> news:4b91068702cb00064...@medicine.umaryland.edu...
>> The sad part of this interchanges is that Blue Sky does not seem to be
>> amiable to suggestion. He, or she, has not taken note, or responded to
>> the 
>> fact that a number of people believe it is good manners to give a real
>> name and affiliation. My mother taught me that when two people tell
>> you 
>> that you are drunk you should lie down until the inebriation goes
>> away. 
>> Blue Sky, several people have noted that you would do well to give us
>> your 
>> name and affiliation. Is this too much to ask given that people are
>> good 
>> enough to help you?
>> John
>>
>>
>>
>>
>> John David Sorkin M.D., Ph.D.
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>>
>> "Matthew Dowle"  3/5/2010 12:58 PM >>>
>> Frank, I respect your views but I agree with Gabor.  The posting guide
>> does
>> not support your views.
>>
>> It is not any of our views that are important but we are following the
>> posting guide.  It covers affiliation. It says only that "some"
>> consider 
>> it
>> "good manners to include a concise signature specifying affiliation".
>> It
>> does not agree that it is bad manners not to.  It is therefore going
>> too 
>> far
>> to urge R-gurus, whoever they might be, to ignore such postings on
>> that
>> basis alone.  It is up to responders (I think that is the better word 
>> which
>> is the one used by the posting guide) whether they reply.  Missing
>> affiliation is ok by the posting guide.  Users shouldn't be put off
>> from
>> posting because of that alone.
>>
>> Sending from an anonymous email address such as "BioStudent" is also
>> fine 
>> by
>> the posting guide as far as my eyes read it. It says only that the
>> email
>> address should work. I would also answer such anonymous posts,
>> providing
>> they demonstrate they made best efforts to follow the posting guide,
>> as
>> usual for all requests for help.  Its so easy to send from a false,
>> but
>> apparently real name, why worry about that?
>>
>> If you disagree with the posting guide then you could make a
>> suggestion to
>> get the posting guide changed with respect to these points.  But, 
>> currently,
>> good and practice is defined by the posting guide, and I can't see
>> that 
>> your
>> view is backed up by it.  In fact it seems to me that these points
>> were
>> carefully

[R] Assistance with pointers to code for B-spline derivatives (S-plus related).

2010-03-05 Thread Jeffrey Racine

Hi.

I have been using the splines package for my work, in particular, the bs() 
function and associated predict() method. I now find myself in need of the 
derivatives of this beast.

In the man page for predict.bSpline I found `predict(object, x, nseg=50, 
deriv=0, ...)' but alas this is not implemented (deriv= is ignored). I 
contacted the package maintainers who were most helpful. Bill Venables kindly 
replied "Curiously, the S-PLUS code did have that feature, I recall, but as I 
no longer have S-PLUS I've lost contact with it.  If you know someone with 
S-PLUS and the splines code, you may be able to get somewhere with that."

Sadly, I do not have access to this code either. However, I was hoping that 
there is a large enough community of spline smoothers using R that perhaps they 
have written/adapted code that accomplishes what I need.

So, if you have any pointers as to how to generate derivatives for the bs() 
function in the splines package (or can point me to functionally equivalent 
code elsewhere that can return derivatives), I would be most grateful to hear 
from you.

Thanks in advance for your time.

-- Jeff


Professor J. S. Racine Phone:  (905) 525 9140 x 23825
Department of EconomicsFAX:(905) 521-8232
McMaster Universitye-mail: raci...@mcmaster.ca
1280 Main St. W.,Hamilton, URL: http://www.economics.mcmaster.ca/racine/
Ontario, Canada. L8S 4M4

`The generation of random numbers is too important to be left to chance'

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] End of line marker?

2010-03-05 Thread jonas garcia

Jim, Duncan and David,

Thanks to you guys I manage to solve the problem and I have learnt a lot.
Best regards

J

On Fri, Mar 5, 2010 at 4:55 AM, Duncan Murdoch  wrote:

> On 04/03/2010 11:40 PM, David Winsemius wrote:
>
>> On Mar 4, 2010, at 10:58 PM, Duncan Murdoch wrote:
>>
>> On 04/03/2010 10:32 PM, David Winsemius wrote:
>>>
 On Mar 4, 2010, at 9:47 PM, jonas garcia wrote:

> When I opened the file with a hex-editor, the problematic  character
>  turned out to be 1a
> I am attaching a sample DAT file with 3 lines (the second line is   the
> one with the undesirable character).
>
> The furthest I could get was through readBin:
>
> tmp<- readBin("new.dat", what = "raw", n=1)
>>
>  [1] 30 32 3a 33 35 3a 33 32 2c 20 34 34 30 33 2c 20 33 37 2e 31  31
>  34 2c 2d 32 30 2e 38 33 36 2c 31
> [33] 35 35 2e 39 2c 30 30 2e 37 36 2c 31 31 35 36 0d 0a 30 32 3a  33
>  35 3a 33 35 2c 20 34 34 33 32 2c
> [65] 20 33 37 2e 31 31 34 2c 2d 32 30 2e 38 33 36 2c 31 35 35 2e  38
>  2c 1a 30 2e 38 31 2c 31 31 35 37
> [97] 0d 0a 30 32 3a 33 35 3a 33 39 2c 20 34 34 36 37 2c 20 33 37  2e
>  31 31 34 2c 2d 32 30 2e 38 33 36
> [129] 2c 31 35 35 2e 38 2c 30 30 2e 38 31 2c 31 31 35 38
>
>
> tmp[87]
>>
> [1] 1a
>
 I got a different "interpretation" of that character when I let R  look
  at it. And I cannot figure out why \032 should be causing  problems??? :

>>> Hex 1a and octal 032 both correspond to Ctrl-Z, which is the MSDOS  EOF
>>> marker.  I forget whether R's text reading routines pay  attention to that,
>>> or whether it's the C runtime, but it makes sense  that it would cause
>>> problems on Windows.
>>>
>>> Duncan Murdoch
>>>
>>
>> Thanks. I was interpreting \032 as decimal, so couldn't figure out why  it
>> should equal 0x1A. You've explained the basis (or base) of my  confusion.
>>
>
> By the way, here's one way to remove the bad char.  Read it using readBin
> as above, then
>
> tmp <- tmp[tmp != 0x1a]
>
> to remove the bad chars, or
>
> tmp[tmp == 0x1a] <- charToRaw(" ")
>
> to replace them with spaces.  Then write the tmp vector out to a file with
> writeBin.
>
> Duncan Murdoch
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I can't find "rpart" help (linux)

2010-03-05 Thread Grześ


Thank you very much! :)
-- 
View this message in context: 
http://n4.nabble.com/I-can-t-find-rpart-help-linux-tp1579403p1579846.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] install rJava in linux

2010-03-05 Thread Grześ


Thank you vary much! =^D
-- 
View this message in context: 
http://n4.nabble.com/install-rJava-in-linux-tp1579395p1579852.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is it possible to recursively update a function?

2010-03-05 Thread Seeker

Thanks for your suggestion, Carl. Actually I am looking for a series
of functions like these

exp(-x)*.5^x
exp(-x)*.5^x*(1-.4^x)
exp(-x)*.5^x*(1-.4^x)*(1-.3^x)
where(.5,.4,.3...) are from the coming results.

I am not sure how to construct such an irregular fucntion vector.

On Mar 4, 6:40 pm, Carl Witthoft  wrote:
> My foolish move for this week: I'm going to go way out on a limb and
> guess what the OP wanted was something like this.
>
> i=1, foo = x*exp(-x)
>
> i=2, foo= x^2*exp(-x)
> i=3, foo = x^3*exp(-x)
> .
> .
> .
>
> In which case he really should create a vector bar<-rep(na,5) ,
> and then inside the loop,
>
> bar[i]<-x^i*foo(x)
>
> Carl
>
> quoted material:
> Date: Thu, 04 Mar 2010 11:37:23 -0800 (PST)
>
> I need to update posterior dist function upon the coming results and
> find the posterior mean each time.
>
> On Mar 4, 1:31 pm, jim holtman  wrote:
>  > What exactly are you trying to do?  'foo' calls 'foo' calls 'foo' 
>  >  How did you expect it to stop the recursive calls?
>  >
>  >
>  >
>  >
>  >
>  > On Thu, Mar 4, 2010 at 2:08 PM, Seeker  wrote:
>  > > Here is the test code.
>  >
>  > > foo<-function(x) exp(-x)
>  > > for (i in 1:5)
>  > > {
>  > > foo<-function(x) foo(x)*x
>  > > foo(2)
>  > > }
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is it possible to recursively update a function?

2010-03-05 Thread Ming Zhong

I was trying to replicate one CRM simulation. The following code works but
seems redundant so I want to create a loop.

O'Quigley CRM example 1#
f1<-function(x) exp(-x)
dose<-c(-1.47,-1.1,-.69,-.42,0.0,.42)
p<-c(0.05,0.1,0.2,0.3,0.5,0.7)
y<-c(0,0,1,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,1,1)

f2<-function(x) f1(x)*(1-0.2^x)
denom2<-integrate(f2,0,100)$value
alpha2<-integrate(function(x) x*f2(x),0,100)$value/denom2
id<-which.min(abs(-.5*log(0.2^(-1/alpha2)-1)-dose))  ### id is used to
locate the next prob.

f3<-function(x) f2(x)*(1-.3^x)
denom3<-integrate(f3,0,100)$value
alpha3<-integrate(function(x) x*f3(x),0,100)$value/denom3
id<-which.min(abs(-.5*log(0.2^(-1/alpha3)-1)-dose))

f4<-function(x) f3(x)*.3^x
denom4<-integrate(f4,0,100)$value
alpha4<-integrate(function(x) x*f4(x),0,100)$value/denom4
id<-which.min(abs(-.5*log(0.2^(-1/alpha4)-1)-dose))

2010/3/5 Uwe Ligges 

>
>
> On 05.03.2010 01:40, Carl Witthoft wrote:
>
>> My foolish move for this week: I'm going to go way out on a limb and
>> guess what the OP wanted was something like this.
>>
>> i=1, foo = x*exp(-x)
>>
>> i=2, foo= x^2*exp(-x)
>> i=3, foo = x^3*exp(-x)
>> .
>> .
>> .
>>
>>
>> In which case he really should create a vector bar<-rep(na,5) ,
>> and then inside the loop,
>>
>> bar[i]<-x^i*foo(x)
>>
>
> Since in this case foo(x) is independent of i, you are wasting resources.
> Moreover you could calculate it for a whole matrix at once. Say you want to
> calculate this for i=1, ..., n with n=5 for some (here pseudo random x),
> then you could do it simpler after defining some data as in:
>
> set.seed(123)
> x <- rnorm(10)
> n <- 5
>
>
> using the single and probably most efficient line:
>
>  outer(x, 1:n, "^") * exp(-x)
>
> or if x is a length 1 vector then even simpler:
>
> set.seed(123)
> x <- rnorm(1)
> n <- 5
>
>  x^(1:5) * exp(-x)
>
> But we still do not know if this is really the question ...
>
> Uwe Ligges
>
>
>
>
>> Carl
>>
>>
>>
>> quoted material:
>> Date: Thu, 04 Mar 2010 11:37:23 -0800 (PST)
>>
>>
>> I need to update posterior dist function upon the coming results and
>> find the posterior mean each time.
>>
>>
>> On Mar 4, 1:31 pm, jim holtman  wrote:
>>  > What exactly are you trying to do? 'foo' calls 'foo' calls 'foo' 
>>  > How did you expect it to stop the recursive calls?
>>  >
>>  >
>>  >
>>  >
>>  >
>>  > On Thu, Mar 4, 2010 at 2:08 PM, Seeker  wrote:
>>  > > Here is the test code.
>>  >
>>  > > foo<-function(x) exp(-x)
>>  > > for (i in 1:5)
>>  > > {
>>  > > foo<-function(x) foo(x)*x
>>  > > foo(2)
>>  > > }
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] zoo Package Question

2010-03-05 Thread testuser


I want to use the zoo package to convert an irregular time series to a
regular one at every 15 min. interval. I would like to read in the csv file
as a zoo object. When trying to do a seq(from,to,by), how can I specify the
"from" to be the first time element in the zoo object and "to" to be the
last element in the zoo object. Thanks for your help.
-- 
View this message in context: 
http://n4.nabble.com/zoo-Package-Question-tp1579735p1579735.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Breadt-first-search algorithm

2010-03-05 Thread Thomas Jensen


Dear R-list,

does anyone of you know whether there exist a breadth-first-search  
algorithm for R?


Best, Thomas Jensen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave and optional document sections

2010-03-05 Thread Max Kuhn

That isn't hard to do with Sweave.

Within you if branch, you can have your code write out any markup you
choose (including \section). For figures, I would suggest writing the
image file "manually" (i.e with the code chunk option fig = FALSE) and
also write out the \includegraphics statements too.

With odfWeave, it is more difficult. Adding sections isn't obvious to
me, although odfCat might be able to do it. Figures are more
problematic. The way odfWeave works right now, whenever you use fig =
TRUE in the chunk, odfWeave sets up the figure environment in the
document (whether you produce an actual image or not). My recollection
was that there wasn't any way around this, but it's been a thorn in my
side for a while.

I suggest using LaTeX in the short term to do this.

Max

On Fri, Mar 5, 2010 at 12:55 PM, Aleksey Naumov  wrote:
> Dear R and Sweave users,
>
> Is there a way to have optional sections in a Sweave-generated report
> document, complete with section header(s), text and code chunks? In other
> words, I'd like for my report to include or omit certain sections based on
> the data itself.
>
> For example, If I examine the input dataset early on in the report and set a
> variable has_daily_data = TRUE, then I'd like to have a section that looks
> at the daily data in detail, with subsections, text and code chunks. What
> facilities do I have in Sweave (or odfWeave) for that?
>
> Perhaps, for code chunks, I could wrap all the code inside in
> if(has_daily_data) {...}? What about inserting conditional text and
> sections? Would it make sense to split the document into subdocuments and
> somehow include the optional subdocuments only if needed? How would I go
> about that?
>
> Would appreciate any pointers. Thank you,
> Aleksey
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Matthew Dowle

John,

So you want BlueSky to change their name to "Paul Smith" at "New York 
University",   just to give a totally random, false name, example,  and then 
you will be happy ?  I just picked a popular, real name at a real, big 
place.   Are you, or is anyone else,  going to check its real ?

We want BlueSky to ask great questions,  which haven't been asked before, 
and to follow the posting guide.  If BlueSky improves the knowledge base 
whats the problem?  This person may well be breaking the posting guide for 
many other reasons  (I haven't looked), and if they are then you could take 
issue with them on those points, but not for simply writing as "BlueSky".

David W has got it right when he replied to "ManInMoon".   Shall we stop 
this thread now,  and follow his lead ?   I would have picked "ManOnMoon" 
myself but maybe that one was taken. Its rather difficult to be on a moon, 
let alone inside it.

Matthew


"John Sorkin"  wrote in message 
news:4b91068702cb00064...@medicine.umaryland.edu...
> The sad part of this interchanges is that Blue Sky does not seem to be 
> amiable to suggestion. He, or she, has not taken note, or responded to the 
> fact that a number of people believe it is good manners to give a real 
> name and affiliation. My mother taught me that when two people tell you 
> that you are drunk you should lie down until the inebriation goes away. 
> Blue Sky, several people have noted that you would do well to give us your 
> name and affiliation. Is this too much to ask given that people are good 
> enough to help you?
> John
>
>
>
>
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> 
> "Matthew Dowle"  3/5/2010 12:58 PM >>>
> Frank, I respect your views but I agree with Gabor.  The posting guide 
> does
> not support your views.
>
> It is not any of our views that are important but we are following the
> posting guide.  It covers affiliation. It says only that "some" consider 
> it
> "good manners to include a concise signature specifying affiliation". It
> does not agree that it is bad manners not to.  It is therefore going too 
> far
> to urge R-gurus, whoever they might be, to ignore such postings on that
> basis alone.  It is up to responders (I think that is the better word 
> which
> is the one used by the posting guide) whether they reply.  Missing
> affiliation is ok by the posting guide.  Users shouldn't be put off from
> posting because of that alone.
>
> Sending from an anonymous email address such as "BioStudent" is also fine 
> by
> the posting guide as far as my eyes read it. It says only that the email
> address should work. I would also answer such anonymous posts, providing
> they demonstrate they made best efforts to follow the posting guide, as
> usual for all requests for help.  Its so easy to send from a false, but
> apparently real name, why worry about that?
>
> If you disagree with the posting guide then you could make a suggestion to
> get the posting guide changed with respect to these points.  But, 
> currently,
> good and practice is defined by the posting guide, and I can't see that 
> your
> view is backed up by it.  In fact it seems to me that these points were
> carefully considered, and the wording is careful on these points.
>
> As far as I know you are wrong that there is no moderator.  There are in
> fact an uncountable number of people who are empowered to moderate i.e. 
> all
> of us. In other words its up to the responders to moderate.  The posting
> guide is our guide.  As a last resort we can alert the list administrator
> (which I believe is the correct name for him in that role), who has powers
> to remove an email address from the list if he thinks that is appropriate,
> or act otherwise, or not at all.  It is actually up to responders (i.e. 
> all
> of us) to ensure the posting guide is followed.
>
> My view is that the problems started with some responders on some 
> occasions.
> They sometimes forgot, a little bit, to encourage and remind posters to
> follow the posting guide when it was not followed. This then may have
> encouraged more posters to think it was ok not to follow the posting 
> guide.
> That is my own personal view,  not a statistical one backed up by any
> evidence.
>
> Matthew
>
>
> "Frank E Harrell Jr"  wrote in message
> news:4b913880.9020...@vanderbilt.edu...
>> Gabor Grothendieck wrote:
>>> I am happy to answer posts to r-help regardless of the name and email
>>> address of the poster but would draw the line at someone excessively
>>> posting without a reasonable effort to find the answer first or using
>>> it for homework since such requests could flood the list making it
>>> useless for everyone.
>>
>> Gabor I respectfully disagree.

Re: [R] About the interaction A:B

2010-03-05 Thread Jeff Laake

You are correct that you need to use  ~-1+A:B.  I use that all the time 
and just spaced it out when I was writing

the response.  Using ~A:B will produce one too many columns.

Didn't follow your other question.  You can always look at the result of 
model.matrix to see if it is correct or dig into the

internal code.  Personally, I've never found it to have a problem.

--jeff

On 3/5/2010 10:16 AM, blue sky wrote:

On Fri, Mar 5, 2010 at 11:41 AM, Jeff Laake  wrote:
   

On 3/5/2010 9:19 AM, Frank E Harrell Jr wrote:
 

You neglected to state your name and affiliation, and your question
demonstrates an allergy to R documentation.

Frank
   

I agree with Frank but will try to answer some of your questions as I
understand it.

First, model.matrix uses the options$contrasts or what is set specifically
as the contrast for a factor using contrasts().  The default is
treatment contrasts and for a factor that means the first level of a factor
variable is the intercept and the remainder are "treatment" effects
which are added to intercept.  If you specify  Y~A+B+A:B, you are specifying
the model with main effects (A, B) and interactions (A:B).
 

So my interpretation of how R does internally when dealing with
interaction terms A:B is to look if each individual term (each sub
interactions in highway interactions like, A:B, A:C, etc., in A:B:C:D)
appears in the formula or not, and deciding how to construct the
columns of the model matrix according to A:B, right?

This seem quite complicated when a formula is arbitrarily complex, let
along different type of data (namely ordered, not ordered, numerical
numbers). The piece of information that I feel that is still missing
to me is the detailed procedure that R generate model.matrix for
arbitrary complex formulas and how it is proved the way that R does is
always correct for these complex formulas.

   

If A has m levels and B has n levels then there will be an intercept (1),
m-1 for A, n-1 for B and (m-1)*(n-1) for the interaction.
If you specify the model as Y~A:B it will specify the model fully with
interactions which will have m*n separate parameters and none will be
NA as long as you have data in each of the m*n cells.  It makes absolutely
no sense to me to have an intercept and then all but one of the
remaining interactions included.
 

I think that 'Y ~ A:B' indeed has the intercept term unless '-1' is
included. You may need to adjust your interpretation of A:B and A:B -1
in these cases.
   

dim(my_subset(model.matrix(Y ~ A:B - 1,fr)))
 

[1] 12 12
   

dim(my_subset(model.matrix(Y ~ A:B,fr)))
 

[1] 12 13


   

Note that you'll get quite different results if A and/or B are not factor
variables.

--jeff
 

blue sky wrote:
   

The following is the code for the model.matrix. But it still doesn't
answer why A:B is interpreted differently in Y~A+B+A:B and Y~A:B. By
'why', I mean how R internally does it and what is the rational behind
the way of doing it?

And it didn't answer why in the model.matrix of Y~A, there are a-1
terms from A plus the intercept, but in the model.matrix of Y~A:B,
there are a*b terms (rather than a*b-1 terms) plus the intercept.
Since the one coefficient of the lm of Y~A:B is going to be NA, why
bother to include the corresponding term in the model matrix?

code below

set.seed(0)

a=3
b=4

AB_effect=data.frame(
  name=paste(
unlist(
  do.call(
rbind
, rep(list(paste('A', letters[1:a],sep='')), b)
)
  )
, unlist(
  do.call(
cbind
, rep(list(paste('B', letters[1:b],sep='')), a)
)
  )
, sep=':'
)
  , value=rnorm(a*b)
  , stringsAsFactors=F
  )

max_n=10
n=sample.int(max_n, a*b, replace=T)

AB=mapply(function(name, n){rep(name,n)}, AB_effect$name, n)

Y=AB_effect$value[match(unlist(AB), AB_effect$name)]

Y=Y+a*b*rnorm(length(Y))

sub_fr=as.data.frame(do.call(rbind, strsplit(unlist(AB), ':')))
rownames(sub_fr)=NULL
colnames(sub_fr)=c('A', 'B')

fr=data.frame(Y=Y,sub_fr)

my_subset=function(amm) {
  coding=apply(
amm
,1
, function(x) {
  paste(x, collapse='')
}
)
  amm[match(unique(coding), coding),]
}

my_subset(model.matrix(Y ~ A*B,fr))
my_subset(model.matrix(Y ~ (A+B)^2,fr))
my_subset(model.matrix(Y ~ A + B + A:B,fr))
my_subset(model.matrix(Y ~ A:B - 1,fr))
my_subset(model.matrix(Y ~ A:B,fr))

On Fri, Mar 5, 2010 at 8:45 AM, Gabor Grothendieck
  wrote:
 

The way to understand this is to look at the output of model.matrix:

model.matrix(fo, fr)

for each formula you tried.  If your data is large you will have to
use a subset not to be overwhelmed with output.

On Fri, Mar 5, 2010 at 9:08 AM, blue sky  wrote:
   

Suppose, 'fr' is data.frame with columns 'Y', 'A' and 'B'. 'A' has
levels 'Aa'
'Ab' and 'Ac', and 'B' has levels 'Ba', 'Bb', 'Bc' and 'Bd'. 'Y'
columns are numbers.

I tried the following three sets of commands. I understand that A*B is
equivalent to A+B+A:B. However, A

Re: [R] How to match vector with a list ?

2010-03-05 Thread William Dunlap

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Carlos Petti
> Sent: Friday, March 05, 2010 9:43 AM
> To: r-help@r-project.org
> Subject: [R] How to match vector with a list ?
> 
> Dear list,
> 
> I have a vector of characters and a list of two named elements :
> 
> i <- c("a","a","b","b","b","c","c","d")
> 
> j <- list(j1 = c("a","c"), j2 = c("b","d"))
> 
> I'm looking for a fast way to obtain a vector with names, as follows :
> 
> [1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2"

A request with a such a nice copy-and-pastable
example in it deserves an answer.

It looks to me like you want to map the item names
in i to the group names that are the names of the list j,
which maps group names to the items in each group.
When there are lots of groups it can be faster to
first invert the list j into a mapping vector pair,
as in:

f2 <- function (i, j) {
groupNames <- rep(names(j), sapply(j, length)) # map to groupName
itemNames <- unlist(j, use.names = FALSE) # map from itemName
groupNames[match(i, itemNames, nomatch = NA)]
}

I put your original code into a function, as this makes
testing and development easier:

f0 <- function (i, j) {
match <- lapply(j, function(x) {
which(i %in% x)
})
k <- vector()
for (y in 1:length(match)) {
k[match[[y]]] <- names(match[y])
}
k
}

With your original data these give identical results:

> identical(f0(i,j), f2(i,j))
[1] TRUE

I made a list describing 1000 groups, each containing
an average of 10 members:

jBig <- split(paste("N",1:1,sep=""),
sample(paste("G",1:1000,sep=""),size=1,replace=TRUE))

and a vector of a million items sampled from the those
member names:

iBig <- sample(paste("N",1:1,sep=""), replace=TRUE, size=1e6)

Then I compared the times it took f0 and f2 to compute
the result and verified that their outputs were identical:

> system.time(r0<-f0(iBig,jBig))
   user  system elapsed 
 100.89   10.20  111.27 
> system.time(r2<-f2(iBig,jBig))
   user  system elapsed 
   0.140.000.14 
> identical(r0,r2)
[1] TRUE

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> 
> I used :
> 
> match <- lapply(j, function (x) {which(i %in% x)})
> k <- vector()
> for (y  in 1:length(match)) {
> k[match[[y]]] <- names(match[y])}
> k
> [1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2"
> 
> But, I think a better way exists ...
> 
> Thanks in advance,
> Carlos
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to match vector with a list ?

2010-03-05 Thread jim holtman

try this:

> i <- c("a","a","b","b","b","c","c","d")
>
> j <- list(j1 = c("a","c"), j2 = c("b","d"))
>
> # create a matrix for mapping
> map <- do.call(rbind, lapply(names(j), function(x) cbind(x, j[[x]])))
>  # generate your output
> map[match(i, map[,2]),1]
[1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2"
>


On Fri, Mar 5, 2010 at 12:42 PM, Carlos Petti  wrote:
> Dear list,
>
> I have a vector of characters and a list of two named elements :
>
> i <- c("a","a","b","b","b","c","c","d")
>
> j <- list(j1 = c("a","c"), j2 = c("b","d"))
>
> I'm looking for a fast way to obtain a vector with names, as follows :
>
> [1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2"
>
> I used :
>
> match <- lapply(j, function (x) {which(i %in% x)})
> k <- vector()
> for (y  in 1:length(match)) {
> k[match[[y]]] <- names(match[y])}
> k
> [1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2"
>
> But, I think a better way exists ...
>
> Thanks in advance,
> Carlos
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread John Sorkin

The sad part of this interchanges is that Blue Sky does not seem to be amiable 
to suggestion. He, or she, has not taken note, or responded to the fact that a 
number of people believe it is good manners to give a real name and 
affiliation. My mother taught me that when two people tell you that you are 
drunk you should lie down until the inebriation goes away. Blue Sky, several 
people have noted that you would do well to give us your name and affiliation. 
Is this too much to ask given that people are good enough to help you?
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> "Matthew 
Dowle"  3/5/2010 12:58 PM >>>
Frank, I respect your views but I agree with Gabor.  The posting guide does 
not support your views.

It is not any of our views that are important but we are following the 
posting guide.  It covers affiliation. It says only that "some" consider it 
"good manners to include a concise signature specifying affiliation". It 
does not agree that it is bad manners not to.  It is therefore going too far 
to urge R-gurus, whoever they might be, to ignore such postings on that 
basis alone.  It is up to responders (I think that is the better word which 
is the one used by the posting guide) whether they reply.  Missing 
affiliation is ok by the posting guide.  Users shouldn't be put off from 
posting because of that alone.

Sending from an anonymous email address such as "BioStudent" is also fine by 
the posting guide as far as my eyes read it. It says only that the email 
address should work. I would also answer such anonymous posts, providing 
they demonstrate they made best efforts to follow the posting guide, as 
usual for all requests for help.  Its so easy to send from a false, but 
apparently real name, why worry about that?

If you disagree with the posting guide then you could make a suggestion to 
get the posting guide changed with respect to these points.  But, currently, 
good and practice is defined by the posting guide, and I can't see that your 
view is backed up by it.  In fact it seems to me that these points were 
carefully considered, and the wording is careful on these points.

As far as I know you are wrong that there is no moderator.  There are in 
fact an uncountable number of people who are empowered to moderate i.e. all 
of us. In other words its up to the responders to moderate.  The posting 
guide is our guide.  As a last resort we can alert the list administrator 
(which I believe is the correct name for him in that role), who has powers 
to remove an email address from the list if he thinks that is appropriate, 
or act otherwise, or not at all.  It is actually up to responders (i.e. all 
of us) to ensure the posting guide is followed.

My view is that the problems started with some responders on some occasions. 
They sometimes forgot, a little bit, to encourage and remind posters to 
follow the posting guide when it was not followed. This then may have 
encouraged more posters to think it was ok not to follow the posting guide. 
That is my own personal view,  not a statistical one backed up by any 
evidence.

Matthew

"Frank E Harrell Jr"  wrote in message 
news:4b913880.9020...@vanderbilt.edu...
> Gabor Grothendieck wrote:
>> I am happy to answer posts to r-help regardless of the name and email
>> address of the poster but would draw the line at someone excessively
>> posting without a reasonable effort to find the answer first or using
>> it for homework since such requests could flood the list making it
>> useless for everyone.
>
> Gabor I respectfully disagree.  It is bad practice to allow anonymous 
> postings.  We need to see real names and real affiliations.
>
> r-help is starting to border on uselessness because of the age old problem 
> of the same question being asked every two days, a high frequency of 
> specialty questions, and answers given with the best of intentions in 
> incremental or contradictory e-mail pieces (as opposed to a cumulative 
> wiki or hierarchically designed discussion web forum), as there is no 
> moderator for the list.  We don't need even more traffic from anonymous 
> postings.
>
> Frank
>
>>
>> On Fri, Mar 5, 2010 at 10:55 AM, Ravi Varadhan  
>> wrote:
>>> David,
>>>
>>> I agree with your sentiments.  I also think that it is bad posting 
>>> etiquette not to sign one's genuine name and affiliation when asking for 
>>> help, which "blue sky" seems to do a lot.  Bert Gunter has already 
>>> raised this issue, and I completely agree with him. I would also like to 
>>> urge the R-gurus to ignore such postings.
>>>
>>> Best,
>>> Ravi.
>>> 
>>>
>>> Ravi Varadhan, Ph.D.
>>> Assistant Professor,
>>> Di

Re: [R] [Rd] Changing the Prompt for browser()

2010-03-05 Thread Prof Brian Ripley


On Fri, 5 Mar 2010, Andrew Redd wrote:


Is there a way that I can change the prompt for within a browser() call.  I
often use use code like


with(obj1,browser())

Browse[1]>

Is there a way that I can set it so that I can get something like


with(obj1,browser(prompt="obj1"))

obj1[1]>

I know that prompt is not a valid option for browser, but it would be nice
if it were.   There is an option('prompt")  but that does not affect the
prompt for browser.  Can I change this and how?


Only by changing the sources.  The prompt is hard-coded in 
src/main/main.c.




Thanks,
Andrew

[[alternative HTML version deleted]]


Please respect the request in the posting guide not to send HTML.


__
r-de...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Gabor Grothendieck

On Fri, Mar 5, 2010 at 12:58 PM, Matthew Dowle  wrote:
> As far as I know you are wrong that there is no moderator.  There are in
> fact an uncountable number of people who are empowered to moderate i.e. all
> of us. In other words its up to the responders to moderate.  The posting

I think moderator is being used in the sense of a person who receives
posts before they become public and allows or disallows each post.
Using that definition there is no moderator.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About the interaction A:B

2010-03-05 Thread blue sky

On Fri, Mar 5, 2010 at 11:41 AM, Jeff Laake  wrote:
> On 3/5/2010 9:19 AM, Frank E Harrell Jr wrote:
>>
>> You neglected to state your name and affiliation, and your question
>> demonstrates an allergy to R documentation.
>>
>> Frank
>
> I agree with Frank but will try to answer some of your questions as I
> understand it.
>
> First, model.matrix uses the options$contrasts or what is set specifically
> as the contrast for a factor using contrasts().  The default is
> treatment contrasts and for a factor that means the first level of a factor
> variable is the intercept and the remainder are "treatment" effects
> which are added to intercept.  If you specify  Y~A+B+A:B, you are specifying
> the model with main effects (A, B) and interactions (A:B).

So my interpretation of how R does internally when dealing with
interaction terms A:B is to look if each individual term (each sub
interactions in highway interactions like, A:B, A:C, etc., in A:B:C:D)
appears in the formula or not, and deciding how to construct the
columns of the model matrix according to A:B, right?

This seem quite complicated when a formula is arbitrarily complex, let
along different type of data (namely ordered, not ordered, numerical
numbers). The piece of information that I feel that is still missing
to me is the detailed procedure that R generate model.matrix for
arbitrary complex formulas and how it is proved the way that R does is
always correct for these complex formulas.

> If A has m levels and B has n levels then there will be an intercept (1),
> m-1 for A, n-1 for B and (m-1)*(n-1) for the interaction.
> If you specify the model as Y~A:B it will specify the model fully with
> interactions which will have m*n separate parameters and none will be
> NA as long as you have data in each of the m*n cells.  It makes absolutely
> no sense to me to have an intercept and then all but one of the
> remaining interactions included.

I think that 'Y ~ A:B' indeed has the intercept term unless '-1' is
included. You may need to adjust your interpretation of A:B and A:B -1
in these cases.
> dim(my_subset(model.matrix(Y ~ A:B - 1,fr)))
[1] 12 12
> dim(my_subset(model.matrix(Y ~ A:B,fr)))
[1] 12 13


> Note that you'll get quite different results if A and/or B are not factor
> variables.
>
> --jeff
>>
>> blue sky wrote:
>>>
>>> The following is the code for the model.matrix. But it still doesn't
>>> answer why A:B is interpreted differently in Y~A+B+A:B and Y~A:B. By
>>> 'why', I mean how R internally does it and what is the rational behind
>>> the way of doing it?
>>>
>>> And it didn't answer why in the model.matrix of Y~A, there are a-1
>>> terms from A plus the intercept, but in the model.matrix of Y~A:B,
>>> there are a*b terms (rather than a*b-1 terms) plus the intercept.
>>> Since the one coefficient of the lm of Y~A:B is going to be NA, why
>>> bother to include the corresponding term in the model matrix?
>>>
>>> code below
>>>
>>> set.seed(0)
>>>
>>> a=3
>>> b=4
>>>
>>> AB_effect=data.frame(
>>>  name=paste(
>>>unlist(
>>>  do.call(
>>>rbind
>>>, rep(list(paste('A', letters[1:a],sep='')), b)
>>>)
>>>  )
>>>, unlist(
>>>  do.call(
>>>cbind
>>>, rep(list(paste('B', letters[1:b],sep='')), a)
>>>)
>>>  )
>>>, sep=':'
>>>)
>>>  , value=rnorm(a*b)
>>>  , stringsAsFactors=F
>>>  )
>>>
>>> max_n=10
>>> n=sample.int(max_n, a*b, replace=T)
>>>
>>> AB=mapply(function(name, n){rep(name,n)}, AB_effect$name, n)
>>>
>>> Y=AB_effect$value[match(unlist(AB), AB_effect$name)]
>>>
>>> Y=Y+a*b*rnorm(length(Y))
>>>
>>> sub_fr=as.data.frame(do.call(rbind, strsplit(unlist(AB), ':')))
>>> rownames(sub_fr)=NULL
>>> colnames(sub_fr)=c('A', 'B')
>>>
>>> fr=data.frame(Y=Y,sub_fr)
>>>
>>> my_subset=function(amm) {
>>>  coding=apply(
>>>amm
>>>,1
>>>, function(x) {
>>>  paste(x, collapse='')
>>>}
>>>)
>>>  amm[match(unique(coding), coding),]
>>> }
>>>
>>> my_subset(model.matrix(Y ~ A*B,fr))
>>> my_subset(model.matrix(Y ~ (A+B)^2,fr))
>>> my_subset(model.matrix(Y ~ A + B + A:B,fr))
>>> my_subset(model.matrix(Y ~ A:B - 1,fr))
>>> my_subset(model.matrix(Y ~ A:B,fr))
>>>
>>> On Fri, Mar 5, 2010 at 8:45 AM, Gabor Grothendieck
>>>  wrote:

 The way to understand this is to look at the output of model.matrix:

 model.matrix(fo, fr)

 for each formula you tried.  If your data is large you will have to
 use a subset not to be overwhelmed with output.

 On Fri, Mar 5, 2010 at 9:08 AM, blue sky  wrote:
>
> Suppose, 'fr' is data.frame with columns 'Y', 'A' and 'B'. 'A' has
> levels 'Aa'
> 'Ab' and 'Ac', and 'B' has levels 'Ba', 'Bb', 'Bc' and 'Bd'. 'Y'
> columns are numbers.
>
> I tried the following three sets of commands. I understand that A*B is
> equivalent to A+B+A:B. However, A:B in A+B+A:B is different from A:B
> just by itself (see the 3rd and 4th set of commands). Would you please
> hel

Re: [R] About the interaction A:B

2010-03-05 Thread RICHARD M. HEIBERGER

On Fri, Mar 5, 2010 at 12:41 PM, Jeff Laake  wrote:

> On 3/5/2010 9:19 AM, Frank E Harrell Jr wrote:
>
>> You neglected to state your name and affiliation, and your question
>> demonstrates an allergy to R documentation.
>
>
A:B indicates a two-level naming scheme.  The naming is identical in any of
the formulas
~ A:B
~ A + A:B
~ A + B + A:B

The degrees of freedom associated with A:B depends on what other terms are
in the model.
The number of dummy variables that are retained matches the number of
degrees of freedom.
Any dummy variables that are linearly dependent on the dummy variables for
the earlier terms
are suppressed.

Please read a good intermediate statistics book for the full story.

Rich

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave and optional document sections

2010-03-05 Thread Gabor Grothendieck

Its possible to do that using latex if/then/else macros which you can
find in latex package ifthen; however, once you are getting to that
level of complexity, Sweave is more of a hindrence than a help and is
probably no longer the appropriate tool for the job.

On Fri, Mar 5, 2010 at 12:55 PM, Aleksey Naumov  wrote:
> Dear R and Sweave users,
>
> Is there a way to have optional sections in a Sweave-generated report
> document, complete with section header(s), text and code chunks? In other
> words, I'd like for my report to include or omit certain sections based on
> the data itself.
>
> For example, If I examine the input dataset early on in the report and set a
> variable has_daily_data = TRUE, then I'd like to have a section that looks
> at the daily data in detail, with subsections, text and code chunks. What
> facilities do I have in Sweave (or odfWeave) for that?
>
> Perhaps, for code chunks, I could wrap all the code inside in
> if(has_daily_data) {...}? What about inserting conditional text and
> sections? Would it make sense to split the document into subdocuments and
> somehow include the optional subdocuments only if needed? How would I go
> about that?
>
> Would appreciate any pointers. Thank you,
> Aleksey
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Matthew Dowle

Frank, I respect your views but I agree with Gabor.  The posting guide does 
not support your views.

It is not any of our views that are important but we are following the 
posting guide.  It covers affiliation. It says only that "some" consider it 
"good manners to include a concise signature specifying affiliation". It 
does not agree that it is bad manners not to.  It is therefore going too far 
to urge R-gurus, whoever they might be, to ignore such postings on that 
basis alone.  It is up to responders (I think that is the better word which 
is the one used by the posting guide) whether they reply.  Missing 
affiliation is ok by the posting guide.  Users shouldn't be put off from 
posting because of that alone.

Sending from an anonymous email address such as "BioStudent" is also fine by 
the posting guide as far as my eyes read it. It says only that the email 
address should work. I would also answer such anonymous posts, providing 
they demonstrate they made best efforts to follow the posting guide, as 
usual for all requests for help.  Its so easy to send from a false, but 
apparently real name, why worry about that?

If you disagree with the posting guide then you could make a suggestion to 
get the posting guide changed with respect to these points.  But, currently, 
good and practice is defined by the posting guide, and I can't see that your 
view is backed up by it.  In fact it seems to me that these points were 
carefully considered, and the wording is careful on these points.

As far as I know you are wrong that there is no moderator.  There are in 
fact an uncountable number of people who are empowered to moderate i.e. all 
of us. In other words its up to the responders to moderate.  The posting 
guide is our guide.  As a last resort we can alert the list administrator 
(which I believe is the correct name for him in that role), who has powers 
to remove an email address from the list if he thinks that is appropriate, 
or act otherwise, or not at all.  It is actually up to responders (i.e. all 
of us) to ensure the posting guide is followed.

My view is that the problems started with some responders on some occasions. 
They sometimes forgot, a little bit, to encourage and remind posters to 
follow the posting guide when it was not followed. This then may have 
encouraged more posters to think it was ok not to follow the posting guide. 
That is my own personal view,  not a statistical one backed up by any 
evidence.

Matthew

"Frank E Harrell Jr"  wrote in message 
news:4b913880.9020...@vanderbilt.edu...
> Gabor Grothendieck wrote:
>> I am happy to answer posts to r-help regardless of the name and email
>> address of the poster but would draw the line at someone excessively
>> posting without a reasonable effort to find the answer first or using
>> it for homework since such requests could flood the list making it
>> useless for everyone.
>
> Gabor I respectfully disagree.  It is bad practice to allow anonymous 
> postings.  We need to see real names and real affiliations.
>
> r-help is starting to border on uselessness because of the age old problem 
> of the same question being asked every two days, a high frequency of 
> specialty questions, and answers given with the best of intentions in 
> incremental or contradictory e-mail pieces (as opposed to a cumulative 
> wiki or hierarchically designed discussion web forum), as there is no 
> moderator for the list.  We don't need even more traffic from anonymous 
> postings.
>
> Frank
>
>>
>> On Fri, Mar 5, 2010 at 10:55 AM, Ravi Varadhan  
>> wrote:
>>> David,
>>>
>>> I agree with your sentiments.  I also think that it is bad posting 
>>> etiquette not to sign one's genuine name and affiliation when asking for 
>>> help, which "blue sky" seems to do a lot.  Bert Gunter has already 
>>> raised this issue, and I completely agree with him. I would also like to 
>>> urge the R-gurus to ignore such postings.
>>>
>>> Best,
>>> Ravi.
>>> 
>>>
>>> Ravi Varadhan, Ph.D.
>>> Assistant Professor,
>>> Division of Geriatric Medicine and Gerontology
>>> School of Medicine
>>> Johns Hopkins University
>>>
>>> Ph. (410) 502-2619
>>> email: rvarad...@jhmi.edu
>>>
>>>
>>> - Original Message -
>>> From: David Winsemius 
>>> Date: Friday, March 5, 2010 9:25 am
>>> Subject: Re: [R] Nonparametric generalization of ANOVA
>>> To: blue sky 
>>> Cc: r-h...@stat.math.ethz.ch
>>>
>>>
  On Mar 5, 2010, at 8:19 AM, blue sky wrote:

  > My interpretation of the relation between 1-way ANOVA and Wilcoxon's
  > test (wilcox.test() in R) is the following.
  >
  > 1-way ANOVA is to test if two or multiple distributions are the 
 same,
  > assuming all the distributions are normal and have equal variances.
  > Wilcoxon's test is to test two distributions are the same without
  > assuming what their distributions are.
  >
  > In this sense, I'm wondering what

[R] Sweave and optional document sections

2010-03-05 Thread Aleksey Naumov

Dear R and Sweave users,

Is there a way to have optional sections in a Sweave-generated report
document, complete with section header(s), text and code chunks? In other
words, I'd like for my report to include or omit certain sections based on
the data itself.

For example, If I examine the input dataset early on in the report and set a
variable has_daily_data = TRUE, then I'd like to have a section that looks
at the daily data in detail, with subsections, text and code chunks. What
facilities do I have in Sweave (or odfWeave) for that?

Perhaps, for code chunks, I could wrap all the code inside in
if(has_daily_data) {...}? What about inserting conditional text and
sections? Would it make sense to split the document into subdocuments and
somehow include the optional subdocuments only if needed? How would I go
about that?

Would appreciate any pointers. Thank you,
Aleksey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to match vector with a list ?

2010-03-05 Thread Carlos Petti

Dear list,

I have a vector of characters and a list of two named elements :

i <- c("a","a","b","b","b","c","c","d")

j <- list(j1 = c("a","c"), j2 = c("b","d"))

I'm looking for a fast way to obtain a vector with names, as follows :

[1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2"

I used :

match <- lapply(j, function (x) {which(i %in% x)})
k <- vector()
for (y  in 1:length(match)) {
k[match[[y]]] <- names(match[y])}
k
[1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2"

But, I think a better way exists ...

Thanks in advance,
Carlos

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About the interaction A:B

2010-03-05 Thread Jeff Laake


On 3/5/2010 9:19 AM, Frank E Harrell Jr wrote:
You neglected to state your name and affiliation, and your question 
demonstrates an allergy to R documentation.


Frank
I agree with Frank but will try to answer some of your questions as I 
understand it.


First, model.matrix uses the options$contrasts or what is set 
specifically as the contrast for a factor using contrasts().  The 
default is
treatment contrasts and for a factor that means the first level of a 
factor variable is the intercept and the remainder are "treatment" effects
which are added to intercept.  If you specify  Y~A+B+A:B, you are 
specifying the model with main effects (A, B) and interactions (A:B).
If A has m levels and B has n levels then there will be an intercept 
(1), m-1 for A, n-1 for B and (m-1)*(n-1) for the interaction.
If you specify the model as Y~A:B it will specify the model fully with 
interactions which will have m*n separate parameters and none will be
NA as long as you have data in each of the m*n cells.  It makes 
absolutely no sense to me to have an intercept and then all but one of the

remaining interactions included.

Note that you'll get quite different results if A and/or B are not 
factor variables.


--jeff


blue sky wrote:

The following is the code for the model.matrix. But it still doesn't
answer why A:B is interpreted differently in Y~A+B+A:B and Y~A:B. By
'why', I mean how R internally does it and what is the rational behind
the way of doing it?

And it didn't answer why in the model.matrix of Y~A, there are a-1
terms from A plus the intercept, but in the model.matrix of Y~A:B,
there are a*b terms (rather than a*b-1 terms) plus the intercept.
Since the one coefficient of the lm of Y~A:B is going to be NA, why
bother to include the corresponding term in the model matrix?

code below

set.seed(0)

a=3
b=4

AB_effect=data.frame(
  name=paste(
unlist(
  do.call(
rbind
, rep(list(paste('A', letters[1:a],sep='')), b)
)
  )
, unlist(
  do.call(
cbind
, rep(list(paste('B', letters[1:b],sep='')), a)
)
  )
, sep=':'
)
  , value=rnorm(a*b)
  , stringsAsFactors=F
  )

max_n=10
n=sample.int(max_n, a*b, replace=T)

AB=mapply(function(name, n){rep(name,n)}, AB_effect$name, n)

Y=AB_effect$value[match(unlist(AB), AB_effect$name)]

Y=Y+a*b*rnorm(length(Y))

sub_fr=as.data.frame(do.call(rbind, strsplit(unlist(AB), ':')))
rownames(sub_fr)=NULL
colnames(sub_fr)=c('A', 'B')

fr=data.frame(Y=Y,sub_fr)

my_subset=function(amm) {
  coding=apply(
amm
,1
, function(x) {
  paste(x, collapse='')
}
)
  amm[match(unique(coding), coding),]
}

my_subset(model.matrix(Y ~ A*B,fr))
my_subset(model.matrix(Y ~ (A+B)^2,fr))
my_subset(model.matrix(Y ~ A + B + A:B,fr))
my_subset(model.matrix(Y ~ A:B - 1,fr))
my_subset(model.matrix(Y ~ A:B,fr))

On Fri, Mar 5, 2010 at 8:45 AM, Gabor Grothendieck
 wrote:

The way to understand this is to look at the output of model.matrix:

model.matrix(fo, fr)

for each formula you tried.  If your data is large you will have to
use a subset not to be overwhelmed with output.

On Fri, Mar 5, 2010 at 9:08 AM, blue sky  wrote:
Suppose, 'fr' is data.frame with columns 'Y', 'A' and 'B'. 'A' has 
levels 'Aa'

'Ab' and 'Ac', and 'B' has levels 'Ba', 'Bb', 'Bc' and 'Bd'. 'Y'
columns are numbers.

I tried the following three sets of commands. I understand that A*B is
equivalent to A+B+A:B. However, A:B in A+B+A:B is different from A:B
just by itself (see the 3rd and 4th set of commands). Would you please
help me understand why the meanings of A:B are different in different
contexts?

I also see the coefficient of AAc:BBd is NA (the last set of
commands). I'm wondering why this coefficient is not removed from the
'coefficients' vector. Since lm(Y~A) has coefficients for (intercept),
Ab, Ac (there are no NA's), I think that it is reasonable to make sure
that there are no NA's when there are interaction terms, namely, A:B
in this case.

Thank you for answering my questions!


alm=lm(Y ~ A*B,fr)
alm$coefficients
(Intercept) AAb AAc BBb BBc 
BBd

 -3.548176   -2.0865867.0038574.367800   11.887356   -3.470840
  AAb:BBb AAc:BBb AAb:BBc AAc:BBc AAb:BBd AAc:BBd
 5.160865  -11.858425  -12.853116  -20.2896116.727401   -2.327173

alm=lm(Y ~ A + B + A:B,fr)
alm$coefficients
(Intercept) AAb AAc BBb BBc 
BBd

 -3.548176   -2.0865867.0038574.367800   11.887356   -3.470840
  AAb:BBb AAc:BBb AAb:BBc AAc:BBc AAb:BBd AAc:BBd
 5.160865  -11.858425  -12.853116  -20.2896116.727401   -2.327173

alm=lm(Y ~ A:B - 1,fr)
alm$coefficients
 AAa:BBaAAb:BBaAAc:BBaAAa:BBbAAb:BBbAAc:BBb
AAa:BBc
-3.5481765 -5.6347625  3.4556808  0.8196231  3.8939016 -4.0349449  
8.3391795

 AAb:BBcAAc:BBcAAa:BBdAAb:BBdAAc:BBd
-6.6005222 -4.9465744 -7.0190168 -2.3782017 -2.34

Re: [R] Running complete R script from Java

2010-03-05 Thread Romain Francois


You can source the script, e.g run the command :

eval( "source( '" + script + "') " ) ;

Questions about rJava/JRI are better on the stats-rosuda-devel mailing 
list:

http://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel

Romain

On 03/05/2010 03:51 AM, Ralf B wrote:


Is it possible to run a R script from Java (via JRI (part of rJava):
http://www.rforge.net/rJava/) without adding it line by line into a
JRI java application?

Ralf


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7
`- http://tr.im/O1wO : highlight 0.1-5

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread blue sky

I don't see which a link has the GENERAL and COMPLETE MATHEMATICAL
description of nonparametric ANOVA for ARBITRARY MODEL. Would you
please be specific which one does so?

On Fri, Mar 5, 2010 at 10:52 AM, Jeremy Miles  wrote:
> Two links for you which will get your answer much quicker than a mailing list:
>
> http://lmgtfy.com/?q=non-parametric+anova+R
>
> or
>
> http://www.justfuckinggoogleit.com/search.pl?query=non+parametric+anova+R
>
> Jeremy
>
>
> On 5 March 2010 05:19, blue sky  wrote:
>> My interpretation of the relation between 1-way ANOVA and Wilcoxon's
>> test (wilcox.test() in R) is the following.
>>
>> 1-way ANOVA is to test if two or multiple distributions are the same,
>> assuming all the distributions are normal and have equal variances.
>> Wilcoxon's test is to test two distributions are the same without
>> assuming what their distributions are.
>>
>> In this sense, I'm wondering what is the generalization of Wilcoxon's
>> test to more than two distributions. And, more general, what is the
>> generalization of Wilcoxon's test to multi-way ANOVA with arbitrary
>> complex model formula? What are the equivalent F statistics and t
>> statistics in the generalization of Wilcoxon's test?
>>
>> Note that I'm not interested in looking for a specific nonparametric
>> test for a particular dataset right now, although this is important in
>> practice. What I'm interested the general nonparametric statistical
>> framework that parallels ANOVA. Could somebody give some hints on what
>> references I should look for? I have google searched this topic, but
>> don't find a page that exactly answered my question.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jeremy Miles
> Psychology Research Methods Wiki: www.researchmethodsinpsychology.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About the interaction A:B

2010-03-05 Thread Frank E Harrell Jr

You neglected to state your name and affiliation, and your question 
demonstrates an allergy to R documentation.


Frank

blue sky wrote:

The following is the code for the model.matrix. But it still doesn't
answer why A:B is interpreted differently in Y~A+B+A:B and Y~A:B. By
'why', I mean how R internally does it and what is the rational behind
the way of doing it?

And it didn't answer why in the model.matrix of Y~A, there are a-1
terms from A plus the intercept, but in the model.matrix of Y~A:B,
there are a*b terms (rather than a*b-1 terms) plus the intercept.
Since the one coefficient of the lm of Y~A:B is going to be NA, why
bother to include the corresponding term in the model matrix?

code below

set.seed(0)

a=3
b=4

AB_effect=data.frame(
  name=paste(
unlist(
  do.call(
rbind
, rep(list(paste('A', letters[1:a],sep='')), b)
)
  )
, unlist(
  do.call(
cbind
, rep(list(paste('B', letters[1:b],sep='')), a)
)
  )
, sep=':'
)
  , value=rnorm(a*b)
  , stringsAsFactors=F
  )

max_n=10
n=sample.int(max_n, a*b, replace=T)

AB=mapply(function(name, n){rep(name,n)}, AB_effect$name, n)

Y=AB_effect$value[match(unlist(AB), AB_effect$name)]

Y=Y+a*b*rnorm(length(Y))

sub_fr=as.data.frame(do.call(rbind, strsplit(unlist(AB), ':')))
rownames(sub_fr)=NULL
colnames(sub_fr)=c('A', 'B')

fr=data.frame(Y=Y,sub_fr)

my_subset=function(amm) {
  coding=apply(
amm
,1
, function(x) {
  paste(x, collapse='')
}
)
  amm[match(unique(coding), coding),]
}

my_subset(model.matrix(Y ~ A*B,fr))
my_subset(model.matrix(Y ~ (A+B)^2,fr))
my_subset(model.matrix(Y ~ A + B + A:B,fr))
my_subset(model.matrix(Y ~ A:B - 1,fr))
my_subset(model.matrix(Y ~ A:B,fr))

On Fri, Mar 5, 2010 at 8:45 AM, Gabor Grothendieck
 wrote:

The way to understand this is to look at the output of model.matrix:

model.matrix(fo, fr)

for each formula you tried.  If your data is large you will have to
use a subset not to be overwhelmed with output.

On Fri, Mar 5, 2010 at 9:08 AM, blue sky  wrote:

Suppose, 'fr' is data.frame with columns 'Y', 'A' and 'B'. 'A' has levels 'Aa'
'Ab' and 'Ac', and 'B' has levels 'Ba', 'Bb', 'Bc' and 'Bd'. 'Y'
columns are numbers.

I tried the following three sets of commands. I understand that A*B is
equivalent to A+B+A:B. However, A:B in A+B+A:B is different from A:B
just by itself (see the 3rd and 4th set of commands). Would you please
help me understand why the meanings of A:B are different in different
contexts?

I also see the coefficient of AAc:BBd is NA (the last set of
commands). I'm wondering why this coefficient is not removed from the
'coefficients' vector. Since lm(Y~A) has coefficients for (intercept),
Ab, Ac (there are no NA's), I think that it is reasonable to make sure
that there are no NA's when there are interaction terms, namely, A:B
in this case.

Thank you for answering my questions!


alm=lm(Y ~ A*B,fr)
alm$coefficients

(Intercept) AAb AAc BBb BBc BBd
 -3.548176   -2.0865867.0038574.367800   11.887356   -3.470840
  AAb:BBb AAc:BBb AAb:BBc AAc:BBc AAb:BBd AAc:BBd
 5.160865  -11.858425  -12.853116  -20.2896116.727401   -2.327173

alm=lm(Y ~ A + B + A:B,fr)
alm$coefficients

(Intercept) AAb AAc BBb BBc BBd
 -3.548176   -2.0865867.0038574.367800   11.887356   -3.470840
  AAb:BBb AAc:BBb AAb:BBc AAc:BBc AAb:BBd AAc:BBd
 5.160865  -11.858425  -12.853116  -20.2896116.727401   -2.327173

alm=lm(Y ~ A:B - 1,fr)
alm$coefficients

 AAa:BBaAAb:BBaAAc:BBaAAa:BBbAAb:BBbAAc:BBbAAa:BBc
-3.5481765 -5.6347625  3.4556808  0.8196231  3.8939016 -4.0349449  8.3391795
 AAb:BBcAAc:BBcAAa:BBdAAb:BBdAAc:BBd
-6.6005222 -4.9465744 -7.0190168 -2.3782017 -2.3423322

alm=lm(Y ~ A:B,fr)
alm$coefficients

(Intercept) AAa:BBa AAb:BBa AAc:BBa AAa:BBb AAb:BBb
-2.34233221 -1.20584424 -3.29243033  5.79801305  3.16195534  6.23623377
  AAc:BBb AAa:BBc AAb:BBc AAc:BBc AAa:BBd AAb:BBd
-1.69261273 10.68151168 -4.25819000 -2.60424217 -4.67668454 -0.03586951
  AAc:BBd
   NA



--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread blue sky

On Fri, Mar 5, 2010 at 9:42 AM, David Winsemius  wrote:
>
> On Mar 5, 2010, at 10:34 AM, Matthias Gondan wrote:
>
>>> This is your first of three postings in the last hour and they are all in
>>> a category that could well be described as requests for tutoring in basic
>>> statistical topics. I am of the impression you have been requested not to
>>> engage in such behavior on this list. For this question for instance
>>> there is an entire CRAN Task View available and you have been in
>>> particular asked to sue such resource before posting.
>>
>> Please allow me to ask for details on this task view, because I am
>> interested in the topic of nonparametric ANOVAs, as well. To my knowledge,
>> there are some R scripts from Brunner et al. available on his website
>>
>> http://www.ams.med.uni-goettingen.de/de/sof/ld/index.html

I don't understand German. There are two references in English though.
Does any of them give description of nonparametric ANOVA in a very
general way.

Brunner, E. , Domhof, S. und Langer,F. (2002): Nonparametric Analysis
of Longitudinal Data in Factorial Experiments. Wiley, New York.
Brunner, E. und Puri, M.L.. (2001): Nonparametric Methods in Factorial
Designs. Statistical Papers 42, 1-52.

>> But they seem not to be working with current R versions.
>
> http://finzi.psych.upenn.edu/views/Robust.html

I think that robust analysis and nonparametric analysis are different,
if I understand correctly some description in the introduction of
Robust Statistics 2nd Ed by Huber and Ronchetti.

>> Best regards,
>>
>> Matthias Gondan
>>
>>
> --
>
> David Winsemius, MD
> West Hartford, CT
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About the interaction A:B

2010-03-05 Thread blue sky

The following is the code for the model.matrix. But it still doesn't
answer why A:B is interpreted differently in Y~A+B+A:B and Y~A:B. By
'why', I mean how R internally does it and what is the rational behind
the way of doing it?

And it didn't answer why in the model.matrix of Y~A, there are a-1
terms from A plus the intercept, but in the model.matrix of Y~A:B,
there are a*b terms (rather than a*b-1 terms) plus the intercept.
Since the one coefficient of the lm of Y~A:B is going to be NA, why
bother to include the corresponding term in the model matrix?

code below

set.seed(0)

a=3
b=4

AB_effect=data.frame(
  name=paste(
unlist(
  do.call(
rbind
, rep(list(paste('A', letters[1:a],sep='')), b)
)
  )
, unlist(
  do.call(
cbind
, rep(list(paste('B', letters[1:b],sep='')), a)
)
  )
, sep=':'
)
  , value=rnorm(a*b)
  , stringsAsFactors=F
  )

max_n=10
n=sample.int(max_n, a*b, replace=T)

AB=mapply(function(name, n){rep(name,n)}, AB_effect$name, n)

Y=AB_effect$value[match(unlist(AB), AB_effect$name)]

Y=Y+a*b*rnorm(length(Y))

sub_fr=as.data.frame(do.call(rbind, strsplit(unlist(AB), ':')))
rownames(sub_fr)=NULL
colnames(sub_fr)=c('A', 'B')

fr=data.frame(Y=Y,sub_fr)

my_subset=function(amm) {
  coding=apply(
amm
,1
, function(x) {
  paste(x, collapse='')
}
)
  amm[match(unique(coding), coding),]
}

my_subset(model.matrix(Y ~ A*B,fr))
my_subset(model.matrix(Y ~ (A+B)^2,fr))
my_subset(model.matrix(Y ~ A + B + A:B,fr))
my_subset(model.matrix(Y ~ A:B - 1,fr))
my_subset(model.matrix(Y ~ A:B,fr))

On Fri, Mar 5, 2010 at 8:45 AM, Gabor Grothendieck
 wrote:
> The way to understand this is to look at the output of model.matrix:
>
> model.matrix(fo, fr)
>
> for each formula you tried.  If your data is large you will have to
> use a subset not to be overwhelmed with output.
>
> On Fri, Mar 5, 2010 at 9:08 AM, blue sky  wrote:
>> Suppose, 'fr' is data.frame with columns 'Y', 'A' and 'B'. 'A' has levels 
>> 'Aa'
>> 'Ab' and 'Ac', and 'B' has levels 'Ba', 'Bb', 'Bc' and 'Bd'. 'Y'
>> columns are numbers.
>>
>> I tried the following three sets of commands. I understand that A*B is
>> equivalent to A+B+A:B. However, A:B in A+B+A:B is different from A:B
>> just by itself (see the 3rd and 4th set of commands). Would you please
>> help me understand why the meanings of A:B are different in different
>> contexts?
>>
>> I also see the coefficient of AAc:BBd is NA (the last set of
>> commands). I'm wondering why this coefficient is not removed from the
>> 'coefficients' vector. Since lm(Y~A) has coefficients for (intercept),
>> Ab, Ac (there are no NA's), I think that it is reasonable to make sure
>> that there are no NA's when there are interaction terms, namely, A:B
>> in this case.
>>
>> Thank you for answering my questions!
>>
>>> alm=lm(Y ~ A*B,fr)
>>> alm$coefficients
>> (Intercept)         AAb         AAc         BBb         BBc         BBd
>>  -3.548176   -2.086586    7.003857    4.367800   11.887356   -3.470840
>>   AAb:BBb     AAc:BBb     AAb:BBc     AAc:BBc     AAb:BBd     AAc:BBd
>>  5.160865  -11.858425  -12.853116  -20.289611    6.727401   -2.327173
>>>
>>> alm=lm(Y ~ A + B + A:B,fr)
>>> alm$coefficients
>> (Intercept)         AAb         AAc         BBb         BBc         BBd
>>  -3.548176   -2.086586    7.003857    4.367800   11.887356   -3.470840
>>   AAb:BBb     AAc:BBb     AAb:BBc     AAc:BBc     AAb:BBd     AAc:BBd
>>  5.160865  -11.858425  -12.853116  -20.289611    6.727401   -2.327173
>>>
>>> alm=lm(Y ~ A:B - 1,fr)
>>> alm$coefficients
>>  AAa:BBa    AAb:BBa    AAc:BBa    AAa:BBb    AAb:BBb    AAc:BBb    AAa:BBc
>> -3.5481765 -5.6347625  3.4556808  0.8196231  3.8939016 -4.0349449  8.3391795
>>  AAb:BBc    AAc:BBc    AAa:BBd    AAb:BBd    AAc:BBd
>> -6.6005222 -4.9465744 -7.0190168 -2.3782017 -2.3423322
>>>
>>> alm=lm(Y ~ A:B,fr)
>>> alm$coefficients
>> (Intercept)     AAa:BBa     AAb:BBa     AAc:BBa     AAa:BBb     AAb:BBb
>> -2.34233221 -1.20584424 -3.29243033  5.79801305  3.16195534  6.23623377
>>   AAc:BBb     AAa:BBc     AAb:BBc     AAc:BBc     AAa:BBd     AAb:BBd
>> -1.69261273 10.68151168 -4.25819000 -2.60424217 -4.67668454 -0.03586951
>>   AAc:BBd
>>        NA
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Frank E Harrell Jr

Gabor Grothendieck wrote:

I am happy to answer posts to r-help regardless of the name and email
address of the poster but would draw the line at someone excessively
posting without a reasonable effort to find the answer first or using
it for homework since such requests could flood the list making it
useless for everyone.

Gabor I respectfully disagree.  It is bad practice to allow anonymous 
postings.  We need to see real names and real affiliations.

r-help is starting to border on uselessness because of the age old 
problem of the same question being asked every two days, a high 
frequency of specialty questions, and answers given with the best of 
intentions in incremental or contradictory e-mail pieces (as opposed to 
a cumulative wiki or hierarchically designed discussion web forum), as 
there is no moderator for the list.  We don't need even more traffic 
from anonymous postings.

Frank

On Fri, Mar 5, 2010 at 10:55 AM, Ravi Varadhan  wrote:

David,

I agree with your sentiments.  I also think that it is bad posting etiquette not to sign 
one's genuine name and affiliation when asking for help, which "blue sky" seems 
to do a lot.  Bert Gunter has already raised this issue, and I completely agree with him. 
I would also like to urge the R-gurus to ignore such postings.

Best,
Ravi.

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu

- Original Message -
From: David Winsemius 
Date: Friday, March 5, 2010 9:25 am
Subject: Re: [R] Nonparametric generalization of ANOVA
To: blue sky 
Cc: r-h...@stat.math.ethz.ch

 On Mar 5, 2010, at 8:19 AM, blue sky wrote:

 > My interpretation of the relation between 1-way ANOVA and Wilcoxon's
 > test (wilcox.test() in R) is the following.
 >
 > 1-way ANOVA is to test if two or multiple distributions are the same,
 > assuming all the distributions are normal and have equal variances.
 > Wilcoxon's test is to test two distributions are the same without
 > assuming what their distributions are.
 >
 > In this sense, I'm wondering what is the generalization of Wilcoxon's
 > test to more than two distributions. And, more general, what is the
 > generalization of Wilcoxon's test to multi-way ANOVA with arbitrary
 > complex model formula? What are the equivalent F statistics and t
 > statistics in the generalization of Wilcoxon's test?
 >
 > Note that I'm not interested in looking for a specific nonparametric
 > test for a particular dataset right now, although this is important
in
 > practice. What I'm interested the general nonparametric statistical
 > framework that parallels ANOVA. Could somebody give some hints on what
 > references I should look for? I have google searched this topic, but
 > don't find a page that exactly answered my question.

 This is your first of three postings in the last hour and they are
all
 in a category that could well be described as requests for tutoring
in
 basic statistical topics. I am of the impression you have been
 requested not to engage in such behavior on this list. For this
 question for instance there is an entire CRAN Task View available and

 you have been in particular asked to sue such resource before posting.

 It's not the described role of the r-help list to remediate your lack

 of statistical background, but rather to deal with difficulties in
 applying the R-language to particular, discrete and exemplified
 problems.

 --

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list

 PLEASE do read the posting guide
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cross tabulation with fixed dimensions

2010-03-05 Thread David Winsemius

On Mar 5, 2010, at 11:43 AM, Tarmo Remmel wrote:

Hello,

This is a seemingly simple task, but it has been frustrating me for  
too

long, so I am turning to this list for some help.

I have two vectors of factors which are quite long; two simple  
examples are

shown here:

No, those are not "factors" in the R sense... but they _should_ be:

> a <- factor(c(1,2,3,4,5), levels=1:6)
> b <- factor(c(1,2,5,5,6), levels=1:6)
>
> tab <- table(a,b)
> tab
   b
a   1 2 3 4 5 6
  1 1 0 0 0 0 0
  2 0 1 0 0 0 0
  3 0 0 0 0 1 0
  4 0 0 0 0 1 0
  5 0 0 0 0 0 1
  6 0 0 0 0 0 0

a <- c(1,2,3,4,5)
b <- c(1,2,5,5,6)

If I produce a cross-tabulation of these vectors, I get the following:

tab <- table(a,b)

table(a,b)

  b
a   1 2 5 6
 1 1 0 0 0
 2 0 1 0 0
 3 0 0 1 0
 4 0 0 1 0
 5 0 0 0 1

I really want to have this cross-tabulation expanded to include all  
factors
present in the input data.  Thus, I need additional columns for  
categories 3
and 4, and a row for category 6, albeit, they will be filled with  
zeros.  Is

there an easy way of doing this?

Thanks,

Tarmo

Tarmo K. Remmel Ph.D.
Assistant Professor, Department of Geography
York University, N413A Ross Building
Toronto, Ontario, M3J 1P3, Canada
Tel: 416-736-2100 x22496; Fax: 416-736-5988
http://www.yorku.ca/remmelt
Skype: tarmoremmel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cross tabulation with fixed dimensions

2010-03-05 Thread Gabor Grothendieck

Those are numerical, not factors.  If they were factors all levels
would be represented.

> a <- factor(c(1,2,3,4,5), levels = 1:5)
> b <- factor(c(1,2,5,5,6), levels = 1:6)
> table(a, b)
   b
a   1 2 3 4 5 6
  1 1 0 0 0 0 0
  2 0 1 0 0 0 0
  3 0 0 0 0 1 0
  4 0 0 0 0 1 0
  5 0 0 0 0 0 1

On Fri, Mar 5, 2010 at 11:43 AM, Tarmo Remmel  wrote:
> Hello,
>
> This is a seemingly simple task, but it has been frustrating me for too
> long, so I am turning to this list for some help.
>
> I have two vectors of factors which are quite long; two simple examples are
> shown here:
>
>> a <- c(1,2,3,4,5)
>> b <- c(1,2,5,5,6)
>
> If I produce a cross-tabulation of these vectors, I get the following:
>
>> tab <- table(a,b)
>
>> table(a,b)
>   b
> a   1 2 5 6
>  1 1 0 0 0
>  2 0 1 0 0
>  3 0 0 1 0
>  4 0 0 1 0
>  5 0 0 0 1
>
> I really want to have this cross-tabulation expanded to include all factors
> present in the input data.  Thus, I need additional columns for categories 3
> and 4, and a row for category 6, albeit, they will be filled with zeros.  Is
> there an easy way of doing this?
>
> Thanks,
>
> Tarmo
>
>
>
> 
> Tarmo K. Remmel Ph.D.
> Assistant Professor, Department of Geography
> York University, N413A Ross Building
> Toronto, Ontario, M3J 1P3, Canada
> Tel: 416-736-2100 x22496; Fax: 416-736-5988
> http://www.yorku.ca/remmelt
> Skype: tarmoremmel
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Jeremy Miles

Two links for you which will get your answer much quicker than a mailing list:

http://lmgtfy.com/?q=non-parametric+anova+R

or

http://www.justfuckinggoogleit.com/search.pl?query=non+parametric+anova+R

Jeremy


On 5 March 2010 05:19, blue sky  wrote:
> My interpretation of the relation between 1-way ANOVA and Wilcoxon's
> test (wilcox.test() in R) is the following.
>
> 1-way ANOVA is to test if two or multiple distributions are the same,
> assuming all the distributions are normal and have equal variances.
> Wilcoxon's test is to test two distributions are the same without
> assuming what their distributions are.
>
> In this sense, I'm wondering what is the generalization of Wilcoxon's
> test to more than two distributions. And, more general, what is the
> generalization of Wilcoxon's test to multi-way ANOVA with arbitrary
> complex model formula? What are the equivalent F statistics and t
> statistics in the generalization of Wilcoxon's test?
>
> Note that I'm not interested in looking for a specific nonparametric
> test for a particular dataset right now, although this is important in
> practice. What I'm interested the general nonparametric statistical
> framework that parallels ANOVA. Could somebody give some hints on what
> references I should look for? I have google searched this topic, but
> don't find a page that exactly answered my question.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jeremy Miles
Psychology Research Methods Wiki: www.researchmethodsinpsychology.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Changing the Prompt for browser()

2010-03-05 Thread Andrew Redd

Is there a way that I can change the prompt for within a browser() call.  I
often use use code like

> with(obj1,browser())
Browse[1]>

Is there a way that I can set it so that I can get something like

> with(obj1,browser(prompt="obj1"))
obj1[1]>

I know that prompt is not a valid option for browser, but it would be nice
if it were.   There is an option('prompt")  but that does not affect the
prompt for browser.  Can I change this and how?

Thanks,
Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cross tabulation with fixed dimensions

2010-03-05 Thread Tarmo Remmel

Hello,

This is a seemingly simple task, but it has been frustrating me for too
long, so I am turning to this list for some help.

I have two vectors of factors which are quite long; two simple examples are
shown here:

> a <- c(1,2,3,4,5)
> b <- c(1,2,5,5,6)

If I produce a cross-tabulation of these vectors, I get the following:

> tab <- table(a,b)

> table(a,b)
   b
a   1 2 5 6
  1 1 0 0 0
  2 0 1 0 0
  3 0 0 1 0
  4 0 0 1 0
  5 0 0 0 1

I really want to have this cross-tabulation expanded to include all factors
present in the input data.  Thus, I need additional columns for categories 3
and 4, and a row for category 6, albeit, they will be filled with zeros.  Is
there an easy way of doing this?

Thanks,

Tarmo




Tarmo K. Remmel Ph.D.
Assistant Professor, Department of Geography
York University, N413A Ross Building
Toronto, Ontario, M3J 1P3, Canada
Tel: 416-736-2100 x22496; Fax: 416-736-5988
http://www.yorku.ca/remmelt
Skype: tarmoremmel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I can't find "rpart" help (linux)

2010-03-05 Thread Uwe Ligges




On 05.03.2010 10:51, Grześ wrote:


Hi
I have installed rpart in my Linux (PLD) but I don't know how I may find
help conect this package?

Here is my instalaction:


  install.packages("rpart",dependencies=TRUE)

--- Please select a CRAN mirror for use in this session ---
trying URL 'http://r.meteo.uni.wroc.pl/src/contrib/rpart_3.1-46.tar.gz'
Content type 'application/x-gzip' length 136572 bytes (133 Kb)
opened URL
==
downloaded 133 Kb

* installing *source* package‘rpart’ ...
** libs
i686-pld-linux-gcc -std=gnu99 -I/usr/lib/R/include  -D_FORTIFY_SOURCE=2
-fpic  -O2 -fno-strict-aliasing -fwrapv -march=i686 -mtune=pentium4
-gdwarf-2 -g2  -c anova.c -o anova.o
i686-pld-linux-gcc -std=gnu99 -I/usr/lib/R/include  -D_FORTIFY_SOURCE=2
-fpic  -O2 -fno-strict-aliasing -fwrapv -march=i686 -mtune=pentium4
-gdwarf-2 -g2  -c anovapred.c -o anovapred.o
.
.
.
usersplit.o xval.o -L/usr/lib/R/lib -lR
** R
** data
**  moving datasets to lazyload DB
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...
* DONE (rpart)

The downloaded packages are in
 ‘/root/tmp/RtmpxWZkKu/downloaded_packages’
Updating HTML index of packages in '.Library'


next I do:

library(rpart)


and as result I get:


help(rpart)

/usr/lib/R/bin/pager[11]: /usr/bin/less: not found

Could somebody help me, please?



Your Linux installation is broken:
your pager is linked to less which seems to be unavailable. Hence 
installing less or reconfiguring your default pager will solve the 
problem (which is not an R problem).


Uwe Ligges





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is it possible to recursively update a function?

2010-03-05 Thread Uwe Ligges

On 05.03.2010 01:40, Carl Witthoft wrote:

My foolish move for this week: I'm going to go way out on a limb and
guess what the OP wanted was something like this.

i=1, foo = x*exp(-x)

i=2, foo= x^2*exp(-x)
i=3, foo = x^3*exp(-x)
.
.
.

In which case he really should create a vector bar<-rep(na,5) ,
and then inside the loop,

bar[i]<-x^i*foo(x)

Since in this case foo(x) is independent of i, you are wasting 
resources. Moreover you could calculate it for a whole matrix at once. 
Say you want to calculate this for i=1, ..., n with n=5 for some (here 
pseudo random x), then you could do it simpler after defining some data 
as in:

set.seed(123)
x <- rnorm(10)
n <- 5

using the single and probably most efficient line:

 outer(x, 1:n, "^") * exp(-x)

or if x is a length 1 vector then even simpler:

set.seed(123)
x <- rnorm(1)
n <- 5

 x^(1:5) * exp(-x)

But we still do not know if this is really the question ...

Uwe Ligges

Carl

quoted material:
Date: Thu, 04 Mar 2010 11:37:23 -0800 (PST)

I need to update posterior dist function upon the coming results and
find the posterior mean each time.

On Mar 4, 1:31 pm, jim holtman  wrote:
 > What exactly are you trying to do? 'foo' calls 'foo' calls 'foo' 
 > How did you expect it to stop the recursive calls?
 >
 >
 >
 >
 >
 > On Thu, Mar 4, 2010 at 2:08 PM, Seeker  wrote:
 > > Here is the test code.
 >
 > > foo<-function(x) exp(-x)
 > > for (i in 1:5)
 > > {
 > > foo<-function(x) foo(x)*x
 > > foo(2)
 > > }

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Gabor Grothendieck

I am happy to answer posts to r-help regardless of the name and email
address of the poster but would draw the line at someone excessively
posting without a reasonable effort to find the answer first or using
it for homework since such requests could flood the list making it
useless for everyone.

On Fri, Mar 5, 2010 at 10:55 AM, Ravi Varadhan  wrote:
> David,
>
> I agree with your sentiments.  I also think that it is bad posting etiquette 
> not to sign one's genuine name and affiliation when asking for help, which 
> "blue sky" seems to do a lot.  Bert Gunter has already raised this issue, and 
> I completely agree with him. I would also like to urge the R-gurus to ignore 
> such postings.
>
> Best,
> Ravi.
> 
>
> Ravi Varadhan, Ph.D.
> Assistant Professor,
> Division of Geriatric Medicine and Gerontology
> School of Medicine
> Johns Hopkins University
>
> Ph. (410) 502-2619
> email: rvarad...@jhmi.edu
>
>
> - Original Message -
> From: David Winsemius 
> Date: Friday, March 5, 2010 9:25 am
> Subject: Re: [R] Nonparametric generalization of ANOVA
> To: blue sky 
> Cc: r-h...@stat.math.ethz.ch
>
>
>>  On Mar 5, 2010, at 8:19 AM, blue sky wrote:
>>
>>  > My interpretation of the relation between 1-way ANOVA and Wilcoxon's
>>  > test (wilcox.test() in R) is the following.
>>  >
>>  > 1-way ANOVA is to test if two or multiple distributions are the same,
>>  > assuming all the distributions are normal and have equal variances.
>>  > Wilcoxon's test is to test two distributions are the same without
>>  > assuming what their distributions are.
>>  >
>>  > In this sense, I'm wondering what is the generalization of Wilcoxon's
>>  > test to more than two distributions. And, more general, what is the
>>  > generalization of Wilcoxon's test to multi-way ANOVA with arbitrary
>>  > complex model formula? What are the equivalent F statistics and t
>>  > statistics in the generalization of Wilcoxon's test?
>>  >
>>  > Note that I'm not interested in looking for a specific nonparametric
>>  > test for a particular dataset right now, although this is important
>> in
>>  > practice. What I'm interested the general nonparametric statistical
>>  > framework that parallels ANOVA. Could somebody give some hints on what
>>  > references I should look for? I have google searched this topic, but
>>  > don't find a page that exactly answered my question.
>>
>>  This is your first of three postings in the last hour and they are
>> all
>>  in a category that could well be described as requests for tutoring
>> in
>>  basic statistical topics. I am of the impression you have been
>>  requested not to engage in such behavior on this list. For this
>>  question for instance there is an entire CRAN Task View available and
>>
>>  you have been in particular asked to sue such resource before posting.
>>
>>  It's not the described role of the r-help list to remediate your lack
>>
>>  of statistical background, but rather to deal with difficulties in
>>  applying the R-language to particular, discrete and exemplified
>>  problems.
>>
>>  --
>>
>>  David Winsemius, MD
>>  West Hartford, CT
>>
>>  __
>>  r-h...@r-project.org mailing list
>>
>>  PLEASE do read the posting guide
>>  and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Still on poLCA

2010-03-05 Thread 千早ケンジ

Dear all,

I have just sent a message asking about poLCA but I thought of another question 
I wanted to ask. I get the G^2 statistic in my output and want to test for its 
significance. I get that the degrees of freedom for the test are (S-1-p) where 
S is the number of different patterns observed and p is the number of estimated 
parameters. Are these the "residual degrees of freedom" that I get in the 
output? 

= 
Fit for 2 latent classes: 
= 
number of observations: 1559 
number of estimated parameters: 25 
residual degrees of freedom: 1534 
maximum log-likelihood: -7419.601 
 
AIC(2): 14889.20
BIC(2): 15023.00
G^2(2): 1088.866 (Likelihood ratio/deviance statistic) 
X^2(2): 2284.071 (Chi-square goodness of fit) 
= 

If I use these degrees of freedom for the test I get really high probabilities 
for the model even with only 2 classes. Am I doing something wrong? If these 
are not the degrees of freedom for the test is there any way to calculate them 
(i.e.: finding the S  to substitute in the S-1-p formula)?

Kind regards

Guilherme Kenji
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Running script with double-click

2010-03-05 Thread Gabor Grothendieck

Note that if you use Rterm.bat in http://batchfiles.googlecode.com in
place of Rterm.exe then you won't have to adjust the Run action when
you upgrade R.  Rterm.bat is a single self contained Windows batch
file that you simply place on your Windows path.

On Thu, Mar 4, 2010 at 8:51 PM, Steve Taylor  wrote:
> You can create a right-mouse menu command to run an R program as follows 
> (although the details may be different for different versions of Windows).
>
> In Windows Explorer:
> 1. Tools / Folder Options / File Types
> 2. find the extension for R files and push Advanced
> 3. add a new action called "Run" with command as follows:
> "C:\Program Files\R\R-2.10.1\bin\Rterm.exe" --quiet --no-save --no-restore -f 
> "%1"
> If/when you have a different version of R installed, you'll need to adjust 
> the above command.
>
> Steve
>

>
> From: Matt Asher 
> To:
> Date: 5/Mar/2010 12:45p
> Subject: [R] Running script with double-click
> Hi,
>
> I need to be able to run an R script by double-clicking the file name in
> Windows. I've tried associating the .r extension with the different R
> .exe's in /bin but none seems to work. Some open R then close right
> away, and Rgui.exe gives the message ARGUMENT "/my/file.r" __ignored__
> before opening a new, blank session.
>
> I've tried Google and looking in the R for Windows FAQ but didn't see
> anything.
>
> Thanks.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R ( http://www.r/ 
> )-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Ravi Varadhan

David,

I agree with your sentiments.  I also think that it is bad posting etiquette 
not to sign one's genuine name and affiliation when asking for help, which 
"blue sky" seems to do a lot.  Bert Gunter has already raised this issue, and I 
completely agree with him. I would also like to urge the R-gurus to ignore such 
postings.

Best,
Ravi.


Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: David Winsemius 
Date: Friday, March 5, 2010 9:25 am
Subject: Re: [R] Nonparametric generalization of ANOVA
To: blue sky 
Cc: r-h...@stat.math.ethz.ch


>  On Mar 5, 2010, at 8:19 AM, blue sky wrote:
>  
>  > My interpretation of the relation between 1-way ANOVA and Wilcoxon's
>  > test (wilcox.test() in R) is the following.
>  >
>  > 1-way ANOVA is to test if two or multiple distributions are the same,
>  > assuming all the distributions are normal and have equal variances.
>  > Wilcoxon's test is to test two distributions are the same without
>  > assuming what their distributions are.
>  >
>  > In this sense, I'm wondering what is the generalization of Wilcoxon's
>  > test to more than two distributions. And, more general, what is the
>  > generalization of Wilcoxon's test to multi-way ANOVA with arbitrary
>  > complex model formula? What are the equivalent F statistics and t
>  > statistics in the generalization of Wilcoxon's test?
>  >
>  > Note that I'm not interested in looking for a specific nonparametric
>  > test for a particular dataset right now, although this is important 
> in
>  > practice. What I'm interested the general nonparametric statistical
>  > framework that parallels ANOVA. Could somebody give some hints on what
>  > references I should look for? I have google searched this topic, but
>  > don't find a page that exactly answered my question.
>  
>  This is your first of three postings in the last hour and they are 
> all  
>  in a category that could well be described as requests for tutoring 
> in  
>  basic statistical topics. I am of the impression you have been  
>  requested not to engage in such behavior on this list. For this  
>  question for instance there is an entire CRAN Task View available and 
>  
>  you have been in particular asked to sue such resource before posting.
>  
>  It's not the described role of the r-help list to remediate your lack 
>  
>  of statistical background, but rather to deal with difficulties in  
>  applying the R-language to particular, discrete and exemplified  
>  problems.
>  
>  -- 
>  
>  David Winsemius, MD
>  West Hartford, CT
>  
>  __
>  R-help@r-project.org mailing list
>  
>  PLEASE do read the posting guide 
>  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread David Winsemius



On Mar 5, 2010, at 10:34 AM, Matthias Gondan wrote:

This is your first of three postings in the last hour and they are  
all in
a category that could well be described as requests for tutoring in  
basic
statistical topics. I am of the impression you have been requested  
not to

engage in such behavior on this list. For this question for instance
there is an entire CRAN Task View available and you have been in
particular asked to sue such resource before posting.


Please allow me to ask for details on this task view, because I am  
interested in the topic of nonparametric ANOVAs, as well. To my  
knowledge,

there are some R scripts from Brunner et al. available on his website

http://www.ams.med.uni-goettingen.de/de/sof/ld/index.html

But they seem not to be working with current R versions.


http://finzi.psych.upenn.edu/views/Robust.html



Best regards,

Matthias Gondan



--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Matthias Gondan

> This is your first of three postings in the last hour and they are all in 
> a category that could well be described as requests for tutoring in basic 
> statistical topics. I am of the impression you have been requested not to 
> engage in such behavior on this list. For this question for instance 
> there is an entire CRAN Task View available and you have been in 
> particular asked to sue such resource before posting.

Please allow me to ask for details on this task view, because I am interested 
in the topic of nonparametric ANOVAs, as well. To my knowledge,
there are some R scripts from Brunner et al. available on his website

http://www.ams.med.uni-goettingen.de/de/sof/ld/index.html

But they seem not to be working with current R versions.

Best regards,

Matthias Gondan



-- 
Sicherer, schneller und einfacher. Die aktuellen Internet-Browser -
jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/atbrowser

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hier.part

2010-03-05 Thread David Winsemius



On Mar 5, 2010, at 9:52 AM, Marco Jorge wrote:


Hi everyone,

A beguinner question.
 - How shall i import 9 different ascii (created from a gis layer  
(arcmap

grid)) into R to create a single dataframe for using in hier.part?
Should i use read.table, then turn each created object into a single  
vector
using unlist and finally use data.frame to joint all the vectors  
into a

unique data frame (xcan in the usage of hier.part) ?
I´ve been trying to run it, but i keep getting different errors.


cbind has a data.frame method.




Thanks in advance,
Marco

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] for help on building a R package with several R function and a bunch of c, c++

2010-03-05 Thread Whit Armstrong

Pick up Rcpp, make your life easier.

http://dirk.eddelbuettel.com/code/rcpp.html

-Whit


On Fri, Mar 5, 2010 at 9:19 AM,   wrote:
> Hope I can get quick help from here, I have a bunch of c, c++ included main 
> function and makefile. It works well on both UNIX and windows. I tried to 
> build R package which include this C++ program and several other R functions. 
>  R function here are independent  c++ code. I prefer to define one R function 
> to call this c++ program.
>
> Do you know any easy way to do it? I am reading manual "Writing R 
> Extensions", I didn't catch the key point. I know how to build R package with 
> just R function,  If I put all c++ code and makefile in /src...what I need do?
>
> Thank you in advance!
>
> Alex
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] install rJava in linux

2010-03-05 Thread Ista Zahn

On Fri, Mar 5, 2010 at 4:41 AM, Grześ  wrote:
>
> Hi
> I'm a problem with installing "rJava" in Linux
>
> I have got in my system : java-sun, java-sun-tools, java-sun-jre
> but if I wont install rJava I got a problem like this:
>
>>  install.packages("rJava", dependencies = TRUE)
> --- Please select a CRAN mirror for use in this session ---
> trying URL 'http://cran.mirroring.de/src/contrib/rJava_0.8-2.tar.gz'
> Content type 'application/x-gzip' length 471971 bytes (460 Kb)
> opened URL
> ==
> downloaded 460 Kb
>
> * installing *source* package ‘rJava’ ...

> configure: error: Java Development Kit (JDK) is missing or not registered in
> R
> Make sure R is configured with full Java support (including JDK). Run
> R CMD javareconf
> as root to add Java support to R.
>
> If you don't have root privileges, run
> R CMD javareconf -e
> to set all Java-related variables and then install rJava.


Start by following these instructions. Make sure you have some version
of JDK installed (use your distributions package management system to
install it if needed) then run R CMD javareconf as instructed above.

Best,
Ista


>
> ERROR: configuration failed for package ‘rJava’
> * removing ‘/usr/lib/R/library/rJava’
>
> The downloaded packages are in
>        ‘/root/tmp/RtmpA0Qvcf/downloaded_packages’
> Updating HTML index of packages in '.Library'
> Warning message:
> In install.packages("rJava", dependencies = TRUE) :
>  installation of package 'rJava' had non-zero exit status
>>
>  What is wrong?
> --
> View this message in context: 
> http://n4.nabble.com/install-rJava-in-linux-tp1579395p1579395.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Improved Nelder-Mead algorithm and Matlab's fminsearch. Was: Hi

2010-03-05 Thread Ravi Varadhan

The `fminsearch' in Matlab uses a version of the Nelder-Mead simplex search 
algorithm, which is a derivative-free search technique.  The Nelder-Mead is 
also the default algorithm in optim().  Therefore, you can simply call optim() 
to get your job done.  However, I would like to mentiion that the Nelder-Mead 
implementation in optim() is not the best, and I have had others (who use 
Matlab) tell me that `fminsearch' generally performs better than optim's 
Nelder-Mead.  I have written an improved version of Nelder-Mead that performs 
better than optim's Nelder-Mead.  Iwill soon release it as a package (I have 
been saying this for several months now!), but I can send it to you if you are 
interested.

Hope this helps,
Ravi.

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu

- Original Message -
From: stephen sefick 
Date: Thursday, March 4, 2010 4:02 pm
Subject: Re: [R] Hi
To: hussain abu-saaq 
Cc: r-help@r-project.org

> I would help, but I don't know matlab.
>  
>  Stephen
>  
>  On Thu, Mar 4, 2010 at 2:50 PM, hussain abu-saaq 
>  wrote:
>  >
>  > How Can I write this this matlab code in R:
>  >
>  >
>  > 
> options=optimset('TolFun',1e-9,'TolX',1e-9,'MaxIter',1e8,'MaxFunEvals',1e8);
>  > c=c/2;
>  > [alpha, delta, epsilon, nofcup] = ustrs(set_date,mat_date);
>  > y = fminsearch('pbond',.15,options,p,c,nofcup,delta/epsilon);
>  > y = 200*y;
>  >
>  >
>  >
>  > Note
>  > pbond is a function in Matlab  I already wrote in R
>  >
>  >
>  > ustrs is a function in Matlab I already convert into r
>  >
>  >
>  > Thank you
>  >
>  > HI
>  >
>  >
>  >        [[alternative HTML version deleted]]
>  >
>  > __
>  > R-help@r-project.org mailing list
>  > 
>  > PLEASE do read the posting guide 
>  > and provide commented, minimal, self-contained, reproducible code.
>  >
>  
>  
>  
>  -- 
>  Stephen Sefick
>  
>  Let's not spend our time and resources thinking about things that are
>  so little or so large that all they really do for us is puff us up and
>  make us feel like gods.  We are mammals, and have not exhausted the
>  annoying little problems of being mammals.
>  
>   -K. Mullis
>  
>  __
>  R-help@r-project.org mailing list
>  
>  PLEASE do read the posting guide 
>  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data frame column

2010-03-05 Thread David Winsemius



On Mar 5, 2010, at 8:26 AM, ManInMoon wrote:



I have a big data frame and I have extracted a bit by doing:


y<-d[1:10,6]
y
[1] Headings 0-49  -98  -49  -41  -120  
-155

-204 -169
92329 Levels: -0 -1 -10 -100 -1000 -1 -10001 -10002 -10003  
-10004 -10005
-10006 -10007 -10008 -10009 -1001 -10010 -10011 -10012 -10013 -10014  
-10015
-10016 -10017 -10018 -10019 -1002 -10020 -10021 -10022 -10023  
-10024 ...

Headings




What does the "levels" means?


See reply to earlier related question.



If I create a similar object as below - I don't get the levels  
message.



x <-c(3,4,5,6,3,2,1)
x

[1] 3 4 5 6 3 2 1


Not similar... that's a vetcor. Review your basic R texts.


--



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data frame query

2010-03-05 Thread David Winsemius



On Mar 5, 2010, at 7:40 AM, ManInMoon wrote:



I have created a large dataframe (d) by getting data from file using
read.table

I now have 79 columns and 3 million rows. How can I plot the 6th  
column? I
tried plot(d[,6]) but it doesn't look right. When I try to do just  
d[,6] the

console gets some odd "levels" message I don't understand


How can we help you understand it if you don't include it (It  
sounds as though you may have attempted to plot a factor.)




--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Operating System Variable

2010-03-05 Thread Prof Brian Ripley


See
?R.version
?Sys.info
?.Platform

On Windows, ?win.version

It all depends what you mean by 'Operating System'.

On Fri, 5 Mar 2010, Rob Helpert wrote:


Hi.

Is there an easy way (analogous to the $^O variable in perl) to find


That's a pretty basic description.  Even R.version$os often gives more 
info.  E.g.


gannet% Rscript -e 'R.version$os'
[1] "linux-gnu"
gannet% perl -e 'print "$^O\n"'
linux
blacklark% Rscript -e 'R.version$os'
[1] "darwin9.8.0"
blacklark% perl -e 'print "$^O\n"'
darwin
blackhawk% Rscript -e 'R.version$os'
[1] "solaris2.10"
blackhawk% perl -e 'print "$^O\n"'
solaris

but note that on binary versions of R you may get the OS which it was 
compiled under (as in blacklark, which is running Mac OS X 10.6.2 aka 
Darwin 10.2.0) and that the name may be neither the name the OS itself 
reports nor the commonly used name.  (Darwin vs Mac OS X, SunOS vs 
Solaris are examples.)


system('uname -s', intern=TRUE)

may be closer to what you are looking for (except on Windows).


out what operating system R is currently using?


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Operating System Variable

2010-03-05 Thread Duncan Murdoch


On 05/03/2010 9:45 AM, Rob Helpert wrote:

Hi.

Is there an easy way (analogous to the $^O variable in perl) to find
out what operating system R is currently using?


.Platform is probably what you want.  R.version gives related 
information. I don't think there's anything built in to determine 
platform version numbers.


For instance, running 32-bit R on 64-bit Windows 7 .Platform says

> .Platform
$OS.type
[1] "windows"

$file.sep
[1] "/"

$dynlib.ext
[1] ".dll"

$GUI
[1] "Rgui"

$endian
[1] "little"

$pkgType
[1] "win.binary"

$path.sep
[1] ";"

$r_arch
[1] ""


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 136 matches

Mail list logo