Re: [R] Alternate to for-loop

2009-02-17 Thread Wacek Kusnierczyk
Stefan Evert wrote:
> A couple of remarks on vQ's naive benchmark:
>
>> f.rep = function(n, m) replicate(n, rnorm(m))
>
> I suppose you meant
>
> f.rep = function(n, m) replicate(n, mean(rnorm(m)))
>
> which doesn't make a substantial speed difference, though.

indeed, thanks;  i've already posted a correction, and as you say, it
doesn't make much difference for these particular benchmark values.


>
>
>> f.pat = function(n, m) colMeans(array(rnorm(n*m), c(n, m)))
>>
>> system.time(f.pat(1000, 1000))
>> system.time(f.rep(1000, 1000))
>>
>> makes me believe that there is no significant difference in efficiency
>> between the 'professionally-looking' replicate-based solution and the
>> 'as fast as possible' pat's solution.
>>
>
> True, I get the same timing results on my machine.  But then you
> should also point out that the original for-loop:
>
> f.for = function(n, m) { res <- numeric(n); for (i in 1:n) res[i]
> <- mean(rnorm(m)); res }
>
> is exactly as fast as replicate().  So apart from "looking more
> professional", there isn't any difference between an explicit loop and
> replicate().

indeed, and pat seems correct in blaming for loops for the inefficiency
of replicate in cases where log10(n/m) > 2.

>
> Perhaps loops in R aren't always as slow (compared to matrix
> operations) as one seemed to think.

depends how and where you use them.  in the problem discussed here, they
do slow down the code for some class of inputs and do not speedup for
the others, compared to the array version of pat.

>   I ran into a similar issue with a simple benchmark the other day,
> where a plain loop in Lua was faster than vectorised code in R ...

hmm, would you be saying that r's vectorised performance is overhyped? 
or is it just that non-vectorised code in r is slow?


>
>
> I have to say, though, that like Patrick I assumed the goal was to
> obtain a large number of replicates for relatively small sets of
> random numbers, in which case the matrix solution is indeed faster
> (though not as much as I would have thought):
>
> > system.time(f.for(10, 100))
>user  system elapsed
>   4.212   0.025   4.273
> > system.time(f.rep(10, 100))
>user  system elapsed
>   4.109   0.028   4.172
> > system.time(f.pat(10, 100))
>user  system elapsed
>   1.580   0.134   1.739
>
>

sure;  it's even more pronounced when n = 10^6 and m=10.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] stopping stepAIC (MASS package)

2009-02-17 Thread Peter Jepsen
Dear R-listers,

I am building a prediction model starting with many, many variables. I
use the 'stepAIC' procedure in the MASS package, and the model building
process takes several hours to complete. At the very end, I am
occasionally confronted with warnings like this one:

Warning messages:
1: In fitter(X, Y, strata = Strata, offset = offset, weights = weights,
:
  Loglik converged before variable  4 ; beta may be infinite. 

In these instances, I take it that I have stretched my data too thin,
and I discard the final model. Therefore, I would save valuable time if
the procedure would stop automatically when it encounters any anomaly
that causes it to warn me. Is this possible? And if so, how?

Thank you in advance for any assistance.

Peter Jepsen, MD


> sessionInfo()
R version 2.8.1 (2008-12-22) 
i386-pc-mingw32 

locale:
LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY=
Danish_Denmark.1252;LC_NUMERIC=C;LC_TIME=Danish_Denmark.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


other attached packages:
[1] MASS_7.2-45
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with dates in zoo package

2009-02-17 Thread CJ Rubio

i have the following code - assimilating the maximum annual discharge each
year ffrom a daily discharge record from year 1989-2005.

m <- read.table("D:/documents/5 stations/01014000.csv", sep =",")
z <- zoo(m[,4],as.Date(as.character(m[,3]), "%m/%d/%Y")) 
x <- aggregate(z, floor(as.numeric(as.yearmon(time(z, max)   #code
suggested by Gabor Grothendieck

which gives me the maximum discharge each year and the year itself.. what i
am trying now is to produce the date when the maximum discharge was observed
in the pattern -mm-dd, like:

1988 11/9/1988  18600
1989  5/8/1989   49000
...
2005 4/26/2005111000 

thank you all in advance...
-- 
View this message in context: 
http://www.nabble.com/problem-with-dates-in-zoo-package-tp22053103p22053103.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Efficient matrix computations

2009-02-17 Thread Shimrit Abraham
Hi,

I am looking for two ways to speed up my computations:

1. Is there a function that efficiently computes the 'sandwich product' of
three matrices, say, ZPZ'
2. Is there a function that efficiently computes the determinant of a
positive definite symmetric matrix?

Thanks,

S.A.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difficulty in starting time sequence from 00:00:00

2009-02-17 Thread Suresh_FSFM

Dear R-experts,

Need your help.

Dear R- Experts, 

Seek your help. 

I created a time sequence using: 
x[i] <-chron(dates, tt, format=c(dates="y-m-d", tt="h:m:s")) 
first element in the list is displayed as: (09-01-01 00:00:00) 
Further elements are:
(09-01-01 00:01:00)
(09-01-01 00:02:00)
(09-01-01 00:03:00)
...
and so on.

I want to compare each value with the timestamp values from database.
I cannot simply compare above values with the timestamp values, as the above
values are not in "date" format.

Therefore, I used: 
format.Date(x[1],"%y-%m-%d %H:%M:%S"), I expect following value: 
"09-01-01 00:00:00" 

HOWEVER, the value displayed as: "09-01-01 01:00:00" 

Unnecessary addition of 1 hour.
I want to create time sequence starting from "09-01-01 00:00:00" !!! 
Can any one please help?

Thank you.


-- 
View this message in context: 
http://www.nabble.com/difficulty-in-starting-time-sequence-from-00%3A00%3A00-tp22053632p22053632.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to add direction of time to plot.circular()

2009-02-17 Thread Jim Lemon

Michael Kubovy wrote:

Dear r-helpers,

I want to show that time is flowing CCW in the following:

require(circular)
len <- 8
labl <- as.character(c(0, 1, 1, 1, 0, 0, 1, 0))
r <- circular(2*pi* (rep(c(1, 3, 6), each = 200)/len + rnorm(600, 0,  
0.025)))

r.dens <- density(r, bw = 25, adjust = 4, kernel = 'vonmises')
plot(r, shrink = 2.5, axes = FALSE, ticks = FALSE, pch = 1, col =  
'lightblue', stack = TRUE, bins = 12 * len)
axis.circular(at = circular(seq(0, (len - 1) * 2 * pi/len, length.out  
= len)), label = labl)

lines(r.dens, col = 2)

I had imagined a directed arc with a smaller radius than the black  
circle, running from 0 to 315 deg. I also thought that adding a short  
horizontal line at its beginning might be helpful. I would appreciate  
advice on how best to do this or anything else that would provide the  
required information.
  

Hi Michael,
You might find that clock24.plot (plotrix) will do what you want, 
although that is in the conventional clockwise direction. You can 
reverse this by using radial.plot and recreating the clock grid in the 
opposite orientation.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create package with Fortran 90 and C code

2009-02-17 Thread Prof Brian Ripley

On Tue, 17 Feb 2009, Nathan S. Watson-Haigh wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I'm trying to add some Fortran 90 code to an existing package.

When I compile and load the file manually like:
SHELL> R CMD SHLIB file.f90
R> dyn.load("file.so")

I can use the .Fortran() fine. However, when I try to build, install and load
the library I seem to be missing something.

I do a:
SHELL> R CMD build dir
SHELL> R CMD INSTALL pkg_version.tar.gz

Things seem to progress smoothly. However, in R when I try to load the package I
get an error like:

Error in dyn.load(file, DLLpath = DLLpath, ...) :
 unable to load shared library
'/cs/home/cslsi/wat410/R/ia64-unknown-linux-gnu-library/2.7/pkg/libs/pkg.so':
 /cs/home/cslsi/wat410/R/ia64-unknown-linux-gnu-library/2.7/pkg/libs/pkg.so:
undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE
Error: package/namespace load failed for 'pkg'

Can anyone suggest what I might do to solve this?


See the posting guide!  This is a question about compiled code, hence 
for the R-devel list.


You are apparently using an obsolete R, and it rather looks as if you 
are using C++ with Fortran 90, something that is not supported (since 
in general it does not work, and you need to tell us the compilers you 
are using).


So please post a much more complete description on R-devel, and 
perhaps make the failing package available for potential helpers to 
look at.



Cheers,
Nathan



- 
Dr. Nathan S. Watson-Haigh
OCE Post Doctoral Fellow
CSIRO Livestock Industries
Queensland Bioscience Precinct
St Lucia, QLD 4067
Australia

Tel: +61 (0)7 3214 2922
Fax: +61 (0)7 3214 2900
Web: http://www.csiro.au/people/Nathan.Watson-Haigh.html



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subset Regression Package

2009-02-17 Thread Alex Roy
Dear all ,
  Is there any subset regression (subset selection
regression) package in R other than "leaps"?


Thanks and regards

Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient matrix computations

2009-02-17 Thread Dimitris Rizopoulos
it would be a bit more helpful if we knew more info regarding these 
matrices, for instance is P diagonal, etc. In any case, you could have a 
look at


crossprod()
# and
tcorssprod()

and, for the determinant maybe

prod(eigen(mat, symmetric = TRUE, only.values = FALSE)$values)
# or
prod(diag(chol(mat)))^2

are a bit faster than det(), but I haven't tested it.

I hope it helps.

Best,
Dimitris


Shimrit Abraham wrote:

Hi,

I am looking for two ways to speed up my computations:

1. Is there a function that efficiently computes the 'sandwich product' of
three matrices, say, ZPZ'
2. Is there a function that efficiently computes the determinant of a
positive definite symmetric matrix?

Thanks,

S.A.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Ingore this message. I have already solved the problem..

2009-02-17 Thread Suresh_FSFM

Ingore this message. I have already solved the problem..

Regards,
Suresh


Suresh_FSFM wrote:
> 
> Dear R-experts,
> 
> Need your help.
> 
> Dear R- Experts, 
> 
> Seek your help. 
> 
> I created a time sequence using: 
> x[i] <-chron(dates, tt, format=c(dates="y-m-d", tt="h:m:s")) 
> first element in the list is displayed as: (09-01-01 00:00:00) 
> Further elements are:
> (09-01-01 00:01:00)
> (09-01-01 00:02:00)
> (09-01-01 00:03:00)
> ...
> and so on.
> 
> I want to compare each value with the timestamp values from database.
> I cannot simply compare above values with the timestamp values, as the
> above values are not in "date" format.
> 
> Therefore, I used: 
> format.Date(x[1],"%y-%m-%d %H:%M:%S"), I expect following value: 
> "09-01-01 00:00:00" 
> 
> HOWEVER, the value displayed as: "09-01-01 01:00:00" 
> 
> Unnecessary addition of 1 hour.
> I want to create time sequence starting from "09-01-01 00:00:00" !!! 
> Can any one please help?
> 
> Thank you.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/difficulty-in-starting-time-sequence-from-00%3A00%3A00-tp22053632p22054075.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient matrix computations

2009-02-17 Thread Dimitris Rizopoulos

sorry, in my previous e-mail it should be

tcrossprod()
# and
prod(eigen(mat, symmetric = TRUE, only.values = TRUE)$values)


Best,
Dimitris

Shimrit Abraham wrote:

Hi,

I am looking for two ways to speed up my computations:

1. Is there a function that efficiently computes the 'sandwich product' of
three matrices, say, ZPZ'
2. Is there a function that efficiently computes the determinant of a
positive definite symmetric matrix?

Thanks,

S.A.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alternate to for-loop

2009-02-17 Thread Stefan Evert





 I ran into a similar issue with a simple benchmark the other day,
where a plain loop in Lua was faster than vectorised code in R ...


hmm, would you be saying that r's vectorised performance is overhyped?
or is it just that non-vectorised code in r is slow?


What I meant, I guess, was (apart from a little bit of trolling) that  
I'd had misconceptions about the speed differences between loops and  
vectorised code.  In particular, I had entertained the naive belief  
that vectorised solutions are always highly efficient (I wonder if I'm  
the only one who was naive enough to think this ..), so I was very  
much surprised to find a loop in an interpreted language like Lua to  
be faster than vectorised R code.


My silly little benchmark translated the Lua code

sum = 0
for i=1,N do sum = sum + i end

into vectorised R

sum(as.numeric(1:N))

The performance results were as follows:

for loop in R:  0.75 Mops/s  (200 ops in 2.66 s)
vectorised R:   29.75 Mops/s  (5000 ops in 1.68 s)
Lua:51.54 Mops/s  (1 ops in 1.94 s)
Perl:   8.26 Mops/s  (1000 ops in 1.21 s)

Note that Lua is an interpreted language (compiled to byte code); with  
the just in time compiler I get more than 230 Mops/s.


I suspect that this has to do with cache trashing, since the  
vectorised code in R operates on large vectors that have to be read  
from / written to RAM, while the Lua loop presumably runs entirely  
from the L1 cache.  (Before you ask, I split the vectorised R code  
into a loop that processes 1 million numbers at a time; I tried  
different ways of coding the benchmark and picked the fastest solution.)



Perhaps loops in R aren't always as slow (compared to matrix
operations) as one seemed to think.


depends how and where you use them.  in the problem discussed here,  
they

do slow down the code for some class of inputs and do not speedup for
the others, compared to the array version of pat.


My mistake was to think that vectorisation will always give a  
substantial performance boost and that for-loops should be avoided  
whenever possible. But it's really just the inner loops that need to  
be vectorised: iterating over the outer margins of a matrix doesn't  
add much overhead, especially if the vectorised solution would have to  
operate on huge matrices.


Guess that's a bad habit from my old Matlab days (back in the early  
90s) ...


Best,
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient matrix computations

2009-02-17 Thread Shimrit Abraham
Thanks for your suggestions.

I'll try to implement what you suggested.

Perhaps the following information can help you to think of alternative ways
to speed up computations:

I am coding the Kalman Filter in R because I have certain requirements that
are not provided by the packages regarding State Space  models (dlm, sspir,
dse, are there any more? ) .
Since I don't see a way to avoid a for loop, I am trying to make the matrix
computations as efficent as possible by using the properties that some of
these matrices possess, hence my questions.

The Kalman Filter requires sandwich matrix products, say ZPZ', where P is a
covariance matrix, hence positive definite and symmetric (Z is not
necessarily a square matrix). To compute the likelihood, I also need an
efficient way to calculate the determinant of a covariance matrix, which is
again positive definite and symmetric.

Please let me know if this information gives you some new ideas on how to
solve my problem.

Thanks,

S.A.



On Tue, Feb 17, 2009 at 11:09 AM, Dimitris Rizopoulos <
d.rizopou...@erasmusmc.nl> wrote:

> sorry, in my previous e-mail it should be
>
> tcrossprod()
> # and
> prod(eigen(mat, symmetric = TRUE, only.values = TRUE)$values)
>
>
> Best,
> Dimitris
>
> Shimrit Abraham wrote:
>
>> Hi,
>>
>> I am looking for two ways to speed up my computations:
>>
>> 1. Is there a function that efficiently computes the 'sandwich product' of
>> three matrices, say, ZPZ'
>> 2. Is there a function that efficiently computes the determinant of a
>> positive definite symmetric matrix?
>>
>> Thanks,
>>
>> S.A.
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> --
> Dimitris Rizopoulos
> Assistant Professor
> Department of Biostatistics
> Erasmus Medical Center
>
> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> Tel: +31/(0)10/7043478
> Fax: +31/(0)10/7043014
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scatterplot and correlation for weird data format

2009-02-17 Thread Jim Lemon

William Simpson wrote:

I have data in a format like this:

namessexsex viewnum rating  rt
ahl4f   m   f   56  -1082246
ahl4f   m   f   74  85  1444
ahl4f   m   f   52  151 1595
ahl4f   m   f   85  1   1447
ahl4f   m   f   53  46  1716
ahl4f   m   f   37  145 1276
ahl4f   m   f   50  98  1465
ahl4f   m   f   51  -26 1322
ahl4f   m   f   38  -97 1790
ahl4f   m   f   14  -158865
...
ahl4f   m   p   43  -1361669
ahl4f   m   p   10  -59 808
ahl4f   m   p   67  -1111279
ahl4f   m   p   85  -86 994
ahl4f   m   p   100 134 1337
ahl4f   m   p   76  56  665
ahl4f   m   p   51  -49 594
ahl4f   m   p   33  -118505
ahl4f   m   p   49  -1561283
...
and so on for many subjects (name)

I would like to do a scatterplot of the rating given by each subject
(with identifier "name") for the frontal (view=="f") and profile
(view=="p") views of each face (each face has an identifier "num").
I'd like to find the correlation as well.
For each subject, since there are 100 faces, there will be 100 points
on the scatterplot. I would just lump all the subjects' data together
for the plot and correlation I think (unless somebody tells me I
should do each subject separately).

I'm stumped on how to do this. Thanks very much for any help!
  

Hi Bill,
The first thing that comes to mind is a variation on count.overplot, a 
function that displays the number of overplotted points for a given 
tolerance rather than a blur of separate symbols. The problem would be 
separating the various categories of experimental stimuli in your case. 
You could use, say, "F" and "P" as suffixes for the counts to indicate 
orientation, color to indicate sex of face, male/female symbol for sex 
of respondent, and so on. The problem is that you end up with a 
difficult to interpret plot, as each entry (of which there will still be 
many) must be decoded by the viewer. If you think this is worth 
pursuing, email me and I will try to outline a way to do it.


Another, perhaps simpler way is to define a summary score for each 
subject for each class of face and plot that.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alternate to for-loop

2009-02-17 Thread Wacek Kusnierczyk
Stefan Evert wrote:
>
>> hmm, would you be saying that r's vectorised performance is overhyped?
>> or is it just that non-vectorised code in r is slow?
>
> What I meant, I guess, was (apart from a little bit of trolling) that
> I'd had misconceptions about the speed differences between loops and
> vectorised code.  In particular, I had entertained the naive belief
> that vectorised solutions are always highly efficient (I wonder if I'm
> the only one who was naive enough to think this ..), so I was very
> much surprised to find a loop in an interpreted language like Lua to
> be faster than vectorised R code.
>
> My silly little benchmark translated the Lua code
>
> sum = 0
> for i=1,N do sum = sum + i end
>
> into vectorised R
>
> sum(as.numeric(1:N))
>

what you're benchmarking here is a lump of vector allocation and the
actual summing.  assuming that you already have generated and stored the
values to be summed, the runtime differs a bit:

system.time(sum(as.numeric(1:10^8)))

system.time({ x = 1:10^8 })
system.time({ y = as.numeric(x) })
system.time(sum(y))

though again, it's a factor below one order of magnitude.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with dates in zoo package

2009-02-17 Thread Gabor Grothendieck
See ?which.max :

library(zoo)
z <- read.zoo("myfile.dat", sep = ",", format = "%m/%d/%Y")
f <- function(x) time(x)[which.max(x)]
ix <- tapply(z, floor(as.yearmon(time(z))), f)
z[ix]


On Tue, Feb 17, 2009 at 3:58 AM, CJ Rubio  wrote:
>
> i have the following code - assimilating the maximum annual discharge each
> year ffrom a daily discharge record from year 1989-2005.
>
> m <- read.table("D:/documents/5 stations/01014000.csv", sep =",")
> z <- zoo(m[,4],as.Date(as.character(m[,3]), "%m/%d/%Y"))
> x <- aggregate(z, floor(as.numeric(as.yearmon(time(z, max)   #code
> suggested by Gabor Grothendieck
>
> which gives me the maximum discharge each year and the year itself.. what i
> am trying now is to produce the date when the maximum discharge was observed
> in the pattern -mm-dd, like:
>
> 1988 11/9/1988  18600
> 1989  5/8/1989   49000
> ...
> 2005 4/26/2005111000
>
> thank you all in advance...
> --
> View this message in context: 
> http://www.nabble.com/problem-with-dates-in-zoo-package-tp22053103p22053103.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combination

2009-02-17 Thread Dani Valverde

Hello,
I have a sequence of numbers:
seq(1:50)
and I would like to have all the possible combinations with this numbers 
without repeating any combination:

11, 12, 13, ... ,22,23,24,...
How can I do it?
Best,

Dani

--
Daniel Valverde Saubí

Grup de Biologia Molecular de Llevats
Facultat de Veterinària de la Universitat Autònoma de Barcelona
Edifici V, Campus UAB
08193 Cerdanyola del Vallès- SPAIN

Centro de Investigación Biomédica en Red
en Bioingeniería, Biomateriales y
Nanomedicina (CIBER-BBN)

Grup d'Aplicacions Biomèdiques de la RMN
Facultat de Biociències
Universitat Autònoma de Barcelona
Edifici Cs, Campus UAB
08193 Cerdanyola del Vallès- SPAIN
+34 93 5814126

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combination

2009-02-17 Thread Henrique Dallazuanna
Try this:

apply(expand.grid(1:50, 1:50), 1, paste, collapse = '')

On Tue, Feb 17, 2009 at 8:37 AM, Dani Valverde wrote:

> Hello,
> I have a sequence of numbers:
> seq(1:50)
> and I would like to have all the possible combinations with this numbers
> without repeating any combination:
> 11, 12, 13, ... ,22,23,24,...
> How can I do it?
> Best,
>
> Dani
>
> --
> Daniel Valverde Saubí
>
> Grup de Biologia Molecular de Llevats
> Facultat de Veterinària de la Universitat Autònoma de Barcelona
> Edifici V, Campus UAB
> 08193 Cerdanyola del Vallès- SPAIN
>
> Centro de Investigación Biomédica en Red
> en Bioingeniería, Biomateriales y
> Nanomedicina (CIBER-BBN)
>
> Grup d'Aplicacions Biomèdiques de la RMN
> Facultat de Biociències
> Universitat Autònoma de Barcelona
> Edifici Cs, Campus UAB
> 08193 Cerdanyola del Vallès- SPAIN
> +34 93 5814126
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combination

2009-02-17 Thread Eik Vettorazzi

Hi Dani,
see ?combn .

combn(1:50,2)
gives you all combinations as matrix.
you can do sth like

apply(combn(1:50,2),2, paste, sep="", collapse="")

to get concatenated results.

hth.

Dani Valverde schrieb:

Hello,
I have a sequence of numbers:
seq(1:50)
and I would like to have all the possible combinations with this 
numbers without repeating any combination:

11, 12, 13, ... ,22,23,24,...
How can I do it?
Best,

Dani



--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/42803-8243
F ++49/40/42803-7790

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with calculating the residuals in an AR(p)-model

2009-02-17 Thread Spiderschwein

Hi,

I'm trying to calculate the residuals in a one-sided AR(p) model:
sum (phi_j (X_t-j - EX)) = epsilon

My X is an ARMA(1,1)-model, which can be represented as an AR(infty) model.
I calculate the order of the model with AIC and the parameters with the
Yule-Walker method.

For the residuals, I tried:

for (k in (p+1):n){ 
 for (j in 1:p){
  eps[k] <- sum(phi[j] * (X[k-j] - mean(X)))
 }
 ceps[k] <- eps[k] - sum(eps)/(n-p+1) #centering the residuals
}


The residuals in my ARMA(1,1)-models are t-distributed with 6 degrees of
freedom.
Unfortunately my code results in residuals, which are all around zero. So
ecdf(eps) or ecdf(ceps) are very different from a t(6)-distribution.

Where's the bug in my code?

Regards,
Martin
-- 
View this message in context: 
http://www.nabble.com/Problem-with-calculating-the-residuals-in-an-AR%28p%29-model-tp22055764p22055764.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using eval in multinom argument

2009-02-17 Thread Crouch, Daniel
Thanks for your help,

I managed to get it working like this:

fml<-as.formula(paste("y ~", paste(PCnames, collapse="+")))
y = class.ind(grp[outgroup])
z1=multinom(formula = fml, data=data.frame(scores))

Daniel Crouch



**QUOTE:
Forget eval(parse(text = ))

See

?as.formula
?update.formula

and try out the example() s there.

HTH,

Chuck




On Mon, 16 Feb 2009, Crouch, Daniel wrote:

> > Hi,
> >
> > I am having difficulty entering a 'programmable' argument into the multinom 
> > function from the nnet package. Interactively, I can get the function to 
> > work fine by calling it this way:
> >
> > z1=multinom(formula = class.ind(grp[-outgroup])~ (PC1 + PC2 + PC3), 
> > data=data.frame(scores))
> >
> > However I need to be able to change the number of variables I am looking 
> > for in 'scores' and so am trying to call it this way...
> >
> > z1=multinom(formula = class.ind(grp[-outgroup])~ eval(parse(text=PCnames)), 
> > data=data.frame(scores))
> >
> > ...where, for example, PCnames = c("PC1", "+",   "PC2", "+",   "PC3")
> >
> > This gives no error messages, but only the last variable (in this case PC3) 
> > gets considered in the model. z1 looks like this:
> >
> >  (Intercept) eval(parse(text = PCnames))
> > 23.530352  -116.87140
> > 3   -1.30861313.59134
> > 43.662172   -57.52198
> > 5   -1.216041  -242.38827
> > 6   -9.377894  -367.71614
> > 7   -3.145738  -286.19766
> >
> > Rather than this:
> >
> >  (Intercept)   PC1PC2   PC3
> > 2   288.97131   889.281  3776.5837 -2105.751
> > 3  -712.53519  2775.663  8490.5724  8602.834
> > 4   229.17772  4234.950   329.6995 -2182.238
> > 585.54585 -3036.657  3968.2517 -3450.070
> > 6  -676.55377 -9545.785  2422.5340 -7183.686
> > 7  -631.91921 10997.432 -3310.2905 -5348.513
> >
> > Any help would be much appreciated.
> >
> > Thanks,
> > Daniel Crouch
> >
> >
> > Daniel Crouch
> > Research Student
> > Department of Medical & Molecular Genetics
> > King's College London
> > 8th Floor, Tower Wing
> > Guy's Hospital
> > London SE1 9RT
> > United Kingdom
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >

Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901





Daniel Crouch
Research Student
Department of Medical & Molecular Genetics
King's College London
8th Floor, Tower Wing
Guy's Hospital
London SE1 9RT
United Kingdom
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparison of age categories using contrasts

2009-02-17 Thread Mark Difford

Hi Dylan,

>> Am I trying to use contrast.Design() for something that it was not
>> intended for? ...

I think Prof. Harrell's main point had to do with how interactions are
handled. You can also get the kind of contrasts that Patrick was interested
in via multcomp. If we do this using your artificial data set we see that
the contrasts are the same as those got by fitting the model using
contr.sdif, but a warning is generated about interactions in the model &c.
[see example code]

Part of Prof. Harrell's "system" is that in generating contrasts via
contr.Design an appropriate value is automatically chosen for the
interacting variable (in this case the median value of x) so that a
reasonable default set of contrasts is calculated. This can of course be
changed.

Coming to your question [?] about how to generate the kind of contrasts that
Patrick wanted using contrast.Design. Well, it is not that straightforward,
though I may have missed something in the documentation to the function. In
the past I have cobbled them together using the kind of hack given below:

## Exampe code
x <- rnorm(100) 
y1 <- x[1:25] * 2 + rnorm(25, mean=1)
y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)


d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25))) 


# now try with contrast.Design(): 
library(multcomp)
dd <- datadist(d) 
options(datadist='dd') 
l <- ols(y ~ x * f, data=d)

## model fitted using successive difference contrasts
library(MASS)
T.lm <- lm(y ~ x * f, contrasts=list(f="contr.sdif"), data=d)
summary(T.lm)

## show the warning: model fitted using ols() and treatment contrasts
summary(glht(l, linfct=mcp(f="Seq")))

## "custom" successive difference contrasts using contrast.Design
TFin <- {}
for (i in 1:(length(levels(d$f))-1)) {
tcont <- contrast(l, a=list(f=levels(d$f)[i]),
b=list(f=levels(d$f)[i+1]))
TFin <- rbind(TFin, tcont)
rownames(TFin)[i] <-  paste(levels(d$f)[i], levels(d$f)[i+1], sep="-")
}
TFin[,1:9]

Regards, Mark.



Dylan Beaudette-2 wrote:
> 
> On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux
>  wrote:
>> Greg Snow a écrit :
>>> One approach is to create your own contrasts matrix:
>>>
>>>
 mycmat <- diag(8)
 mycmat[ row(mycmat) == col(mycmat) + 1 ] <- -1
 mycmati <- solve(mycmat)
 contrasts(agefactor) <- mycmati[,-1]

>>>
>>> Now when you use agefactor, the intercept will be the first age group
>>> and the slopes will be the differences between the pairs of groups (make
>>> sure that the order of the levels of agefactor is correct).
>>>
>>> The difference between this method and the contr.sdif function in MASS
>>> is how the intercept will end up being interpreted (and the dimnames).
>>>
>>> Hope this helps,
>>>
>>>
>>
>> Actually, exactly what I needed including the reference to contr.sdif in
>> MASS I did not spot before (although I am a faithful reader of the
>> yellow book... but so many things still escape to me). Again thanks a
>> lot.
>>
>> Patrick
>>
> 
> Glad you were able to solve your problem. Frank replied earlier with
> the suggestion to use contrast.Design() to perform the tests after the
> initial model has been fit. Perhaps I am a little denser than normal,
> but I cannot figure out how to apply contrast.Design() to a similar
> model with several levels of a factor. Example data:
> 
> # need these
> library(lattice)
> library(Design)
> 
> # replicate an important experimental dataset
> set.seed(10101010)
> x <- rnorm(100)
> y1 <- x[1:25] * 2 + rnorm(25, mean=1)
> y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
> y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
> y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)
> 
> d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4],
> each=25)))
> 
> # plot
> xyplot(y ~ x, groups=f, data=d, auto.key=list(columns=4, title='Beard
> Type', lines=TRUE, points=FALSE, cex=0.75), type=c('p','r'),
> ylab='Number of Pirates', xlab='Distance from Land')
> 
> # standard comparison to base level of f
> summary(lm(y ~ x * f, data=d))
> 
> # compare to level 4 of f
> summary(lm(y ~ x * C(f, base=4), data=d))
> 
> 
> # now try with contrast.Design():
> dd <- datadist(d)
> options(datadist='dd')
> l <- ols(y ~ x * f, data=d)
> 
> # probably the wrong approach, as the results do not look too familiar:
> contrast(l, list(f=levels(d$f)))
> 
>  x  f Contrast  S.E.  Lower   Upper t Pr(>|t|)
>  -0.3449623 a 0.3966682 0.1961059 0.007184856 0.7861515  2.02 0.0460
>  -0.3449623 b 0.5587395 0.1889173 0.183533422 0.9339456  2.96 0.0039
>  -0.3449623 c 4.1511677 0.1889170 3.775962254 4.5263730 21.97 0.
>  -0.3449623 d 4.3510407 0.120 3.975904726 4.7261766 23.04 0.
> 
> 
> This is adjusting the output to a given value of 'x'... Am I trying to
> use contrast.Design() for something that it was not intended for? Any
> tips Frank?
> 
> Cheers,
> Dylan
> 
> __
> R-help@r-proje

Re: [R] Comparison of age categories using contrasts

2009-02-17 Thread Chuck Cleland
On 2/16/2009 10:18 PM, Dylan Beaudette wrote:
> On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux
>  wrote:
>> Greg Snow a écrit :
>>> One approach is to create your own contrasts matrix:
>>>
>>>
 mycmat <- diag(8)
 mycmat[ row(mycmat) == col(mycmat) + 1 ] <- -1
 mycmati <- solve(mycmat)
 contrasts(agefactor) <- mycmati[,-1]

>>> Now when you use agefactor, the intercept will be the first age group and 
>>> the slopes will be the differences between the pairs of groups (make sure 
>>> that the order of the levels of agefactor is correct).
>>>
>>> The difference between this method and the contr.sdif function in MASS is 
>>> how the intercept will end up being interpreted (and the dimnames).
>>>
>>> Hope this helps,
>>>
>>>
>> Actually, exactly what I needed including the reference to contr.sdif in
>> MASS I did not spot before (although I am a faithful reader of the
>> yellow book... but so many things still escape to me). Again thanks a lot.
>>
>> Patrick
>>
> 
> Glad you were able to solve your problem. Frank replied earlier with
> the suggestion to use contrast.Design() to perform the tests after the
> initial model has been fit. Perhaps I am a little denser than normal,
> but I cannot figure out how to apply contrast.Design() to a similar
> model with several levels of a factor. Example data:
> 
> # need these
> library(lattice)
> library(Design)
> 
> # replicate an important experimental dataset
> set.seed(10101010)
> x <- rnorm(100)
> y1 <- x[1:25] * 2 + rnorm(25, mean=1)
> y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
> y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
> y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)
> 
> d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25)))
> 
> # plot
> xyplot(y ~ x, groups=f, data=d, auto.key=list(columns=4, title='Beard
> Type', lines=TRUE, points=FALSE, cex=0.75), type=c('p','r'),
> ylab='Number of Pirates', xlab='Distance from Land')
> 
> # standard comparison to base level of f
> summary(lm(y ~ x * f, data=d))
> 
> # compare to level 4 of f
> summary(lm(y ~ x * C(f, base=4), data=d))
> 
> # now try with contrast.Design():
> dd <- datadist(d)
> options(datadist='dd')
> l <- ols(y ~ x * f, data=d)
> 
> # probably the wrong approach, as the results do not look too familiar:
> contrast(l, list(f=levels(d$f)))
> 
>  x  f Contrast  S.E.  Lower   Upper t Pr(>|t|)
>  -0.3449623 a 0.3966682 0.1961059 0.007184856 0.7861515  2.02 0.0460
>  -0.3449623 b 0.5587395 0.1889173 0.183533422 0.9339456  2.96 0.0039
>  -0.3449623 c 4.1511677 0.1889170 3.775962254 4.5263730 21.97 0.
>  -0.3449623 d 4.3510407 0.120 3.975904726 4.7261766 23.04 0. 
> 
> This is adjusting the output to a given value of 'x'... Am I trying to
> use contrast.Design() for something that it was not intended for? Any
> tips Frank?

  Does this help?

# Compare with summary(lm(y ~ x * f, data=d))
contrast(l, a=list(f=levels(d$f)[2:4], x=0),
b=list(f=levels(d$f)[1],   x=0))

 f Contrast  S.E.  Lower  Upper t Pr(>|t|)
 b 0.3673455 0.2724247 -0.1737135 0.9084046  1.35 0.1808
 c 4.1310015 0.2714011  3.5919754 4.6700275 15.22 0.
 d 4.4308653 0.2731223  3.8884207 4.9733098 16.22 0.

Error d.f.= 92

# Compare with summary(lm(y ~ x * C(f, base=4), data=d))
contrast(l, a=list(f=levels(d$f)[1:3], x=0),
b=list(f=levels(d$f)[4],   x=0))

 f Contrast   S.E.  Lower  Upper  t  Pr(>|t|)
 a -4.4308653 0.2731223 -4.9733098 -3.8884207 -16.22 0.
 b -4.0635198 0.2782704 -4.6161888 -3.5108507 -14.60 0.
 c -0.2998638 0.2772684 -0.8505427  0.2508151  -1.08 0.2823

Error d.f.= 92

> Cheers,
> Dylan
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assuming AR(1) residuals in OLS

2009-02-17 Thread constantine
Thank you for the lightening replies.

I tested various corStruct objects (?corClasses) using the nlme
package and all work flawlessly.

My best regards to all...

Constantine Tsardounis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combination

2009-02-17 Thread Dimitris Rizopoulos

Eik Vettorazzi wrote:

Hi Dani,
see ?combn .

combn(1:50,2)
gives you all combinations as matrix.
you can do sth like

apply(combn(1:50,2),2, paste, sep="", collapse="")


note that you could simplify the above by

combn(50, 2, paste, collapse = "")


Best,
Dimitris



to get concatenated results.

hth.

Dani Valverde schrieb:

Hello,
I have a sequence of numbers:
seq(1:50)
and I would like to have all the possible combinations with this 
numbers without repeating any combination:

11, 12, 13, ... ,22,23,24,...
How can I do it?
Best,

Dani





--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparison of age categories using contrasts

2009-02-17 Thread Mark Difford

Hi Dylan, Chuck,

>> contrast(l, a=list(f=levels(d$f)[1:3], x=0), b=list(f=levels(d$f)[4],  
>> x=0))

There is a subtlety here that needs to be emphasized. Setting the
interacting variable (x) to zero is reasonable in this case, because the
mean value of rnorm(n) is zero. However, in the real world of real data it
is somewhat unusual for zero to be a reasonable value for an interacting
variable. In fact, it is this setting-to-zero that causes multcomp's "bells
to ring" with a word of warning.

##
glht(l, linfct=mcp(f="Seq"))$linfct

Regards, Mark.



Chuck Cleland wrote:
> 
> On 2/16/2009 10:18 PM, Dylan Beaudette wrote:
>> On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux
>>  wrote:
>>> Greg Snow a écrit :
 One approach is to create your own contrasts matrix:


> mycmat <- diag(8)
> mycmat[ row(mycmat) == col(mycmat) + 1 ] <- -1
> mycmati <- solve(mycmat)
> contrasts(agefactor) <- mycmati[,-1]
>
 Now when you use agefactor, the intercept will be the first age group
 and the slopes will be the differences between the pairs of groups
 (make sure that the order of the levels of agefactor is correct).

 The difference between this method and the contr.sdif function in MASS
 is how the intercept will end up being interpreted (and the dimnames).

 Hope this helps,


>>> Actually, exactly what I needed including the reference to contr.sdif in
>>> MASS I did not spot before (although I am a faithful reader of the
>>> yellow book... but so many things still escape to me). Again thanks a
>>> lot.
>>>
>>> Patrick
>>>
>> 
>> Glad you were able to solve your problem. Frank replied earlier with
>> the suggestion to use contrast.Design() to perform the tests after the
>> initial model has been fit. Perhaps I am a little denser than normal,
>> but I cannot figure out how to apply contrast.Design() to a similar
>> model with several levels of a factor. Example data:
>> 
>> # need these
>> library(lattice)
>> library(Design)
>> 
>> # replicate an important experimental dataset
>> set.seed(10101010)
>> x <- rnorm(100)
>> y1 <- x[1:25] * 2 + rnorm(25, mean=1)
>> y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
>> y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
>> y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)
>> 
>> d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4],
>> each=25)))
>> 
>> # plot
>> xyplot(y ~ x, groups=f, data=d, auto.key=list(columns=4, title='Beard
>> Type', lines=TRUE, points=FALSE, cex=0.75), type=c('p','r'),
>> ylab='Number of Pirates', xlab='Distance from Land')
>> 
>> # standard comparison to base level of f
>> summary(lm(y ~ x * f, data=d))
>> 
>> # compare to level 4 of f
>> summary(lm(y ~ x * C(f, base=4), data=d))
>> 
>> # now try with contrast.Design():
>> dd <- datadist(d)
>> options(datadist='dd')
>> l <- ols(y ~ x * f, data=d)
>> 
>> # probably the wrong approach, as the results do not look too familiar:
>> contrast(l, list(f=levels(d$f)))
>> 
>>  x  f Contrast  S.E.  Lower   Upper t Pr(>|t|)
>>  -0.3449623 a 0.3966682 0.1961059 0.007184856 0.7861515  2.02 0.0460
>>  -0.3449623 b 0.5587395 0.1889173 0.183533422 0.9339456  2.96 0.0039
>>  -0.3449623 c 4.1511677 0.1889170 3.775962254 4.5263730 21.97 0.
>>  -0.3449623 d 4.3510407 0.120 3.975904726 4.7261766 23.04 0. 
>> 
>> This is adjusting the output to a given value of 'x'... Am I trying to
>> use contrast.Design() for something that it was not intended for? Any
>> tips Frank?
> 
>   Does this help?
> 
> # Compare with summary(lm(y ~ x * f, data=d))
> contrast(l, a=list(f=levels(d$f)[2:4], x=0),
> b=list(f=levels(d$f)[1],   x=0))
> 
>  f Contrast  S.E.  Lower  Upper t Pr(>|t|)
>  b 0.3673455 0.2724247 -0.1737135 0.9084046  1.35 0.1808
>  c 4.1310015 0.2714011  3.5919754 4.6700275 15.22 0.
>  d 4.4308653 0.2731223  3.8884207 4.9733098 16.22 0.
> 
> Error d.f.= 92
> 
> # Compare with summary(lm(y ~ x * C(f, base=4), data=d))
> contrast(l, a=list(f=levels(d$f)[1:3], x=0),
> b=list(f=levels(d$f)[4],   x=0))
> 
>  f Contrast   S.E.  Lower  Upper  t  Pr(>|t|)
>  a -4.4308653 0.2731223 -4.9733098 -3.8884207 -16.22 0.
>  b -4.0635198 0.2782704 -4.6161888 -3.5108507 -14.60 0.
>  c -0.2998638 0.2772684 -0.8505427  0.2508151  -1.08 0.2823
> 
> Error d.f.= 92
> 
>> Cheers,
>> Dylan
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code. 
> 
> -- 
> Chuck Cleland, Ph.D.
> NDRI, Inc. (www.ndri.org)
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailm

[R] Chromatogram deconvolution and peak matching

2009-02-17 Thread bartjoosen

Hi,

I'm trying to match peaks between chromatographic runs.
I'm able to match peaks when they are chromatographed with the same method,
but not when there are different methods are used and spectra comes in to
play.

While searching I found the ALS package which should be usefull for my
application, but I couldn't figure it out.

I made some dummy chroms with R, which mimic my actual datasets, to play
with, but after looking at the manuals of ALS, I'm affraid I can't get the
job done. Can someone put me on the right way?

Here is my code to generate the dummy chroms, which also plots the 2 chroms
and the spectra of the 3 peaks:

#2D chromatogram generation
par(mfrow=c(3,1))
time <- seq(0,20,by=0.05)
f <- function(x,rt) dnorm((x-rt),mean=0,sd=rt/35)
c1 <- f(time,6.1)
c2 <- f(time,5.6)
c3 <- f(time,15)
plot(c1+c2+c3~time,type="l",main="chrom1")

#spectrum generation
spectra <- function(x,a,b,c,d,e) a + b*(x-e) + c*((x-e)^2) + d*((x-e)^3)
x <- 220:300
s1 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),0)
s2 <- spectra(x,(-1.054e02),1.3,(-5.239e-03),(6.927e-06),-20)
s3 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),20)

chrom1.tot <-
data.frame(time,outer(c1,s1,"*")+outer(c2,s2,"*")+outer(c2,s2,"*"))
names(chrom.tot)[-1] <- x

#generation of chromatogram 2
c1 <- f(time,2.1)
c2 <- f(time,4)
c3 <- f(time,8) 
plot(c1+c2+c3~time,type="l",main="chrom2")

chrom2.tot <-
data.frame(time,outer(c1,s1,"*")+outer(c2,s2,"*")+outer(c2,s2,"*"))
names(chrom.tot)[-1] <- x

plot(s1~x,type="l",main="spectra")
lines(s2~x,col=2)
lines(s3~x,col=3)

Thanks for your time

Kind Regards

Bart
-- 
View this message in context: 
http://www.nabble.com/Chromatogram-deconvolution-and-peak-matching-tp22057592p22057592.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stopping stepAIC (MASS package)

2009-02-17 Thread Ben Bolker
Peter Jepsen  DCE.AU.DK> writes:
> [snip]
>  ... I would save valuable time if
> the procedure would stop automatically when it encounters any anomaly
> that causes it to warn me. Is this possible? And if so, how?
> 

  Maybe options(warn=2)  will do what you want.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assuming AR(1) residuals in OLS

2009-02-17 Thread constantine
Thank you Gabor Grothendieck for your message.
I would surely like to say, that if someone wants to assume AR(1)
residuals, running the regression y ~ x, could run


gls(y~x, correlation = corAR1(0, ~1))




Constantine Tsardounis
http://www.costis.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] joining "one-to-many"

2009-02-17 Thread Monica Pisica

Hello list,
 
I am wondering if a joining "one-to-many" can be done a little bit easier. I 
tried merge function but I was not able to do it, so I end up using for and if.
 
Suppose you have a table with locations, each location repeated several times, 
and some attributes at that location. The second table has the same locations, 
but only once with a different set of attributes. I would like to add the 
second set of attributes to the first table.
 
Example:
 
set.seed <- 123
loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
val1 <- round(rnorm(10),2)
val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
t1 <- data.frame(loc, val1, val2)
t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = c(25, 
67, 48))
 
# join one-to-many
 
n <- nrow(t1)
m <- nrow(t2)
t1$val3 <- rep(1, n)
t1$val4 <- rep(1, n)
 
for (i in 1:n) {
for (j in 1:m){
if (t1$loc[i]==t2$loc[j]) {
t1$val3[i] <- as.character(t2$val3[j])
t1$val4[i] <- t2$val4[j]
}
}
}
 
Desired result:

t1
   loc  val1 val2 val3 val4
1   L1 -0.41am   25
2   L1 -0.69bm   25
3   L1  0.36cm   25
4   L2  1.11an   67
5   L2  0.15bn   67
6   L2 -0.80dn   67
7   L2 -0.08fn   67
8   L2 -1.01en   67
9   L3 -1.01bp   48
10  L3 -2.50ep   48

 
This code works OK but it is slow if the data frames are actually bigger than 
my little example. I hope somebody knows of a better way of doing these type of 
things.
 
Thanks,
 
Monica
_


22009
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] joining "one-to-many"

2009-02-17 Thread hadley wickham
On Tue, Feb 17, 2009 at 8:33 AM, Monica Pisica  wrote:
>
> Hello list,
>
> I am wondering if a joining "one-to-many" can be done a little bit easier. I 
> tried merge function but I was not able to do it, so I end up using for and 
> if.
>
> Suppose you have a table with locations, each location repeated several 
> times, and some attributes at that location. The second table has the same 
> locations, but only once with a different set of attributes. I would like to 
> add the second set of attributes to the first table.
>
> Example:
>
> set.seed <- 123
> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
> val1 <- round(rnorm(10),2)
> val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
> t1 <- data.frame(loc, val1, val2)
> t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = c(25, 
> 67, 48))
>
> # join one-to-many
>
> n <- nrow(t1)
> m <- nrow(t2)
> t1$val3 <- rep(1, n)
> t1$val4 <- rep(1, n)
>
> for (i in 1:n) {
>for (j in 1:m){
>if (t1$loc[i]==t2$loc[j]) {
>t1$val3[i] <- as.character(t2$val3[j])
>t1$val4[i] <- t2$val4[j]
>}
>}
> }
>
> Desired result:
>
> t1
>   loc  val1 val2 val3 val4
> 1   L1 -0.41am   25
> 2   L1 -0.69bm   25
> 3   L1  0.36cm   25
> 4   L2  1.11an   67
> 5   L2  0.15bn   67
> 6   L2 -0.80dn   67
> 7   L2 -0.08fn   67
> 8   L2 -1.01en   67
> 9   L3 -1.01bp   48
> 10  L3 -2.50ep   48
>
>
> This code works OK but it is slow if the data frames are actually bigger than 
> my little example. I hope somebody knows of a better way of doing these type 
> of things.

merge(t1, t2)


Hadley
-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] joining "one-to-many"

2009-02-17 Thread Marc Schwartz
on 02/17/2009 08:33 AM Monica Pisica wrote:
> Hello list,
>  
> I am wondering if a joining "one-to-many" can be done a little bit
easier. I tried merge function but I was not able to do it, so I end up
using for and if.
> 
> Suppose you have a table with locations, each location repeated
several times, and some attributes at that location. The second table
has the same locations, but only once with a different set of
attributes. I would like to add the second set of attributes to the
first table.

> Example:
>  
> set.seed <- 123

This needs to be set.seed(123)

See ?set.seed

:-)

> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
> val1 <- round(rnorm(10),2)
> val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
> t1 <- data.frame(loc, val1, val2)
> t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = c(25, 
> 67, 48))
>  
> # join one-to-many
>  
> n <- nrow(t1)
> m <- nrow(t2)
> t1$val3 <- rep(1, n)
> t1$val4 <- rep(1, n)
>  
> for (i in 1:n) {
> for (j in 1:m){
> if (t1$loc[i]==t2$loc[j]) {
> t1$val3[i] <- as.character(t2$val3[j])
> t1$val4[i] <- t2$val4[j]
> }
> }
> }
>  
> Desired result:
> 
> t1
>loc  val1 val2 val3 val4
> 1   L1 -0.41am   25
> 2   L1 -0.69bm   25
> 3   L1  0.36cm   25
> 4   L2  1.11an   67
> 5   L2  0.15bn   67
> 6   L2 -0.80dn   67
> 7   L2 -0.08fn   67
> 8   L2 -1.01en   67
> 9   L3 -1.01bp   48
> 10  L3 -2.50ep   48
> 
>  
> This code works OK but it is slow if the data frames are actually
bigger than my little example. I hope somebody knows of a better way of
doing these type of things.

> merge(t1, t2, by = "loc")
   loc  val1 val2 val3 val4
1   L1 -0.32am   25
2   L1 -1.50bm   25
3   L1 -0.31cm   25
4   L2  1.42an   67
5   L2  0.32bn   67
6   L2 -0.12dn   67
7   L2  0.33fn   67
8   L2 -1.74en   67
9   L3  0.88bp   48
10  L3  1.88ep   48

> system.time(merge(t1, t2, by = "loc"))
   user  system elapsed
  0.004   0.000   0.019


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] frequency table for multiple variables

2009-02-17 Thread Hans Ekbrand
Hi r-help!

Consider the following data-frame:

   var1 var2 var3 
1 314 
2 223 
3 223 
4 44   NA 
5 435 
6 223 
7 343 

How can I get R to convert this into the following?

Value 1  2  3  4  5 
var1  0  3  2  2  0
var2  1  3  1  2  0 
var3  0  0  4  1  1

TIA,

-- 
Hans Ekbrand (http://sociologi.cjb.net) 
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] joining "one-to-many"

2009-02-17 Thread Gabor Grothendieck
Try merge(t1, t2)


On Tue, Feb 17, 2009 at 9:33 AM, Monica Pisica  wrote:
>
> Hello list,
>
> I am wondering if a joining "one-to-many" can be done a little bit easier. I 
> tried merge function but I was not able to do it, so I end up using for and 
> if.
>
> Suppose you have a table with locations, each location repeated several 
> times, and some attributes at that location. The second table has the same 
> locations, but only once with a different set of attributes. I would like to 
> add the second set of attributes to the first table.
>
> Example:
>
> set.seed <- 123
> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
> val1 <- round(rnorm(10),2)
> val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
> t1 <- data.frame(loc, val1, val2)
> t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = c(25, 
> 67, 48))
>
> # join one-to-many
>
> n <- nrow(t1)
> m <- nrow(t2)
> t1$val3 <- rep(1, n)
> t1$val4 <- rep(1, n)
>
> for (i in 1:n) {
>for (j in 1:m){
>if (t1$loc[i]==t2$loc[j]) {
>t1$val3[i] <- as.character(t2$val3[j])
>t1$val4[i] <- t2$val4[j]
>}
>}
> }
>
> Desired result:
>
> t1
>   loc  val1 val2 val3 val4
> 1   L1 -0.41am   25
> 2   L1 -0.69bm   25
> 3   L1  0.36cm   25
> 4   L2  1.11an   67
> 5   L2  0.15bn   67
> 6   L2 -0.80dn   67
> 7   L2 -0.08fn   67
> 8   L2 -1.01en   67
> 9   L3 -1.01bp   48
> 10  L3 -2.50ep   48
>
>
> This code works OK but it is slow if the data frames are actually bigger than 
> my little example. I hope somebody knows of a better way of doing these type 
> of things.
>
> Thanks,
>
> Monica
> _
>
>
> 22009
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] spss-file problem with foreign 0.8-32

2009-02-17 Thread Harry Haupt

Hi,
after updating to foreign version 0.8-32, I experienced the following error
when I tried to load a SPSS file:

Fehler in inherits(x, "factor") : objekt "cp" nicht gefunden
Zusätzlich: Warning message:
In read.spss("***l.sav", use.value.labels = TRUE, to.data.frame = TRUE) :
  ***.sav: File-indicated character representation code (1252) looks like a
Windows codepage

It worked without problems with earlier versions.

Thanks for any clues,
best,
Harry
-- 
View this message in context: 
http://www.nabble.com/spss-file-problem-with-foreign-0.8-32-tp22059259p22059259.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Uninstall question

2009-02-17 Thread ANJAN PURKAYASTHA
I need to uninstall R 2.7.1 from my Mac. What is the best way to uninstall
it? Simply delete the R icon in the Applications folder?
Or is it more involved?
TIA,
Anjan

-- 
=
anjan purkayastha, phd
bioinformatics analyst
whitehead institute for biomedical research
nine cambridge center
cambridge, ma 02142

purkayas [at] wi [dot] mit [dot] edu
703.740.6939

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] joining "one-to-many"

2009-02-17 Thread ONKELINX, Thierry
Hi Monica,

merge(t1, t2) works on your example. So why don't you use merge?

HTH,

Thierry




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Monica Pisica
Verzonden: dinsdag 17 februari 2009 15:33
Aan: R help project
Onderwerp: [R] joining "one-to-many"


Hello list,

I am wondering if a joining "one-to-many" can be done a little bit
easier. I tried merge function but I was not able to do it, so I end up
using for and if.

Suppose you have a table with locations, each location repeated several
times, and some attributes at that location. The second table has the
same locations, but only once with a different set of attributes. I
would like to add the second set of attributes to the first table.

Example:

set.seed <- 123
loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
val1 <- round(rnorm(10),2)
val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
t1 <- data.frame(loc, val1, val2)
t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 =
c(25, 67, 48))

# join one-to-many

n <- nrow(t1)
m <- nrow(t2)
t1$val3 <- rep(1, n)
t1$val4 <- rep(1, n)

for (i in 1:n) {
for (j in 1:m){
if (t1$loc[i]==t2$loc[j]) {
t1$val3[i] <- as.character(t2$val3[j])
t1$val4[i] <- t2$val4[j]
}
}
}

Desired result:

t1
   loc  val1 val2 val3 val4
1   L1 -0.41am   25
2   L1 -0.69bm   25
3   L1  0.36cm   25
4   L2  1.11an   67
5   L2  0.15bn   67
6   L2 -0.80dn   67
7   L2 -0.08fn   67
8   L2 -1.01en   67
9   L3 -1.01bp   48
10  L3 -2.50ep   48


This code works OK but it is slow if the data frames are actually bigger
than my little example. I hope somebody knows of a better way of doing
these type of things.

Thanks,

Monica
_


22009
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Uninstall question

2009-02-17 Thread Sundar Dorai-Raj
This is on the Mac FAQ:

http://cran.cnr.berkeley.edu/bin/macosx/RMacOSX-FAQ.html#How-can-R-for-Mac-OS-X-be-uninstalled_003f

HTH,

--sundar

On Tue, Feb 17, 2009 at 7:17 AM, ANJAN PURKAYASTHA
 wrote:
> I need to uninstall R 2.7.1 from my Mac. What is the best way to uninstall
> it? Simply delete the R icon in the Applications folder?
> Or is it more involved?
> TIA,
> Anjan
>
> --
> =
> anjan purkayastha, phd
> bioinformatics analyst
> whitehead institute for biomedical research
> nine cambridge center
> cambridge, ma 02142
>
> purkayas [at] wi [dot] mit [dot] edu
> 703.740.6939
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] joining "one-to-many"

2009-02-17 Thread Monica Pisica


Ok,
 
I feel properly ashamed. I suppose my "real" data is a little bit different 
than my toy data (although i don't know how) because i did try the merge 
function as simple as merge(t1, t2) and did not work. Maybe a reset of my 
session will solve my problems and more coffee my confusion.
 
Again, thanks for your help,
 
Monica
 

> Date: Tue, 17 Feb 2009 10:09:17 -0500
> Subject: Re: [R] joining "one-to-many"
> From: ggrothendi...@gmail.com
> To: pisican...@hotmail.com
> CC: r-help@r-project.org
>
> Try merge(t1, t2)
>
>
> On Tue, Feb 17, 2009 at 9:33 AM, Monica Pisica wrote:
>>
>> Hello list,
>>
>> I am wondering if a joining "one-to-many" can be done a little bit easier. I 
>> tried merge function but I was not able to do it, so I end up using for and 
>> if.
>>
>> Suppose you have a table with locations, each location repeated several 
>> times, and some attributes at that location. The second table has the same 
>> locations, but only once with a different set of attributes. I would like to 
>> add the second set of attributes to the first table.
>>
>> Example:
>>
>> set.seed <- 123
>> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
>> val1 <- round(rnorm(10),2)
>> val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
>> t1 <- data.frame(loc, val1, val2)
>> t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = 
>> c(25, 67, 48))
>>
>> # join one-to-many
>>
>> n <- nrow(t1)
>> m <- nrow(t2)
>> t1$val3 <- rep(1, n)
>> t1$val4 <- rep(1, n)
>>
>> for (i in 1:n) {
>> for (j in 1:m){
>> if (t1$loc[i]==t2$loc[j]) {
>> t1$val3[i] <- as.character(t2$val3[j])
>> t1$val4[i] <- t2$val4[j]
>> }
>> }
>> }
>>
>> Desired result:
>>
>> t1
>> loc val1 val2 val3 val4
>> 1 L1 -0.41 a m 25
>> 2 L1 -0.69 b m 25
>> 3 L1 0.36 c m 25
>> 4 L2 1.11 a n 67
>> 5 L2 0.15 b n 67
>> 6 L2 -0.80 d n 67
>> 7 L2 -0.08 f n 67
>> 8 L2 -1.01 e n 67
>> 9 L3 -1.01 b p 48
>> 10 L3 -2.50 e p 48
>>
>>
>> This code works OK but it is slow if the data frames are actually bigger 
>> than my little example. I hope somebody knows of a better way of doing these 
>> type of things.
>>
>> Thanks,
>>
>> Monica
>> _
>>
>>
>> 22009
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread Bernhard Reinhardt

Hi!

I came across R just a few days ago since I was looking for a toolbox 
for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from John Fox.

As described therein plotting survival-functions works well 
(plot(survfit(model))). But I´d like to do some manipulation with the 
survival-functions before plotting them e.g. dividing one 
survival-function by another.


survfit() only returns an object of the following structure

 n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function out 
of these values and how I extract the values from the object? I´m sure 
the calculation is done by the corresponding plot-routine, but I 
couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread Arthur Allignol

Hi,

See ?survfit.object

if fit is the object you get using survfit,
fit$surv will give you the survival probability.

Best,
arthur

Bernhard Reinhardt wrote:

Hi!

I came across R just a few days ago since I was looking for a toolbox 
for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from John Fox.

As described therein plotting survival-functions works well 
(plot(survfit(model))). But I´d like to do some manipulation with the 
survival-functions before plotting them e.g. dividing one 
survival-function by another.


survfit() only returns an object of the following structure

 n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function out 
of these values and how I extract the values from the object? I´m sure 
the calculation is done by the corresponding plot-routine, but I 
couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spss-file problem with foreign 0.8-32

2009-02-17 Thread Peter Dalgaard
Harry Haupt wrote:
> Hi,
> after updating to foreign version 0.8-32, I experienced the following error
> when I tried to load a SPSS file:
> 
> Fehler in inherits(x, "factor") : objekt "cp" nicht gefunden
> Zusätzlich: Warning message:
> In read.spss("***l.sav", use.value.labels = TRUE, to.data.frame = TRUE) :
>   ***.sav: File-indicated character representation code (1252) looks like a
> Windows codepage
> 
> It worked without problems with earlier versions.
> 
> Thanks for any clues,
> best,
> Harry

Yes, something in the logic appears to have gotten garbled.

It's in this part of read,spss:

if (is.character(reencode)) {
cp <- reencode
reencode <- TRUE
}
else if (codepage <= 500 || codepage >= 2000) {
attr(rval, "codepage") <- NULL
reencode <- FALSE
}
else if (m <- match(cp, knownCP, 0L))
cp <- names(knownCP)[m]

if you get to the 2nd "else if" then cp is unset. Possible the attempted
match should be of codepage? But it still looks wrong: Why restrict to
codepages between 500 and 2000 when knownCP contains several values
above 1???

A workaround is to set reencode="ascii" (or whatever is relevant).

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R scripts and parameters

2009-02-17 Thread mauede
A couple of weeks ago I asked how it is possible to run an R script (not a 
function) passing some parameters.
Someone suggested the function "commandArgs()".
I read the on-line help and found no clarifying example. Therefore I do not 
know how to use it appropriately.
I noticed this function returns the pathname of the R executable which is not 
what I need.

I meant to ask if it is possible to open an R session and launch a script 
passing parameters that the script can retrieve and use itself.
Just like in C I can run a program and call it with  some arguments 

> Example_Prog A B C
  
The program "Example_Prog" can acess its own arguments through the data 
structures "argc" an "argv".

How can I launch an R script simulating the above mechanism ? 
Shall I use source ("script-name") ?
Where are the arguments to be passed, as part of the source call ?
Is the function "commandArgs" to be places as one of the first code lines of 
the script in order to access its own arguments ?
Is there any "commandArgs" usage example ?

Thank you very much in advance.
Maura






tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] frequency table for multiple variables

2009-02-17 Thread Marc Schwartz
on 02/17/2009 09:06 AM Hans Ekbrand wrote:
> Hi r-help!
> 
> Consider the following data-frame:
> 
>var1 var2 var3 
> 1 314 
> 2 223 
> 3 223 
> 4 44   NA 
> 5 435 
> 6 223 
> 7 343 
> 
> How can I get R to convert this into the following?
> 
> Value 1  2  3  4  5 
> var1  0  3  2  2  0
> var2  1  3  1  2  0 
> var3  0  0  4  1  1


> t(sapply(DF, function(x) table(factor(x, levels = 1:5
 1 2 3 4 5
var1 0 3 2 2 0
var2 1 3 1 2 0
var3 0 0 4 1 1


The key is to turn each column into a factor with explicitly defined
common levels for tabulation. This enables the table result to have a
consistent format across each column, allowing for a matrix to be
created, rather than a list.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crash after fGarch update

2009-02-17 Thread John Kerpel
Prof Ripley:

Many thanks - it did indeed say it cannot find fGarch after I tried your
advice - but a completely clean re-install did the trick.

John

On Tue, Feb 17, 2009 at 1:09 AM, Prof Brian Ripley wrote:

> Start R with --vanilla, or rename youe saved workspace (.RData).
> Then
>
> library(fGarch)
> load(".RData")  # or whatever you renamed it to.
>
> This will either work or (more ikely) tell you it cannot find fGarch or a
> package it depends on).
>
> On Mon, 16 Feb 2009, John Kerpel wrote:
>
> Hi folks!
>> After updating my packages my R seems to have completely crashed as will
>> not
>> start up - even after I installed 2.8.1 from 2.8.0.
>>
>
> You haven't told us your OS: I am guesing Windows.
>
>  I get the following:
>>
>> Fatal error: unable to restore saved data in .Rdata
>>
>> Error in loadNamespeace(name): there is no package called fGarch
>>
>> But I do have a package called fGarch.
>>
>> After I hit ok, it crashes and exits.  I cannot use any functionality at
>> all.  What do I do?
>>
>> John
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] word wrap using Hershey fonts

2009-02-17 Thread SLCMSR

Hi,

is it possible to wrap a text using Hershey fonts? "\n" does not work!

Thanks in advance,
Martina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread David Winsemius
I seriously doubt that a survfit object could only contain that  
information. I suspect that you are erroneously thinking that what  
print.survfit offers is the entire story.


What does str(survfit(, data=) ) show you?

> data(aml)
> aml.mdl <- survfit(Surv(time, status) ~ x, data=aml)
# this is with survfit.Design since I load Hmisc/Design by default now.
> str(aml.mdl)
List of 18
 $ n: int 23
 $ time : num [1:20] 9 13 18 23 28 31 34 45 48 161 ...
 $ n.risk   : num [1:20] 11 10 8 7 6 5 4 3 2 1 ...
 $ n.event  : num [1:20] 1 1 1 1 0 1 1 0 1 0 ...
 $ surv : num [1:20] 0.909 0.818 0.716 0.614 0.614 ...
 $ type : chr "right"
 $ ntimes.strata: Named int [1:2] 10 10
  ..- attr(*, "names")= chr [1:2] "1" "2"
 $ strata   : Named num [1:2] 10 10
  ..- attr(*, "names")= chr [1:2] "Maintained" "Nonmaintained"
 $ strata.all   : Named int [1:2] 11 12
  ..- attr(*, "names")= chr [1:2] "Maintained" "Nonmaintained"
 $ std.err  : num [1:20] 0.0953 0.1421 0.1951 0.2487 0.2487 ...
 $ upper: num [1:20] 0.987 0.951 0.899 0.835 0.835 ...
 $ lower: num [1:20] 0.508 0.447 0.35 0.266 0.266 ...
 $ conf.type: chr "log-log"
 $ conf.int : num 0.95
 $ maxtime  : num 161
 $ units: chr "Day"
 $ time.label   : chr "time"
 $ call : language survfit(formula = Surv(time, status) ~ x,  
data = aml)

 - attr(*, "class")= chr "survfit"
>

I also don't think survfit returns a Cox model.


On Feb 17, 2009, at 10:37 AM, Bernhard Reinhardt wrote:


Hi!

I came across R just a few days ago since I was looking for a  
toolbox for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from  
John Fox.


As described therein plotting survival-functions works well  
(plot(survfit(model))). But I´d like to do some manipulation with  
the survival-functions before plotting them e.g. dividing one  
survival-function by another.


survfit() only returns an object of the following structure

n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function  
out of these values and how I extract the values from the object? I 
´m sure the calculation is done by the corresponding plot-routine,  
but I couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R scripts and parameters

2009-02-17 Thread Gabor Grothendieck
Try this:

A <- 1
B <- 2
C <- 3
source("myfile.R")

Now the code in myfile can access A, B and C.

On Tue, Feb 17, 2009 at 10:55 AM,   wrote:
> A couple of weeks ago I asked how it is possible to run an R script (not a 
> function) passing some parameters.
> Someone suggested the function "commandArgs()".
> I read the on-line help and found no clarifying example. Therefore I do not 
> know how to use it appropriately.
> I noticed this function returns the pathname of the R executable which is not 
> what I need.
>
> I meant to ask if it is possible to open an R session and launch a script 
> passing parameters that the script can retrieve and use itself.
> Just like in C I can run a program and call it with  some arguments
>
>> Example_Prog A B C
>
> The program "Example_Prog" can acess its own arguments through the data 
> structures "argc" an "argv".
>
> How can I launch an R script simulating the above mechanism ?
> Shall I use source ("script-name") ?
> Where are the arguments to be passed, as part of the source call ?
> Is the function "commandArgs" to be places as one of the first code lines of 
> the script in order to access its own arguments ?
> Is there any "commandArgs" usage example ?
>
> Thank you very much in advance.
> Maura
>
>
>
>
>
>
> tutti i telefonini TIM!
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spss-file problem with foreign 0.8-32

2009-02-17 Thread Harry Haupt


Peter Dalgaard wrote:
> 
> Yes, something in the logic appears to have gotten garbled.
> 
> It's in this part of read,spss:
> 
> if (is.character(reencode)) {
> cp <- reencode
> reencode <- TRUE
> }
> else if (codepage <= 500 || codepage >= 2000) {
> attr(rval, "codepage") <- NULL
> reencode <- FALSE
> }
> else if (m <- match(cp, knownCP, 0L))
> cp <- names(knownCP)[m]
> 
> if you get to the 2nd "else if" then cp is unset. Possible the attempted
> match should be of codepage? But it still looks wrong: Why restrict to
> codepages between 500 and 2000 when knownCP contains several values
> above 1???
> 
> A workaround is to set reencode="ascii" (or whatever is relevant).
> 
> -- 
>O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
> ~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

Dear Peter,
thanks, reencode="ascii" fixed it (and leaves just the warning message,
which seems to have no effect).
Best,
Harry

-
---
Centre for Statistics
Bielefeld University, Germany
-- 
View this message in context: 
http://www.nabble.com/spss-file-problem-with-foreign-0.8-32-tp22059259p22060774.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R scripts and parameters

2009-02-17 Thread Duncan Murdoch

On 2/17/2009 10:55 AM, mau...@alice.it wrote:

A couple of weeks ago I asked how it is possible to run an R script (not a 
function) passing some parameters.
Someone suggested the function "commandArgs()".
I read the on-line help and found no clarifying example. Therefore I do not 
know how to use it appropriately.
I noticed this function returns the pathname of the R executable which is not 
what I need.

I meant to ask if it is possible to open an R session and launch a script 
passing parameters that the script can retrieve and use itself.
Just like in C I can run a program and call it with  some arguments 


Example_Prog A B C
  
The program "Example_Prog" can acess its own arguments through the data structures "argc" an "argv".


How can I launch an R script simulating the above mechanism ? 
Shall I use source ("script-name") ?

Where are the arguments to be passed, as part of the source call ?
Is the function "commandArgs" to be places as one of the first code lines of 
the script in order to access its own arguments ?
Is there any "commandArgs" usage example ?



Gabor gave you a solution from within R.  If you want to run a script 
from the command line, then use commandArgs(TRUE).  For example, put 
this into the file test.R:


commandArgs(TRUE)

(The TRUE says you only want to see the trailing arguments, not 
everything else on the command line.)


Then from the command line, do

Rscript test.R A B C

and you'll see the output

[1] "A" "B" "C"

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ARIMA models

2009-02-17 Thread emj83

is there some sort of R function which can advise me of the best ARIMA(p,q,r)
model to use based on the Schwarz criterion e.g for e.g p=0-5, q =0, r=0-5
or for example p+r< 5???

or is this something I will have to write my own code for?

Thanks Emma
-- 
View this message in context: 
http://www.nabble.com/ARIMA-models-tp22059382p22059382.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to connect R and WinBUGS/OpenBUGS/LinBUGS in Linux in Feb. 2009

2009-02-17 Thread Paul Heinrich Dietrich

Hi all,
I've managed to get JAGS working on my Ubuntu Hardy Linux with a 32-bit
computer and AMD processors using R 2.8.1.  JAGS is great.  I've read that
JAGS is the fastest, but that hasn't been my experience.  At any rate, I
have more experience with WinBUGS under Windows and would like a version of
that working as well.

It seems like I've read a lot on the subject and tried a lot, but haven't
managed to get BUGS to work yet.  The most success I've had is to install
WinBUGS or OpenBUGS using this method:
http://www.math.aau.dk/~slb/kurser/bayes-08/install.html

What you also need to know is that you need to open Wine and add a drive. 
Although Z is recommended, I haven't been able to specify it, but have
gotten a D drive to work, using:

wine D:/opt/OpenBUGS/winbugs.exe

Using this method, OpenBUGS opens.  Now, to be able to open it with R.  I've
read all sorts of discussions about BRugs (which is no longer on CRAN, but
old versions can still be found), rbugs, and R2WinBUGS (which I'm used to
using on Windows with WinBUGS).  Some people say R2WinBUGS cannot run
OpenBUGS on Linux, some claim they've done it (I think).  It seems the same
thing with everything else.  I've tried making the linbugs and cbugs file
recommended elsewhere online.  It's all very confusing.

Can someone show a method that works currently, along with some sample code? 
I'm also new to Linux, and confused by path conventions.  For example, in
rbugs, it shows an example of a path such as
"/var/scratch/jyan/wine-20040408/wine", and I don't see how to modify this. 
I have no /var/scratch to begin with, and think Wine is installed in
/home/me/.wine...(I don't have Linux in front of me right now).

Please help.  Thanks.
-- 
View this message in context: 
http://www.nabble.com/How-to-connect-R-and-WinBUGS-OpenBUGS-LinBUGS-in-Linux-in-Feb.-2009-tp22058716p22058716.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread Eik Vettorazzi

Hi Bernhard,
I'm wondering what you will expect to get in "dividing" two proportional 
survival curves from a fitted cox model.
Anyway, you can provide a newdata object to the survfit function 
containing any combination of cofactors you are interested in and then 
use summary, eg:


fit <- coxph( Surv(futime,fustat)~resid.ds+rx+ecog.ps,data=ovarian)
summary(survfit( fit,newdata=data.frame(rx=1, ecog.ps=2, resid.ds=0)))

hth.

Bernhard Reinhardt schrieb:

Hi!

I came across R just a few days ago since I was looking for a toolbox 
for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from John 
Fox.


As described therein plotting survival-functions works well 
(plot(survfit(model))). But I´d like to do some manipulation with the 
survival-functions before plotting them e.g. dividing one 
survival-function by another.


survfit() only returns an object of the following structure

 n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function 
out of these values and how I extract the values from the object? I´m 
sure the calculation is done by the corresponding plot-routine, but I 
couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/42803-8243
F ++49/40/42803-7790

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparison of age categories using contrasts

2009-02-17 Thread Mark Difford

Hi Dylan, Chuck,

Mark Difford wrote:
>> Coming to your question [?] about how to generate the kind of contrasts
>> that Patrick wanted 
>> using contrast.Design. Well, it is not that straightforward, though I may
>> have missed 
>> something in the documentation to the function. In the past I have
>> cobbled them together 
>> using the kind of hack given below:

Here is a much simpler route, and a correction to my earlier posting (helped
by a little off-list prompting from Prof. Harrell). All that is required to
get successive difference (i.e. sdif) contrasts using contrast.Design is the
following:


contrast(l, a=list(f=c("a","b","c")), b=list(f=c("b","c","d")))

## Run a full example
set.seed(10101010) 
x <- rnorm(100) 
y1 <- x[1:25] * 2 + rnorm(25, mean=1) 
y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5) 
y3 <- x[51:75] * 2.9 + rnorm(25, mean=5) 
y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5) 

d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25)))

dd <- datadist(d) 
options(datadist='dd') 
l <- ols(y ~ x * f, data=d)

## compare with multcomp by setting x=0
summary(glht(l, linfct=mcp(f="Seq")))
contrast(l, b=list(x=0, f=c("a","b","c")), a=list(x=0, f=c("b","c","d")))

Regards, and apologies for any confusion, Mark.


Mark Difford wrote:
> 
> Hi Dylan,
> 
>>> Am I trying to use contrast.Design() for something that it was not
>>> intended for? ...
> 
> I think Prof. Harrell's main point had to do with how interactions are
> handled. You can also get the kind of contrasts that Patrick was
> interested in via multcomp. If we do this using your artificial data set
> we see that the contrasts are the same as those got by fitting the model
> using contr.sdif, but a warning is generated about interactions in the
> model &c. [see example code]
> 
> Part of Prof. Harrell's "system" is that in generating contrasts via
> contr.Design an appropriate value is automatically chosen for the
> interacting variable (in this case the median value of x) so that a
> reasonable default set of contrasts is calculated. This can of course be
> changed.
> 
> Coming to your question [?] about how to generate the kind of contrasts
> that Patrick wanted using contrast.Design. Well, it is not that
> straightforward, though I may have missed something in the documentation
> to the function. In the past I have cobbled them together using the kind
> of hack given below:
> 
> ## Exampe code
> x <- rnorm(100) 
> y1 <- x[1:25] * 2 + rnorm(25, mean=1)
> y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
> y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
> y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)
> 
> 
> d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4],
> each=25))) 
> 
> 
> # now try with contrast.Design(): 
> library(multcomp)
> dd <- datadist(d) 
> options(datadist='dd') 
> l <- ols(y ~ x * f, data=d)
> 
> ## model fitted using successive difference contrasts
> library(MASS)
> T.lm <- lm(y ~ x * f, contrasts=list(f="contr.sdif"), data=d)
> summary(T.lm)
> 
> ## show the warning: model fitted using ols() and treatment contrasts
> summary(glht(l, linfct=mcp(f="Seq")))
> 
> ## "custom" successive difference contrasts using contrast.Design
> TFin <- {}
> for (i in 1:(length(levels(d$f))-1)) {
> tcont <- contrast(l, a=list(f=levels(d$f)[i]),
> b=list(f=levels(d$f)[i+1]))
> TFin <- rbind(TFin, tcont)
> rownames(TFin)[i] <-  paste(levels(d$f)[i], levels(d$f)[i+1], sep="-")
> }
> TFin[,1:9]
> 
> Regards, Mark.
> 
> 
> 
> Dylan Beaudette-2 wrote:
>> 
>> On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux
>>  wrote:
>>> Greg Snow a écrit :
 One approach is to create your own contrasts matrix:


> mycmat <- diag(8)
> mycmat[ row(mycmat) == col(mycmat) + 1 ] <- -1
> mycmati <- solve(mycmat)
> contrasts(agefactor) <- mycmati[,-1]
>

 Now when you use agefactor, the intercept will be the first age group
 and the slopes will be the differences between the pairs of groups
 (make sure that the order of the levels of agefactor is correct).

 The difference between this method and the contr.sdif function in MASS
 is how the intercept will end up being interpreted (and the dimnames).

 Hope this helps,


>>>
>>> Actually, exactly what I needed including the reference to contr.sdif in
>>> MASS I did not spot before (although I am a faithful reader of the
>>> yellow book... but so many things still escape to me). Again thanks a
>>> lot.
>>>
>>> Patrick
>>>
>> 
>> Glad you were able to solve your problem. Frank replied earlier with
>> the suggestion to use contrast.Design() to perform the tests after the
>> initial model has been fit. Perhaps I am a little denser than normal,
>> but I cannot figure out how to apply contrast.Design() to a similar
>> model with several levels of a factor. Example data:
>> 
>> # need these
>> library(lattice)
>> library(Design)
>> 
>> # replicate an important experimental dataset
>> set.seed(10101010)
>> x <- rnorm(100)
>> y

Re: [R] Chromatogram deconvolution and peak matching

2009-02-17 Thread Katharine Mullen
Hoi Bart,

I think you're right that ALS should be applicable to this problem.
Unfortunately in writing I see that there is a bug when the spectra are
NOT constrained to nonnegative values (the package has been used to my
knowledge only in fitting multiway mass spectra thus far, where this
constraint is always applied); I'll have a patched version soon.

I'd be interested in hearing about where the manual could be improved,
too.

#2D chromatogram generation

time <- seq(0,20,by=0.05)
f <- function(x,rt) dnorm((x-rt),mean=0,sd=rt/35)
C1 <- matrix(c( f(time,6.1), f(time,5.6), f(time,15)), nrow=401,ncol=3)
C2 <- matrix(c( f(time,2.1), f(time,4), f(time,8)), nrow=401,ncol=3)

#spectrum generation
spectra <- function(x,a,b,c,d,e) a + b*(x-e) + c*((x-e)^2) + d*((x-e)^3)
x <- 220:300
s1 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),0)
s2 <- spectra(x,(-1.054e02),1.3,(-5.239e-03),(6.927e-06),-20)
s3 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),20)
spec <- matrix(c(s1,s2,s3), nrow=81,ncol=3)

## data matrix 1
chrom1 <- C1 %*% t(spec)

## data matrix 2
chrom2 <- C2 %*% t(spec)

require(ALS)

## you want to recover 2 (401 x 3) matrices C1 and C2 and a
## (82 x 3) matrix E such that
## chrom1 = C1 E^T, chrom2 = C2 E^T

## some starting guess for elution profiles
## these need to be pretty good
Cstart1 <- matrix(c( f(time,4), f(time,6), f(time,12)), nrow=401,ncol=3)
Cstart2 <- matrix(c( f(time,3), f(time,5), f(time,10)), nrow=401,ncol=3)

xx <- als( CList = list(Cstart1, Cstart2), PsiList = list(chrom1, chrom2),
  S=matrix(1, nrow=81, ncol=3), x=time, x2=x, uniC=TRUE,
  normS=TRUE, nonnegS=TRUE)

par(mfrow=c(3,1))

matplot(time, xx$CList[[1]], col=2, type="l",
main="elution profiles chrom1", lty=1,
ylab="amplitude")

matplot(time, C1, col=1, add=TRUE, type="l", lty=1)

legend(10,2,legend=c("real", "estimated"), col=1:2, lty=1)

matplot(time, xx$CList[[2]], col=2, type="l",
main="elution profiles chrom2", lty=1,
ylab="amplitude")

matplot(time, C2, col=1, add=TRUE, type="l", lty=1)

legend(10,6,legend=c("real", "estimated"), col=1:2, lty=1)

matplot(x, xx$S, col=2, type="l", main="spectra", lty=1,
ylab="amplitude")

matplot(x, spec, col=1, add=TRUE, type="l", lty=1)

legend(270,.95,legend=c("real", "estimated"), col=1:2, lty=1)


On Tue, 17 Feb 2009, bartjoosen wrote:

>
> Hi,
>
> I'm trying to match peaks between chromatographic runs.
> I'm able to match peaks when they are chromatographed with the same method,
> but not when there are different methods are used and spectra comes in to
> play.
>
> While searching I found the ALS package which should be usefull for my
> application, but I couldn't figure it out.
>
> I made some dummy chroms with R, which mimic my actual datasets, to play
> with, but after looking at the manuals of ALS, I'm affraid I can't get the
> job done. Can someone put me on the right way?
>
> Here is my code to generate the dummy chroms, which also plots the 2 chroms
> and the spectra of the 3 peaks:
>
> #2D chromatogram generation
> par(mfrow=c(3,1))
> time <- seq(0,20,by=0.05)
> f <- function(x,rt) dnorm((x-rt),mean=0,sd=rt/35)
> c1 <- f(time,6.1)
> c2 <- f(time,5.6)
> c3 <- f(time,15)
> plot(c1+c2+c3~time,type="l",main="chrom1")
>
> #spectrum generation
> spectra <- function(x,a,b,c,d,e) a + b*(x-e) + c*((x-e)^2) + d*((x-e)^3)
> x <- 220:300
> s1 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),0)
> s2 <- spectra(x,(-1.054e02),1.3,(-5.239e-03),(6.927e-06),-20)
> s3 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),20)
>
> chrom1.tot <-
> data.frame(time,outer(c1,s1,"*")+outer(c2,s2,"*")+outer(c2,s2,"*"))
> names(chrom.tot)[-1] <- x
>
> #generation of chromatogram 2
> c1 <- f(time,2.1)
> c2 <- f(time,4)
> c3 <- f(time,8)
> plot(c1+c2+c3~time,type="l",main="chrom2")
>
> chrom2.tot <-
> data.frame(time,outer(c1,s1,"*")+outer(c2,s2,"*")+outer(c2,s2,"*"))
> names(chrom.tot)[-1] <- x
>
> plot(s1~x,type="l",main="spectra")
> lines(s2~x,col=2)
> lines(s3~x,col=3)
>
> Thanks for your time
>
> Kind Regards
>
> Bart
> --
> View this message in context: 
> http://www.nabble.com/Chromatogram-deconvolution-and-peak-matching-tp22057592p22057592.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating a map

2009-02-17 Thread Alina Sheyman
I'm trying to create a fairly basic map using R. What i want to get is the
map of the country with circles representing a count of students in each
state.
What I've done so far is as following -
  map("state")
 symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inches=F)

this gives me the map of the country, but one that's not populated by my
counts.
Does anyone know what I'm doing wrong?

Also, if anyone can recommend a good reference for creating maps in R, I'd
really appreciate that.

thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ARIMA models

2009-02-17 Thread Gabor Grothendieck
see auto.arima in the forecast package.

On Tue, Feb 17, 2009 at 10:20 AM, emj83  wrote:
>
> is there some sort of R function which can advise me of the best ARIMA(p,q,r)
> model to use based on the Schwarz criterion e.g for e.g p=0-5, q =0, r=0-5
> or for example p+r< 5???
>
> or is this something I will have to write my own code for?
>
> Thanks Emma
> --
> View this message in context: 
> http://www.nabble.com/ARIMA-models-tp22059382p22059382.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to connect R and WinBUGS/OpenBUGS/LinBUGS in Linux in Feb. 2009

2009-02-17 Thread Uwe Ligges



Paul Heinrich Dietrich wrote:

Hi all,
I've managed to get JAGS working on my Ubuntu Hardy Linux with a 32-bit
computer and AMD processors using R 2.8.1.  JAGS is great.  I've read that
JAGS is the fastest, but that hasn't been my experience.  At any rate, I
have more experience with WinBUGS under Windows and would like a version of
that working as well.

It seems like I've read a lot on the subject and tried a lot, but haven't
managed to get BUGS to work yet.  The most success I've had is to install
WinBUGS or OpenBUGS using this method:
http://www.math.aau.dk/~slb/kurser/bayes-08/install.html

What you also need to know is that you need to open Wine and add a drive. 
Although Z is recommended, I haven't been able to specify it, but have

gotten a D drive to work, using:

wine D:/opt/OpenBUGS/winbugs.exe

Using this method, OpenBUGS opens.  Now, to be able to open it with R.  I've
read all sorts of discussions about BRugs (which is no longer on CRAN, but
old versions can still be found), rbugs, and R2WinBUGS (which I'm used to
using on Windows with WinBUGS).  Some people say R2WinBUGS cannot run
OpenBUGS on Linux, some claim they've done it (I think).  It seems the same
thing with everything else.  I've tried making the linbugs and cbugs file
recommended elsewhere online.  It's all very confusing.


For short: It is quite unlikely that BRugs / OpenBUGS (which is called 
LinBUGS under Linux) works natively under your Linux (although it might 
work under very specific settings). BRugs is available for Windows users 
from the "CRAN extras" repsository maintained by Brian Ripley. We moved 
it in order to meet GPL compliance issues.


Hence a standard recommendation is to use R2WinBUGS under native R under 
Linux with WinBUGS running under wine. R2WinBUGS can use wine to do so. 
See the help page ?bugs once you have loaded R2WinBUGS.


Best wishes,
Uwe Ligges


Can someone show a method that works currently, along with some sample code? 
I'm also new to Linux, and confused by path conventions.  For example, in

rbugs, it shows an example of a path such as
"/var/scratch/jyan/wine-20040408/wine", and I don't see how to modify this. 
I have no /var/scratch to begin with, and think Wine is installed in

/home/me/.wine...(I don't have Linux in front of me right now).

Please help.  Thanks.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Percentiles/Quantiles with Weighting

2009-02-17 Thread Brigid Mooney
Hi All,

I am looking at applications of percentiles to time sequenced data.  I had
just been using the quantile function to get percentiles over various
periods, but am more interested in if there is an accepted (and/or
R-implemented) method to apply weighting to the data so as to weigh recent
data more heavily.

I wrote the following function, but it seems quite inefficient, and not
really very flexible in its applications - so if anyone has any suggestions
on how to look at quantiles/percentiles within R while also using a
weighting schema, I would be very interested.

Note - this function supposes the data in X is time-sequenced, with the most
recent (and thus heaviest weighted) data at the end of the vector

WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
{
  Xprime <- NA

  for(i in 1:length(X))
  {
Xprime <- c(Xprime, rep(X[i], times=i))
  }

  print("Percentiles:")
  print(quantile(X, pctile))
  print("Weighted:")
  print(Xprime)
  print("Weighted Percentiles:")
  print(quantile(Xprime, pctile, na.rm=TRUE))
}

WtPercentile(1:10)
WtPercentile(rnorm(10))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a map

2009-02-17 Thread David Winsemius

Two places that have worked examples leap to mind:

--- Sarkar's online accompaniment to his book:
http://lmdvr.r-forge.r-project.org/figures/figures.html
Thumbing through the hard copy I see Figure 6.5 might of interest.

--- Addicted to R's graphics gallery:
http://addictedtor.free.fr/graphiques/search.php?q=map&engine=RGG

--  
David Winsemius


On Feb 17, 2009, at 11:53 AM, Alina Sheyman wrote:

I'm trying to create a fairly basic map using R. What i want to get  
is the
map of the country with circles representing a count of students in  
each

state.
What I've done so far is as following -
 map("state")
symbols 
(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inches=F)


this gives me the map of the country, but one that's not populated  
by my

counts.
Does anyone know what I'm doing wrong?

Also, if anyone can recommend a good reference for creating maps in  
R, I'd

really appreciate that.

thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merging files with different structures

2009-02-17 Thread Grant Gillis
Hello list,

Thanks in advance for any help.

I have many (approx 20) files that I have merged.  For example

d1<-read.csv("AlleleReport.csv")
d2<-read.csv("AlleleReport.csv")

m1 <- merge(d1, d2,  by = c("IND", intersect(colnames(d1), colnames(d2))),
all = TRUE)
m2 <- merge(m1, d3,  by = c("IND", intersect(colnames(m1), colnames(d3))),
all = TRUE)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spss-file problem with foreign 0.8-32

2009-02-17 Thread Prof Brian Ripley
It is a issue specific to 0.8-32 and some files (most likely those 
with some (not all) Windows codepages declared).


We are trying to collect together some examples, and will update 
foreign accordingly later in the week.


On Tue, 17 Feb 2009, Harry Haupt wrote:



Hi,
after updating to foreign version 0.8-32, I experienced the following error
when I tried to load a SPSS file:

Fehler in inherits(x, "factor") : objekt "cp" nicht gefunden
Zusätzlich: Warning message:
In read.spss("***l.sav", use.value.labels = TRUE, to.data.frame = TRUE) :
 ***.sav: File-indicated character representation code (1252) looks like a
Windows codepage

It worked without problems with earlier versions.

Thanks for any clues,
best,
Harry
--
View this message in context: 
http://www.nabble.com/spss-file-problem-with-foreign-0.8-32-tp22059259p22059259.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging files with different structures

2009-02-17 Thread Grant Gillis
> Hello list,
>
> I am sorry for the previous half post.  I accidentily hit send.  Thanks
> again in advance for any help.
>
> I have many (approx 20) files that I have merged.  Each data set contains
> rows for individuals and data in 2 - 5 columns (depending upon which data
> set).  The individuals in each data set are not necessarily the same and are
> all duplicated (different data in columns) across sheets I am trying to
> merge.  I have used the merge function
>
For example
>
> d1<-read.csv("AlleleReport.csv")
> d2<-read.csv("AlleleReport.csv")
>
> m1 <- merge(d1, d2,  by = c("IND", intersect(colnames(d1), colnames(d2))),
> all = TRUE)
> m2 <- merge(m1, d3,  by = c("IND", intersect(colnames(m1), colnames(d3))),
> all = TRUE)



My problem is that when the data is merged it looks something like


Ind   L1 L1.1L2 L2.1L3   L3.1
a 12  13  NA NA  NA  NA
a  NA NA 22 43   34   45
b  14  1545   64   NA  NA
b   NANANA  NA 99  84

Is there a way that I can merge the rows for each individual?

Cheers


Grant







>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Percentiles/Quantiles with Weighting

2009-02-17 Thread David Winsemius
I do know that Harrell's Quantile function in the Hmisc package will  
allow quantile estimates from models. Whether it is general enough to  
extend to time series, I have no experience and cannot say.


--
David Winsemius


On Feb 17, 2009, at 11:57 AM, Brigid Mooney wrote:


Hi All,

I am looking at applications of percentiles to time sequenced data.   
I had

just been using the quantile function to get percentiles over various
periods, but am more interested in if there is an accepted (and/or
R-implemented) method to apply weighting to the data so as to weigh  
recent

data more heavily.

I wrote the following function, but it seems quite inefficient, and  
not
really very flexible in its applications - so if anyone has any  
suggestions

on how to look at quantiles/percentiles within R while also using a
weighting schema, I would be very interested.

Note - this function supposes the data in X is time-sequenced, with  
the most

recent (and thus heaviest weighted) data at the end of the vector

WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
{
 Xprime <- NA

 for(i in 1:length(X))
 {
   Xprime <- c(Xprime, rep(X[i], times=i))
 }

 print("Percentiles:")
 print(quantile(X, pctile))
 print("Weighted:")
 print(Xprime)
 print("Weighted Percentiles:")
 print(quantile(Xprime, pctile, na.rm=TRUE))
}

WtPercentile(1:10)
WtPercentile(rnorm(10))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a map

2009-02-17 Thread Greg Snow
You need to give the symbols function the locations where you want the centers 
of the circles to be.  Some datesets with map information also have centers of 
the states that you can use, for the USA, there is the state.center dataset 
that may work for you, or the maptools package function get.Pcent will compute 
a center for polygons (there are probably other similar functions in other 
packages).

For adding circles to a map of the USA, you may want to look at the state.vbm 
data in the TeachingDemos package (works with maptools package), but you will 
need to computer the centers of the polygons, they don't match state.center or 
others.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Alina Sheyman
> Sent: Tuesday, February 17, 2009 9:53 AM
> To: r-help@r-project.org
> Subject: [R] creating a map
> 
> I'm trying to create a fairly basic map using R. What i want to get is
> the
> map of the country with circles representing a count of students in
> each
> state.
> What I've done so far is as following -
>   map("state")
> 
> symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inch
> es=F)
> 
> this gives me the map of the country, but one that's not populated by
> my
> counts.
> Does anyone know what I'm doing wrong?
> 
> Also, if anyone can recommend a good reference for creating maps in R,
> I'd
> really appreciate that.
> 
> thank you
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion with binomial distribution

2009-02-17 Thread Ben Bolker
Jessica L Hite/hitejl/O/VCU  vcu.edu> writes:

> I am attempting to run a glm with a binomial model to analyze proportion
> data.
> I have been following Crawley's book closely and am wondering if there is
> an accepted standard for how much is too much overdispersion? (e.g. change
> in AIC has an accepted standard of 2).

  In principle, in the null case (i.e. data are really binomial)
the deviance is  chi-squared distributed with the df equal
to the residual df.

  For example:

example(glm)
deviance(glm.D93) ## 5.13
summary(glm.D93)$dispersion ## 1 (by definition)
dfr <- df.residual(glm.D93)
deviance(glm.D93)/dfr ## 1.28
d2 <- sum(residuals(glm.D93,"pearson")^2) ## 5.17
(disp2 <- d2/dfr)  ## 1.293

gg2 <- update(glm.D93,family=quasipoisson)
summary(gg2)$dispersion  ## 1.293, same as above

pchisq(d2,df=dfr,lower.tail=FALSE)

all.equal(coef(glm.D93),coef(gg2)) ## TRUE

se1 <- coef(summary(glm.D93))[,"Std. Error"]
se2 <- coef(summary(gg2))[,"Std. Error"]
se2/se1

# (Intercept)outcome2outcome3  treatment2  treatment3 
#   1.1372341.1372341.1372341.1372341.137234 

sqrt(disp2)
# [1] 1.137234

> My code and output are below, given the example in the book, these data are
> WAY overdispersed .do I mention this and go on or does this signal the
> need to try a different model? If so, any suggestions on the type of
> distribution (gamma or negative binomial ?)?

  Way overdispersed may indicate model lack of fit.  Have
you examined residuals/data for outliers etc.?  

  quasibinomial should be fine, or you can try beta-binomial
(see the aod package) ...


> attach(Clutch2)
>  y<-cbind(Total,Size-Total)
> glm1<-glm(y~Pred,"binomial")
> summary(glm1)
> 
> Call:
> glm(formula = y ~ Pred, family = "binomial")
> 
> Deviance Residuals:
> Min   1Q   Median   3Q  Max
> -9.1022  -2.7899  -0.4781   2.6058   8.4852
> 
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept)  1.350950.06612  20.433  < 2e-16 ***
> PredF   -0.348110.11719  -2.970  0.00297 **
> PredSN  -3.291560.10691 -30.788  < 2e-16 ***
> PredW   -1.464510.12787 -11.453  < 2e-16 ***
> PredWF  -0.564120.13178  -4.281 1.86e-05 ***
> ---
>  the output for residual deviance and df does not change even when I
> use quasibinomial, is this ok?  #

  That's as expected.
 
>  library(MASS)

  you don't really need MASS for quasibinomial.

> > glm2<-glm(y~Pred,"quasibinomial")
> > summary(glm2)
> 
> Call:
> glm(formula = y ~ Pred, family = "quasibinomial")
> 
> Deviance Residuals:
> Min   1Q   Median   3Q  Max
> -9.1022  -2.7899  -0.4781   2.6058   8.4852
> 
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept)   1.3510 0.2398   5.633 1.52e-07 ***
> PredF-0.3481 0.4251  -0.819  0.41471
> PredSN   -3.2916 0.3878  -8.488 1.56e-13 ***
> PredW-1.4645 0.4638  -3.157  0.00208 **
> PredWF   -0.5641 0.4780  -1.180  0.24063
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> 
> (Dispersion parameter for quasibinomial family taken to be 13.15786)
> 
> Null deviance: 2815.5  on 108  degrees of freedom
> Residual deviance: 1323.5  on 104  degrees of freedom
>   (3 observations deleted due to missingness)
> AIC: NA
> 
> Number of Fisher Scoring iterations: 5
> 
> > anova(glm2,test="F")
> Analysis of Deviance Table
> 
> Model: quasibinomial, link: logit
> 
> Response: y
> 
> Terms added sequentially (first to last)
> 
>   Df Deviance Resid. Df Resid. Dev  F   Pr(>F)
> NULL108 2815.5
> Pred   4   1492.0   104 1323.5 28.349 6.28e-16 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> > model1<-update(glm2,~.-Pred)
> > anova(glm2,model1,test="F")
> Analysis of Deviance Table
> 
> Model 1: y ~ Pred
> Model 2: y ~ 1
>   Resid. Df Resid. Dev  Df Deviance  F   Pr(>F)
> 1   104 1323.5
> 2   108 2815.5  -4  -1492.0 28.349 6.28e-16 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> > coef(glm2)
> (Intercept)   PredF  PredSN   PredW  PredWF
>   1.3509550  -0.3481096  -3.2915601  -1.4645097  -0.5641223
> 
> Thanks
> Jessica
> hitejl  vcu.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a map

2009-02-17 Thread Alina Sheyman
Thanks Greg,
  do you know where i can find the sate.center dataset that you mention?

On Tue, Feb 17, 2009 at 12:28 PM, Greg Snow  wrote:

> You need to give the symbols function the locations where you want the
> centers of the circles to be.  Some datesets with map information also have
> centers of the states that you can use, for the USA, there is the
> state.center dataset that may work for you, or the maptools package function
> get.Pcent will compute a center for polygons (there are probably other
> similar functions in other packages).
>
> For adding circles to a map of the USA, you may want to look at the
> state.vbm data in the TeachingDemos package (works with maptools package),
> but you will need to computer the centers of the polygons, they don't match
> state.center or others.
>
> Hope this helps,
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> > project.org] On Behalf Of Alina Sheyman
> > Sent: Tuesday, February 17, 2009 9:53 AM
> > To: r-help@r-project.org
> > Subject: [R] creating a map
> >
> > I'm trying to create a fairly basic map using R. What i want to get is
> > the
> > map of the country with circles representing a count of students in
> > each
> > state.
> > What I've done so far is as following -
> >   map("state")
> >
> > symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inch
> > es=F)
> >
> > this gives me the map of the country, but one that's not populated by
> > my
> > counts.
> > Does anyone know what I'm doing wrong?
> >
> > Also, if anyone can recommend a good reference for creating maps in R,
> > I'd
> > really appreciate that.
> >
> > thank you
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error with "make" with R-devel

2009-02-17 Thread Lana Schaffer
Hi,
I am getting an error compiling the R-devel on a suse architecture
64-bit architecture.  The cp attribute is sending 'trusted.lov'
and an error.  This is a sample of the output:

>  make[3]: Entering directory
>`/lustre/people/schaffer/R-devel/src/library/base'
>building package 'base'
>make[4]: Entering directory
>`/lustre/people/schaffer/R-devel/src/library/base'
>all.R is unchanged
>cmp: EOF on ../../../library/base/R/base
>cp: getting attribute `trusted.lov' of `all.R': Operation not permitted
>make[4]: *** [mkR] Error 1
>make[4]: Leaving directory
>`/lustre/people/schaffer/R-devel/src/library/base'
>make[3]: *** [all] Error 2

Is there someone who can modify the Makefile so that I am able
to compile R-devel?

Lana Schaffer
Biostatistics/Informatics
The Scripps Research Institute
DNA Array Core Facility
La Jolla, CA 92037
(858) 784-2263
(858) 784-2994
schaf...@scripps.edu 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] printing out the summary for lm into a txt file

2009-02-17 Thread kayj

Hi All,


I am trying to run several linear regressions and print out the summay and
the anova reslts on the top of 
each other for each model. Below is a sample progarm that did not work. is
it possible to print the
anova below the summary of lm in one file?

thanks for your help



##

data<-read.table("data.txt", header=T, sep='\t')

for (i in 1:100){

y<-data[,i]

lm.model<-lm(y~data$x1+data$x2+data$x3+data)

sink("results.txt", append=T)

s<-summary(lm.Model)
print(s)
sink()

an<-anova(lm.Model)
print(an)
sink()

} 
-- 
View this message in context: 
http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22062643.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to control the overall shape of a figure?

2009-02-17 Thread Oliver
Thanks very much, exactly what I need.

Oliver

On Feb 16, 10:36 pm, Marc Schwartz  wrote:
> on 02/16/2009 07:51 PM Oliver wrote:
>
>
>
> > hi,
>
> > I am a R beginner. One thing I notice is that when do graphing is,
>
> > if I want to draw two figures in a row such as this:
>
> > par(mfrow(1, 2))
> > plot(...)
> > plot(...)
>
> > Each figure inside will be rectangle instead of the familiar square
> > shape.
> > Though you can drag the edge the window to resize it. I would have
> > prefer this can be done automatically ... also when I do the pdf
> > export for example. Is this possible?
>
> > The thing is if you do par (mfrow(2,2)), then you got all 4 figures in
> > perfect square shape ... but one row two figures are pretty common to
> > me.
>
> > thanks for help
>
> > Oliver
>
> The overall shape of the plot region is controlled by par("pty"), which
> is set to "m" by default, for 'maximal' plotting region.
>
> Set this to "s" to create a square plot region. For example:
>
> par(mfrow = c(1, 2), pty = "s")
> plot(1)
> plot(1)
>
> In order to do this with a PDF file, ensure that the overall plot
> dimensions are in the proper relative aspect ratio. For example:
>
> pdf(file = "SquarePlots.pdf", height = 4, width = 8)
> par(mfrow = c(1, 2), pty= "s")
> plot(1)
> plot(1)
> dev.off()
>
> This will create 2 square plots, side-by-side, within an overall plot
> size of 4x8. You can further adjust the plot margins and other
> characteristics as may be required.
>
> See ?par and ?Devices for more information regarding these
> customizations and options to set size arguments for specific devices.
>
> HTH,
>
> Marc Schwartz
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multiple levels of nesting with trellis plots

2009-02-17 Thread R User R User
Hi guys,
I have a tricky problem that I'd appreciate your help with.

I have two categorical variables, say varA and varB and an associated
frequency Freq for combinations of the levels of varA and varB. This was
created with a table() call.

I'd now like to make panel plots of the frequency. I can do a panel plot by
varA:
barchart(data$Freq ~ data$varB | data$varA)

How do I achieve two levels of nesting for say a third categorical variable
varC?
That is, how do I specify 2 'given's in the the panel plot, and preferably
arranged the panels vertically within the 'inner' panel/

thanks very much,
Richie

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion with binomial distribution

2009-02-17 Thread Prof Brian Ripley

On Tue, 17 Feb 2009, Ben Bolker wrote:


Jessica L Hite/hitejl/O/VCU  vcu.edu> writes:


I am attempting to run a glm with a binomial model to analyze proportion
data.
I have been following Crawley's book closely and am wondering if there is
an accepted standard for how much is too much overdispersion? (e.g. change
in AIC has an accepted standard of 2).



 In principle, in the null case (i.e. data are really binomial)
the deviance is  chi-squared distributed with the df equal
to the residual df.


*Approximately*, provided the expected counts are not near or below 
one.  See MASS §7.5 for an analysis of the size of the approximation 
errors (which can be large and in both directions).


Given that I once had a consulting job where the over-dispersion was 
causing something close ot panic and was entirely illusory, the lack 
of the 'approximately' can have quite serious consequences.



 For example:

example(glm)
deviance(glm.D93) ## 5.13
summary(glm.D93)$dispersion ## 1 (by definition)
dfr <- df.residual(glm.D93)
deviance(glm.D93)/dfr ## 1.28
d2 <- sum(residuals(glm.D93,"pearson")^2) ## 5.17
(disp2 <- d2/dfr)  ## 1.293

gg2 <- update(glm.D93,family=quasipoisson)
summary(gg2)$dispersion  ## 1.293, same as above

pchisq(d2,df=dfr,lower.tail=FALSE)

all.equal(coef(glm.D93),coef(gg2)) ## TRUE

se1 <- coef(summary(glm.D93))[,"Std. Error"]
se2 <- coef(summary(gg2))[,"Std. Error"]
se2/se1

# (Intercept)outcome2outcome3  treatment2  treatment3
#   1.1372341.1372341.1372341.1372341.137234

sqrt(disp2)
# [1] 1.137234


My code and output are below, given the example in the book, these data are
WAY overdispersed .do I mention this and go on or does this signal the
need to try a different model? If so, any suggestions on the type of
distribution (gamma or negative binomial ?)?


 Way overdispersed may indicate model lack of fit.  Have
you examined residuals/data for outliers etc.?

 quasibinomial should be fine, or you can try beta-binomial
(see the aod package) ...



attach(Clutch2)
 y<-cbind(Total,Size-Total)
glm1<-glm(y~Pred,"binomial")
summary(glm1)

Call:
glm(formula = y ~ Pred, family = "binomial")

Deviance Residuals:
Min   1Q   Median   3Q  Max
-9.1022  -2.7899  -0.4781   2.6058   8.4852

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  1.350950.06612  20.433  < 2e-16 ***
PredF   -0.348110.11719  -2.970  0.00297 **
PredSN  -3.291560.10691 -30.788  < 2e-16 ***
PredW   -1.464510.12787 -11.453  < 2e-16 ***
PredWF  -0.564120.13178  -4.281 1.86e-05 ***
---
 the output for residual deviance and df does not change even when I
use quasibinomial, is this ok?  #


 That's as expected.


 library(MASS)


 you don't really need MASS for quasibinomial.


glm2<-glm(y~Pred,"quasibinomial")
summary(glm2)


Call:
glm(formula = y ~ Pred, family = "quasibinomial")

Deviance Residuals:
Min   1Q   Median   3Q  Max
-9.1022  -2.7899  -0.4781   2.6058   8.4852

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.3510 0.2398   5.633 1.52e-07 ***
PredF-0.3481 0.4251  -0.819  0.41471
PredSN   -3.2916 0.3878  -8.488 1.56e-13 ***
PredW-1.4645 0.4638  -3.157  0.00208 **
PredWF   -0.5641 0.4780  -1.180  0.24063
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

(Dispersion parameter for quasibinomial family taken to be 13.15786)

Null deviance: 2815.5  on 108  degrees of freedom
Residual deviance: 1323.5  on 104  degrees of freedom
  (3 observations deleted due to missingness)
AIC: NA

Number of Fisher Scoring iterations: 5


anova(glm2,test="F")

Analysis of Deviance Table

Model: quasibinomial, link: logit

Response: y

Terms added sequentially (first to last)

  Df Deviance Resid. Df Resid. Dev  F   Pr(>F)
NULL108 2815.5
Pred   4   1492.0   104 1323.5 28.349 6.28e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

model1<-update(glm2,~.-Pred)
anova(glm2,model1,test="F")

Analysis of Deviance Table

Model 1: y ~ Pred
Model 2: y ~ 1
  Resid. Df Resid. Dev  Df Deviance  F   Pr(>F)
1   104 1323.5
2   108 2815.5  -4  -1492.0 28.349 6.28e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

coef(glm2)

(Intercept)   PredF  PredSN   PredW  PredWF
  1.3509550  -0.3481096  -3.2915601  -1.4645097  -0.5641223

Thanks
Jessica
hitejl  vcu.edu



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http:

Re: [R] printing out the summary for lm into a txt file

2009-02-17 Thread David Winsemius

" did not work."

Might that mean errors? Care to share?

Running that through my wetware R interpreter, I think I am seeing you  
ask for creation of models with variable outcomes from a the first 100  
columns of "data". And then you are specifying a formula that includes  
a weird mixture of 3 vectors, data[,"x1"],x2 x3,... plus all of the  
data.frame, "data"? Have you tested this loop without the sink's to  
see if it produces anything meaningful?


Once you test the process, then issue the sink(filename, append=TRUE)  
once, before the loop, do your analyses in the loop, and then close  
with sink() after the loop.


--
David Winsemius
On Feb 17, 2009, at 12:52 PM, kayj wrote:



Hi All,


I am trying to run several linear regressions and print out the  
summay and

the anova reslts on the top of
each other for each model. Below is a sample progarm that did not  
work. is

it possible to print the
anova below the summary of lm in one file?

thanks for your help



##

data<-read.table("data.txt", header=T, sep='\t')

for (i in 1:100){

y<-data[,i]

lm.model<-lm(y~data$x1+data$x2+data$x3+data)

sink("results.txt", append=T)

s<-summary(lm.Model)
print(s)
sink()

an<-anova(lm.Model)
print(an)
sink()

}
--
View this message in context: 
http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22062643.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion with binomial distribution

2009-02-17 Thread Ben Bolker
 Thanks for the clarification.
  I actually had MASS open to that page while
I was composing my reply but forgot to mention
it (trying to do too many things at once) ...

  Ben Bolker

Prof Brian Ripley wrote:
> On Tue, 17 Feb 2009, Ben Bolker wrote:
> 
>> Jessica L Hite/hitejl/O/VCU  vcu.edu> writes:
>>
>>> I am attempting to run a glm with a binomial model to analyze proportion
>>> data.
>>> I have been following Crawley's book closely and am wondering if there is
>>> an accepted standard for how much is too much overdispersion? (e.g. change
>>> in AIC has an accepted standard of 2).
> 
>>  In principle, in the null case (i.e. data are really binomial)
>> the deviance is  chi-squared distributed with the df equal
>> to the residual df.
> 
> *Approximately*, provided the expected counts are not near or below 
> one.  See MASS §7.5 for an analysis of the size of the approximation 
> errors (which can be large and in both directions).
> 
> Given that I once had a consulting job where the over-dispersion was 
> causing something close ot panic and was entirely illusory, the lack 
> of the 'approximately' can have quite serious consequences.
> 
>>  For example:
>>
>> example(glm)
>> deviance(glm.D93) ## 5.13
>> summary(glm.D93)$dispersion ## 1 (by definition)
>> dfr <- df.residual(glm.D93)
>> deviance(glm.D93)/dfr ## 1.28
>> d2 <- sum(residuals(glm.D93,"pearson")^2) ## 5.17
>> (disp2 <- d2/dfr)  ## 1.293
>>
>> gg2 <- update(glm.D93,family=quasipoisson)
>> summary(gg2)$dispersion  ## 1.293, same as above
>>
>> pchisq(d2,df=dfr,lower.tail=FALSE)
>>
>> all.equal(coef(glm.D93),coef(gg2)) ## TRUE
>>
>> se1 <- coef(summary(glm.D93))[,"Std. Error"]
>> se2 <- coef(summary(gg2))[,"Std. Error"]
>> se2/se1
>>
>> # (Intercept)outcome2outcome3  treatment2  treatment3
>> #   1.1372341.1372341.1372341.1372341.137234
>>
>> sqrt(disp2)
>> # [1] 1.137234
>>
>>> My code and output are below, given the example in the book, these data are
>>> WAY overdispersed .do I mention this and go on or does this signal the
>>> need to try a different model? If so, any suggestions on the type of
>>> distribution (gamma or negative binomial ?)?
>>  Way overdispersed may indicate model lack of fit.  Have
>> you examined residuals/data for outliers etc.?
>>
>>  quasibinomial should be fine, or you can try beta-binomial
>> (see the aod package) ...
>>
>>
>>> attach(Clutch2)
>>>  y<-cbind(Total,Size-Total)
>>> glm1<-glm(y~Pred,"binomial")
>>> summary(glm1)
>>>
>>> Call:
>>> glm(formula = y ~ Pred, family = "binomial")
>>>
>>> Deviance Residuals:
>>> Min   1Q   Median   3Q  Max
>>> -9.1022  -2.7899  -0.4781   2.6058   8.4852
>>>
>>> Coefficients:
>>> Estimate Std. Error z value Pr(>|z|)
>>> (Intercept)  1.350950.06612  20.433  < 2e-16 ***
>>> PredF   -0.348110.11719  -2.970  0.00297 **
>>> PredSN  -3.291560.10691 -30.788  < 2e-16 ***
>>> PredW   -1.464510.12787 -11.453  < 2e-16 ***
>>> PredWF  -0.564120.13178  -4.281 1.86e-05 ***
>>> ---
>>>  the output for residual deviance and df does not change even when I
>>> use quasibinomial, is this ok?  #
>>  That's as expected.
>>
>>>  library(MASS)
>>  you don't really need MASS for quasibinomial.
>>
 glm2<-glm(y~Pred,"quasibinomial")
 summary(glm2)
>>> Call:
>>> glm(formula = y ~ Pred, family = "quasibinomial")
>>>
>>> Deviance Residuals:
>>> Min   1Q   Median   3Q  Max
>>> -9.1022  -2.7899  -0.4781   2.6058   8.4852
>>>
>>> Coefficients:
>>> Estimate Std. Error t value Pr(>|t|)
>>> (Intercept)   1.3510 0.2398   5.633 1.52e-07 ***
>>> PredF-0.3481 0.4251  -0.819  0.41471
>>> PredSN   -3.2916 0.3878  -8.488 1.56e-13 ***
>>> PredW-1.4645 0.4638  -3.157  0.00208 **
>>> PredWF   -0.5641 0.4780  -1.180  0.24063
>>> ---
>>> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>>>
>>> (Dispersion parameter for quasibinomial family taken to be 13.15786)
>>>
>>> Null deviance: 2815.5  on 108  degrees of freedom
>>> Residual deviance: 1323.5  on 104  degrees of freedom
>>>   (3 observations deleted due to missingness)
>>> AIC: NA
>>>
>>> Number of Fisher Scoring iterations: 5
>>>
 anova(glm2,test="F")
>>> Analysis of Deviance Table
>>>
>>> Model: quasibinomial, link: logit
>>>
>>> Response: y
>>>
>>> Terms added sequentially (first to last)
>>>
>>>   Df Deviance Resid. Df Resid. Dev  F   Pr(>F)
>>> NULL108 2815.5
>>> Pred   4   1492.0   104 1323.5 28.349 6.28e-16 ***
>>> ---
>>> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
 model1<-update(glm2,~.-Pred)
 anova(glm2,model1,test="F")
>>> Analysis of Deviance Table
>>>
>>> Model 1: y ~ Pred
>>> Model 2: y ~ 1
>>>   Resid. Df Resid. Dev  Df Deviance  F   Pr(>F)
>>> 1   104 1323.5
>>> 2   108 2815.5  -4  -1492.0 28.349 6.28e-16 ***
>>> ---
>>> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.0

Re: [R] creating a map

2009-02-17 Thread Greg Snow
It should be in the datasets package that is automatically loaded with R (at 
least my copy), try ?state and you should see the help for it and others.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

From: Alina Sheyman [mailto:alina...@gmail.com]
Sent: Tuesday, February 17, 2009 11:02 AM
To: Greg Snow
Cc: r-help@r-project.org
Subject: Re: [R] creating a map

Thanks Greg,
  do you know where i can find the sate.center dataset that you mention?
On Tue, Feb 17, 2009 at 12:28 PM, Greg Snow 
mailto:greg.s...@imail.org>> wrote:
You need to give the symbols function the locations where you want the centers 
of the circles to be.  Some datesets with map information also have centers of 
the states that you can use, for the USA, there is the state.center dataset 
that may work for you, or the maptools package function get.Pcent will compute 
a center for polygons (there are probably other similar functions in other 
packages).

For adding circles to a map of the USA, you may want to look at the state.vbm 
data in the TeachingDemos package (works with maptools package), but you will 
need to computer the centers of the polygons, they don't match state.center or 
others.

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-
> project.org] On Behalf Of Alina Sheyman
> Sent: Tuesday, February 17, 2009 9:53 AM
> To: r-help@r-project.org
> Subject: [R] creating a map
>
> I'm trying to create a fairly basic map using R. What i want to get is
> the
> map of the country with circles representing a count of students in
> each
> state.
> What I've done so far is as following -
>   map("state")
>
> symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inch
> es=F)
>
> this gives me the map of the country, but one that's not populated by
> my
> counts.
> Does anyone know what I'm doing wrong?
>
> Also, if anyone can recommend a good reference for creating maps in R,
> I'd
> really appreciate that.
>
> thank you
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] printing out the summary for lm into a txt file

2009-02-17 Thread Greg Snow
The sink() command stops the sinking, so you send the lm output to the file, 
then stop the sinking before printing out the anova result.  So the simplest 
thing to try is to put the first sink (with the filename and append=T) before 
you start the loop, remove all calls to sink within the loop, then do a single 
sink() after the loop ends.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of kayj
> Sent: Tuesday, February 17, 2009 10:52 AM
> To: r-help@r-project.org
> Subject: [R] printing out the summary for lm into a txt file
> 
> 
> Hi All,
> 
> 
> I am trying to run several linear regressions and print out the summay
> and
> the anova reslts on the top of
> each other for each model. Below is a sample progarm that did not work.
> is
> it possible to print the
> anova below the summary of lm in one file?
> 
> thanks for your help
> 
> 
> 
> ##
> 
> data<-read.table("data.txt", header=T, sep='\t')
> 
> for (i in 1:100){
> 
> y<-data[,i]
> 
> lm.model<-lm(y~data$x1+data$x2+data$x3+data)
> 
> sink("results.txt", append=T)
> 
> s<-summary(lm.Model)
> print(s)
> sink()
> 
> an<-anova(lm.Model)
> print(an)
> sink()
> 
> }
> --
> View this message in context: http://www.nabble.com/printing-out-the-
> summary-for-lm-into-a-txt-file-tp22062643p22062643.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Percentiles/Quantiles with Weighting

2009-02-17 Thread Brigid Mooney
Thanks for pointing me to the quantreg package as a resource.  I was hoping
to ask be able to address one quick follow-up question...

I get slightly different variants between using the rq funciton with formula
= mydata ~ 1 as I would if I ran the same data using the quantile function.

Example:

mydata <- (1:10)^2/2
pctile <- seq(.59, .99, .1)

quantile(mydata, pctile)
59%69%79%89%99%
20.015 26.075 32.935 40.595 49.145

rq(mydata~1, tau=pctile)
Call:
rq(formula = mydata ~ 1, tau = pctile)
Coefficients:
tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99
(Intercept)18  24.532  40.550
Degrees of freedom: 10 total; 9 residual

Is it correct to assume this is due to the different accepted methods of
calculating quantiles?  If so, do you know where I would be able to see the
algorithms used in these functions?  I'm not finding it in the documentation
for function rq, and am new enough to R that I don't know where those
references would generally be.




On Tue, Feb 17, 2009 at 12:29 PM, roger koenker  wrote:

> http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869
>
> gives one possibility...
>
> url:www.econ.uiuc.edu/~rogerRoger Koenker
> emailrkoen...@uiuc.eduDepartment of Economics
> vox: 217-333-4558University of Illinois
> fax:   217-244-6678Champaign, IL 61820
>
>
>
>
> On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote:
>
>   Hi All,
>>
>> I am looking at applications of percentiles to time sequenced data.  I had
>> just been using the quantile function to get percentiles over various
>> periods, but am more interested in if there is an accepted (and/or
>> R-implemented) method to apply weighting to the data so as to weigh recent
>> data more heavily.
>>
>> I wrote the following function, but it seems quite inefficient, and not
>> really very flexible in its applications - so if anyone has any
>> suggestions
>> on how to look at quantiles/percentiles within R while also using a
>> weighting schema, I would be very interested.
>>
>> Note - this function supposes the data in X is time-sequenced, with the
>> most
>> recent (and thus heaviest weighted) data at the end of the vector
>>
>> WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
>> {
>>  Xprime <- NA
>>
>>  for(i in 1:length(X))
>>  {
>>   Xprime <- c(Xprime, rep(X[i], times=i))
>>  }
>>
>>  print("Percentiles:")
>>  print(quantile(X, pctile))
>>  print("Weighted:")
>>  print(Xprime)
>>  print("Weighted Percentiles:")
>>  print(quantile(Xprime, pctile, na.rm=TRUE))
>> }
>>
>> WtPercentile(1:10)
>> WtPercentile(rnorm(10))
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Percentiles/Quantiles with Weighting

2009-02-17 Thread roger koenker


url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820



On Feb 17, 2009, at 1:58 PM, Brigid Mooney wrote:

Thanks for pointing me to the quantreg package as a resource.  I was  
hoping to ask be able to address one quick follow-up question...


I get slightly different variants between using the rq funciton with  
formula = mydata ~ 1 as I would if I ran the same data using the  
quantile function.

Example:

mydata <- (1:10)^2/2
pctile <- seq(.59, .99, .1)

quantile(mydata, pctile)
59%69%79%89%99%
20.015 26.075 32.935 40.595 49.145

rq(mydata~1, tau=pctile)
Call:
rq(formula = mydata ~ 1, tau = pctile)
Coefficients:
tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99
(Intercept)18  24.532  40.550
Degrees of freedom: 10 total; 9 residual

Is it correct to assume this is due to the different accepted  
methods of calculating quantiles?  If so, do you know where I would  
be able to see the algorithms used in these functions?  I'm not  
finding it in the documentation for function rq, and am new enough  
to R that I don't know where those references would generally be.




Yes,  quantile() in base R documents 9 varieties of quantiles, 2 more  
than

William Empson's famous 7 Types of Ambiguity.  In quantreg the function
rq() finds a solution to an underlying optimization problem and  
doesn't ask

any further into the nature of the ambiguity -- it does often produce a
warning indicating that there may be more than one solution.  The  
default
base R quantile is interpolated, while the default rq() with method =  
"br"
using the simplex algorithm finds an order statistic, typically.  If  
you prefer

something more like interpolation, you can try rq() with method = "fn"
which is using an interior point algorithm and when there are multiple
solutions it tends to produce something more like the centroid of the
solution set.  I hope that this helps.




On Tue, Feb 17, 2009 at 12:29 PM, roger koenker   
wrote:

http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869

gives one possibility...

url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820




On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote:

Hi All,

I am looking at applications of percentiles to time sequenced data.   
I had

just been using the quantile function to get percentiles over various
periods, but am more interested in if there is an accepted (and/or
R-implemented) method to apply weighting to the data so as to weigh  
recent

data more heavily.

I wrote the following function, but it seems quite inefficient, and  
not
really very flexible in its applications - so if anyone has any  
suggestions

on how to look at quantiles/percentiles within R while also using a
weighting schema, I would be very interested.

Note - this function supposes the data in X is time-sequenced, with  
the most

recent (and thus heaviest weighted) data at the end of the vector

WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
{
 Xprime <- NA

 for(i in 1:length(X))
 {
  Xprime <- c(Xprime, rep(X[i], times=i))
 }

 print("Percentiles:")
 print(quantile(X, pctile))
 print("Weighted:")
 print(Xprime)
 print("Weighted Percentiles:")
 print(quantile(Xprime, pctile, na.rm=TRUE))
}

WtPercentile(1:10)
WtPercentile(rnorm(10))

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] printing out the summary for lm into a txt file

2009-02-17 Thread kayj

Thanks a lot for your help, it worked.




Greg Snow-2 wrote:
> 
> The sink() command stops the sinking, so you send the lm output to the
> file, then stop the sinking before printing out the anova result.  So the
> simplest thing to try is to put the first sink (with the filename and
> append=T) before you start the loop, remove all calls to sink within the
> loop, then do a single sink() after the loop ends.
> 
> Hope this helps,
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
> 
> 
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
>> project.org] On Behalf Of kayj
>> Sent: Tuesday, February 17, 2009 10:52 AM
>> To: r-help@r-project.org
>> Subject: [R] printing out the summary for lm into a txt file
>> 
>> 
>> Hi All,
>> 
>> 
>> I am trying to run several linear regressions and print out the summay
>> and
>> the anova reslts on the top of
>> each other for each model. Below is a sample progarm that did not work.
>> is
>> it possible to print the
>> anova below the summary of lm in one file?
>> 
>> thanks for your help
>> 
>> 
>> 
>> ##
>> 
>> data<-read.table("data.txt", header=T, sep='\t')
>> 
>> for (i in 1:100){
>> 
>> y<-data[,i]
>> 
>> lm.model<-lm(y~data$x1+data$x2+data$x3+data)
>> 
>> sink("results.txt", append=T)
>> 
>> s<-summary(lm.Model)
>> print(s)
>> sink()
>> 
>> an<-anova(lm.Model)
>> print(an)
>> sink()
>> 
>> }
>> --
>> View this message in context: http://www.nabble.com/printing-out-the-
>> summary-for-lm-into-a-txt-file-tp22062643p22062643.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22065292.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] printing out the summary for lm into a txt file

2009-02-17 Thread kayj


this is the error message that I was getting 

 "In sink() ... : no sink to remove"


i got it to work . thanks for the help


David Winsemius wrote:
> 
> " did not work."
> 
> Might that mean errors? Care to share?
> 
> Running that through my wetware R interpreter, I think I am seeing you  
> ask for creation of models with variable outcomes from a the first 100  
> columns of "data". And then you are specifying a formula that includes  
> a weird mixture of 3 vectors, data[,"x1"],x2 x3,... plus all of the  
> data.frame, "data"? Have you tested this loop without the sink's to  
> see if it produces anything meaningful?
> 
> Once you test the process, then issue the sink(filename, append=TRUE)  
> once, before the loop, do your analyses in the loop, and then close  
> with sink() after the loop.
> 
> -- 
> David Winsemius
> On Feb 17, 2009, at 12:52 PM, kayj wrote:
> 
>>
>> Hi All,
>>
>>
>> I am trying to run several linear regressions and print out the  
>> summay and
>> the anova reslts on the top of
>> each other for each model. Below is a sample progarm that did not  
>> work. is
>> it possible to print the
>> anova below the summary of lm in one file?
>>
>> thanks for your help
>>
>>
>>
>> ##
>>
>> data<-read.table("data.txt", header=T, sep='\t')
>>
>> for (i in 1:100){
>>
>> y<-data[,i]
>>
>> lm.model<-lm(y~data$x1+data$x2+data$x3+data)
>>
>> sink("results.txt", append=T)
>>
>> s<-summary(lm.Model)
>> print(s)
>> sink()
>>
>> an<-anova(lm.Model)
>> print(an)
>> sink()
>>
>> }
>> -- 
>> View this message in context:
>> http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22062643.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22065365.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset Regression Package

2009-02-17 Thread Ben Bolker



Alex Roy wrote:
> 
> Dear all ,
>   Is there any subset regression (subset selection
> regression) package in R other than "leaps"?
> 
> 

RSiteSearch("{subset regression}") doesn't turn up much other than
special-purpose tools for ARMA models etc..  What does leaps not do that you
need it to do?

  Ben Bolker

-- 
View this message in context: 
http://www.nabble.com/Subset-Regression-Package-tp22054068p22066217.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Whitening Time Series

2009-02-17 Thread Pele

Hi Bob - your suggesting worked out great... Many thanks!  

Also, thanks everyone for the other suggestions!


Bob McCall wrote:
> 
> Look in the package "forecast" for the function "Arima". It will do what
> you want. It's different than arima function in the stats package.
> Bob
> 
> Pele wrote:
>> 
>> Hi R users,
>> 
>> I am doing cross correlation analysis on  2 time series (call them
>> y-series and x-series) where I need the use the model developed on the
>> x-series to prewhiten the yseries..  Can someone point me to a
>> function/filter in R that would allow me to do that? 
>> 
>> Thanks in advance for any help!
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Whitening-Time-Series-tp22041765p22066246.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cross classified or Multiple membership or Hierarchical (3 level ) logistic models using Umacs

2009-02-17 Thread Luwis Tapiwa Diya
Dear R users,

I would like to fit cross classified or multiple membership logistic models
or a 3 level hierarchical logistic model using the Umacs package. Can anyone
advise me on how to proceed or better point me to examples of  how its done.

Regards,

-- 
Luwis Diya,
Leuven Biostatistics and Statistical Bioinformatics Centre (L-BioStat),
Kapucijnenvoer 35 blok d - bus 7001,
3000 Leuven,
Belgium

Tel: +32 16 336886 or +32 16 336892
Fax: +32 16 337015

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using sapply to apply function to some columns of a dataframe

2009-02-17 Thread mwestphal

Hello:

I would like to sum every x columns of a dataframe for each row.  For instance,
if x is 10, then for dataframe df, this function will sum the first ten elements
together and then the next ten:

sapply(list(colnames(df)[1:10], colnames(df)[11:20]),function(x)apply( df[,x],
1, sum))

If the number of columns is quite large (1000's), then manually entering the
list above is not practical.  Any suggestions?

I would also like to do a variant of the above, where I sum every nth element.


Thanks,

Michael

--
Michael I. Westphal, PhD
Africa Region Water Resources (AFTWR)
South Asia Region Sustainable Development (SASSD)
World Development Report 2010: "Development in a Changing Climate"
www.worldbank.org/wdr2010
Room J6-007 (mail stop: J6-603)
Tel: 202.473.1217
The World Bank
1818  H St NW, Washington DC 20433, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset Regression Package

2009-02-17 Thread Hans W. Borchers
Alex Roy  gmail.com> writes:

> 
> Dear all ,
>   Is there any subset regression (subset selection
> regression) package in R other than "leaps"?


Lars and Lasso are other 'subset selection' methods, see the corresponding
packages 'lars' and 'lasso2' and its description in The Elements of Statistical
Learning.
Also, 'dr', "Methods for dimension reduction for regression", or  'relaimpo',
"Relative importance of regressors in linear models", can be considered.


> Thanks and regards
> 
> Alex
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cumsum vs. sum

2009-02-17 Thread Stavros Macrakis
I recently traced a bug of mine to the fact that cumsum(s)[length(s)]
is not always exactly equal to sum(s).

For example,

 x<-1/(12:14)
 sum(x) - cumsum(x)[3]  => 2.8e-17

Floating-point addition is of course not exact, and in particular is
not associative, so there are various possible reasons for this.
Perhaps sum uses clever summing tricks to get more accurate results?
In some quick experiments, it does seem to get more accurate results
than cumsum.

It might be worth documenting.

 -s

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using sapply to apply function to some columns of a dataframe

2009-02-17 Thread Duncan Murdoch

On 17/02/2009 4:42 PM, mwestp...@worldbank.org wrote:

Hello:

I would like to sum every x columns of a dataframe for each row.  For instance,
if x is 10, then for dataframe df, this function will sum the first ten elements
together and then the next ten:

sapply(list(colnames(df)[1:10], colnames(df)[11:20]),function(x)apply( df[,x],
1, sum))

If the number of columns is quite large (1000's), then manually entering the
list above is not practical.  Any suggestions?

I would also like to do a variant of the above, where I sum every nth element.


I think the easiest way to do this is to convert the dataframe into an 
array with 3 indices, and sum over one of them.  For example:


rows <- 20
cols <- 120

df <- matrix(1:(rows*cols), rows, cols)
# in your case, df <- as.matrix( df )

arr <- array( df, c(rows, 10, cols/10))
sums <- apply( arr, c(1,3), sum)

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matrix output

2009-02-17 Thread phoebe kong
Hi friends,

I have questions about printing a pretty big size matrix.

As you could see from below, the matrix wasn't showed in R at full size
(11X11), but it was cut partly into three smaller matrices (11X4,11X4,11X3).
I'm wondering if there is a way to show the whole matrix with dimension
11X11, do you know how to make it?

If R really couldn't fit the full big matrix at once, what about output the
FULL matrix to a .pdf document? I have been wondering for a long time if we
could output something other than graphic, like data frame or text, to a
.pdf file.

Thanks in advance for your patience and answer.

SY

Sldur Slonset Waso  Sboutn
Sldur1  -0.6744252  -0.08427312  -0.2871798
Slonset327   1.000   0.14257353   0.1981339
Waso   327 327.000   1.   0.5104723
Sboutn 327 327.000 327.   1.000
MidSleep   327 327.000 327. 327.000
SleepEff   327 327.000 327. 327.000
NumTrials   22  22.000  22.  22.000
MeanTrials  22  22.000  22.  22.000
MedianTrials22  22.000  22.  22.000
NS3all5tage296 296.000 296. 296.000
HGsoc118   325 325.000 325. 325.000

 MidSleepSleepEff   NumTrials
MeanTrials
Sldur -0.14512244   0.6721889  0.26482213
0.26256352
Slonset0.73991223  -0.5613362 -0.09429701
-0.02540937
Waso   0.08729977  -0.6836098  0.18577075
0.26369283
Sboutn 0.06169839  -0.4895902  0.10897798
0.07058159
MidSleep   1.  -0.1498673 -0.17786561
0.13043478
SleepEff 327.   1.000 -0.01637493
-0.20835686
NumTrials 22.  22.000  1.
-0.16768424
MeanTrials22.  22.000 46.
1.
MedianTrials  22.  22.000 46.
46.
NS3all5tage  296. 296.000 44.
44.
HGsoc118 325. 325.000 44.
44.

 MedianTrials  NS3all5tage HGsoc118
Sldur   0.1857708   0.07849234  -0.03751973
Slonset 0.2919255  -0.14858206   0.05323562
Waso0.3856578  -0.08148054   0.09341454
Sboutn  0.2557877  -0.05049218   0.06847118
MidSleep0.4172784  -0.09344423   0.05287818
SleepEff   -0.3088650   0.12099185  -0.08453191
NumTrials  -0.1108233  -0.06779422  -0.13164200
MeanTrials  0.9203207  -0.07625088   0.10584919
MedianTrials1.000  -0.10894996   0.13615222
NS3all5tage44.000   1.  -0.10554137
HGsoc118   44.000 632.   1.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix output

2009-02-17 Thread Duncan Murdoch

On 17/02/2009 5:31 PM, phoebe kong wrote:

Hi friends,

I have questions about printing a pretty big size matrix.

As you could see from below, the matrix wasn't showed in R at full size
(11X11), but it was cut partly into three smaller matrices (11X4,11X4,11X3).
I'm wondering if there is a way to show the whole matrix with dimension
11X11, do you know how to make it?


If you increase the output width, e.g.

options(width=1)

won't wrap (but it'll probably be too wide to be useful).  You can also 
reduced the number of decimal places, e.g.


SYrounded <- round(SY, 3)




If R really couldn't fit the full big matrix at once, what about output the
FULL matrix to a .pdf document? I have been wondering for a long time if we
could output something other than graphic, like data frame or text, to a
.pdf file.


The main way to produce nice text in a PDF document through R is to use 
Sweave and LaTeX.  Too much explanation needed for a simple example here.


Duncan Murdoch



Thanks in advance for your patience and answer.

SY

Sldur Slonset Waso  Sboutn
Sldur1  -0.6744252  -0.08427312  -0.2871798
Slonset327   1.000   0.14257353   0.1981339
Waso   327 327.000   1.   0.5104723
Sboutn 327 327.000 327.   1.000
MidSleep   327 327.000 327. 327.000
SleepEff   327 327.000 327. 327.000
NumTrials   22  22.000  22.  22.000
MeanTrials  22  22.000  22.  22.000
MedianTrials22  22.000  22.  22.000
NS3all5tage296 296.000 296. 296.000
HGsoc118   325 325.000 325. 325.000

 MidSleepSleepEff   NumTrials
MeanTrials
Sldur -0.14512244   0.6721889  0.26482213
0.26256352
Slonset0.73991223  -0.5613362 -0.09429701
-0.02540937
Waso   0.08729977  -0.6836098  0.18577075
0.26369283
Sboutn 0.06169839  -0.4895902  0.10897798
0.07058159
MidSleep   1.  -0.1498673 -0.17786561
0.13043478
SleepEff 327.   1.000 -0.01637493
-0.20835686
NumTrials 22.  22.000  1.
-0.16768424
MeanTrials22.  22.000 46.
1.
MedianTrials  22.  22.000 46.
46.
NS3all5tage  296. 296.000 44.
44.
HGsoc118 325. 325.000 44.
44.

 MedianTrials  NS3all5tage HGsoc118
Sldur   0.1857708   0.07849234  -0.03751973
Slonset 0.2919255  -0.14858206   0.05323562
Waso0.3856578  -0.08148054   0.09341454
Sboutn  0.2557877  -0.05049218   0.06847118
MidSleep0.4172784  -0.09344423   0.05287818
SleepEff   -0.3088650   0.12099185  -0.08453191
NumTrials  -0.1108233  -0.06779422  -0.13164200
MeanTrials  0.9203207  -0.07625088   0.10584919
MedianTrials1.000  -0.10894996   0.13615222
NS3all5tage44.000   1.  -0.10554137
HGsoc118   44.000 632.   1.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Python and R

2009-02-17 Thread Esmail Bonakdarian

Hello all,

I am just wondering if any of you are doing most of your scripting
with Python instead of R's programming language and then calling
the relevant R functions as needed?

And if so, what is your experience with this and what sort of
software/library  do you use in combination with Python to be able
to access R's functionality.

Is there much of a performance hit either way? (as both are interpreted
languages)

Thanks,
Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Percentiles/Quantiles with Weighting

2009-02-17 Thread Stavros Macrakis
Here is one kind of weighted quantile function.

The basic idea is very simple:

wquantile <- function( v, w, p )
  {
v <- v[order(v)]
w <- w[order(v)]
v [ which.max( cumsum(w) / sum(w) >= p ) ]
  }

With some more error-checking and general clean-up, it looks like this:

# Simple weighted quantile
#
# v  A numeric vector of observations
# w  A numeric vector of positive weights
# p  The probability 0<=p<=1
#
# Nothing fancy: no interpolation etc.

# Basic idea

wquantile <- function( v, w, p )
  {
v <- v[order(v)]
w <- w[order(v)]
v [ which.max( cumsum(w) / sum(w) >= p ) ]
  }

# Simple weighted quantile
#
# v  A numeric vector of observations
# w  A numeric vector of positive weights
# p  The probability 0<=p<=1
#
# Nothing fancy: no interpolation etc.

wquantile <- function(v,w=rep(1,length(v)),p=.5)
   {
 if (!is.numeric(v) || !is.numeric(w) || length(v) != length(w))
   stop("Values and weights must be equal-length numeric vectors")
 if ( !is.numeric(p) || any( p<0 | p>1 ) )
   stop("Quantiles must be 0<=p<=1")
 ranking <- order(v)
 sumw <- cumsum(w[ranking])
 if ( is.na(w[1]) || w[1]<0 ) stop("Weights must be non-negative numbers")
 plist <- sumw/sumw[length(sumw)]
 sapply(p, function(p) v [ ranking [ which.max( plist >= p ) ] ])
   }

I would appreciate any comments people have on this -- whether
correctness, efficiency, style, 

  -s


On Tue, Feb 17, 2009 at 11:57 AM, Brigid Mooney  wrote:
> Hi All,
>
> I am looking at applications of percentiles to time sequenced data.  I had
> just been using the quantile function to get percentiles over various
> periods, but am more interested in if there is an accepted (and/or
> R-implemented) method to apply weighting to the data so as to weigh recent
> data more heavily.
>
> I wrote the following function, but it seems quite inefficient, and not
> really very flexible in its applications - so if anyone has any suggestions
> on how to look at quantiles/percentiles within R while also using a
> weighting schema, I would be very interested.
>
> Note - this function supposes the data in X is time-sequenced, with the most
> recent (and thus heaviest weighted) data at the end of the vector
>
> WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
> {
>  Xprime <- NA
>
>  for(i in 1:length(X))
>  {
>Xprime <- c(Xprime, rep(X[i], times=i))
>  }
>
>  print("Percentiles:")
>  print(quantile(X, pctile))
>  print("Weighted:")
>  print(Xprime)
>  print("Weighted Percentiles:")
>  print(quantile(Xprime, pctile, na.rm=TRUE))
> }
>
> WtPercentile(1:10)
> WtPercentile(rnorm(10))
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] frequency table for multiple variables

2009-02-17 Thread Hans Ekbrand
On Tue, Feb 17, 2009 at 10:00:40AM -0600, Marc Schwartz wrote:
> on 02/17/2009 09:06 AM Hans Ekbrand wrote:
> > Hi r-help!
> > 
> > Consider the following data-frame:
> > 
> >var1 var2 var3 
> > 1 314 
> > 2 223 
> > 3 223 
> > 4 44   NA 
> > 5 435 
> > 6 223 
> > 7 343 
> > 
> > How can I get R to convert this into the following?
> > 
> > Value 1  2  3  4  5 
> > var1  0  3  2  2  0
> > var2  1  3  1  2  0 
> > var3  0  0  4  1  1
> 
> 
> > t(sapply(DF, function(x) table(factor(x, levels = 1:5
>  1 2 3 4 5
> var1 0 3 2 2 0
> var2 1 3 1 2 0
> var3 0 0 4 1 1
> 
> 
> The key is to turn each column into a factor with explicitly defined
> common levels for tabulation. This enables the table result to have a
> consistent format across each column, allowing for a matrix to be
> created, rather than a list.

Thanks alot, Marc. Neat and efficient, just what I wanted.

BTW, before I saw that you actually included code, I tried on my own,
and wrote this:

my.count <- function(data.frame, levels) {
  result.df <- data.frame(matrix(nrow=length(data.frame),ncol=levels))
  for (i in 1:length(data.frame)) {
result.df[i,] <- table(factor(data.frame[[i]], levels = c(1:levels)))
  }
  result.df
}

which produces the same result. I take this to be a an instructive
example of unnecessary use of for-loops in R.

-- 
Hans Ekbrand (http://sociologi.cjb.net) 
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Python and R

2009-02-17 Thread Warren Young

Esmail Bonakdarian wrote:


I am just wondering if any of you are doing most of your scripting
with Python instead of R's programming language and then calling
the relevant R functions as needed?


No, but if I wanted to do such a thing, I'd look at Sage: 
http://sagemath.org/


It'll give you access to a lot more than just R.


Is there much of a performance hit either way? (as both are interpreted
languages)


Are you just asking, or do you have a particular execution time goal, 
which if exceeded would prevent doing this?  I ask because I suspect 
it's the former, and fast enough is fast enough.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Python and R

2009-02-17 Thread Barry Rowlingson
2009/2/17 Esmail Bonakdarian :
> Hello all,
>
> I am just wondering if any of you are doing most of your scripting
> with Python instead of R's programming language and then calling
> the relevant R functions as needed?

 I tend to use R in its native form for data analysis and modelling,
and python for all my other programming needs (gui stuff with PyQt4,
web stuff, text processing etc etc).

> And if so, what is your experience with this and what sort of
> software/library  do you use in combination with Python to be able
> to access R's functionality.

 When I need to use the two together, it's easiest with 'rpy'. This
lets you call R functions from python, so you can do:

 from rpy import r
 r.hist(z)

to get a histogram of the values in a python list 'z'. There are some
complications converting structured data types between the two but
they can be overcome, and apparently are handled better with the next
generation Rpy2 (which I've not got into yet). Google for rpy for
info.

> Is there much of a performance hit either way? (as both are interpreted
> languages)

 Not sure what you mean here. Do you mean is:

 R> sum(x)

faster than

Python> sum(x)

and how much worse is:

Python> from rpy import r
Python> r.sum(x)

?

 Knuth's remark on premature optimization applies, as ever

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Python and R

2009-02-17 Thread Esmail Bonakdarian
Hello!

On Tue, Feb 17, 2009 at 5:58 PM, Warren Young  wrote:
>
> Esmail Bonakdarian wrote:
>>
>> I am just wondering if any of you are doing most of your scripting
>> with Python instead of R's programming language and then calling
>> the relevant R functions as needed?
>
> No, but if I wanted to do such a thing, I'd look at Sage: http://sagemath.org/

ah .. thanks for the pointer, I had not heard of Sage, I was just
starting to look at
SciPy.

> It'll give you access to a lot more than just R.
>
>> Is there much of a performance hit either way? (as both are interpreted
>> languages)
>
> Are you just asking, or do you have a particular execution time goal, which 
> if exceeded
> would prevent doing this?  I ask because I suspect it's the former, and fast 
> enough is fast
> enough.

I put together a large'ish R program last year, but I think I would be
happier if I could code
it in say Python - but I would rather not do that at the expense of
execution time.

Thanks again for telling me about Sage.

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Saving data to a MS Access Database

2009-02-17 Thread Bill Cunliffe
I have the following dataframe:

 

ad <- data.frame(dates, av, sn$SectorName)

colnames(ad) <- c("Date", "Value", "Tag")

 

which has data (rows 10 to 20, for example) as follows:

 

 DateValue   Tag

10 2008-01-16-0.20875ConDisc

11 2008-01-17-0.21125ConDisc

12 2008-01-18-0.08875ConDisc

13 2008-01-21-0.10875ConDisc

14 2008-01-220.02125 ConDisc

15 2008-01-230.09500 ConDisc

16 2008-01-240.02500 ConDisc

17 2008-01-250.03500 ConDisc

18 2008-01-280.06625 ConDisc

19 2008-01-290.20500 ConDisc

20 2008-01-300.11625 ConDisc

 

I want to import this data in into an existing MS Access database which has
the following properties:

 

Name: tblAD

ID: Autonumber, primary key

Date: Date

Value: Double

Tag: String

 

I am trying this by using the following:

 

result <- sqlUpdate(channel, ad, "tblAD")

 

But I get the following error:

> result <- sqlUpdate(channel, ad, "tblAD")

Error in sqlUpdate(channel, ad, "tblAD") :

cannot update 'tblAD' without unique column

 

 

Can anyone suggest where I am going wrong?

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ylim in plot.ts?

2009-02-17 Thread Oliver Bandel

Hello,


I tried to use ylim=c(x,y) in a plot.ts().

This was ignored.

How can I achieve it to create such graphics?


Ciao,
   Oliver Bandel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] :How to insert a column in a data frame

2009-02-17 Thread Laura Rodriguez Murillo
Hi dear list,

I wonder if somebody can help me with this. I have a text file with
300 rows and around 30 columns and I need to insert a column that
has the number 1 in every row. This new column should be placed
between columns 6 and 7.

As an example: I would want to insert a column (consiting just of the
number 1 in every row, as in column 6) between columns 6 and 7.

847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T

So the new table would look like this:

847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T
847 847 0 0 2 1 1 G G T C A G T T C G A C C C G C A A A G G G G A C A T T

You can also look at this problem as duplicating column 6.

Thank you for your help!


Laura R Murillo
Columbia University
New York

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Python and R

2009-02-17 Thread Esmail Bonakdarian
On Tue, Feb 17, 2009 at 6:05 PM, Barry Rowlingson
 wrote:
> 2009/2/17 Esmail Bonakdarian :

>  When I need to use the two together, it's easiest with 'rpy'. This
> lets you call R functions from python, so you can do:
>
>  from rpy import r
>  r.hist(z)

wow .. that is pretty straight forward, I'll have to check out rpy for sure.

> to get a histogram of the values in a python list 'z'. There are some
> complications converting structured data types between the two but
> they can be overcome, and apparently are handled better with the next
> generation Rpy2 (which I've not got into yet). Google for rpy for
> info.

Will do!

>> Is there much of a performance hit either way? (as both are interpreted
>> languages)
>
>  Not sure what you mean here. Do you mean is:
>
>  R> sum(x)
>
> faster than
>
> Python> sum(x)
>
> and how much worse is:
>
> Python> from rpy import r
> Python> r.sum(x)
>

Well, I have a program written in R which already takes quite a while
to run. I was
just wondering if I were to rewrite most of the logic in Python - the
main thing I use
in R are its regression facilities - if it would speed things up. I
suspect not since
both of them are interpreted, and the bulk of the time is taken up by
R's regression
calls.

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Normal cdf modified function

2009-02-17 Thread Fernando Saldanha
I wonder if an R package would have a function that calculates the following.

Let Y be a normal multivariate function. For example, let Y have 4
dimensions. I want to calculate

P(Y1 < Z1, Y2 < Z2, Y3 > Z3, Y4 > Z4).

There are R functions to do the calculation if all the inequalities
are of the type "<" (the cdf). But is there an R function where the
two types of inequalities ("<" and ">") can be mixed? (The user would
have to specify the set of indexes with inequalities of the type ">")

Thanks for any suggestions.

FS

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >