date:20120406

Re: [R] Find sequence in vector

2012-04-06 Thread Petr Savicky

On Fri, Apr 06, 2012 at 10:25:15AM -0700, ens wrote:
> > a<-sample(1:6,100,replace=T)
> > a
>   [1] 2 4 3 4 5 1 3 2 4 3 6 6 2 6 2 1 5 5 3 4 6 1 6 6 3 4 6 6 4 4 5 4 6 5 6
> 3 4 5 6 3 4 1 6 6 6 4 2 1 1 3 1 5 3 2 2 6 2 5
>  [59] 2 6 1 6 1 1 6 4 4 2 2 3 4 5 6 1 6 4 6 1 5 1 1 2 1 3 4 4 6 3 1 4 1 1 1
> 5 5 2 4 6 5 1
> which(a<=3)
>  [1]   1   3   6   7   8  10  13  15  16  19  22  25  36  40  42  47  48  49 
> 50  51  53  54  55  57  59  61  63  64  68
> [30]  69  70  74  78  80  81  82  83  84  88  89  91  92  93  96 100
> 
> I want to know if the indices are sequential and if so, how many of them are
> sequential in a row. Does anyone know the least clumsy way to do this. I am
> a C++ user by default, so my instinct is probably too mess for R.

Hi.

Try this.

  set.seed(12345)
  (a<-sample(1:6,100,replace=T))

  [1] 5 6 5 6 3 1 2 4 5 6 1 1 5 1 3 3 3 3 2 6 3 2 6 5 4 3 5 4 2 3 5 1 2 5 3 3 6
 [38] 6 4 1 5 3 6 5 2 2 1 1 1 4 6 5 2 2 5 3 5 1 3 2 5 2 6 5 6 2 6 1 4 6 5 4 3 3
 [75] 1 4 6 4 4 1 6 4 1 1 1 2 5 4 5 1 6 5 1 4 5 4 5 5 1 3

  out <- rle(a <= 3)
  out$lengths[out$values]

   [1] 3 2 6 2 1 2 2 2 1 1 5 2 1 3 1 1 1 3 1 4 1 1 2

The first 3 is due to (3 1 2).
The next 2 is due to (1 1),
The next 6 is due to  (1 3 3 3 3 2).

The starting indices of the blocks are

  c(0, cumsum(out$lengths))[which(out$values)] + 1

   [1]  5 11 14 21 26 29 32 35 40 42 45 53 56 58 62 66 68 73 80 83 90 93 99

The endings are

  cumsum(out$lengths)[which(out$values)]

  [1]   7  12  19  22  26  30  33  36  40  42  49  54  56  60  62  66  68  75  
80
  [20]  86  90  93 100

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building R on Solaris (OpenIndiana) with gcc 4.6.2 for amd64 target - relocation problems solved

2012-04-06 Thread Prof Brian Ripley


On 06/04/2012 19:38, Michael Figiel wrote:

Hello,
the "R Installation and Administration" handbook states in Section C 5.1:
For ‘amd64’ the builds have failed to complete in several different
ways, currently with relocation errors for libRblas.so.

To fix it: add '-shared' to the SHLIB_LDFLAGS, SHLIB_CXXLDFLAGS and
SHLIB_FCLDFLAGS before starting configure.

So the complete set of variables' values sufficient to build for the
amd64 target (as used on OpenIndiana 151a2 with gcc 4.6.2):
SHLIB_LDFLAGS=-shared
SHLIB_CXXFLAGS=-shared
SHLIB_FCLDFLAGS=-shared
CFLAGS=-m64
CXXFLAGS=-m64
FFLAGS=-m64
FCFLAGS=-m64
LDFLAGS=-m64 -L/usr/local/lib/amd64
CPPFLAGS=-I/usr/local/include

Additionally you'll need the gnu libiconv, which resides probably in
/usr/gnu, so the CPPFLAGS will need an -I/usr/gnu/include and the
LDFLAGS needs a -L/usr/gnu/lib/amd64 (on my machines I keep everything
in /usr/local therefor no reference to /usr/gnu )

If you prefer to use the Solaris linker (/usr/bin/ld) add
-fno-gnu-linker to the SHLIB_* variables and make sure, your PATH
doesn't list /usr/gnu/bin before /usr/bin.

I've got only OpenIndiana machines, but it should work on Solaris 10/11,
too.


Your instructions are for the GNU linker: those in the manual are for 
the Solaris linker, which does not support --shared.




Kind regards
Michael Figiel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] newbie question: strategy

2012-04-06 Thread sysot1t

newbie to R, less than a week, and I ordered some books about R, but I learn
better by examples.. and thus far I cant find a good example of what I am
trying to do... which follows:

assuming one is using any instrument intra-day data... I want to..

open a file (lets name it signal) that will contain two fields...
date/time(MM/DD/ HH:MM) and signal (1=buy,-1=sell)
open a file with real time data for instrument (I cant find anything that
will let me access intra-day online directly from someone like IQFeed or
eSignal) the content of the file will look as follows:

"Date","Time","Open","High","Low","Close","Volume"
05/16/2007,10:15,74.800,74.850,74.550,74.725,123
05/16/2007,10:16,74.700,74.700,74.600,74.625,33
05/16/2007,10:17,74.675,74.725,74.600,74.600,21

I would like to be able to determine the start and end of the period to
test, and verify that signals exist within that period or assume the signals
are 0... for no trades..

then I would like to basically process the signal file.. and at the time of
the "signal" whenever I see -1 then sell instrument, if I see 1, then buy
it... to determine the buy price, I would like to make sure the signal time
coincides with the intraday data time.. and then either buy/sell the next
minute open... then given a set of variables (target, stop) I would
basically either sell at target or stop... the other thought is to buy/sell
after X number of target/stop bars.. also, if an opposite signal is
processed before target/stop are reached, the position would immediately
reverse...

the above assumes 1 minute bars for simplicity

to view the results, I want to then chart the chart candles and the signals
on the chart... with the P&L below the chart... assuming a starting
portfolio of X size, where X is a variable set to 0.00...

any assistance at all would be greatly appreciated... if you can, please
document any code tidbits to assist with my learning process... 

thanks!



--
View this message in context: 
http://r.789695.n4.nabble.com/newbie-question-strategy-tp4538818p4538818.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] quadratic model with plateau

2012-04-06 Thread help ly

Dear All,

I would like to make a quadratic with a plateau model in R. Is there a
package in R doing this? The bentcableAR package seems won't work.

The link below describes what I am looking for in R exactly:
http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_nlin_sect033.htm

-- 
Thanks so much!

Orange
help.ly2...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Drawing a line in xyplot

2012-04-06 Thread wcheckle

i am trying to replicate the following graph using xyplot :

attach(x)
plot ( jitter(type), mortality, pch=16,  xlim = c(0.25, 3.75))
lines ( c(1-0.375,1.375) , c ( median(mortality[type==1]),
median(mortality[type==1])), lwd=5,col=2)
lines ( c(2-0.375,2.375) , c ( median(mortality[type==2]),
median(mortality[type==2])), lwd=5,col=2)
lines ( c(3-0.375,3.375) , c ( median(mortality[type==3]),
median(mortality[type==3])), lwd=5,col=2)
detach(x)

in the above graph, i draw a median line for "mortality" (range from 5 to
35) by "type" (1,2 or 3).  i now have an additional variable "attend" (0 or
1). within each panel (three panel, one for each "type"), i would like to
draw the median "mortality" for each instance of "attend".  i have been able
to get as far as plotting everything but the median lines:

x11(height=8,width=11)
xyplot ( mortality ~ attend|type, 
panel=function(x,y,subscripts){panel.grid(lty=5);
panel.xyplot(x,y,pch=16,jitter.x=TRUE,col=1)},
strip=strip.custom(which.given=1, bg="orange"),data
=x,aspect=2:1,layout=c(3,1))

any suggestions on how to add the median lines of "mortality" for each
instance of "attend" within each panel of "type"?

thank you



--
View this message in context: 
http://r.789695.n4.nabble.com/Drawing-a-line-in-xyplot-tp4538689p4538689.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filling the matrix row by row in the order from lower to larger elements

2012-04-06 Thread Rui Barradas

Hello,

>
> I maybe missing something but this seems like an indexing problem
> which doesn't require a loop at all.
> 

Yes, but with 'order'.


# Original example
input <- as.matrix(data.frame(a=c(5,1,3,7), b=c(2,6,4,8)))
(input)
desired.result <- as.matrix(data.frame(a=c(100,0,100,0), b=c(0,100,0,100)))
(desired.result)
all.equal(f(input), desired.result)

# Two other examples
set.seed(123)
(x <- matrix(sample(10, 10), ncol=2))
f(x)

(y <- matrix(sample(40, 40), ncol=5))
f(y)


Hope this helps,

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/filling-the-matrix-row-by-row-in-the-order-from-lower-to-larger-elements-tp4538171p4538334.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filling the matrix row by row in the order from lower to larger elements

2012-04-06 Thread Rui Barradas

Hello,

Oops!

What happened to the function 'f'?
Forgot to copy and pasted only the rest, now complete.


f <- function(x){
nr <- nrow(x)
result <- matrix(0, nrow=nr, ncol=ncol(x))
colnames(result) <- colnames(x)

inp.ord <- order(x)[1:nr] - 1 # Keep only one per row, must be zero 
based
inx <- cbind(1:nr, inp.ord %/% nr + 1) # Index matrix
result[inx] <- 100
result
}

# Original example
input <- as.matrix(data.frame(a=c(5,1,3,7), b=c(2,6,4,8)))
(input)
desired.result <- as.matrix(data.frame(a=c(100,0,100,0), b=c(0,100,0,100)))
(desired.result)
all.equal(f(input), desired.result)

# Two other examples
set.seed(123)
(x <- matrix(sample(10, 10), ncol=2))
f(x)

(y <- matrix(sample(40, 40), ncol=5))
f(y)

Note that there's no loops (or apply, which is also a loop.)

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/filling-the-matrix-row-by-row-in-the-order-from-lower-to-larger-elements-tp4538171p4538486.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] effect size measure for dependent samples

2012-04-06 Thread jlbisson

Hi,
The compute.es package is great for calculating estimated effect size.
However, it is limited to tests that have independent groups (i.e.,
independent samples t-test, ANOVA). Are there any other packages available
for download that can handle estimating effect sizes for repeated-measures
ANOVAs or dependent samples t-tests?

-Nelly

--
View this message in context: 
http://r.789695.n4.nabble.com/effect-size-measure-for-dependent-samples-tp3231182p4538322.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FIML in R

2012-04-06 Thread dadrivr

I completely agree that the development of full-information maximum
likelihood (FIML) estimation for use in packages like lm, lme, lmer, etc.
would make R much more attractive.  FIML is better than other approaches to
missing data (e.g., multiple imputation; see Graham, Olchowski, & Gilreath,
2007), and it is much better than what R currently uses for most functions
(listwise deletion).  Right now, the best implementation of FIML is in Mplus
(http://statmodel.com/), which is useful in the structural equation model
(SEM) framework.  There is an R package that incorporates FIML in SEM called
OpenMx (http://openmx.psyc.virginia.edu/openmx-features), in which one could
run multiple regression with FIML.  It would be nice to see FIML estimation
incorporated into other functions, as well though, as FIML seems to be the
way to go.  I second the call to get the R community talking about FIML.

Thank you Andrew for starting this -- it's long overdue.  Hopefully, other
people are interested, as well.

--
View this message in context: 
http://r.789695.n4.nabble.com/FIML-in-R-tp4515074p4538436.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] creating the variable from a data Range

2012-04-06 Thread arunkumar1111

Hi

I have  a dataset with a date variable.
There may be more than one Date Range. i've to create 1 for the dateRange
and 0 for the other.

Please can any one help in this



-
Thanks in Advance
Arun
--
View this message in context: 
http://r.789695.n4.nabble.com/creating-the-variable-from-a-data-Range-tp4538115p4538115.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Building R on Solaris (OpenIndiana) with gcc 4.6.2 for amd64 target - relocation problems solved

2012-04-06 Thread Michael Figiel


 Hello,
the "R Installation and Administration" handbook states in Section C 5.1:
For ‘amd64’ the builds have failed to complete in several different 
ways, currently with relocation errors for libRblas.so.


To fix it: add '-shared' to the SHLIB_LDFLAGS, SHLIB_CXXLDFLAGS and 
SHLIB_FCLDFLAGS before starting configure.


So the complete set of variables' values sufficient to build for the 
amd64 target (as used on OpenIndiana 151a2 with gcc 4.6.2):

SHLIB_LDFLAGS=-shared
SHLIB_CXXFLAGS=-shared
SHLIB_FCLDFLAGS=-shared
CFLAGS=-m64
CXXFLAGS=-m64
FFLAGS=-m64
FCFLAGS=-m64
LDFLAGS=-m64 -L/usr/local/lib/amd64
CPPFLAGS=-I/usr/local/include

Additionally you'll need the gnu libiconv, which resides probably in 
/usr/gnu, so the CPPFLAGS will need an -I/usr/gnu/include and the 
LDFLAGS needs a -L/usr/gnu/lib/amd64 (on my machines I keep everything 
in /usr/local therefor no reference to /usr/gnu )


If you prefer to use the Solaris linker (/usr/bin/ld) add 
-fno-gnu-linker to the SHLIB_* variables and make sure, your PATH 
doesn't list /usr/gnu/bin before /usr/bin.


I've got only OpenIndiana machines, but it should work on Solaris 10/11, 
too.


Kind regards
Michael Figiel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Execution speed in randomForest

2012-04-06 Thread Jason & Caroline Shaw

The CPU time and elapsed time are essentially identical. (That is, the
system time is negligible.)

Using Rprof, I just ran the code twice.  The first time, while
randomForest is doing its thing, there are 850 consecutive lines which
read:
".C" "randomForest.default" "randomForest" "randomForest.formula" "randomForest"
Upon running it a second time, this time taking 285 seconds to
complete, there are 14201 such lines, with nothing intervening

There shouldn't be interference from elsewhere on the machine.  This
is the only memory- and CPU-intensive process.  I don't know how to
check what kind of paging is going on, but since the machine has 16GB
of memory and I am using maybe 3 or 4 at most, I hope paging is not an
issue.

I'm on a CentOS 5 box running R 2.15.0.

On Fri, Apr 6, 2012 at 12:45 PM, jim holtman  wrote:
> Are you looking at the CPU or the elapsed time?  If it is the elapsed
> time, then also capture the CPU time to see if it is different.  Also
> consider the use of the Rprof function to see where time is being
> spent.  What else is running on the machine?  Are you doing any
> paging?  What type of system are you running on?  Use some of the
> system level profiling tools.  If on Windows, then use perfmon.
>
> On Fri, Apr 6, 2012 at 11:28 AM, Jason & Caroline Shaw
>  wrote:
>> I am using the randomForest package.  I have found that multiple runs
>> of precisely the same command can generate drastically different run
>> times.  Can anyone with knowledge of this package provide some insight
>> as to why this would happen and whether there's anything I can do
>> about it?  Here are some details of what I'm doing:
>>
>> - Data: ~80,000 rows, with 10 columns (one of which is the class label)
>> - I randomly select 90% of the data to use to build 500 trees.
>>
>> And this is what I find:
>>
>> - Execution times of randomForest() using the entire dataset (in
>> seconds): 20.65, 20.93, 20.79, 21.05, 21.00, 21.52, 21.22, 21.22
>> - Execution times of randomForest() using the 90% selection: 17.78,
>> 17.74, 126.52, 241.87, 17.56, 17.97, 182.05, 17.82 <-- Note the 3rd,
>> 4th, and 7th.
>> - When the speed is slow, it often stutters, with one or a few trees
>> being produced very quickly, followed by a slow build taking 10 or 20
>> seconds
>> - The oob results are indistinguishable between the fast and slow runs.
>>
>> I select the 90% of my data by using sample() to generate indices and
>> then subsetting, like: selection <- data[sample,].  I thought perhaps
>> this subsetting was getting repeated, rather than storing in memory a
>> new copy of all that data, so I tried circumventing this with
>> eval(data[sample,]).  Probably barking up the wrong tree -- it had no
>> effect, and doesn't explain the run-to-run variation (really, I'm just
>> not clear on what eval() is for).  I have also tried garbage
>> collecting with gc() between each run, and adding a Sys.sleep() for 5
>> seconds, but neither of these has helped either.
>>
>> Any ideas?
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] count() function

2012-04-06 Thread Christopher R. Dolanc

Yes, I just tried the length() function and it worked beautifully.

Thank you.

On 4/5/2012 6:01 PM, William Dunlap wrote:

I think you are looking for the function called length().  I cannot recreate
your output, since I don't know what is in NZ_Conifers, but with the built-in
dataset mtcars I get:

   >  ddply(mtcars, .(cyl,gear,carb), summarize, MeanWt=mean(wt), N=length(wt))
  cyl gear carb  MeanWt N
   1431 2.46500 1
   2441 2.07250 4
   3442 2.68375 4
   4452 1.82650 2
   5631 3.33750 2
   6644 3.09375 4
   7656 2.77000 1
   8832 3.56000 4
   9833 3.86000 3
   10   834 4.68580 5
   11   854 3.17000 1
   12   858 3.57000 1
   >  with(mtcars, sum(cyl==8&  gear==3&  carb==4)) # output line 10
   [1] 5

If all you want is the count of things in various categories, you can use table
instead of ddply and length:
   >  with(mtcars, table(cyl, gear, carb))
   , , carb = 1

  gear
   cyl 3 4 5
 4 1 4 0
 6 2 0 0
 8 0 0 0

   , , carb = 2

  gear
   cyl 3 4 5
 4 0 4 2
 6 0 0 0
 8 4 0 0
   ...

Using ftable on table's output gives a nicer looking printout, but table's 
output is easier
to use in a program.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf
Of Christopher R. Dolanc
Sent: Thursday, April 05, 2012 12:16 PM
To: r-help@r-project.org
Subject: [R] count() function

I keep expecting R to have something analogous to the =count function in
Excel, but I can't find anything. I simply want to count the data for a
given category.

I've been using the ddply() function in the plyr package to summarize
means and st dev of my data, with this code:

ddply(NZ_Conifers,.(ElevCat, DataSource, SizeClass), summarise,
avgDensity=mean(Density), sdDensity=sd(Density), n=sum(Density))

and that gives me results that look like this:

 ElevCat DataSource SizeClass avgDensity   sdDensityn
1Elev1FIAClass1   38.67768  46.6673478734.87598
2Elev1FIAClass2   27.34096  23.3232470820.22879
3Elev1FIAClass3   15.38758   0.7088432 76.93790
4Elev1VTMClass1   66.37897  70.2050817  24958.49284
5Elev1VTMClass2   39.40786  34.9343269  11782.95152
6Elev1VTMClass3   21.17839  12.3487600   1461.30895

But, instead of "sum(Density)", I'd really like counts of "Density", so
that I know the sample size of each group. Any suggestions?

--
Christopher R. Dolanc
Post-doctoral Researcher
University of California, Davis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Christopher R. Dolanc
Post-doctoral Researcher
University of California, Davis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Repeated measures in BiodiversityR

2012-04-06 Thread ecardinal

Hi,
I intend to use the BiodiversityR package to perfom community analyses. I
will do ordination analyses and diversity comparisons (e.g. accumulation
curves, etc.).  

I have repeated measures of plots and want to control for it when performing
analyses. I have not decided yet which analysis I will precisely use, but I
am concerned about using the right sample unit. All I can see right now
would be to have an 'environmental variable' standing for the plot name...

I can't find any mention of 'repeated measures' in the Tree Diversity
Analysis book or  R documentation provided for the package. Does anyone have
references where I could learn more about this?

Thanks a lot.
Cheers,
EC



--
View this message in context: 
http://r.789695.n4.nabble.com/Repeated-measures-in-BiodiversityR-tp4537756p4537756.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading big files in chunks-ff package

2012-04-06 Thread Mav

Dear Jan,
Thank you for your answers. They are very useful. I will try the LaF
package.
Cheers,

--
View this message in context: 
http://r.789695.n4.nabble.com/Reading-big-files-in-chunks-ff-package-tp4502070p4537857.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Find sequence in vector

2012-04-06 Thread ens

> a<-sample(1:6,100,replace=T)
> a
  [1] 2 4 3 4 5 1 3 2 4 3 6 6 2 6 2 1 5 5 3 4 6 1 6 6 3 4 6 6 4 4 5 4 6 5 6
3 4 5 6 3 4 1 6 6 6 4 2 1 1 3 1 5 3 2 2 6 2 5
 [59] 2 6 1 6 1 1 6 4 4 2 2 3 4 5 6 1 6 4 6 1 5 1 1 2 1 3 4 4 6 3 1 4 1 1 1
5 5 2 4 6 5 1
which(a<=3)
 [1]   1   3   6   7   8  10  13  15  16  19  22  25  36  40  42  47  48  49 
50  51  53  54  55  57  59  61  63  64  68
[30]  69  70  74  78  80  81  82  83  84  88  89  91  92  93  96 100

I want to know if the indices are sequential and if so, how many of them are
sequential in a row. Does anyone know the least clumsy way to do this. I am
a C++ user by default, so my instinct is probably too mess for R.

--
View this message in context: 
http://r.789695.n4.nabble.com/Find-sequence-in-vector-tp4537882p4537882.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] system command and Perl confusion

2012-04-06 Thread Nathan McIntyre

This reads like a PATH problem. Specify the absolute path to the perl you
want use: /usr/bin/perl or /opt/local/bin/perl. It looks like you already
have Statistics::Descriptive thru MacPorts; you can use
system(paste("/opt/local/bin/perl myscript.pl", ..., sep = " ")). If you
plan on using the Mac OS X version, /usr/bin/perl, use /usr/bin/cpan to
install Statistics::Descriptive first as it appears you don't have that
installed in /System/Library/Perl.

Thanks,
Nathan

>Hello,
>
>I'm having a question related to the system command within R: I try to
>evoke a perl script from within R with something like
>system(paste('perl myscript.pl', some parameters ,sep=" ")
>
>This works just fine as long as I do not use any additionally installed
>Perl module, as this leads to a
>
>"Can't locate Statistics/Descriptive.pm in @INC ."
>
>This is most likely due to the fact that there are somehow two Perls on my
>Mac OS X (one must have come through my trials playing with Macports), one
>is the Perl that came with the system.
>
>What I do not get, is that I can run the script from the command line
>without a problem.
>
>How can I make R aware of "the other" @INC?
>
>In the terminal, I find Perl to be in
>/opt/local/bin/Perl
>
>the above module is accordingly in
>/opt/local/lib/perl5/site_perl/5.8.9/Statistics/Descriptive.pm
>
>In contrast, running the Perl script from within R, it tries to find the
>modules in /System/Library/Perl/5.10.0/...
>
>I see that this is not necessarily a pure R question, but as it occurs
>specifically using the R system command I thought there might be someone
>around here who came across similar problems!
>
>Best wishes
>Maxim
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Legend based on levels of a variable

2012-04-06 Thread Kumar Mainali

Thank you every body for your suggestion. It does help.

On Fri, Apr 6, 2012 at 2:37 AM, windmagics_lsl  wrote:

> I think there may 3 legends should be added in your plot
> the argument col, pch and pt.cex should be in the same length with legend,
> but the objects col, pch
>  and cex you defined former have 16*3 length. I guess the follow codes may
> work
>
> col <- rep(c("blue", "red", "darkgreen"), c(16, 16, 16))
> ## Choose different size of points
> cex <- rep(c(1, 1.2, 1), c(16, 16, 16))
> ## Choose the form of the points (square, circle, triangle and
> diamond-shaped
> pch <- rep(c(15, 16, 17), c(16, 16, 16))
>
> plot(axis1, axis2, main="My plot", xlab="Axis 1", ylab="Axis 2",
>  col=c(Category, col), pch=pch, cex=cex)
> legend(4, 12.5, c("NorthAmerica", "SouthAmerica", "Asia"), col =
> unique(col),
>   pch = unique(pch), pt.cex = unique(cex), title = "Region")
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Legend-based-on-levels-of-a-variable-tp4536796p4536868.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Section of Integrative Biology
University of Texas at Austin
Austin, Texas 78712, USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I get a rough quick utility plot of a time series?

2012-04-06 Thread Hurr

A friend of mine added plot code to the function SimpleTS() as follows 
but it misses the single point between two missing points:

SimpleTS <- function() { 
  fileStg <- "C:/ad/dta/TryRRead/Rcode/SHIYU/SimpleTS.dta" 
  titleline <- readLines(fileStg, n=1) 
  print(titleline) 
  dta <- read.table(fileStg, skip = 1, header = TRUE, sep = ",", colClasses
= "character")
  print(dta) 
  dta=linearizeTime(dta) 
  print(titleline) 
  print(dta) 
  #Plot code starts
  N=ncol(dta)-1 # N function columns 
  layout(matrix(seq(from=0,to=N,by=1),ncol=1),heights =
c(0.1,rep(0.8,N-1),1))
  for(i in 2:N) { # numb of functs -1 #leave first subplot blank.
mar<-c(0,4,0,1)
par(mar = mar)
   
plot(dta[,1],dta[,i],xlab="",ylab=colnames(dta)[i],axes=F,pch="*",type="l")
box()
  }
  mar<-c(4,4,0,1)
  par(mar=mar,las=0)
 
plot(dta[,1],dta[,N+1],xlab="",ylab=colnames(dta)[N+1],axes=F,pch="*",type="l")
  axis(1)
  title(xlab="time")
  box()
  #plot ends
} #end SimpleTS 
SimpleTS()


--
View this message in context: 
http://r.789695.n4.nabble.com/How-do-I-get-a-rough-quick-utility-plot-of-a-time-series-tp4522709p4538842.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integrate function - error -integration not occurring with last few rows

2012-04-06 Thread Peter Ehlers


On 2012-04-06 07:19, Navin Goyal wrote:

Thank you so much for your help Berend.
I did not see that my code had a typo and it was thus wrongly written ( I
overlooked the i that was supposed to be actually 1)

instead of   for (q in *1*:length(comb1$ID))
  I had it as for (q in *i*:length(comb1$ID))


This is a good example to show that it's usually better to
use seq_along() or seq_len() instead of 1:x.

Peter Ehlers



It works correctly as expected
Thanks again.


Navin






On Fri, Apr 6, 2012 at 9:56 AM, Berend Hasselman  wrote:



On 06-04-2012, at 13:14, Navin Goyal wrote:


Apologies for the lengthy code.
I tried a simple (and shorter) piece of code (pasted below) and it still

gives me the same error for last few rows. Is this a bug or am I doing
something totally wrong?  Could anyone please provide some help/pointers ?




You are not specifying what the  error is for the last few rows?
You are doing something completely wrong (AFAICT).
See below.


PS.  beta0 was fixed to 0.001 in the previous code. Also if I manually

estimated the integral isnt 0. If I scramble the row order, it is again
only the last few rows that dont integrate.

See below.


Thanks

data1<-expand.grid(1:5,0:6,10)
names(data1)<-c("ID","TIME", "DOSE")
data1<-data1[order(data1$DOSE,data1$ID,data1$TIME),]
ed<-data1[!duplicated(data1$ID) , c(1,3)]
ed$base=1
ed$drop=1
set.seed(5234123)
k<-0
for (i in 1:length(ed$ID))
{
k<-k+1
ed$base[k]<-100*exp(rnorm(1,0,0.2))
ed$drop[k]<-0.20*exp(rnorm(1,0,0.5))
}


Why are you not using i to index ed$XXX?
You are not using i.
Simplify to

for (k in 1:length(ed$ID))
{
ed$base[k]<-100*exp(rnorm(1,0,0.2))
ed$drop[k]<-0.20*exp(rnorm(1,0,0.5))
}


comb1<-merge(data1[, c("ID","TIME")], ed)
comb1$disprog<-comb1$base*exp(-comb1$drop*comb1$TIME)
comb1$integral=0
hz.func1<-function(t,bshz,beta1, change)
{ ifelse(t==0,bshz, bshz*exp(beta1*change)) }


Insert here

comb1
i
length(comb1$ID)

and you should see that i is 5 and that length(comb1$ID) is 35.


q<-0
for (m in i:length(comb1$ID))
{
q<-q+1
comb1$integral[q]<-integrate(hz.func1, lower=0, upper=comb1$TIME[q],
   bshz=0.001,beta1=0.035,
  change=comb1$disprog[q])$value
}
comb1



1. Why does your for loop variable m start with i and not 1? (as I told
you in my first reply)
2. Why are you not using the for loop variable m?
3. So from the above m starts at 5 and stops at 35 (==>  312 steps)
4, so you are filling elements 1 to 31 of comb1 and items 32 to 35 are
unchanged.

5. why don't you do this

for (q in 1:length(comb1$ID))
{
comb1$integral[q]<-integrate(hz.func1, lower=0, upper=comb1$TIME[q],
  bshz=0.001,beta1=0.035,
 change=comb1$disprog[q])$value
}
comb1

This avoids a useless variable m and fill all elements of comb1.
And you could just as well reuse variable k; there is no need for a new
variable (q) here.

Berend








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filling the matrix row by row in the order from lower to larger elements

2012-04-06 Thread ilai

On Fri, Apr 6, 2012 at 4:02 PM, Dimitri Liakhovitski
 wrote:
 This works great:

Really ? surprising given it is the EXACT same for-loop as in your
original problem with counter "i" replaced by "k" and reorder to
matrix[!100]<- 0 instead of matrix(0)[i]<- 100
You didn't even attempt to implement Carl's suggestion to use apply
family for looping (which I still think is completely unnecessary).

The only logical conclusion is N=nrow(input) was not large enough to
pose a problem in the first place. In the future please use some brain
power before waisting ours.

Cheers

>
> input<-as.matrix(data.frame(a=c(5,1,3,7),b=c(2,6,4,8)))
> result<-input
> N<-nrow(input)
> for (k in 1:N){
>  foo <- which (input == k,arr.ind=T)
>  result[k,foo[2]] <-100
> }
> result[result !=100]<-0
>
> Dimitri
>
>
> On Fri, Apr 6, 2012 at 5:14 PM, Carl Witthoft  wrote:
>> I think the OP wants to fill values in an arbitrarily large matrix. Now,
>> first of all, I'd like to know what his real problem is, since this seems
>> like a very tedious and unproductive matrix to produce.  But in the
>> meantime,  since he also left out important information, let's assume the
>> input matrix is N rows by M columns, and that he wants therefore to end up
>> with N instances of "100", not counting the original value of 100 that is
>> one of his ranking values (a bad BAD move IMHO).
>>
>> Then either loop or lapply over an equation like (I've expanded things more
>> than necessary for clarity
>> result<-inmatrix
>> for (k in 1:N){
>> foo <- which (inmatrix == k,arr.ind=T)
>> result[k,foo[2]] <-100
>>
>> }
>>
>>
>>
>> I maybe missing something but this seems like an indexing problem
>> which doesn't require a loop at all. Something like this maybe?
>>
>> (input<-matrix(c(5,1,3,7,2,6,4,8),nc=2))
>> output <- matrix(0,max(input),2)
>> output[input[,1],1] <- 100
>> output[input[,2],2] <- 100
>> output
>>
>> Cheers
>>
>>
>> On Fri, Apr 6, 2012 at 1:49 PM, Dimitri Liakhovitski
>>  wrote:
>>> Hello, everybody!
>>>
>>> I have a matrix "input" (see example below) - with all unique entries
>>> that are actually unique ranks (i.e., start with 1, no ties).
>>> I want to assign a value of 100 to the first row of the column that
>>> contains the minimum (i.e., value of 1).
>>> Then, I want to assign a value of 100 to the second row of the column
>>> that contains the value of 2, etc.
>>> The results I am looking for are in "desired.results".
>>> My code (below) does what I need. But it's using a loop through all
>>> the rows of my matrix and searches for a matrix element every time.
>>> My actual matrix is very large. Is there a way to do it more efficiently?
>>> Thank you very much for the tips!
>>> Dimitri
>>>
>>> input<-as.matrix(data.frame(a=c(5,1,3,7),b=c(2,6,4,8)))
>>> (input)
>>> desired.result<-as.matrix(data.frame(a=c(100,0,100,0),b=c(0,100,0,100)))
>>> (desired.result)
>>> result<-as.matrix(data.frame(a=c(0,0,0,0),b=c(0,0,0,0)))
>>> for(i in 1:nrow(input)){ # i<-1
>>>  mymin<-i
>>>  mycoords<-which(input==mymin,arr.ind=TRUE)
>>>  result[i,mycoords[2]]<-100
>>>  input[mycoords]<-max(input)
>>> }
>>> (result)
>>>
>> --
>>
>> Sent from my Cray XK6
>> "Quidvis recte factum, quamvis humile, praeclarum."
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Dimitri Liakhovitski
> marketfusionanalytics.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] system command and Perl confusion

2012-04-06 Thread Maxim

Hello,

I'm having a question related to the system command within R: I try to
evoke a perl script from within R with something like
system(paste('perl myscript.pl', some parameters ,sep=" ")

This works just fine as long as I do not use any additionally installed
Perl module, as this leads to a

"Can't locate Statistics/Descriptive.pm in @INC ."

This is most likely due to the fact that there are somehow two Perls on my
Mac OS X (one must have come through my trials playing with Macports), one
is the Perl that came with the system.

What I do not get, is that I can run the script from the command line
without a problem.

How can I make R aware of "the other" @INC?

In the terminal, I find Perl to be in
/opt/local/bin/Perl

the above module is accordingly in
/opt/local/lib/perl5/site_perl/5.8.9/Statistics/Descriptive.pm

In contrast, running the Perl script from within R, it tries to find the
modules in /System/Library/Perl/5.10.0/...

I see that this is not necessarily a pure R question, but as it occurs
specifically using the R system command I thought there might be someone
around here who came across similar problems!

Best wishes
Maxim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to control exact positions of axis

2012-04-06 Thread mlell08

Hello,

the graphical parameters xaxs and yaxs are for you.
par(xaxs="i")

Regards!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filling the matrix row by row in the order from lower to larger elements

2012-04-06 Thread Dimitri Liakhovitski

Yes, that's correct - my matrix has N rows.
Thank you very much, Carl. This works great:

input<-as.matrix(data.frame(a=c(5,1,3,7),b=c(2,6,4,8)))
result<-input
N<-nrow(input)
for (k in 1:N){
  foo <- which (input == k,arr.ind=T)
  result[k,foo[2]] <-100
}
result[result !=100]<-0

Dimitri


On Fri, Apr 6, 2012 at 5:14 PM, Carl Witthoft  wrote:
> I think the OP wants to fill values in an arbitrarily large matrix. Now,
> first of all, I'd like to know what his real problem is, since this seems
> like a very tedious and unproductive matrix to produce.  But in the
> meantime,  since he also left out important information, let's assume the
> input matrix is N rows by M columns, and that he wants therefore to end up
> with N instances of "100", not counting the original value of 100 that is
> one of his ranking values (a bad BAD move IMHO).
>
> Then either loop or lapply over an equation like (I've expanded things more
> than necessary for clarity
> result<-inmatrix
> for (k in 1:N){
> foo <- which (inmatrix == k,arr.ind=T)
> result[k,foo[2]] <-100
>
> }
>
>
>
> I maybe missing something but this seems like an indexing problem
> which doesn't require a loop at all. Something like this maybe?
>
> (input<-matrix(c(5,1,3,7,2,6,4,8),nc=2))
> output <- matrix(0,max(input),2)
> output[input[,1],1] <- 100
> output[input[,2],2] <- 100
> output
>
> Cheers
>
>
> On Fri, Apr 6, 2012 at 1:49 PM, Dimitri Liakhovitski
>  wrote:
>> Hello, everybody!
>>
>> I have a matrix "input" (see example below) - with all unique entries
>> that are actually unique ranks (i.e., start with 1, no ties).
>> I want to assign a value of 100 to the first row of the column that
>> contains the minimum (i.e., value of 1).
>> Then, I want to assign a value of 100 to the second row of the column
>> that contains the value of 2, etc.
>> The results I am looking for are in "desired.results".
>> My code (below) does what I need. But it's using a loop through all
>> the rows of my matrix and searches for a matrix element every time.
>> My actual matrix is very large. Is there a way to do it more efficiently?
>> Thank you very much for the tips!
>> Dimitri
>>
>> input<-as.matrix(data.frame(a=c(5,1,3,7),b=c(2,6,4,8)))
>> (input)
>> desired.result<-as.matrix(data.frame(a=c(100,0,100,0),b=c(0,100,0,100)))
>> (desired.result)
>> result<-as.matrix(data.frame(a=c(0,0,0,0),b=c(0,0,0,0)))
>> for(i in 1:nrow(input)){ # i<-1
>>  mymin<-i
>>  mycoords<-which(input==mymin,arr.ind=TRUE)
>>  result[i,mycoords[2]]<-100
>>  input[mycoords]<-max(input)
>> }
>> (result)
>>
> --
>
> Sent from my Cray XK6
> "Quidvis recte factum, quamvis humile, praeclarum."
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filling the matrix row by row in the order from lower to larger elements

2012-04-06 Thread Carl Witthoft


Ok, how's this:


Rgames> foo
 [,1] [,2] [,3] [,4]
[1,]361   16
[2,]   10   14   125
[3,]   117   159
[4,]84   132

Rgames> sapply(1:4,FUN=function(k){ 
foo[k,which(foo==k,arr.ind=T)[2]]<-100;return(foo)})->bar

Rgames> bar
  [,1] [,2] [,3] [,4]
 [1,]3333
 [2,]   10   10   10   10
 [3,]   11   11  100   11
 [4,]8888
 [5,]6666
 [6,]   14   14   14   14
 [7,]7777
 [8,]444  100
 [9,]  100111
[10,]   12   12   12   12
[11,]   15   15   15   15
[12,]   13   13   13   13
[13,]   16   16   16   16
[14,]5  10055
[15,]9999
[16,]2222


Rgames> rab<-matrix(apply(bar,1,max),4,4)
Rgames> rab
 [,1] [,2] [,3] [,4]
[1,]36  100   16
[2,]   10   14   12  100
[3,]  1007   159
[4,]8  100   132


--

Sent from my Cray XK6
"Quidvis recte factum, quamvis humile, praeclarum."

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to control exact positions of axis

2012-04-06 Thread Maxim

Hi,

I have to plot a heat map and next to it a lineplot. Unfortunately the
scale is not the same between the two plots (as the heatmap data is binned).

My problem is, that despite the fact the plotted areas (marked by the
heatmap and box of the the default line plot) are essentially very similar,
the coordinate system is slightly shifted.

Below code does not produce as bad results as I have on the screen right
now, but it still shows a significant shift of the first and last axis tick
which in the heatmap/image is far more oriented towards the plots' margins
as compared to the default plot!

par(mfrow=c(2,1))
matrix(rnorm(3600),ncol=60,byrow=T)->x
image(x)
vals<-rnorm(6000)
plot(vals,type="l")


Is there a way how I could control the exact position of the first and last
axis tick more precisely? Of course I will have to adjust these issues
later on dynamically, as it is obviously the size of the grid that make the
problem, as with smaller grid size the first and last axis tick will come
closer to the edges of the plot area!

Right now I'm a bit lost but still I think I might overlook something
pretty simple!
Maxim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RE filling the matrix row by row in the order from lower to larger elements

2012-04-06 Thread Carl Witthoft

Apologies -- I meant to translate that code (which is what the OP 
provided, albeit in longer form) into a *apply one-liner.



--

Sent from my Cray XK6
"Quidvis recte factum, quamvis humile, praeclarum."

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filling the matrix row by row in the order from lower to larger elements

2012-04-06 Thread Carl Witthoft

I think the OP wants to fill values in an arbitrarily large matrix. 
Now, first of all, I'd like to know what his real problem is, since this 
seems like a very tedious and unproductive matrix to produce.  But in 
the meantime,  since he also left out important information, let's 
assume the input matrix is N rows by M columns, and that he wants 
therefore to end up with N instances of "100", not counting the original 
value of 100 that is one of his ranking values (a bad BAD move IMHO).

Then either loop or lapply over an equation like (I've expanded things 
more than necessary for clarity

result<-inmatrix
for (k in 1:N){
foo <- which (inmatrix == k,arr.ind=T)
result[k,foo[2]] <-100
}

I maybe missing something but this seems like an indexing problem
which doesn't require a loop at all. Something like this maybe?

(input<-matrix(c(5,1,3,7,2,6,4,8),nc=2))
output <- matrix(0,max(input),2)
output[input[,1],1] <- 100
output[input[,2],2] <- 100
output

Cheers

On Fri, Apr 6, 2012 at 1:49 PM, Dimitri Liakhovitski
 wrote:
> Hello, everybody!
>
> I have a matrix "input" (see example below) - with all unique entries
> that are actually unique ranks (i.e., start with 1, no ties).
> I want to assign a value of 100 to the first row of the column that
> contains the minimum (i.e., value of 1).
> Then, I want to assign a value of 100 to the second row of the column
> that contains the value of 2, etc.
> The results I am looking for are in "desired.results".
> My code (below) does what I need. But it's using a loop through all
> the rows of my matrix and searches for a matrix element every time.
> My actual matrix is very large. Is there a way to do it more efficiently?
> Thank you very much for the tips!
> Dimitri
>
> input<-as.matrix(data.frame(a=c(5,1,3,7),b=c(2,6,4,8)))
> (input)
> desired.result<-as.matrix(data.frame(a=c(100,0,100,0),b=c(0,100,0,100)))
> (desired.result)
> result<-as.matrix(data.frame(a=c(0,0,0,0),b=c(0,0,0,0)))
> for(i in 1:nrow(input)){ # i<-1
>  mymin<-i
>  mycoords<-which(input==mymin,arr.ind=TRUE)
>  result[i,mycoords[2]]<-100
>  input[mycoords]<-max(input)
> }
> (result)
>
--

Sent from my Cray XK6
"Quidvis recte factum, quamvis humile, praeclarum."

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I get a rough quick utility plot of a time series?

2012-04-06 Thread Hurr

linearizeTime() is from the restof my own code and the rest if my code is not
small.
That is why I gave you the printout and the code line that produced it.
The printout at the end of SimpleTS() is this:
[1] "`Simple Time Series"
   monoMn SBP DBP HRT
1  1057366710 117  53  54
2  1057369636 108  65  52
3  1057371157 107  NA  59
4  1057372649  NA  NA  58
5  1057374176  97  56  52
6  1057376790  NA  NA  NA
7  1057378307 108  65  52
8  1057381225 107  NA  59
9  1057382744  NA  57  58
10 1057384217  97  56  52
It's a simple plot, and if you know how to plot it should be easy for you.
I don't know how to do what you require, having used R very little.



--
View this message in context: 
http://r.789695.n4.nabble.com/How-do-I-get-a-rough-quick-utility-plot-of-a-time-series-tp4522709p4538331.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multivariate Multilevel Model: is R the right software for this problem

2012-04-06 Thread Andrew Miles

I recommend looking at chapter 6 of Paul Allison's *Fixed Effects
Regression Models*.  This chapter outlines how you can use a structural
equation modeling framework to estimate a multi-level model (a random
effects model).  This approach is slower than just using MLM software like
lmer() in the lme4 package, but has the advantage of being able to specify
correlations between errors across time, the ability to control for
time-invariant effects of time-invariant variables, and allows you to use
the missing data maximum likelihood that comes in structural equation
modeling packages.

Andrew Miles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing grid defaults

2012-04-06 Thread Paul Murrell


Hi

On 7/04/2012 2:43 a.m., Brett Presnell wrote:


I'm trying to use the vcd package to produce mosaic plots for my class
notes, written in Sweave and using the LaTeX's beamer document class.
For projecting the notes in class, I use a dark background with light
foreground colors.  It's easy enough to change the defaults for R's
standard graphics to match my color scheme (using the fg, col.axis,
col.lab, col.main, and col.sub parameter settings), but I can't figure
out how to do this with grid/strucplot/vcd.


From my experiments, I think that I might eventually figure out how to

change the colors of all the text in the mosaic plots to what I want
using arguments like 'gp_args = list(gp_labels = gpar(col = "yellow")'
and 'gp_varnames = gpar(col = "yellow")' (although I still haven't
figured out how to changed the color of the text on the legends), but
this is obviously not what I need.  I have been reading all the
documentation I can find, but I still haven't figured this out, so an
answer accompanied by a reference to some line or the other in some
piece of documentation would be greatly appreciated.


You could push a 'grid' viewport with the desired defaults and then 
hopefully 'vcd' will pick those up as its defaults.  The following 
example promises at least partial (if ugly) success ...


library(vcd)
pushViewport(viewport(gp=gpar(col="yellow")))
mosaic(Titanic, newpage=FALSE)

You can similarly set up font defaults.  See "Non-standard fonts in 
PostScript and PDF graphics" in 
http://cran.r-project.org/doc/Rnews/Rnews_2006-2.pdf for an example, 
plus also possibly http://www.stat.auckland.ac.nz/~paul/R/CM/CMR.html.


Paul


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I get a rough quick utility plot of a time series?

2012-04-06 Thread R. Michael Weylandt

This still isn't reproducible -- when I said use dput() I meant it.

x <- data.frame(x = 1:5, y = letters[1:5], z = factor(sample(1:3,5,
TRUE))) # complicated
dput(x) # Easy to copy and paste.

Also, what package is the linearizeTime function from? I'm having
trouble finding it on CRAN.

If you can dput() the exact object you want to plot, we can help you.

Michael

On Fri, Apr 6, 2012 at 2:15 PM, Hurr  wrote:
> I believe I have made this posting simple enough to understand.
> The sample time-series has only 10 times.
> SimpleTS() is the R function to contain the call to do the plot.
>
> SimpleTS <- function() {
>  titleline <- readLines("SimpleTS.dta", n=1)
>  print(titleline)
>  dta <- read.table("SimpleTS.dta", skip = 1, header = TRUE, sep = ",",
> colClasses = "character")
>  print(dta)
>  dta <- linearizeTime(dta)
>  print(titleline)
>  print(dta)
>  # plotting call goes here
> } #end SimpleTS
>
> A printout in linearizeTime(dta) is this:
> [1] "`Simple Time Series"
>   Simpl CnYrMoDaHrMn SBP DBP HRT X
> 1  Simpl 201005251030 117  53  54
> 2  Simpl 201005271116 108  65  52
> 3  Simpl 201005281237 107      59
> 4  Simpl 201005291329          58
> 5  Simpl 201005301456  97  56  52
> 6  Simpl 201006011030
> 7  Simpl 201006021147 108  65  52
> 8  Simpl 201006041225 107      59
> 9  Simpl 201006051344      57  58
> 10 Simpl 201006061417  97  56  52
>
> The printout at the end of SimpleTS() is this:
> [1] "`Simple Time Series"
>       monoMn SBP DBP HRT
> 1  1057366710 117  53  54
> 2  1057369636 108  65  52
> 3  1057371157 107  NA  59
> 4  1057372649  NA  NA  58
> 5  1057374176  97  56  52
> 6  1057376790  NA  NA  NA
> 7  1057378307 108  65  52
> 8  1057381225 107  NA  59
> 9  1057382744  NA  57  58
> 10 1057384217  97  56  52
>
> We want a simple 3-tier plot with monoMn on the abscissa
> and the three functions SBP DBP HRT plotted as fct of monoMn.
> We will want more-beautiful plots later, but can't specify now.
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/How-do-I-get-a-rough-quick-utility-plot-of-a-time-series-tp4522709p4537994.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filling the matrix row by row in the order from lower to larger elements

2012-04-06 Thread ilai

I maybe missing something but this seems like an indexing problem
which doesn't require a loop at all. Something like this maybe?

(input<-matrix(c(5,1,3,7,2,6,4,8),nc=2))
output <- matrix(0,max(input),2)
output[input[,1],1] <- 100
output[input[,2],2] <- 100
output

Cheers


On Fri, Apr 6, 2012 at 1:49 PM, Dimitri Liakhovitski
 wrote:
> Hello, everybody!
>
> I have a matrix "input" (see example below) - with all unique entries
> that are actually unique ranks (i.e., start with 1, no ties).
> I want to assign a value of 100 to the first row of the column that
> contains the minimum (i.e., value of 1).
> Then, I want to assign a value of 100 to the second row of the column
> that contains the value of 2, etc.
> The results I am looking for are in "desired.results".
> My code (below) does what I need. But it's using a loop through all
> the rows of my matrix and searches for a matrix element every time.
> My actual matrix is very large. Is there a way to do it more efficiently?
> Thank you very much for the tips!
> Dimitri
>
> input<-as.matrix(data.frame(a=c(5,1,3,7),b=c(2,6,4,8)))
> (input)
> desired.result<-as.matrix(data.frame(a=c(100,0,100,0),b=c(0,100,0,100)))
> (desired.result)
> result<-as.matrix(data.frame(a=c(0,0,0,0),b=c(0,0,0,0)))
> for(i in 1:nrow(input)){ # i<-1
>  mymin<-i
>  mycoords<-which(input==mymin,arr.ind=TRUE)
>  result[i,mycoords[2]]<-100
>  input[mycoords]<-max(input)
> }
> (result)
>
> --
> Dimitri Liakhovitski
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting data frame to its object results in matrix of strings

2012-04-06 Thread R. Michael Weylandt

Try this:

x <- xts(as.character(1:10), Sys.Date() + 0:9)
storage.mode(x) <- "double"

Michael

On Fri, Apr 6, 2012 at 1:13 PM, Noah Silverman  wrote:
> Hi,
>
> I have a rather large data frame (500 x 5000) that I want to convert to a 
> proper xts object.
>
> I am able to properly generate an xts object with the correct time index.  
> However, all of my numerical values are now strings.
>
> b <- as.xts(a[,2:dim(a)[2]], order.by=as.POSIXct(strptime(paste(a$Date), 
> '%m/%d/%Y')))
>
> My guess is that somewhere in the large data frame there are a few strings 
> hiding that is causing the who thong to be converted to string.
>
> Is there some way to force the as.xts function to ignore the strings and keep 
> everything numeric?
>
>
> Thanks!
>
>
> --
> Noah Silverman
> UCLA Department of Statistics
> 8208 Math Sciences Building
> Los Angeles, CA 90095
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple values in one column

2012-04-06 Thread Sarah Goslee

To the best of my knowledge, you can't skip step #2, at least not with
using much more complicated work-arounds like including a gsub() step
within the call to table, and to everything else you do with those
data.

Computers are generally better at dealing with normalized data, which
is what you're constructing in step #2.

Sarah

On Fri, Apr 6, 2012 at 3:53 PM, John D. Muccigrosso
 wrote:
> On Apr 6, 2012, at 9:09 AM, John D. Muccigrosso wrote:
>
>> I have some data files in which some fields have multiple values. For example
>>
>> first  last   sex   major
>> John   Smith  M     ANTH
>> Jane   Doe    F     HIST,BIOL
>>
>> What's the best R-like way to handle these data (Jane's major in my 
>> example), so that I can do things like summarize the other fields by them 
>> (e.g., sex by major)?
>>
>> Right now I'm processing the files (in excel since they're spreadsheets) by 
>> duplicating lines with two values in the major field, eliminating one value 
>> per row. I suspect there's a nifty R way to do this.
>
>
> I've gotten a few responses, for which I'm grateful, but either I don't quite 
> see how they answer my question, or I didn't phrase my question well, both of 
> which are equally possible. :-)
>
> So, given the data as above, let's call it "students", I have no problem 
> turning it into:
>
> first  last   sex   major
> John   Smith  M     ANTH
> Jane   Doe    F     HIST
> Jane   Doe    F     BIOL
>
> What I then do with this is things like
>
> table(students$sex, students$major)
>
> So, three steps:
>
> 1. Get data with multiple values per field.
> 2. Turn it into a data frame with only one value per field (by duplicating 
> lines).
> 3. Do things like table().
>
> I'd like to be able to skip #2.
>
> Thanks.
>
> John Muccigrosso
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple values in one column

2012-04-06 Thread John D. Muccigrosso

On Apr 6, 2012, at 9:09 AM, John D. Muccigrosso wrote:

> I have some data files in which some fields have multiple values. For example
> 
> first  last   sex   major
> John   Smith  M ANTH
> Jane   DoeF HIST,BIOL
> 
> What's the best R-like way to handle these data (Jane's major in my example), 
> so that I can do things like summarize the other fields by them (e.g., sex by 
> major)?
> 
> Right now I'm processing the files (in excel since they're spreadsheets) by 
> duplicating lines with two values in the major field, eliminating one value 
> per row. I suspect there's a nifty R way to do this.

I've gotten a few responses, for which I'm grateful, but either I don't quite 
see how they answer my question, or I didn't phrase my question well, both of 
which are equally possible. :-)

So, given the data as above, let's call it "students", I have no problem 
turning it into:

first  last   sex   major
John   Smith  M ANTH
Jane   DoeF HIST
Jane   DoeF BIOL

What I then do with this is things like 

table(students$sex, students$major)

So, three steps:

1. Get data with multiple values per field.
2. Turn it into a data frame with only one value per field (by duplicating 
lines).
3. Do things like table().

I'd like to be able to skip #2.

Thanks.

John Muccigrosso

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] filling the matrix row by row in the order from lower to larger elements

2012-04-06 Thread Dimitri Liakhovitski

Hello, everybody!

I have a matrix "input" (see example below) - with all unique entries
that are actually unique ranks (i.e., start with 1, no ties).
I want to assign a value of 100 to the first row of the column that
contains the minimum (i.e., value of 1).
Then, I want to assign a value of 100 to the second row of the column
that contains the value of 2, etc.
The results I am looking for are in "desired.results".
My code (below) does what I need. But it's using a loop through all
the rows of my matrix and searches for a matrix element every time.
My actual matrix is very large. Is there a way to do it more efficiently?
Thank you very much for the tips!
Dimitri

input<-as.matrix(data.frame(a=c(5,1,3,7),b=c(2,6,4,8)))
(input)
desired.result<-as.matrix(data.frame(a=c(100,0,100,0),b=c(0,100,0,100)))
(desired.result)
result<-as.matrix(data.frame(a=c(0,0,0,0),b=c(0,0,0,0)))
for(i in 1:nrow(input)){ # i<-1
  mymin<-i
  mycoords<-which(input==mymin,arr.ind=TRUE)
  result[i,mycoords[2]]<-100
  input[mycoords]<-max(input)
}
(result)

-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bayesian 95% Credible interval

2012-04-06 Thread Brunero Liseo

This is a function I use when the posterior is unimodal
I do not remember but I think I found it somewhere in the web

brunero

hpd<-function(x,p){
#generate an hpd set of level p, based
#on a sample x from the posterior
dx<-density(x)
md<-dx$x[dx$y==max(dx$y)]
px<-dx$y/sum(dx$y)
pxs<--sort(-px)
ct<-min(pxs[cumsum(pxs)< p])
list(hpdr=range(dx$x[px>=ct]),mode=md)
}

2012/4/6 Gyanendra Pokharel :
> Hi all,
> I have the data from the posterior distribution for some parameter. I want
> to find the 95% credible interval. I think "t.test(data)" is only for the
> confidence interval. I did not fine function for the Bayesian credible
> interval. Could some one suggest me?
>
> Thanks
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
=
Brunero Liseo
Dip. di metodi e modelli per il territorio, l'economia e la finanza
Sapienza Università di Roma
Tel. +39 06 49766973
Fax +39 06 4957606
http://geostasto.eco.uniroma1.it/utenti/liseo
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple values in one column

2012-04-06 Thread Nutter, Benjamin

This is a function I use for these kinds of situations.  Assuming the delimiter 
within the column is consistent and the spelling is consistent, it is pretty 
useful.

The function returns a vector of 0/1 values, 1 if the text in level is found, 0 
otherwise.
var=the variable
level=The value of interest in var

'split_levels' <- function(var, level, sep=","){

#*** identify level in var.
  f <- function(v){
v <- unlist(strsplit(v,sep))
ifelse(level %in% v, return(1), return(0))
  }

#*** split the variable
  new.var <- unlist(sapply(var,f))
  names(new.var) <- NULL

#*** assign NA's where they were in original variable
  new.var[is.na(var)] <- NA
  return(new.var)
}


  Benjamin Nutter |  Biostatistician     |  Quantitative Health Sciences
  Cleveland Clinic    |  9500 Euclid Ave.  |  Cleveland, OH 44195  | (216) 
445-1365



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Mark Grimes
Sent: Friday, April 06, 2012 11:16 AM
To: John D. Muccigrosso
Cc: r-help@r-project.org
Subject: Re: [R] multiple values in one column

John

I have to deal with this kind of thing too for my class.

#   Some functions
# for ad$Full.name = "Mark Grimes"
get.first.name <- function(cell){
x<-unlist(strsplit(as.character(cell), " "))
return(x[1]) 
}
get.last.name <- function(cell){
x<-unlist(strsplit(as.character(cell), " "))
return(x[2]) 
}
# For roster$Name = "Grimes, Mark L"
get.first.namec <- function(cell){
x<-unlist(strsplit(as.character(cell), ", "))
y <- get.first.name(x[2])
return(y) 
}
get.last.namec <- function(cell){
x<-unlist(strsplit(as.character(cell), ", "))
return(x[1]) 
}
Use these functions with the apply family for processing class files. 

Hope this helps,

Mark

On Apr 6, 2012, at 9:09 AM, John D. Muccigrosso wrote:

> I have some data files in which some fields have multiple values. For example
> 
> first  last   sex   major
> John   Smith  M ANTH
> Jane   DoeF HIST,BIOL
> 
> What's the best R-like way to handle these data (Jane's major in my example), 
> so that I can do things like summarize the other fields by them (e.g., sex by 
> major)?
> 
> Right now I'm processing the files (in excel since they're spreadsheets) by 
> duplicating lines with two values in the major field, eliminating one value 
> per row. I suspect there's a nifty R way to do this.
> 
> Thanks in advance!
> 
> John Muccigrosso
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

 Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2010).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bayesian 95% Credible interval

2012-04-06 Thread Gyanendra Pokharel

Hi all,
I have the data from the posterior distribution for some parameter. I want
to find the 95% credible interval. I think "t.test(data)" is only for the
confidence interval. I did not fine function for the Bayesian credible
interval. Could some one suggest me?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saving multiple plots using tiff function

2012-04-06 Thread John S

Sorry forgot to mention that I am using windows 7 and R session info

R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit

Thanks,
John



On Fri, Apr 6, 2012 at 2:17 PM, John S  wrote:

> Dear R experts,
>
> I am trying to save three plots using tiff graphics devices; however the
> following code only produces two files (Rplot002.tif and Rplot003.tif)
> showing figures 1 and 3. Here is a simplified ex code
>
>
>
> tiff(filename ="Rplot%03d.tif",width=24,height=20,units="cm",res=300,
> pointsize=10, compression = "lzw")
>
> plot(1)
>
> mtext("Fig 1",side=3,line=4,adj=0.50,padj=2,col="black",cex=1)
>
> plot(2)
>
> mtext("Fig 2",side=3,line=4,adj=0.50,padj=2,col="black",cex=1)
>
> plot(3)
>
> mtext("Fig 3",side=3,line=4,adj=0.50,padj=2,col="black",cex=1)
>
> dev.off()
>
>
>
> Using pdf () produces the correct 3 figures but I want to use tiff images
> .Any clues why this occurs?
>
> I am opening a tiff graphics device and writing plots to files in a loop
> so I need to have a single call to tiff().
>
> Thanks,
> John
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Saving multiple plots using tiff function

2012-04-06 Thread John S

Dear R experts,

I am trying to save three plots using tiff graphics devices; however the
following code only produces two files (Rplot002.tif and Rplot003.tif)
showing figures 1 and 3. Here is a simplified ex code



tiff(filename ="Rplot%03d.tif",width=24,height=20,units="cm",res=300,
pointsize=10, compression = "lzw")

plot(1)

mtext("Fig 1",side=3,line=4,adj=0.50,padj=2,col="black",cex=1)

plot(2)

mtext("Fig 2",side=3,line=4,adj=0.50,padj=2,col="black",cex=1)

plot(3)

mtext("Fig 3",side=3,line=4,adj=0.50,padj=2,col="black",cex=1)

dev.off()



Using pdf () produces the correct 3 figures but I want to use tiff images
.Any clues why this occurs?

I am opening a tiff graphics device and writing plots to files in a loop so
I need to have a single call to tiff().

Thanks,
John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I get a rough quick utility plot of a time series?

2012-04-06 Thread Hurr

I believe I have made this posting simple enough to understand.
The sample time-series has only 10 times.
SimpleTS() is the R function to contain the call to do the plot.

SimpleTS <- function() { 
  titleline <- readLines("SimpleTS.dta", n=1) 
  print(titleline) 
  dta <- read.table("SimpleTS.dta", skip = 1, header = TRUE, sep = ",",
colClasses = "character")
  print(dta) 
  dta <- linearizeTime(dta) 
  print(titleline) 
  print(dta) 
  # plotting call goes here
} #end SimpleTS 

A printout in linearizeTime(dta) is this: 
[1] "`Simple Time Series"
   Simpl CnYrMoDaHrMn SBP DBP HRT X
1  Simpl 201005251030 117  53  54  
2  Simpl 201005271116 108  65  52  
3  Simpl 201005281237 107  59  
4  Simpl 201005291329  58  
5  Simpl 201005301456  97  56  52  
6  Simpl 201006011030  
7  Simpl 201006021147 108  65  52  
8  Simpl 201006041225 107  59  
9  Simpl 201006051344  57  58  
10 Simpl 201006061417  97  56  52  

The printout at the end of SimpleTS() is this:
[1] "`Simple Time Series"
   monoMn SBP DBP HRT
1  1057366710 117  53  54
2  1057369636 108  65  52
3  1057371157 107  NA  59
4  1057372649  NA  NA  58
5  1057374176  97  56  52
6  1057376790  NA  NA  NA
7  1057378307 108  65  52
8  1057381225 107  NA  59
9  1057382744  NA  57  58
10 1057384217  97  56  52

We want a simple 3-tier plot with monoMn on the abscissa
and the three functions SBP DBP HRT plotted as fct of monoMn.
We will want more-beautiful plots later, but can't specify now.


--
View this message in context: 
http://r.789695.n4.nabble.com/How-do-I-get-a-rough-quick-utility-plot-of-a-time-series-tp4522709p4537994.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sincere inquiry about “subscript out of bounds” error in R

2012-04-06 Thread Henrik Bengtsson

str() is your number one friend in R.  Do str(A) and str(A2) after
allocating the matrices and you'll be "surprised".  My $.02 /Henrik

On Fri, Apr 6, 2012 at 7:35 AM, 卢永芳  wrote:
> Hello，experts
> I am working on a simulation of effect of artificial selection on certain 
> population in Animal Breeding.I am new beginner in coding. I have already 
> build a matrix A(500*500) based on this code
> A<-matrix(,500,500)
> for(i in 1:500){
> for(j in 1:500){
> ifelse(i==j,A[i,j]<-1,A[i,j]<-0)
> }
> }
> and I need to caculate A2
>
> base on A and X1(4500*4500).Here are the codes
> A2<-matrix(4500,500)
> for(i in 1:4500){
> for(j in 1:500){
> A2[i,j]<-(A[X1[i,2],j]+A[X1[i,3],j])/2
> }
> }
> and error happened like this:Error in A2[i, j] <- (A[X1[i, 2], j] + A[X1[i, 
> 3], j])/2 : subscript out of bounds
> I check the criculation number in for loop and it is perfect match with the 
> dimension of matrix A and X1. I do not know how this error happened? And 
> anther inportant question is that how can I build a matrix with very larger 
> dimension which can not  allocate in R．Error in matrix(, 45500, 45500) : 
> cannot allocate vector of length 207025
>
>
> I am looking forward to hear from you
> Kindest Regards
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fisher's LSD multiple comparisons in a two-way ANOVA

2012-04-06 Thread Richard M. Heiberger

Interpreting contrasts for a two-way interaction in the presence of a
significant
three-way interaction is dangerous.  They might not be interpretable.

I would start by examining the interaction2wt plot (in the HH package)

interaction2wt(activity ~ pH + I + f, data=yourdataframe)

Look at the vulcan example in ?interaction2wt

You might want to look at the set of two-way interactions, conditioned on
the value
of the third factor, for example similar to

require(HH)
interaction2wt(wear ~ filler + raw,
   data=vulcan,
   sub="all values of pretreat",
   ylim=c(50,600))
vulcan.pretreat <- split(vulcan, vulcan$pretreat)
lapply(1:length(vulcan.pretreat),
   function(i) {
 interaction2wt(wear ~ filler + raw,
data=vulcan.pretreat[[i]],
sub=paste("pretreat =", i),
ylim=c(50,600))
   })

On Fri, Apr 6, 2012 at 5:01 AM, Jinsong Zhao  wrote:

>  On 2012-04-05 10:49, Richard M. Heiberger wrote:
>
>> Here is your example.  The table you displayed in gigawiz ignored the
>> two-way factor structure
>> and interpreted the data as a single factor with 6 levels.  I created
>> the interaction of
>> a and b to get that behavior.
>> ## your example, with data stored in a data.frame
>> tmp <- data.frame(x=c(76, 84, 78, 80, 82, 70, 62, 72,
>> 71, 69, 72, 74, 66, 74, 68, 66,
>> 69, 72, 72, 78, 74, 71, 73, 67,
>> 86, 67, 72, 85, 87, 74, 83, 86,
>> 66, 68, 70, 76, 78, 76, 69, 74,
>> 72, 72, 76, 69, 69, 82, 79, 81),
>>   a=factor(rep(c("A1", "A2"), each = 24)),
>>   b=factor(rep(c("B1", "B2", "B3"), each=8, times=2)))
>> x.aov <- aov(x ~ a*b, data=tmp)
>> summary(x.aov)
>> ## your request
>> require(multcomp)
>> tmp$ab <- with(tmp, interaction(a, b))
>> xi.aov <- aov(x ~ ab, data=tmp)
>> summary(xi.aov)
>> xi.glht <- glht(xi.aov, linfct=mcp(ab="Tukey"))
>> confint(xi.glht)
>>
>> ## graphs
>> ## boxplots
>> require(lattice)
>> bwplot(x ~ ab, data=tmp)
>> ## interaction plot
>> ## install.packages("HH")  ## if you don't have HH yet
>> require(HH)
>> interaction2wt(x ~ a*b, data=tmp)
>>
>>
> Thank you very much for the demonstration.
>
> There is still a small difference between the results of glht() and the
> the table displayed in gigawiz. I try my best to figure out, but fail...
>
> By the way, I have a similar question. I built a ANOVA model:
>
> activity ~ pH * I * f
>
>
>Df Sum Sq Mean Sq F value   Pr(>F)
> pH   1   13301330  59.752 2.15e-10 ***
> I1137 137   6.131   0.0163 *
> f6  230543842 172.585  < 2e-16 ***
> pH:I 1152 152   6.809   0.0116 *
> pH:f 6274  46   2.049   0.0741 .
> I:f  6   5015 836  37.544  < 2e-16 ***
> pH:I:f   6849 142   6.356 3.82e-05 ***
> Residuals   56   1247  22
>
> ---
> Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.
>
> Now, how can I do a multi-comparison on `pH:I'?
>
> Do I need to do separate ANOVA for each `pH' or `I', just as that in
> demo("MMC.WoodEnergy", "HH")? And then do multi-comparison on `I' or `pH'
> in each separate ANOVA?
>
> Thanks again.
>
> Regards,
> Jinsong
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Order sapply

2012-04-06 Thread Sarah Goslee

On Fri, Apr 6, 2012 at 10:27 AM, MSousa  wrote:
> Good Afternoon,
>
>   I have the following code, but it seems that something must be doing
> wrong, because it is giving the results I want.

Assuming you don't really mean that, you can use order() and/or sort()
to put it back into order by val_user and pos. If you provide
reproducible data with dput(), someone might be inspired to write you
the code to do so.

Sarah

> The idea is to create segments while the value of Commutation is less than
> 1000.
> for example, from the small set of data below
>
> text="
> val_user  pos    v    v_star    v_end    commutation    v_source
> v_destine
> 1    1 96-96    1173438391    1173438391    0    96    96
> 3    2    126-126    1172501729    1172501532    197    126    126
> 3    3    126-35    1174404177    1172501909    1902268    126    35
> 3    4    35-56    1174404252    1174404221    31    35    56
> 3    5    56-99    1174404295    1174404295    0    56    99
> 3    6    99-92    1174404536    1174404535    1    99    92
> 3    7    92-99    1174404660    1174404658    2    92    99
> 3    8    99-43    1174405442    1174405442    0    99    43
> 3    9    43-99    1174405545    1174405544    1    43    99
> 3    10    99-43    1174405581    1174405581    0    99    43
> 3    11    43-99    1174405836    1174405836    0    43    99
> 3    12    99-43    1174405861    1174405861    0    99    43
> 3    13    43-99    1174405875    1174405875    0    43    99
> 3    18    101-113    1174410215    1174410214    1    101    113
> 3    19    113-36    1174410261    1174410261    0    113    36
> 3    20    36-60    1174410268    1174410268    0    36    60
> 3    21    60-101    1174660357    1174411020    249337    60    101
> 3    22    101-191    1174666205    1174662119    4086    101    191
> 3    23    191-196    1174666278    1174666265    13    191    196
> 3    24    196-9    1174666398    1174666366    32    196    9
> 3    25    9-101    1175154139    1174667144    486995    9    101
> 3    26    101-37    1175160182    1175159734    448    101    37
> 3    27    37-55    1175160256    1175160257    -1    37    55
> 4    1    11-11    1216304836    1216304127    709    11    11
> 4    2    11-11    1216370154    1216312995    57159    11    11
> 4    3    11-11    1216373234    1216372799    435    11    11
> 4    4    11-11    1216373974    1216373373    601    11    11
> 4    5    11-11    1216382659    1216379277    3382    11    11
> 4    6    11-11    1216397081    1216395201    1880    11    11
> 4    7    11-11    1216397339    1216397131    208    11    11
> 4    8    11-11    1216630649    1216399235    231414    11    11
> 4    9    11-11    1216637080    1216631541    5539    11    11
> 4    10    11-11    1216646563    1216640763    5800    11    11
> 4    11    11-11    1216656338    1216651635    4703    11    11
> "
> df1 <-read.table(textConnection(text), header=TRUE)
>
> inx <- df1$commutation > 1000
> comm1000 <- cumsum(inx)
>
> result <- split(df1[!inx, ], list(comm1000[!inx], df1$v_source[!inx],
> df1$v_destine[!inx]))
> result <- sapply(result, function(x) c(x$val_user[1], x$v_source[1],
> x$v_destine[1], nrow(x), mean(x$comm)))
> result <- na.exclude(t(result))
>
> rownames(result) <- 1:nrow(result)
> colnames(result) <- c("user", "v_source", "v_destine", "count", "average")
> attr(result, "na.action") <- NULL
> attr(result, "class") <- NULL
> results_user<-data.frame(result)
> View(results_user)
>
> This give:
>   user v_source v_destine count Min Max     average
>
>
> but the results I want:
> user v_source v_destine count Min Max     average
> 1       96      96      1       0       0       0.000
> 3       126     126     1       197     197     197.000
>  3              35             56     1         31      31      31.000
> ….
>
>
> I think there is a problem in the order of the different blocks, I don’t
> understand, how is that data are organized.
> The idea is to keep the organization of the file near the original.
>
> Thanks
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Converting data frame to its object results in matrix of strings

2012-04-06 Thread Noah Silverman

Hi,

I have a rather large data frame (500 x 5000) that I want to convert to a 
proper xts object.

I am able to properly generate an xts object with the correct time index.  
However, all of my numerical values are now strings.

b <- as.xts(a[,2:dim(a)[2]], order.by=as.POSIXct(strptime(paste(a$Date), 
'%m/%d/%Y')))

My guess is that somewhere in the large data frame there are a few strings 
hiding that is causing the who thong to be converted to string.

Is there some way to force the as.xts function to ignore the strings and keep 
everything numeric?


Thanks!


--
Noah Silverman
UCLA Department of Statistics
8208 Math Sciences Building
Los Angeles, CA 90095

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help Using Spreadsheets

2012-04-06 Thread jim holtman

You might want to re-read the "Intro to R" and the section on
dataframes.  Your spreadsheet is read into R as a dataframe which is
very similar to an Excel spreadsheet.  Exactly what problem are you
having with it?  Is it trying to access the data?

2012/4/6 Pedro Henrique :
> Hi, Petr,
> Thanks for answering.
> Yes, I do read the file with the "read.xls" command but I do not know how to
> read it into an object.
> I read the R-into document chapter of objects, but I is still not clear for
> me how to transform this kind of data into an object.
>
> Regards,
>
> Lämarao
>
>
> - Original Message - From: "Petr PIKAL" 
> To: "Pedro Henrique" 
> Cc: 
> Sent: Friday, April 06, 2012 6:27 AM
> Subject: Hi: [R] Help Using Spreadsheets
>
>
> Hi
>>
>> Hello,
>>
>> I am a new user of R and I am trying to use the data I am reading from a
>
>
>> spreadsheet.
>> I installed the xlsReadWrite package and I am able to read data from
>
> this
>>
>> files, but how can I assign the colums into values?
>> E.g:
>> as I read a spreadsheet like this one:
>
>
> Maybe with read.xls? Did you read it into an object?
>
>> A B
>> 1 2
>> 4 9
>>
>> I manually assign the values:
>> A<-c(1,4)
>> B<-c(2,9)
>
>
> Why? If you read in to an object (e.g. mydata)
>
>
>>
>> to plot it on a graph:
>> plot(A,B)
>
>
> plot(mydata$A, mydata$B)
>
>
>>
>> or make histograms:
>> hist(A)
>
>
> hist(mydata$A)
>
>>
>> But actualy I am using very large colums, does exist any other way to do
>
>
>> it automatically?
>
>
> Yes. But before that you shall automatically read some introduction
> documentation like R-intro)
>
> Regards
> Petr
>
>>
>> Best Regards,
>>
>> Lämarăo
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>
> http://www.R-project.org/posting-guide.html
>>
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Best way to do temporal joins in R?

2012-04-06 Thread jim holtman

check out the 'sqldf' package.  In
http://code.google.com/p/sqldf/#Example_4._Join there is an example of
a temporal join.  Maybe this will work for you.

On Fri, Apr 6, 2012 at 9:56 AM, Edith Mertz  wrote:
> Found the blunder, last line should have been:
>
> TideH$dt <- as.chron(paste(TideH$Date, TideH$Time), "%Y%m%d %H%M%S")
>
> After this I did:
>
> Fix <- read.csv("Fix times.csv")
> Fix[,"Station"] <- as.character(Fix[,"Station"])
> Fix[,"Date"] <- as.Date(Fix[,"Date"],format="%d/%m/%Y")
> Fix[,"Time"] <- as.character(Fix[,"Time"])
> Fix[,"Fix.Type"] <- as.character(Fix[,"Fix.Type"])
>
> Fix$DateTime<- as.chron(paste(Fix$Date, Fix$Time), "%Y%m%d %H%M%S")
>
> ds <- Fix$DateTime
> Fix$dt <- chron(sub(" .*", "", ds), gsub("[apm]+$|^.* ", "", ds)) +
>  (regexpr("pm", ds) > 0)/2
>
> Which gave an error list:
>
> Error in convert.dates(dates., format = format[[1]], origin. = origin.) :
>  format m/d/y may be incorrect
> In addition: Warning messages:
> 1: In unpaste(dates., sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
>  17955 entries set to NA due to wrong number of fields
> 2: In convert.dates(dates., format = format[[1]], origin. = origin.) :
>  NAs introduced by coercion
> 3: In convert.dates(dates., format = format[[1]], origin. = origin.) :
>  NAs introduced by coercion
> 4: In convert.dates(dates., format = format[[1]], origin. = origin.) :
>  NAs introduced by coercion
>
> Now I'm lost 4sure, help?
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Best-way-to-do-temporal-joins-in-R-tp885420p4537443.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] resampling syntax for caret package

2012-04-06 Thread Juliet Hannah

Max and List,

Could you advise me if I am using the proper caret syntax to carry out
leave-one-out cross validation. In the example below, I use example
data from the rda package. I use caret to tune over a grid and select
an optimal value. I think I am then using the optimal selection for
prediction.  So there are two rounds of resampling with the first one
taken care of by caret's train function.

My question overall is that it seems I must carry the outer resampling
plan manually.

On another note, I usually get the warning

1: In train.default(colon.x[-holdout, ], outcome[-holdout], method = "pam",  :
  At least one of the class levels are not valid R variables names;
This may cause errors if class probabilities are generated because the
variables names will be converted to: X1, X2
2: executing %dopar% sequentially: no parallel backend registered

When I change the variable names, caret gives me predictions as a
numeric value corresponding to the ordered level. Have I missed
something here?


Thanks,

Juliet

# start example

library(caret)
# to obtain data
library(rda)

data(colon)

#  add colnames
myind <- seq(1:ncol(colon.x))
mynames <- paste("A",myind,sep="")
colnames(colon.x) <- mynames

outcome  <- factor(as.character(colon.y),levels=c("1","2"))

cv_index <- 1:length(outcome)
predictions <- rep(-1,length(cv_index))

pamGrid <- seq(0.1,5,by=0.2)
pamGrid <- data.frame(.threshold=pamGrid)

# manual leave-one-out
for (holdout in cv_index) {
pamFit1 <- train(colon.x[-holdout,], outcome[-holdout],
 method = "pam",
 tuneGrid= pamGrid,
 trControl = trainControl(method = "cv"))

predictions[holdout] = predict(pamFit1,newdata =
colon.x[holdout,,drop=FALSE])

}



# end example


> sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
 [1] pamr_1.54survival_2.36-12 e1071_1.6class_7.3-3
 [5] rda_1.0.2caret_5.15-023   foreach_1.3.5codetools_0.2-8
 [9] iterators_1.0.5  cluster_1.14.2   reshape_0.8.4plyr_1.7.1
[13] lattice_0.20-6

loaded via a namespace (and not attached):
[1] compiler_2.14.2 grid_2.14.2 tools_2.14.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Execution speed in randomForest

2012-04-06 Thread jim holtman

Are you looking at the CPU or the elapsed time?  If it is the elapsed
time, then also capture the CPU time to see if it is different.  Also
consider the use of the Rprof function to see where time is being
spent.  What else is running on the machine?  Are you doing any
paging?  What type of system are you running on?  Use some of the
system level profiling tools.  If on Windows, then use perfmon.

On Fri, Apr 6, 2012 at 11:28 AM, Jason & Caroline Shaw
 wrote:
> I am using the randomForest package.  I have found that multiple runs
> of precisely the same command can generate drastically different run
> times.  Can anyone with knowledge of this package provide some insight
> as to why this would happen and whether there's anything I can do
> about it?  Here are some details of what I'm doing:
>
> - Data: ~80,000 rows, with 10 columns (one of which is the class label)
> - I randomly select 90% of the data to use to build 500 trees.
>
> And this is what I find:
>
> - Execution times of randomForest() using the entire dataset (in
> seconds): 20.65, 20.93, 20.79, 21.05, 21.00, 21.52, 21.22, 21.22
> - Execution times of randomForest() using the 90% selection: 17.78,
> 17.74, 126.52, 241.87, 17.56, 17.97, 182.05, 17.82 <-- Note the 3rd,
> 4th, and 7th.
> - When the speed is slow, it often stutters, with one or a few trees
> being produced very quickly, followed by a slow build taking 10 or 20
> seconds
> - The oob results are indistinguishable between the fast and slow runs.
>
> I select the 90% of my data by using sample() to generate indices and
> then subsetting, like: selection <- data[sample,].  I thought perhaps
> this subsetting was getting repeated, rather than storing in memory a
> new copy of all that data, so I tried circumventing this with
> eval(data[sample,]).  Probably barking up the wrong tree -- it had no
> effect, and doesn't explain the run-to-run variation (really, I'm just
> not clear on what eval() is for).  I have also tried garbage
> collecting with gc() between each run, and adding a Sys.sleep() for 5
> seconds, but neither of these has helped either.
>
> Any ideas?
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sincere inquiry about “subscript out of bounds” error in R

2012-04-06 Thread jim holtman

You need to do some basic debugging by putting

options(error=utils::recover)

in your startup of R (or just type it in) so that when the error
occurs you get control at the point of the error and can examine all
the variable.  You have some incorrect data that is causing the
subscript error, so you have to determine why.  You are picking up an
index from "X1[i,2]" or "X[i,3]" that is probably not legal.  Should
be easy to find.

On Fri, Apr 6, 2012 at 10:35 AM, 卢永芳  wrote:
> Hello，experts
> I am working on a simulation of effect of artificial selection on certain 
> population in Animal Breeding.I am new beginner in coding. I have already 
> build a matrix A(500*500) based on this code
> A<-matrix(,500,500)
> for(i in 1:500){
> for(j in 1:500){
> ifelse(i==j,A[i,j]<-1,A[i,j]<-0)
> }
> }
> and I need to caculate A2
>
> base on A and X1(4500*4500).Here are the codes
> A2<-matrix(4500,500)
> for(i in 1:4500){
> for(j in 1:500){
> A2[i,j]<-(A[X1[i,2],j]+A[X1[i,3],j])/2
> }
> }
> and error happened like this:Error in A2[i, j] <- (A[X1[i, 2], j] + A[X1[i, 
> 3], j])/2 : subscript out of bounds
> I check the criculation number in for loop and it is perfect match with the 
> dimension of matrix A and X1. I do not know how this error happened? And 
> anther inportant question is that how can I build a matrix with very larger 
> dimension which can not  allocate in R．Error in matrix(, 45500, 45500) : 
> cannot allocate vector of length 207025
>
>
> I am looking forward to hear from you
> Kindest Regards
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding for loops

2012-04-06 Thread R. Michael Weylandt

Usually you can just use cor() and it will do all the possibilities directly:

x <- matrix(rnorm(100), ncol = 10)
cor(x)

But that works on the columns, so you'll need to transpose things if
you want all possible row combinations: cor(t(x))

Hope this helps,
Michael

On Fri, Apr 6, 2012 at 9:57 AM, Cserháti Mátyás  wrote:
> Hello everyone,
>
> My name is Matthew and I'm new to this list. greetings to everyone.
> Sorry if I'm asking an old question, but I have an m x n matrix where the 
> rows are value profiles and the columns are conditions.
> What I want to do is calculate the correlation between all possible pairs of 
> rows.
>
> That is, if there are 10 rows in my matrix, then I want to calculate 10 x  10 
> = 100 correlation values (all against all).
> Now R is slow when I use two for loops.
> What kind of other function or tool can I use to get the job done more 
> speedily?
> I've heard of tapply, lapply, etc. and by and aggregate.
>
> Any kind of help is gladly appreciated.
>
> Thanks,
>
>
> Matthew
>

[Deleted the unnecessary digest]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] indexing in a function doesn't work?

2012-04-06 Thread Benjamin Caldwell

Josh

Apologies I haven't responded earlier. This looks great - I ended up doing
what I needed done piece-by-piece because of a looming deadline, but
understanding this code and your suggestions below will be a weekend
project.

Many thanks for all your help.
*
*
Best

Ben

On Tue, Apr 3, 2012 at 8:01 AM, Joshua Wiley  wrote:

> Hi Benjamin,
>
> You seem to have the right basic ideas, but a lot of your code had
> typos and some logic flaws that I guess came from trying to move from
> just code to in a function.  I attached the changes I made.  What I
> would strongly encourage you to do, is work through each of the little
> functions I made and:
>
> 1) make sure you understand what it is doing
> 2) make sure each small function works properly---this means creating
> a *variety* of plausible test cases and trying them out
> 3) once all of the pieces work, then try to wrap them up in your
> overall plotter()
>
> The original function you had was large and had many errors, but it is
> difficult to debug something when there can be errors coming from
> multiple places.  Easier is to break your work into small, tractable
> chunks.  Plan in advance what your final goal is, and how each piece
> will fit in to that.  Then write each piece and ensure that it works.
> From this point, you will have a much easier time bundling all the
> pieces together (even still, there may be additional work, but it will
> be more doable because you can be reasonably certain all your code
> works, it just does not quite work together.
>
> A few functions I used may be new to you, so I would also suggest
> reading the documentation (I know this can be tedious, but it is
> valuable)
>
> ?match.arg
> ?switch
> ?on.exit
>
> note that "..." are arguments passed down from the final function
> plotter to internal ones.  They must be named if they are passed to
> "...".
>
> You are off to a good start, and I think with some more work, you can
> get this going fine.  Long run you may have less headaches and stress
> if you take more time at the beginning to write clean code.
>
> I hope this helps,
>
> Josh
>
> On Sun, Apr 1, 2012 at 4:34 PM, Benjamin Caldwell
>  wrote:
> > Josh,
> >
> > Many thanks - here's a subset of the data and a couple examples:
> >
> >
> plotter(10,3,fram=rwb,framvec=rwb$prcnt.char.depth,obj=prcnt.char.depth,form1=
> > post.f.crwn.length~shigo.av,form2=post.f.crwn.length~shigo.av-1,
> > form3=leaf.area~(1/exp(shigo.av*x))*n,type=2,xlm=70,ylm=35)
> >
> > plotter(10,3,fram=rwb, framvec=rwb$prcnt.char.depth,
> obj=prcnt.char.depth,
> > form1= post.f.crwn.length~leaf.area,
> form2=post.f.crwn.length~leaf.area-1,
> > form3=leaf.area~(1/exp(shigo.av*x))*n,type=1, xlm=1500, ylm=35,
> > sx=.01,sn=25)
> >
> >
> >
> >
> > plotter<-function(a,b,fram,framvec,obj,form1,form2,form3, type=1, xlm,
> ylm,
> > sx=.01,sn=25){
> > g<-ceiling(a/b)
> > par(mfrow=c(b,g))
> > num<-rep(0,a)
> > sub.plotter<-function(i,fram,framvec,obj,form1,form2,form3,type,
> > xlm,ylm,var1,var2){
> > temp.i<-fram[framvec <=(i*.10),] #trees in the list that have an
> attribute
> > less than or equal to a progressively larger percentage
> > plot(form1, data=temp.i, xlim=c(0,xlm), ylim=c(0,ylm), main=((i-1)*.10))
> > if(type==1){
> > mod<-lm(form2,data=temp.i)
> > r2<-summary(mod)$adj.r.squared
> > num[i]<-r2
> > legend("bottomright", legend=signif(r2), col="black")
> > abline(mod)
> > num}
> > else{
> > if(type==2){
> > try(mod<-nls(form3, data=temp.i, start=list(x=sx,n=sn),
> > na.action="na.omit"), silent=TRUE)
> > try(x1<-summary(mod)$coefficients[1,1], silent=TRUE)
> > try(n1<-summary(mod)$coefficients[2,1], silent=TRUE)
> > try(lines((1/exp(c(0:70)*x1)*n1)), silent=TRUE)
> > try(num[i]<-AIC(mod), silent=TRUE)
> > try(legend("bottomright", legend=round(num[i],3) , col="black"),
> > silent=TRUE)
> > try((num), silent=TRUE)
> >   }
> > }}
> > for(i in 0:a+1){
> >  num<-sub.plotter(i,fram,framvec,obj,form1,form2,form3,type,xlm,ylm)
> > }
> > plot.cor<-function(x){
> > temp<-a+1
> > lengthx<-c(1:temp)
> > plot(x~c(1:temp))
> > m2<-lm(x~c(1:temp))
> > abline(m2)
> > n<-summary(m2)$adj.r.squared
> > legend("bottomright", legend=signif(n), col="black")
> > slope<-(coef(m2)[2])# slope
> > values<-(num)#values for aic or adj r2
> > r2ofr2<-(n) #r2 of r2 or AIC
> > output<-data.frame(lengthx,slope,values,r2ofr2)
> > }
> > plot.cor(num)
> > write.csv(plot.cor(num)$output,"output.csv") # can't seem to use
> > paste(substitute(form3),".csv",sep="") to name it at the moment
> > par(mfrow=c(1,1))
> > }
> >
> >
> >
> >
> > On Sun, Apr 1, 2012 at 3:25 PM, Joshua Wiley 
> wrote:
> >>
> >> Hi,
> >>
> >> Glancing through your code it was not immediately obvious to me why it
> >> does not work, but I can see a lot of things that could be simplified.
> >>  It would really help if you could give us a reproducible example.
> >> Find/upload/create (in R) some data, and examples of how you would use
> >> the function.  Right now, I can only guess what your data etc. are
> >> like

Re: [R] Multivariate Multilevel Model: is R the right software for this problem

2012-04-06 Thread Andrew Miles

I recommend looking at chapter 6 of Paul Allison's Fixed Effects Regression 
Models.  This chapter outlines how you can use a structural equation modeling 
framework to estimate a multi-level model (a random effects model).  This 
approach is slower than just using MLM software like lmer() in the lme4 
package, but has the advantage of being able to specify correlations between 
errors across time, the ability to control for time-invariant effects of 
time-invariant variables, and allows you to use the missing data maximum 
likelihood that comes in structural equation modeling packages.

Andrew Miles
Department of Sociology
Duke University

On Apr 6, 2012, at 9:48 AM, Eiko Fried wrote:

> Hello,
> 
> I've been trying to answer a problem I have had for some months now and
> came across multivariate multilevel modeling. I know MPLUS and SPSS quite
> well but these programs could not solve this specific difficulty.
> 
> My problem:
> 9 correlated dependent variables (medical symptoms; categorical, 0-3), 5
> measurement points, 10 time-varying covariates (life events; dichotomous,
> 0-1), N ~ 900. Up to 35% missing values on some variables, especially at
> later measurement points.
> 
> My exploratory question is whether there is an interaction effect between
> life events and symptoms - and if so, what the effect is exactly. E.g. life
> event 1 could lead to more symptoms A B D whereas life event 2 could lead
> to more symptoms A C D and less symptoms E.
> 
> My question is: would MMM in R be a viable option for this? If so, could
> you recommend literature?
> 
> Thank you
> --T
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Order sapply

2012-04-06 Thread R. Michael Weylandt

On Fri, Apr 6, 2012 at 10:27 AM, MSousa  wrote:
> Good Afternoon,
>
>   I have the following code, but it seems that something must be doing
> wrong, because it is giving the results I want.

Didn't someone else have that problem just a few weeks ago? :-P

Michael

> The idea is to create segments while the value of Commutation is less than
> 1000.
> for example, from the small set of data below
>
> text="
> val_user  pos    v    v_star    v_end    commutation    v_source
> v_destine
> 1    1 96-96    1173438391    1173438391    0    96    96
> 3    2    126-126    1172501729    1172501532    197    126    126
> 3    3    126-35    1174404177    1172501909    1902268    126    35
> 3    4    35-56    1174404252    1174404221    31    35    56
> 3    5    56-99    1174404295    1174404295    0    56    99
> 3    6    99-92    1174404536    1174404535    1    99    92
> 3    7    92-99    1174404660    1174404658    2    92    99
> 3    8    99-43    1174405442    1174405442    0    99    43
> 3    9    43-99    1174405545    1174405544    1    43    99
> 3    10    99-43    1174405581    1174405581    0    99    43
> 3    11    43-99    1174405836    1174405836    0    43    99
> 3    12    99-43    1174405861    1174405861    0    99    43
> 3    13    43-99    1174405875    1174405875    0    43    99
> 3    18    101-113    1174410215    1174410214    1    101    113
> 3    19    113-36    1174410261    1174410261    0    113    36
> 3    20    36-60    1174410268    1174410268    0    36    60
> 3    21    60-101    1174660357    1174411020    249337    60    101
> 3    22    101-191    1174666205    1174662119    4086    101    191
> 3    23    191-196    1174666278    1174666265    13    191    196
> 3    24    196-9    1174666398    1174666366    32    196    9
> 3    25    9-101    1175154139    1174667144    486995    9    101
> 3    26    101-37    1175160182    1175159734    448    101    37
> 3    27    37-55    1175160256    1175160257    -1    37    55
> 4    1    11-11    1216304836    1216304127    709    11    11
> 4    2    11-11    1216370154    1216312995    57159    11    11
> 4    3    11-11    1216373234    1216372799    435    11    11
> 4    4    11-11    1216373974    1216373373    601    11    11
> 4    5    11-11    1216382659    1216379277    3382    11    11
> 4    6    11-11    1216397081    1216395201    1880    11    11
> 4    7    11-11    1216397339    1216397131    208    11    11
> 4    8    11-11    1216630649    1216399235    231414    11    11
> 4    9    11-11    1216637080    1216631541    5539    11    11
> 4    10    11-11    1216646563    1216640763    5800    11    11
> 4    11    11-11    1216656338    1216651635    4703    11    11
> "
> df1 <-read.table(textConnection(text), header=TRUE)
>
> inx <- df1$commutation > 1000
> comm1000 <- cumsum(inx)
>
> result <- split(df1[!inx, ], list(comm1000[!inx], df1$v_source[!inx],
> df1$v_destine[!inx]))
> result <- sapply(result, function(x) c(x$val_user[1], x$v_source[1],
> x$v_destine[1], nrow(x), mean(x$comm)))
> result <- na.exclude(t(result))
>
> rownames(result) <- 1:nrow(result)
> colnames(result) <- c("user", "v_source", "v_destine", "count", "average")
> attr(result, "na.action") <- NULL
> attr(result, "class") <- NULL
> results_user<-data.frame(result)
> View(results_user)
>
> This give:
>   user v_source v_destine count Min Max     average
>
>
> but the results I want:
> user v_source v_destine count Min Max     average
> 1       96      96      1       0       0       0.000
> 3       126     126     1       197     197     197.000
>  3              35             56     1         31      31      31.000
> ….
>
>
> I think there is a problem in the order of the different blocks, I don’t
> understand, how is that data are organized.
> The idea is to keep the organization of the file near the original.
>
> Thanks
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Order-sapply-tp4537496p4537496.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Imputing missing values using "LSmeans" (i.e., population marginal means) - advice in R?

2012-04-06 Thread Jenn Barrett

Thanks Andy. I did read that posting, but didn't find that it answered my 
questions.

Ok - so I've confirmed that I can use popMeans in the doBy package to obtain 
the LSmeans as described in my e-mail below; however, the output has me puzzled.

Recall that my data consists of counts at various sites (n=93 this time) over 
about 35 years. Each site was only counted once per year; however, not all 
sites were counted in all years (i.e., data is unbalanced). 

Sample code is as follows:

> dat<-read.csv("CountData.csv”) 
> dat$COUNTYR_F<-as.factor(dat$COUNT_YR) # Convert year variable to factor
> lmMod<-lm(log(COUNT+0.5)~SITE + COUNTYR_F, data=dat)  # Run lm on log 
> transformed counts
> library(doBy)
> pM<-popMeans(lmMod, c("SITE","COUNTYR_F"))
> LMest<-exp(pM$Estimate)  # Transform LS estimates back to count and save to a 
> vector
> head(LMest,10)
 [1]   2.3006217   0.7012738   0.6707810   8.4331212   4.6810141   0.5902387   
1.2535870 903.2004994  31.7064744 351.7324390

# e.g., compare above output to actual counts: 2, 0, 0, NA, 14, 0, NA, 1031, 
NA, NA
# This output looks alright. However:

> pM.year<-popMeans(lmMod,"COUNTYR_F")
> LMest.year<-exp(pM.year$Estimate) 
> LMest.year
 [1] 35.94605 52.21187 38.26182 45.04494 48.26065 31.57805 38.20253 29.08914 
27.01732 32.25929 32.54706 25.29704 31.99606 35.86583 [...]
> range(LMest.year)
[1] 20.34141 52.21187
 
If I calculate the mean for 1975 (which corresponds to the first value in above 
vector --> 35.94605) using the ouput from popMeans(lmMod, 
c("SITE","COUNTYR_F")) above, I obtain 1530.253, which makes sense given my 
data. So why are the population marginal means for year (i.e., averaging across 
sites) so low in the vector immediately above?  I'm obviously missing something 
crucial here...(and I know I'll feel like a dimwit when it hits me). 

Note that if I use popMeans to obtain the marginal mean across years for each 
site the output looks just fine:
> pM.site<-popMeans(lmMod,"SITE") 
> LMest.site<-exp(pM.site$Estimate) 
> round(LMest.site,3)
 [1] 2.108 0.643 0.615 7.728 4.289 0.541 1.149   
827.64529.054   322.30973.696 1.067 27116.44644.367 0.885   
 17.267
[17] 0.743  2529.955   114.254  5624.021 2.652   167.986  6181.059
32.175 0.728 0.685 0.590 6.184  2399.361 0.633  6943.247
 0.902
[33] 0.740 3.934 0.83111.362 0.843   733.44218.123  
1807.352  2361.726 2.260 0.650   226.013 1.037  3808.097   294.388  
   1.161
[49] 2.42816.572  3006.224 0.776 0.946  4587.31230.342 
0.628 0.986 8.147 0.798   241.99540.880   466.779   395.398 
0.688
[65]45.66834.119   529.253 0.61567.455 6.129   883.504  
1487.803  2133.575   298.47231.981   907.871 0.982 1.271 3.636  
   9.387
[81]   110.531  1129.33031.332  2905.735 23512.16888.917  8666.546  
7276.974   215.724  1740.38121.53029.327 1.388

> mean(LMest.site)
[1] 1319.349 

My ultimate goal is to use the marginal means to impute the missing values; 
however, I'm concerned that I'm doing something wrong given the output for the 
marginal means for years (i.e., averaging across sites).

Thanks in advance for any input. 

Cheers,
Jenn

- Original Message 
From: "Andy Liaw" 
To: "Jenn Barrett" , r-help@r-project.org
Sent: Thursday, April 5, 2012 8:40:04 AM
Subject: RE: [R] Imputing missing values using "LSmeans" (i.e., population 
marginal means) - advice in R?

Don't know how you searched, but perhaps this might help:

https://stat.ethz.ch/pipermail/r-help/2007-March/128064.html 

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Jenn Barrett
> Sent: Tuesday, April 03, 2012 1:23 AM
> To: r-help@r-project.org
> Subject: [R] Imputing missing values using "LSmeans" (i.e., 
> population marginal means) - advice in R?
> 
> Hi folks,
> 
> I have a dataset that consists of counts over a ~30 year 
> period at multiple (>200) sites. Only one count is conducted 
> at each site in each year; however, not all sites are 
> surveyed in all years. I need to impute the missing values 
> because I need an estimate of the total population size 
> (i.e., sum of counts across all sites) in each year as input 
> to another model. 
> 
> > head(newdat,40)
>SITE YEAR COUNT
> 1 1 1975 12620
> 2 1 1976 13499
> 3 1 1977 45575
> 4 1 1978 21919
> 5 1 1979 33423
> ...
> 372 1975 4
> 382 1978 40322
> 392 1979 7
> 402 1980 16244
> 
> 
> It was suggested to me by a statistician to use LSmeans to do 
> this; however, I do not have SAS, nor do I know anything much 
> about SAS. I have spent DAYS reading about these "LSmeans" 
> and while (I think) I understand what they are, I have 
> absolutely no idea how to a) calculate them in R and b) how 
> to use them to impute my missing values in R. Again, I'v

[R] Sincere inquiry about “subscript out of bounds” error in R

2012-04-06 Thread 卢永芳

Hello£¬experts
I am working on a simulation of effect of artificial selection on certain 
population in Animal Breeding.I am new beginner in coding. I have already build 
a matrix A(500*500) based on this code
A<-matrix(,500,500)
for(i in 1:500){
for(j in 1:500){
ifelse(i==j,A[i,j]<-1,A[i,j]<-0)
}
}
and I need to caculate A2
 
base on A and X1(4500*4500).Here are the codes
A2<-matrix(4500,500)
for(i in 1:4500){
for(j in 1:500){
A2[i,j]<-(A[X1[i,2],j]+A[X1[i,3],j])/2
}
}
and error happened like this:Error in A2[i, j] <- (A[X1[i, 2], j] + A[X1[i, 3], 
j])/2 : subscript out of bounds
I check the criculation number in for loop and it is perfect match with the 
dimension of matrix A and X1. I do not know how this error happened? And anther 
inportant question is that how can I build a matrix with very larger dimension 
which can not  allocate in R£®Error in matrix(, 45500, 45500) : cannot allocate 
vector of length 207025
 
 
I am looking forward to hear from you
Kindest Regards
 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Order sapply

2012-04-06 Thread MSousa

Good Afternoon,

   I have the following code, but it seems that something must be doing
wrong, because it is giving the results I want.
The idea is to create segments while the value of Commutation is less than
1000.
for example, from the small set of data below

text="
val_user  posvv_starv_endcommutationv_source   
v_destine
11 96-961173438391117343839109696
32126-12611725017291172501532197126126
33126-3511744041771172501909190226812635
3435-5611744042521174404221313556
3556-991174404295117440429505699
3699-921174404536117440453519992
3792-991174404660117440465829299
3899-431174405442117440544209943
3943-991174405545117440554414399
31099-431174405581117440558109943
31143-991174405836117440583604399
31299-431174405861117440586109943
31343-991174405875117440587504399
318101-113117441021511744102141101113
319113-3611744102611174410261011336
32036-601174410268117441026803660
32160-1011174660357117441102024933760101
322101-191117466620511746621194086101191
323191-1961174666278117466626513191196
324196-911746663981174666366321969
3259-101117515413911746671444869959101
326101-371175160182117515973444810137
32737-5511751602561175160257-13755
4111-11121630483612163041277091111
4211-1112163701541216312995571591111
4311-11121637323412163727994351111
4411-11121637397412163733736011111
4511-111216382659121637927733821111
4611-111216397081121639520118801111
4711-11121639733912163971312081111
4811-11121663064912163992352314141111
4911-111216637080121663154155391111
41011-111216646563121664076358001111
41111-111216656338121665163547031111
"
df1 <-read.table(textConnection(text), header=TRUE)

inx <- df1$commutation > 1000
comm1000 <- cumsum(inx)

result <- split(df1[!inx, ], list(comm1000[!inx], df1$v_source[!inx],
df1$v_destine[!inx]))
result <- sapply(result, function(x) c(x$val_user[1], x$v_source[1],
x$v_destine[1], nrow(x), mean(x$comm)))
result <- na.exclude(t(result))

rownames(result) <- 1:nrow(result)
colnames(result) <- c("user", "v_source", "v_destine", "count", "average")
attr(result, "na.action") <- NULL
attr(result, "class") <- NULL
results_user<-data.frame(result)
View(results_user)

This give:
   user v_source v_destine count Min Max average


but the results I want:
user v_source v_destine count Min Max average
1   96  96  1   0   0   0.000
3   126 126 1   197 197 197.000
 3  35 56 1 31  31  31.000
….


I think there is a problem in the order of the different blocks, I don’t
understand, how is that data are organized.
The idea is to keep the organization of the file near the original.

Thanks


--
View this message in context: 
http://r.789695.n4.nabble.com/Order-sapply-tp4537496p4537496.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing grid defaults

2012-04-06 Thread ilai

You might want to check out package {tikzDevice} and it's
documentation. In essence you turn your R plots to tikz-pgf so they
can be naturally incorporated into a beamer presentation. Colors, bg,
fonts etc. can now be controlled in your main latex doc. I find it
much more convenient, and nicer when the plot annotations make use of
the same latex font rather than e.g. plotmath. Note, for grid graphics
you'll need to use print(yourplot) to the device.

Cheers

On Fri, Apr 6, 2012 at 8:43 AM, Brett Presnell  wrote:
>
> I'm trying to use the vcd package to produce mosaic plots for my class
> notes, written in Sweave and using the LaTeX's beamer document class.
> For projecting the notes in class, I use a dark background with light
> foreground colors.  It's easy enough to change the defaults for R's
> standard graphics to match my color scheme (using the fg, col.axis,
> col.lab, col.main, and col.sub parameter settings), but I can't figure
> out how to do this with grid/strucplot/vcd.
>
> >From my experiments, I think that I might eventually figure out how to
> change the colors of all the text in the mosaic plots to what I want
> using arguments like 'gp_args = list(gp_labels = gpar(col = "yellow")'
> and 'gp_varnames = gpar(col = "yellow")' (although I still haven't
> figured out how to changed the color of the text on the legends), but
> this is obviously not what I need.  I have been reading all the
> documentation I can find, but I still haven't figured this out, so an
> answer accompanied by a reference to some line or the other in some
> piece of documentation would be greatly appreciated.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Best way to do temporal joins in R?

2012-04-06 Thread Edith Mertz

Found the blunder, last line should have been:

TideH$dt <- as.chron(paste(TideH$Date, TideH$Time), "%Y%m%d %H%M%S") 

After this I did:

Fix <- read.csv("Fix times.csv")
Fix[,"Station"] <- as.character(Fix[,"Station"])
Fix[,"Date"] <- as.Date(Fix[,"Date"],format="%d/%m/%Y")
Fix[,"Time"] <- as.character(Fix[,"Time"])
Fix[,"Fix.Type"] <- as.character(Fix[,"Fix.Type"])

Fix$DateTime<- as.chron(paste(Fix$Date, Fix$Time), "%Y%m%d %H%M%S")

ds <- Fix$DateTime
Fix$dt <- chron(sub(" .*", "", ds), gsub("[apm]+$|^.* ", "", ds)) +
 (regexpr("pm", ds) > 0)/2

Which gave an error list:

Error in convert.dates(dates., format = format[[1]], origin. = origin.) : 
  format m/d/y may be incorrect
In addition: Warning messages:
1: In unpaste(dates., sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
  17955 entries set to NA due to wrong number of fields
2: In convert.dates(dates., format = format[[1]], origin. = origin.) :
  NAs introduced by coercion
3: In convert.dates(dates., format = format[[1]], origin. = origin.) :
  NAs introduced by coercion
4: In convert.dates(dates., format = format[[1]], origin. = origin.) :
  NAs introduced by coercion

Now I'm lost 4sure, help?


--
View this message in context: 
http://r.789695.n4.nabble.com/Best-way-to-do-temporal-joins-in-R-tp885420p4537443.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Execution speed in randomForest

2012-04-06 Thread Jason & Caroline Shaw

I am using the randomForest package.  I have found that multiple runs
of precisely the same command can generate drastically different run
times.  Can anyone with knowledge of this package provide some insight
as to why this would happen and whether there's anything I can do
about it?  Here are some details of what I'm doing:

- Data: ~80,000 rows, with 10 columns (one of which is the class label)
- I randomly select 90% of the data to use to build 500 trees.

And this is what I find:

- Execution times of randomForest() using the entire dataset (in
seconds): 20.65, 20.93, 20.79, 21.05, 21.00, 21.52, 21.22, 21.22
- Execution times of randomForest() using the 90% selection: 17.78,
17.74, 126.52, 241.87, 17.56, 17.97, 182.05, 17.82 <-- Note the 3rd,
4th, and 7th.
- When the speed is slow, it often stutters, with one or a few trees
being produced very quickly, followed by a slow build taking 10 or 20
seconds
- The oob results are indistinguishable between the fast and slow runs.

I select the 90% of my data by using sample() to generate indices and
then subsetting, like: selection <- data[sample,].  I thought perhaps
this subsetting was getting repeated, rather than storing in memory a
new copy of all that data, so I tried circumventing this with
eval(data[sample,]).  Probably barking up the wrong tree -- it had no
effect, and doesn't explain the run-to-run variation (really, I'm just
not clear on what eval() is for).  I have also tried garbage
collecting with gc() between each run, and adding a Sys.sleep() for 5
seconds, but neither of these has helped either.

Any ideas?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multivariate Multilevel Model: is R the right software for this problem

2012-04-06 Thread Eiko Fried

Hello,

I've been trying to answer a problem I have had for some months now and
came across multivariate multilevel modeling. I know MPLUS and SPSS quite
well but these programs could not solve this specific difficulty.

My problem:
9 correlated dependent variables (medical symptoms; categorical, 0-3), 5
measurement points, 10 time-varying covariates (life events; dichotomous,
0-1), N ~ 900. Up to 35% missing values on some variables, especially at
later measurement points.

My exploratory question is whether there is an interaction effect between
life events and symptoms - and if so, what the effect is exactly. E.g. life
event 1 could lead to more symptoms A B D whereas life event 2 could lead
to more symptoms A C D and less symptoms E.

My question is: would MMM in R be a viable option for this? If so, could
you recommend literature?

Thank you
--T

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Best way to do temporal joins in R?

2012-04-06 Thread Edith Mertz

Hi, I'm new to R-help mailing list and novice in R, so pls excuse 'silly
questions' and obvious blunders.

I have the same problem as Jon Greenberg (just different data) and have been
trying to use the code given above, but with some difficulty.

Pls Help?

My data:

Table 1 (TideH.csv)
DateTimeTide Height
03/02/2010  08:00:001.9
03/02/2010  09:00:001.49
03/02/2010  10:00:001.04

Table 2 (Fix times.csv)
Station DateTimeFix Type
1   03/02/2010  09:20:30Mn
1   03/02/2010  09:23:27Mn
1   03/02/2010  09:32:05Mn

Need to get tide height values /nearest/ to the fix time.
Result should be:

Station DateTimeFix Type  Tide Height
1   03/02/2010  09:20:30Mn1.49
1   03/02/2010  09:23:27Mn1.49
1   03/02/2010  09:32:05Mn1.04

This is what I have tried so far:

library(chron)
TideH <- read.csv("TideH.csv")

TideH[,"Date"] <- as.character(TideH[,"Date"])
TideH[,"Date"] <- as.Date(TideH[,"Date"],format="%d/%m/%Y")
TideH[,"Time"] <- as.character(TideH[,"Time"])
TideH[,"Tide.Height"] <- as.numeric(TideH[,"Tide.Height"])

TideH$dt <- as.chron(paste(TideH$date, TideH$time), "%Y%m%d %H%M%S")

Error in `$<-.data.frame`(`*tmp*`, "dt", value = numeric(0)) : 
  replacement has 0 rows, data has 3

What went wrong?

--
View this message in context: 
http://r.789695.n4.nabble.com/Best-way-to-do-temporal-joins-in-R-tp885420p4537342.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help Using Spreadsheets

2012-04-06 Thread Pedro Henrique


Hi, Petr,
Thanks for answering.
Yes, I do read the file with the "read.xls" command but I do not know how to 
read it into an object.
I read the R-into document chapter of objects, but I is still not clear for 
me how to transform this kind of data into an object.


Regards,

Lämarao


- Original Message - 
From: "Petr PIKAL" 

To: "Pedro Henrique" 
Cc: 
Sent: Friday, April 06, 2012 6:27 AM
Subject: Hi: [R] Help Using Spreadsheets


Hi

Hello,

I am a new user of R and I am trying to use the data I am reading from a



spreadsheet.
I installed the xlsReadWrite package and I am able to read data from

this

files, but how can I assign the colums into values?
E.g:
as I read a spreadsheet like this one:


Maybe with read.xls? Did you read it into an object?


A B
1 2
4 9

I manually assign the values:
A<-c(1,4)
B<-c(2,9)


Why? If you read in to an object (e.g. mydata)




to plot it on a graph:
plot(A,B)


plot(mydata$A, mydata$B)




or make histograms:
hist(A)


hist(mydata$A)



But actualy I am using very large colums, does exist any other way to do



it automatically?


Yes. But before that you shall automatically read some introduction
documentation like R-intro)

Regards
Petr



Best Regards,

Lämarăo
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple values in one column

2012-04-06 Thread Ashish Agarwal

How about reading lines and separating out cases having more than one
major? For cases having more than one major, process the data to create
duplicate rows - one for each major

On Fri, Apr 6, 2012 at 8:39 PM, John D. Muccigrosso <
intern...@muccigrosso.org> wrote:

> I have some data files in which some fields have multiple values. For
> example
>
> first  last   sex   major
> John   Smith  M ANTH
> Jane   DoeF HIST,BIOL
>
> What's the best R-like way to handle these data (Jane's major in my
> example), so that I can do things like summarize the other fields by them
> (e.g., sex by major)?
>
> Right now I'm processing the files (in excel since they're spreadsheets)
> by duplicating lines with two values in the major field, eliminating one
> value per row. I suspect there's a nifty R way to do this.
>
> Thanks in advance!
>
> John Muccigrosso
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to get the confidence interval of area under then time dependent roc curve

2012-04-06 Thread 何湘湘

 It is possible to calculate the c-index for time dependent outcomes (such as 
disease) using the survivalROC package in R. My question is : is it possible to 
produce a p-value for the c-index that is calculated (at a specific point in 
time)?How to get the confidence interval of area under then time dependent roc 
curve (or by bootstrap)?


--

 --

ºÎÏæÏæ

 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Missing CRAN Mirror

2012-04-06 Thread David Winsemius



On Apr 6, 2012, at 9:45 AM, Tyler Rinker wrote:



Hello R community,
This isn't a technical question about R:
I have used
http://lib.stat.cmu.edu/R/CRAN/ as my mirror for some time now.  As  
of a few days now I don't seem to be able to use this mirror.  The  
link from CRAN to this repository does not work either.  Does anyone  
know the fate of this repository?


I don't know. (The R-repo was part of StatLib which also seems to be  
currently unresponsive to browser queries, so it's more likely a  
server-wide issue. I checked the CMU Stats Dept website: http://www.stat.cmu.edu/ 
 and can find no announcements of service issues.)  I used to use it,  
too, but it became less reliable in recent months and I switched over  
to using the repo at the Fred Hutchinson Cancer Center in Seattle: http://cran.fhcrc.org



Cheers,Tyler Rinker
If this was not the appropriate place for this question please feel  
free to direct me to a more appropriate place to ask this question.


[[alternative HTML version deleted]]


Tyler;

You should investigate whether Hotmail allows you to send plain text  
mail. I've checked Gmail and yahoo and Nabble and they all support  
plain text.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple values in one column

2012-04-06 Thread Mark Grimes

John

I have to deal with this kind of thing too for my class.

#   Some functions
# for ad$Full.name = "Mark Grimes"
get.first.name <- function(cell){
x<-unlist(strsplit(as.character(cell), " "))
return(x[1]) 
}
get.last.name <- function(cell){
x<-unlist(strsplit(as.character(cell), " "))
return(x[2]) 
}
# For roster$Name = "Grimes, Mark L"
get.first.namec <- function(cell){
x<-unlist(strsplit(as.character(cell), ", "))
y <- get.first.name(x[2])
return(y) 
}
get.last.namec <- function(cell){
x<-unlist(strsplit(as.character(cell), ", "))
return(x[1]) 
}
Use these functions with the apply family for processing class files. 

Hope this helps,

Mark

On Apr 6, 2012, at 9:09 AM, John D. Muccigrosso wrote:

> I have some data files in which some fields have multiple values. For example
> 
> first  last   sex   major
> John   Smith  M ANTH
> Jane   DoeF HIST,BIOL
> 
> What's the best R-like way to handle these data (Jane's major in my example), 
> so that I can do things like summarize the other fields by them (e.g., sex by 
> major)?
> 
> Right now I'm processing the files (in excel since they're spreadsheets) by 
> duplicating lines with two values in the major field, eliminating one value 
> per row. I suspect there's a nifty R way to do this.
> 
> Thanks in advance!
> 
> John Muccigrosso
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multiple values in one column

2012-04-06 Thread John D. Muccigrosso

I have some data files in which some fields have multiple values. For example

first  last   sex   major
John   Smith  M ANTH
Jane   DoeF HIST,BIOL

What's the best R-like way to handle these data (Jane's major in my example), 
so that I can do things like summarize the other fields by them (e.g., sex by 
major)?

Right now I'm processing the files (in excel since they're spreadsheets) by 
duplicating lines with two values in the major field, eliminating one value per 
row. I suspect there's a nifty R way to do this.

Thanks in advance!

John Muccigrosso

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Changing grid defaults

2012-04-06 Thread Brett Presnell


I'm trying to use the vcd package to produce mosaic plots for my class
notes, written in Sweave and using the LaTeX's beamer document class.
For projecting the notes in class, I use a dark background with light
foreground colors.  It's easy enough to change the defaults for R's
standard graphics to match my color scheme (using the fg, col.axis,
col.lab, col.main, and col.sub parameter settings), but I can't figure
out how to do this with grid/strucplot/vcd.

>From my experiments, I think that I might eventually figure out how to
change the colors of all the text in the mosaic plots to what I want
using arguments like 'gp_args = list(gp_labels = gpar(col = "yellow")'
and 'gp_varnames = gpar(col = "yellow")' (although I still haven't
figured out how to changed the color of the text on the legends), but
this is obviously not what I need.  I have been reading all the
documentation I can find, but I still haven't figured this out, so an
answer accompanied by a reference to some line or the other in some
piece of documentation would be greatly appreciated.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read multiaple files within one folder

2012-04-06 Thread Amen

firstly  , thanks a lot. yes the files are already binarys. (there are 360
binary files ,each file has dim of 720 *360  files) in one folder of 
what I am trying to do is to take the average of each 4 files and finally
got 40 files to a new folder .
the error message is  Error in Testarray[i, , ] <- readBin(conne, integer(),
size = 2, n = 360 *  : 
  incorrect number of subscripts
> 

--
View this message in context: 
http://r.789695.n4.nabble.com/read-multiaple-files-within-one-folder-tp4537394p4537507.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integrate function - error -integration not occurring with last few rows

2012-04-06 Thread Navin Goyal

Thank you so much for your help Berend.
I did not see that my code had a typo and it was thus wrongly written ( I
overlooked the i that was supposed to be actually 1)

instead of   for (q in *1*:length(comb1$ID))
 I had it as for (q in *i*:length(comb1$ID))

It works correctly as expected
Thanks again.


Navin






On Fri, Apr 6, 2012 at 9:56 AM, Berend Hasselman  wrote:

>
> On 06-04-2012, at 13:14, Navin Goyal wrote:
>
> > Apologies for the lengthy code.
> > I tried a simple (and shorter) piece of code (pasted below) and it still
> gives me the same error for last few rows. Is this a bug or am I doing
> something totally wrong?  Could anyone please provide some help/pointers ?
> >
>
> You are not specifying what the  error is for the last few rows?
> You are doing something completely wrong (AFAICT).
> See below.
>
> > PS.  beta0 was fixed to 0.001 in the previous code. Also if I manually
> estimated the integral isnt 0. If I scramble the row order, it is again
> only the last few rows that dont integrate.
>
> See below.
>
> > Thanks
> >
> > data1<-expand.grid(1:5,0:6,10)
> > names(data1)<-c("ID","TIME", "DOSE")
> > data1<-data1[order(data1$DOSE,data1$ID,data1$TIME),]
> > ed<-data1[!duplicated(data1$ID) , c(1,3)]
> > ed$base=1
> > ed$drop=1
> > set.seed(5234123)
> > k<-0
> > for (i in 1:length(ed$ID))
> > {
> > k<-k+1
> > ed$base[k]<-100*exp(rnorm(1,0,0.2))
> > ed$drop[k]<-0.20*exp(rnorm(1,0,0.5))
> > }
>
> Why are you not using i to index ed$XXX?
> You are not using i.
> Simplify to
>
> for (k in 1:length(ed$ID))
> {
> ed$base[k]<-100*exp(rnorm(1,0,0.2))
> ed$drop[k]<-0.20*exp(rnorm(1,0,0.5))
> }
>
> > comb1<-merge(data1[, c("ID","TIME")], ed)
> > comb1$disprog<-comb1$base*exp(-comb1$drop*comb1$TIME)
> > comb1$integral=0
> > hz.func1<-function(t,bshz,beta1, change)
> > { ifelse(t==0,bshz, bshz*exp(beta1*change)) }
>
> Insert here
>
> comb1
> i
> length(comb1$ID)
>
> and you should see that i is 5 and that length(comb1$ID) is 35.
>
> > q<-0
> > for (m in i:length(comb1$ID))
> > {
> > q<-q+1
> > comb1$integral[q]<-integrate(hz.func1, lower=0, upper=comb1$TIME[q],
> >   bshz=0.001,beta1=0.035,
> >  change=comb1$disprog[q])$value
> > }
> > comb1
> >
>
> 1. Why does your for loop variable m start with i and not 1? (as I told
> you in my first reply)
> 2. Why are you not using the for loop variable m?
> 3. So from the above m starts at 5 and stops at 35 (==> 312 steps)
> 4, so you are filling elements 1 to 31 of comb1 and items 32 to 35 are
> unchanged.
>
> 5. why don't you do this
>
> for (q in 1:length(comb1$ID))
> {
> comb1$integral[q]<-integrate(hz.func1, lower=0, upper=comb1$TIME[q],
>  bshz=0.001,beta1=0.035,
> change=comb1$disprog[q])$value
> }
> comb1
>
> This avoids a useless variable m and fill all elements of comb1.
> And you could just as well reuse variable k; there is no need for a new
> variable (q) here.
>
> Berend
>
>
>


-- 
Navin Goyal

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read multiaple files within one folder

2012-04-06 Thread jim holtman

Why "didn't seem to be right"?  Are there error messages?  I assume
you have at least 6GB of real memory since you single copy of
Testarray requires 3GB.  Is your already in a 'binary' file?  If so,
why are you defining your matrix as numeric?  Should you be using
'array(0L, dim = c(1460, 720, 360))'?  This might be the cause of your
problem.

Alway start with a small subset.  Read a single file into an array and
see if the values are correct.  No telling what you are getting since
you are reading binary values in.

On Fri, Apr 6, 2012 at 9:25 AM, Amen  wrote:
> Suppose we have files in one folder file1.bin, file2.bin, ... , and
> 1460slice(file) with dim of 720 * 360 in directory C:\\PHD\\Climate
> Data\\Wind\\   and we want to read them and make a loop to go from 1 to 4
> and take the average,  then from 4 to 8 and so on till 1460. in the end we
> will get 365 files . I need those 365 files to be in one  new folder for
> later use in my model
> I tried using this code but didnt seem to be right :
> Testarray<-array(0, dim=c(1460,720,360))
> listfile<-dir("C:\\PHD\\Climate Data\\Wind\\")
> for (i in c(1:1460)) {
>     Testarray <- file(listfile[i], "rb")
>      Testarray[i,,]<- readBin(conne, integer(), size=2,  n=360*720,
> signed=F)
>             results <- mean(listfile[[(i*4):(i*4+3)]])
>              results1<-    writeBin(results)
> close(conne)
> }
> Thanks in advance
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/read-multiaple-files-within-one-folder-tp4537394p4537394.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integrate function - error -integration not occurring with last few rows

2012-04-06 Thread Berend Hasselman

On 06-04-2012, at 13:14, Navin Goyal wrote:

> Apologies for the lengthy code.
> I tried a simple (and shorter) piece of code (pasted below) and it still 
> gives me the same error for last few rows. Is this a bug or am I doing 
> something totally wrong?  Could anyone please provide some help/pointers ?
>  

You are not specifying what the  error is for the last few rows?
You are doing something completely wrong (AFAICT).
See below.

> PS.  beta0 was fixed to 0.001 in the previous code. Also if I manually 
> estimated the integral isnt 0. If I scramble the row order, it is again only 
> the last few rows that dont integrate.

See below.

> Thanks
> 
> data1<-expand.grid(1:5,0:6,10)
> names(data1)<-c("ID","TIME", "DOSE")
> data1<-data1[order(data1$DOSE,data1$ID,data1$TIME),]
> ed<-data1[!duplicated(data1$ID) , c(1,3)]
> ed$base=1
> ed$drop=1
> set.seed(5234123)
> k<-0
> for (i in 1:length(ed$ID))
> {
> k<-k+1
> ed$base[k]<-100*exp(rnorm(1,0,0.2))
> ed$drop[k]<-0.20*exp(rnorm(1,0,0.5))
> }

Why are you not using i to index ed$XXX?
You are not using i.
Simplify to

for (k in 1:length(ed$ID))
{
ed$base[k]<-100*exp(rnorm(1,0,0.2))
ed$drop[k]<-0.20*exp(rnorm(1,0,0.5))
}

> comb1<-merge(data1[, c("ID","TIME")], ed)
> comb1$disprog<-comb1$base*exp(-comb1$drop*comb1$TIME)
> comb1$integral=0
> hz.func1<-function(t,bshz,beta1, change)
> { ifelse(t==0,bshz, bshz*exp(beta1*change)) }

Insert here

comb1
i
length(comb1$ID)

and you should see that i is 5 and that length(comb1$ID) is 35.

> q<-0
> for (m in i:length(comb1$ID))
> {
> q<-q+1
> comb1$integral[q]<-integrate(hz.func1, lower=0, upper=comb1$TIME[q],
>   bshz=0.001,beta1=0.035,
>  change=comb1$disprog[q])$value
> }
> comb1
> 

1. Why does your for loop variable m start with i and not 1? (as I told you in 
my first reply)
2. Why are you not using the for loop variable m?
3. So from the above m starts at 5 and stops at 35 (==> 312 steps)
4, so you are filling elements 1 to 31 of comb1 and items 32 to 35 are 
unchanged.

5. why don't you do this

for (q in 1:length(comb1$ID))
{
comb1$integral[q]<-integrate(hz.func1, lower=0, upper=comb1$TIME[q],
  bshz=0.001,beta1=0.035,
 change=comb1$disprog[q])$value
}
comb1

This avoids a useless variable m and fill all elements of comb1.
And you could just as well reuse variable k; there is no need for a new 
variable (q) here.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Missing CRAN Mirror

2012-04-06 Thread Tyler Rinker


Hello R community,
This isn't a technical question about R:
I have used 
http://lib.stat.cmu.edu/R/CRAN/ as my mirror for some time now.  As of a few 
days now I don't seem to be able to use this mirror.  The link from CRAN to 
this repository does not work either.  Does anyone know the fate of this 
repository?  
Cheers,Tyler Rinker
If this was not the appropriate place for this question please feel free to 
direct me to a more appropriate place to ask this question.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] DESCRIPTION FILE in R Manuals

2012-04-06 Thread Duncan Murdoch


On 12-04-06 8:03 AM, Axel Urbiz wrote:

Dear List,

In building a package on a Mac, all the steps performed (build, check,
install) seem to be working fine (no warning messages or errors). The
manual for the package is created and everything looks good except for the
fact that the header of the document is not showing the info on the
DESCRIPTION file. It shows "R Documentation" and the path
"Users/name/..etc". Is there something I might be missing for the data in
the DESCRIPTION file not getting into the manual?


Currently the information in the DESCRIPTION file is inserted by the 
author, not automatically.  (Often the author uses package.skeleton() 
for the initial version, so it may appear to be automatic, but it will 
have to be updated manually later.)


To do the editing manually, find the .Rd file with alias "foo-package" 
for package foo.  Typically this will be foo-package.Rd, but it doesn't 
have to be.  If you don't have such a file, promptPackage() will create one.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding text for written comments to bottom of graphs

2012-04-06 Thread Paul Miller

Hi Baptiste,

Thanks for your help with this. Sorry for being slow to express my 
appreciation. I had intended to put some more time into tweaking the graphs 
before responding. Recently have been reading Hadley Wickham's ggplot2 book and 
have also located some materials on the knitr package. Didn't get far enough in 
my reading to do the necessary tweaking in a timely fashion though.

The result I'm getting now isn't perfect but it's still pretty amazing. 
Certainly good enough for our internal coding exercise. Next steps will be to 
add little grey extensions onto the bars and to obtain better control over the 
sizing and placement of the graphs and comment text. The little grey extensions 
will represent the time that chemotherapies remain active following their final 
administration. 

So far, the best way I've found for controlling the size of the graph is to add 
to the number of comment lines. As it happens, we wanted more comment lines 
anyway and so this worked out well. Must be a better way to get the necessary 
control though. Will try to figure that out once I have more time.

Thanks again for your help with this.

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read multiaple files within one folder

2012-04-06 Thread Amen

Suppose we have files in one folder file1.bin, file2.bin, ... , and
1460slice(file) with dim of 720 * 360 in directory C:\\PHD\\Climate
Data\\Wind\\   and we want to read them and make a loop to go from 1 to 4
and take the average,  then from 4 to 8 and so on till 1460. in the end we
will get 365 files . I need those 365 files to be in one  new folder for
later use in my model
I tried using this code but didnt seem to be right :
Testarray<-array(0, dim=c(1460,720,360))
listfile<-dir("C:\\PHD\\Climate Data\\Wind\\")
for (i in c(1:1460)) {
 Testarray <- file(listfile[i], "rb")
  Testarray[i,,]<- readBin(conne, integer(), size=2,  n=360*720,
signed=F)
 results <- mean(listfile[[(i*4):(i*4+3)]])
  results1<-writeBin(results)
close(conne)
}
Thanks in advance




--
View this message in context: 
http://r.789695.n4.nabble.com/read-multiaple-files-within-one-folder-tp4537394p4537394.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reclaiming lost memory in R

2012-04-06 Thread Liviu Andronic

On Fri, Apr 6, 2012 at 2:21 PM, Ramiro Barrantes
 wrote:
> Please let me know if you have any other suggestions or clues.
>
See this older post by Brian [1] and check ?"Memory-limits".
Otherwise, I remember someone suggesting that even if R releases the
memory internally, the OS may still keep it reserved for the R
process. (Unfortunately I cannot find the reference.)

Regards
Liviu

[1] http://tolstoy.newcastle.edu.au/R/e10/help/10/05/3851.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reclaiming lost memory in R

2012-04-06 Thread Ramiro Barrantes

Thank you for your feedback.

I think the problem is that, when nlme runs, it "hangs" when iterating.  I have 
a timeout of 5 minutes so that is how I get it to stop on the few cases where 
it goes over.  However, I think the memory doesn't get "cleared up" properly.

Please let me know if you have any other suggestions or clues.

Ramiro

From: William Dunlap [wdun...@tibco.com]
Sent: Thursday, April 05, 2012 9:21 PM
To: Drew Tyre; Ramiro Barrantes
Cc: r-help@r-project.org
Subject: RE: [R] reclaiming lost memory in R

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Drew Tyre
> Sent: Thursday, April 05, 2012 8:35 AM
> To: Ramiro Barrantes
> Cc: r-help@r-project.org
> Subject: Re: [R] reclaiming lost memory in R
>
> Ramiro
>
> I think the problem is the loop - R doesn't release memory allocated inside
> an expression until the expression completes. A for loop is an expression,
> so it duplicates fit and dataset on every iteration.

The above explanation is not true.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> An alternative
> approach that I have found successful in similar circumstances is to use
> sapply(), like this
>
> fits <- list()
> sapply(1:N,function(i){
>dataset <- generateDataset(i)
>fit[[i]] <- try( memoryHogFunction(dataset, otherParameters))
> })
>
> I'm assuming above that you want to save the result of memoryHogFunction
> from each iteration.
>
> hth
> Drew
> On Thu, Apr 5, 2012 at 8:35 AM, Ramiro Barrantes <
> ram...@precisionbioassay.com> wrote:
>
> > Dear list,
> >
> > I am trying to reclaim what I think is lost memory in R, I have been using
> > gc(), rm() and also using Rprof to figure out where all the memory is going
> > but I might be missing something.
> >
> > I have the following situation
> >
> > basic loop which calls memoryHogFunction:
> >
> > for i in (1:N) {
> >dataset <- generateDataset(i)
> >fit <- try( memoryHogFunction(dataset, otherParameters))
> > }
> >
> > and within
> >
> > memoryHogFunction <- function(dataset, params){
> >
> >fit <- try(nlme(someinitialValues)
> >...
> >fit <- try(updatenlme(otherInitialValues)
> >...
> >fit <- try(updatenlme(otherInitialValues)
> >  ...
> >ret <- fit ( and other things)
> >return a result "ret"
> > }
> >
> > The problem is that, memoryHogFunction uses a lot of memory, and at the
> > end returns a result (which is not big) but the memory used by the
> > computation seems to be still occupied.  The original loop continues, but
> > the memory used by the program grows and grows after each call to
> > memoryHogFunction.
> >
> > I have been trying to do gc() after each run in the loop, and have even
> > done:
> >
> > in memoryHogFunction()
> >  ...
> >ret <- fit ( and other things)
> >rm(list=ls()[-match("ret",ls())])
> >return a result "ret"
> > }
> >
> > ???
> >
> > A typical results from gc() after each loop iteration says:
> >  used (Mb) gc trigger (Mb) max used (Mb)
> > Ncells  326953 17.5 597831 32.0   597831 32.0
> > Vcells 1645892 12.63048985 23.3  3048985 23.3
> >
> > Which doesn't reflect that 340mb (and 400+mb in virtual memory) that are
> > being used right now.
> >
> > Even when I do:
> >
> > print(sapply(ls(all.names=TRUE), function(x) object.size(get(x
> >
> > the largest object is 8179808, which is what it should be.
> >
> > THe only thing that looked suspicious was the following within Rprof (with
> > memory=stats option), the tot.duplications might be a problem???:
> >
> > index: "with":"with.default"
> > vsize.small  max.vsize.small  vsize.large  max.vsize.large
> >   308416337820642   660787
> >   nodesmax.nodes duplications tot.duplications
> > 3446132  811501612395 61431787
> > samples
> >4956
> >
> > Any suggestions?  Is it something about the use of loops in R?  Is it
> > maybe the try's???
> >
> > Thanks in advance for any help,
> >
> > Ramiro
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Drew Tyre
>
> School of Natural Resources
> University of Nebraska-Lincoln
> 416 Hardin Hall, East Campus
> 3310 Holdrege Street
> Lincoln, NE 68583-0974
>
> phone: +1 402 472 4054
> fax: +1 402 472 2946
> email: aty...@unl.edu
> http://snr.unl.edu/tyre
> http://aminpractice.blogspot.com
> http://www.flickr.com/photos/atiretoo
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listin

Re: [R] R generated means are different from the boxplot!

2012-04-06 Thread Petr PIKAL

Hi
> 
> Hi R-listers, 
> 
> 1) I am having trouble understanding why the means I have calculated 
from
> Aeventexhumed (A, B, and C) are different from the means showing on the
> boxplot I generated (see attached).  I have added the script as to how 
my
> data is organized. 

Maybe the difference is that boxplot show medians.

> 
> 2) Also when I went through the data manually the means I calculated for
> each nesting event are slightly different than what I generated through 
R
> (see below). 
> A,  B,  C
> 0.2155051, 0.1288241, 0.1124618

Without data it is difficult to guess. Usually when manually computed 
value is different you made manual mistake. I would hardly believe that R 
can be wrong in such a simple and extensively used function.

Regards
Petr

> 
> Thanks in advance,
> 
> Jean
> 
> ---
> > require(plyr)
> Loading required package: plyr
> > resp <- read.csv(file.choose())
> > envir <- read.csv(file.choose())
> > resp <- resp[!is.na(resp$Aeventexhumed), ]
> > resp$QuadratEvent <- paste(resp$QuadratID, resp$Aeventexhumed, sep="")
> > resp$QuadratEvent <- as.character(resp$QuadratEvent)
> > envir <- envir[!is.na(envir$Aeventexhumed), ]
> > envir$QuadratEvent <- paste(envir$QuadratID, envir$Aeventexhumed, 
sep="")
> > envir$QuadratEvent <- as.character(envir$QuadratEvent)
> > ExDate <- Sector <- Quadrat <- Aeventexhumed <- NULL
> > ST1 <- ST2 <- ST3 <- ST4 <- ST0 <- NULL
> > Shells <- Hatchlings <- MaxHatch <- DeadHatch <- NULL
> > Oldeggs <- TotalEggs <- QuadratEvent <- NULL
> > for (q in unique(as.character(resp$QuadratEvent))) {
> + s <- resp[as.character(resp$QuadratEvent) == q, ]
> + ExDate <- c(ExDate, as.character(s$ExDate[1]))
> + Sector <- c(Sector, as.character(s$Sector[1]))
> + Quadrat <- c(Quadrat, as.character(s$Quadrat[1]))
> + Aeventexhumed <- as.character(c(Aeventexhumed,
> as.character(s$Aeventexhumed[1])))
> + QuadratEvent<- c(QuadratEvent, q)
> + ST1 <- c(ST1, sum(s$ST1, na.rm=TRUE))
> + ST2 <- c(ST2, sum(s$ST2, na.rm=TRUE))
> + ST3 <- c(ST3, sum(s$ST3, na.rm=TRUE))
> + ST4 <- c(ST4, sum(s$ST4, na.rm=TRUE))
> + ST0 <- c(ST0, sum(s$ST0, na.rm=TRUE))
> + Shells <- c(Shells, sum(s$Shells, na.rm=TRUE))
> + Hatchlings <- c(Hatchlings, sum(s$Hatchlings, na.rm=TRUE))
> + MaxHatch <- c(MaxHatch, sum(s$MaxHatch, na.rm=TRUE))
> + DeadHatch <- c(DeadHatch, sum(s$DeadHatch, na.rm=TRUE))
> + Oldeggs <- c(Oldeggs, sum(s$Oldeggs, na.rm=TRUE))
> + TotalEggs <- c(TotalEggs, sum(s$TotalEggs, na.rm=TRUE))
> + }
> > responses <- data.frame(QuadratEvent, ExDate, Sector, Quadrat,
> + Aeventexhumed, ST0, ST1, ST2, ST3, ST4, 
Shells,
> + Hatchlings, MaxHatch, DeadHatch, Oldeggs,
> + TotalEggs, stringsAsFactors=FALSE)
> > responses$QuadratEvent <- as.character(responses$QuadratEvent)
> > data.to.analyze <- join(responses, envir, by="QuadratEvent")
> > data.to.analyze$NotHatched <- data.to.analyze$TotalEggs -
> > data.to.analyze$Shells
> > data.to.analyze$Rayos <- paste("Rayos", data.to.analyze$Rayos, 
sep=".")
> 
> > Hsuccess <- Shells/TotalEggs
> > tapply(Hsuccess, Aeventexhumed, mean, na.rm=TRUE)
> A B C 
> 0.2156265 0.1288559 0.1124327 
> > boxplot(HSuccess ~ Aeventexhumed, data = data.to.analyze, col = 
"blue",
> + main = "Hatching Success of Arribadas in 2010",
> + xlab = "Arribada Event",
> + ylab = "Hatching Success % (Shells / Total Eggs)")
> 
> http://r.789695.n4.nabble.com/file/n4536926/hatch_Aeventexhumed.png 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/R-generated-
> means-are-different-from-the-boxplot-tp4536926p4536926.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] DESCRIPTION FILE in R Manuals

2012-04-06 Thread Axel Urbiz

Dear List,

In building a package on a Mac, all the steps performed (build, check,
install) seem to be working fine (no warning messages or errors). The
manual for the package is created and everything looks good except for the
fact that the header of the document is not showing the info on the
DESCRIPTION file. It shows "R Documentation" and the path
"Users/name/..etc". Is there something I might be missing for the data in
the DESCRIPTION file not getting into the manual?

Thanks in advance,
Axel.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R generated means are different from the boxplot!

2012-04-06 Thread Liviu Andronic

Hello

On Fri, Apr 6, 2012 at 10:35 AM, Jhope  wrote:
> Hi R-listers,
>
> 1) I am having trouble understanding why the means I have calculated from
> Aeventexhumed (A, B, and C) are different from the means showing on the
> boxplot I generated (see attached).  I have added the script as to how my
> data is organized.
>
For starters, boxplots do not display means but medians. [1]
[1] http://en.wikipedia.org/wiki/Boxplot


> 2) Also when I went through the data manually the means I calculated  for
> each nesting event are slightly different than what I generated through R
> (see below).
> A,  B,  C
> 0.2155051, 0.1288241, 0.1124618
>
Just guessing, perhaps it is due to some missing values.

Regards
Liviu


> Thanks in advance,
>
> Jean
>
> ---
>> require(plyr)
> Loading required package: plyr
>> resp <- read.csv(file.choose())
>> envir <- read.csv(file.choose())
>> resp <- resp[!is.na(resp$Aeventexhumed), ]
>> resp$QuadratEvent <- paste(resp$QuadratID, resp$Aeventexhumed, sep="")
>> resp$QuadratEvent <- as.character(resp$QuadratEvent)
>> envir <- envir[!is.na(envir$Aeventexhumed), ]
>> envir$QuadratEvent <- paste(envir$QuadratID, envir$Aeventexhumed, sep="")
>> envir$QuadratEvent <- as.character(envir$QuadratEvent)
>> ExDate <- Sector <- Quadrat <- Aeventexhumed <- NULL
>> ST1 <- ST2 <- ST3 <- ST4 <- ST0 <- NULL
>> Shells <- Hatchlings <- MaxHatch <- DeadHatch <- NULL
>> Oldeggs <- TotalEggs <- QuadratEvent <- NULL
>> for (q in unique(as.character(resp$QuadratEvent))) {
> +     s <- resp[as.character(resp$QuadratEvent) == q, ]
> +     ExDate <- c(ExDate, as.character(s$ExDate[1]))
> +     Sector <- c(Sector, as.character(s$Sector[1]))
> +     Quadrat <- c(Quadrat, as.character(s$Quadrat[1]))
> +     Aeventexhumed <- as.character(c(Aeventexhumed,
> as.character(s$Aeventexhumed[1])))
> +     QuadratEvent<- c(QuadratEvent, q)
> +     ST1 <- c(ST1, sum(s$ST1, na.rm=TRUE))
> +     ST2 <- c(ST2, sum(s$ST2, na.rm=TRUE))
> +     ST3 <- c(ST3, sum(s$ST3, na.rm=TRUE))
> +     ST4 <- c(ST4, sum(s$ST4, na.rm=TRUE))
> +     ST0 <- c(ST0, sum(s$ST0, na.rm=TRUE))
> +     Shells <- c(Shells, sum(s$Shells, na.rm=TRUE))
> +     Hatchlings <- c(Hatchlings, sum(s$Hatchlings, na.rm=TRUE))
> +     MaxHatch <- c(MaxHatch, sum(s$MaxHatch, na.rm=TRUE))
> +     DeadHatch <- c(DeadHatch, sum(s$DeadHatch, na.rm=TRUE))
> +     Oldeggs <- c(Oldeggs, sum(s$Oldeggs, na.rm=TRUE))
> +     TotalEggs <- c(TotalEggs, sum(s$TotalEggs, na.rm=TRUE))
> + }
>> responses <- data.frame(QuadratEvent, ExDate, Sector, Quadrat,
> +                         Aeventexhumed, ST0, ST1, ST2, ST3, ST4, Shells,
> +                         Hatchlings, MaxHatch, DeadHatch, Oldeggs,
> +                         TotalEggs, stringsAsFactors=FALSE)
>> responses$QuadratEvent <- as.character(responses$QuadratEvent)
>> data.to.analyze <- join(responses, envir, by="QuadratEvent")
>> data.to.analyze$NotHatched <- data.to.analyze$TotalEggs -
>> data.to.analyze$Shells
>> data.to.analyze$Rayos <- paste("Rayos", data.to.analyze$Rayos, sep=".")
>
>> Hsuccess <- Shells/TotalEggs
>> tapply(Hsuccess, Aeventexhumed, mean, na.rm=TRUE)
>        A         B         C
> 0.2156265 0.1288559 0.1124327
>> boxplot(HSuccess ~ Aeventexhumed, data = data.to.analyze, col = "blue",
> +         main = "Hatching Success of Arribadas in 2010",
> +         xlab = "Arribada Event",
> +         ylab = "Hatching Success % (Shells / Total Eggs)")
>
> http://r.789695.n4.nabble.com/file/n4536926/hatch_Aeventexhumed.png
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/R-generated-means-are-different-from-the-boxplot-tp4536926p4536926.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R generated means are different from the boxplot!

2012-04-06 Thread Jhope

Hi R-listers, 

1) I am having trouble understanding why the means I have calculated from
Aeventexhumed (A, B, and C) are different from the means showing on the
boxplot I generated (see attached).  I have added the script as to how my
data is organized. 

2) Also when I went through the data manually the means I calculated  for
each nesting event are slightly different than what I generated through R
(see below). 
A,  B,  C
0.2155051, 0.1288241, 0.1124618

Thanks in advance,

Jean

---
> require(plyr)
Loading required package: plyr
> resp <- read.csv(file.choose())
> envir <- read.csv(file.choose())
> resp <- resp[!is.na(resp$Aeventexhumed), ]
> resp$QuadratEvent <- paste(resp$QuadratID, resp$Aeventexhumed, sep="")
> resp$QuadratEvent <- as.character(resp$QuadratEvent)
> envir <- envir[!is.na(envir$Aeventexhumed), ]
> envir$QuadratEvent <- paste(envir$QuadratID, envir$Aeventexhumed, sep="")
> envir$QuadratEvent <- as.character(envir$QuadratEvent)
> ExDate <- Sector <- Quadrat <- Aeventexhumed <- NULL
> ST1 <- ST2 <- ST3 <- ST4 <- ST0 <- NULL
> Shells <- Hatchlings <- MaxHatch <- DeadHatch <- NULL
> Oldeggs <- TotalEggs <- QuadratEvent <- NULL
> for (q in unique(as.character(resp$QuadratEvent))) {
+ s <- resp[as.character(resp$QuadratEvent) == q, ]
+ ExDate <- c(ExDate, as.character(s$ExDate[1]))
+ Sector <- c(Sector, as.character(s$Sector[1]))
+ Quadrat <- c(Quadrat, as.character(s$Quadrat[1]))
+ Aeventexhumed <- as.character(c(Aeventexhumed,
as.character(s$Aeventexhumed[1])))
+ QuadratEvent<- c(QuadratEvent, q)
+ ST1 <- c(ST1, sum(s$ST1, na.rm=TRUE))
+ ST2 <- c(ST2, sum(s$ST2, na.rm=TRUE))
+ ST3 <- c(ST3, sum(s$ST3, na.rm=TRUE))
+ ST4 <- c(ST4, sum(s$ST4, na.rm=TRUE))
+ ST0 <- c(ST0, sum(s$ST0, na.rm=TRUE))
+ Shells <- c(Shells, sum(s$Shells, na.rm=TRUE))
+ Hatchlings <- c(Hatchlings, sum(s$Hatchlings, na.rm=TRUE))
+ MaxHatch <- c(MaxHatch, sum(s$MaxHatch, na.rm=TRUE))
+ DeadHatch <- c(DeadHatch, sum(s$DeadHatch, na.rm=TRUE))
+ Oldeggs <- c(Oldeggs, sum(s$Oldeggs, na.rm=TRUE))
+ TotalEggs <- c(TotalEggs, sum(s$TotalEggs, na.rm=TRUE))
+ }
> responses <- data.frame(QuadratEvent, ExDate, Sector, Quadrat,
+ Aeventexhumed, ST0, ST1, ST2, ST3, ST4, Shells,
+ Hatchlings, MaxHatch, DeadHatch, Oldeggs,
+ TotalEggs, stringsAsFactors=FALSE)
> responses$QuadratEvent <- as.character(responses$QuadratEvent)
> data.to.analyze <- join(responses, envir, by="QuadratEvent")
> data.to.analyze$NotHatched <- data.to.analyze$TotalEggs -
> data.to.analyze$Shells
> data.to.analyze$Rayos <- paste("Rayos", data.to.analyze$Rayos, sep=".")

> Hsuccess <- Shells/TotalEggs
> tapply(Hsuccess, Aeventexhumed, mean, na.rm=TRUE)
A B C 
0.2156265 0.1288559 0.1124327 
> boxplot(HSuccess ~ Aeventexhumed, data = data.to.analyze, col = "blue",
+ main = "Hatching Success of Arribadas in 2010",
+ xlab = "Arribada Event",
+ ylab = "Hatching Success % (Shells / Total Eggs)")

http://r.789695.n4.nabble.com/file/n4536926/hatch_Aeventexhumed.png 

--
View this message in context: 
http://r.789695.n4.nabble.com/R-generated-means-are-different-from-the-boxplot-tp4536926p4536926.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Legend based on levels of a variable

2012-04-06 Thread windmagics_lsl

I think there may 3 legends should be added in your plot
the argument col, pch and pt.cex should be in the same length with legend,
but the objects col, pch
 and cex you defined former have 16*3 length. I guess the follow codes may
work

col <- rep(c("blue", "red", "darkgreen"), c(16, 16, 16)) 
## Choose different size of points 
cex <- rep(c(1, 1.2, 1), c(16, 16, 16)) 
## Choose the form of the points (square, circle, triangle and 
diamond-shaped 
pch <- rep(c(15, 16, 17), c(16, 16, 16)) 

plot(axis1, axis2, main="My plot", xlab="Axis 1", ylab="Axis 2", 
 col=c(Category, col), pch=pch, cex=cex) 
legend(4, 12.5, c("NorthAmerica", "SouthAmerica", "Asia"), col =
unique(col), 
   pch = unique(pch), pt.cex = unique(cex), title = "Region") 

--
View this message in context: 
http://r.789695.n4.nabble.com/Legend-based-on-levels-of-a-variable-tp4536796p4536868.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integrate function - error -integration not occurring with last few rows

2012-04-06 Thread Navin Goyal

Apologies for the lengthy code.
I tried a simple (and shorter) piece of code (pasted below) and it still
gives me the same error for last few rows. Is this a bug or am I doing
something totally wrong?  Could anyone please provide some help/pointers ?

PS.  beta0 was fixed to 0.001 in the previous code. Also if I manually
estimated the integral isnt 0. If I scramble the row order, it is again
only the last few rows that dont integrate.
Thanks
data1<-expand.grid(1:5,0:6,10)
names(data1)<-c("ID","TIME", "DOSE")
data1<-data1[order(data1$DOSE,data1$ID,data1$TIME),]
ed<-data1[!duplicated(data1$ID) , c(1,3)]
ed$base=1
ed$drop=1
set.seed(5234123)
k<-0
for (i in 1:length(ed$ID))
{
k<-k+1
ed$base[k]<-100*exp(rnorm(1,0,0.2))
ed$drop[k]<-0.20*exp(rnorm(1,0,0.5))
}
comb1<-merge(data1[, c("ID","TIME")], ed)

comb1$disprog<-comb1$base*exp(-comb1$drop*comb1$TIME)
comb1$integral=0
hz.func1<-function(t,bshz,beta1, change)
{ ifelse(t==0,bshz, bshz*exp(beta1*change)) }
q<-0
for (m in i:length(comb1$ID))
{
q<-q+1
comb1$integral[q]<-integrate(hz.func1, lower=0, upper=comb1$TIME[q],
  bshz=0.001,beta1=0.035,
 change=comb1$disprog[q])$value
}
comb1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integrate function - error -integration not occurring with last few rows

2012-04-06 Thread Navin Goyal

Apologies for the lengthy code.
I tried a simple (and shorter) piece of code (pasted below) and it still
gives me the same error for last few rows. Is this a bug or am I doing
something totally wrong?  Could anyone please provide some help/pointers ?

PS.  beta0 was fixed to 0.001 in the previous code. Also if I manually
estimated the integral isnt 0. If I scramble the row order, it is again
only the last few rows that dont integrate.

Thanks

data1<-expand.grid(1:5,0:6,10)
names(data1)<-c("ID","TIME", "DOSE")
data1<-data1[order(data1$DOSE,data1$ID,data1$TIME),]

ed<-data1[!duplicated(data1$ID) , c(1,3)]
ed$base=1
ed$drop=1
set.seed(5234123)
k<-0
for (i in 1:length(ed$ID))
{
k<-k+1
ed$base[k]<-100*exp(rnorm(1,0,0.2))
ed$drop[k]<-0.20*exp(rnorm(1,0,0.5))
}
comb1<-merge(data1[, c("ID","TIME")], ed)
comb1$disprog<-comb1$base*exp(-comb1$drop*comb1$TIME)
comb1$integral=0
hz.func1<-function(t,bshz,beta1, change)
{ ifelse(t==0,bshz, bshz*exp(beta1*change)) }
q<-0
for (m in i:length(comb1$ID))
{
q<-q+1
comb1$integral[q]<-integrate(hz.func1, lower=0, upper=comb1$TIME[q],
  bshz=0.001,beta1=0.035,
 change=comb1$disprog[q])$value
}
comb1











On Fri, Apr 6, 2012 at 1:47 AM, Berend Hasselman  wrote:

>
> On 06-04-2012, at 00:55, Navin Goyal wrote:
>
> > Hi,
> > I am using the integrate function in some simulations in R (tried ver
> 2.12
> > and 2.15). The problem I have is that the last few rows do not integrate
> > correctly. I have pasted the code I used.
> > The column named "integral" shows the output from the integrate function.
> > The last few rows have no integration results. I tried increasing the
> > doses, number of subjects, etc this error occurs with the last few
> rows
> > only
> >
> > I am not sure why this is happening. Could someone please help me with
> this
> > issue ??
> > Thank you for your time
> >
> > dose<-c(0)
> > time<-(0:6)
> > id<-1:25
> >
> > data1<-expand.grid(id,time,dose)
> > names(data1)<-c("ID","TIME", "DOSE")
> > data1<-data1[order(data1$DOSE,data1$ID,data1$TIME),]
> >
> > 
> > basescore=95
> > basescore_sd=0.12
> > fall=0.15
> > fall_sd=0.5
> > slope=5
> > dose_slope1=0.045
> > dose_slope2=0.045
> > dose_slope3=0.002
> > rise_sd=0.5
> >
> > ed<-data1[!duplicated(data1$ID) , c(1,3)]
> > ed$base=1
> > ed$drop=1
> > ed$bshz<-1
> > ed$up<-1
> > ed
> >
> > set.seed(5234123)
> > k<-0
> >
> > for (i in 1:length(ed$ID))
> > {
> > k<-k+1
> > ed$base[k]<-basescore*exp(rnorm(1,0,basescore_sd))
> > ed$drop[k]<-fall*exp(rnorm(1,0,fall_sd))
> > ed$up[k]<-slope*exp(rnorm(1,0,rise_sd))
> > ed$bshz<-beta0
> > }
> >
> > comb1<-merge(data1[, c("ID","TIME")], ed)
> > comb1$disprog<-1
> > comb1$beta1<-0.035
> > comb1$beta21<-0.02
> > comb1$beta22<-0.45
> > comb1$beta23<-0085
> > comb1$beta31<-0.7
> > comb1$beta32<-0.05
> > comb1$exphz<-1
> >
> > comb2<-comb1
> >
> > p<-0
> > for(l in 1:length(comb2$ID))
> > {
> > p<-p+1
> > comb2$disprog[p]<-comb2$base[p]*exp(-comb2$drop[p]*comb2$TIME[p]) +
> >  comb2$up[p]*comb2$TIME[p]
> > comb2$frac[p]<-ifelse ( comb2$DOSE[p]==3,
> >   comb2$beta31[p]*comb2$TIME[p]^comb2$beta32[p],
> > exp(-comb2$beta21[p]*comb2$DOSE[p])*comb2$TIME[p]^comb2$beta22[p]   )
> > }
> >
> > hz.func1<-function(t,bshz,beta1, change,other)
> > {
> > ifelse(t==0,bshz, bshz*exp(beta1*change+other))
> > }
> >
> > comb3<-comb2
> > comb3$integral=0
> >
> > q<-0
> > for (m in i:length(comb3$ID))
> > {
> > q<-q+1
> > comb3$integral[q]<-integrate(hz.func1, lower=0, upper=comb3$TIME[q],
> >  bshz=comb3$bshz[q],beta1=comb3$beta1[q],
> > change=comb3$disprog[q], other=comb3$frac[q])$value
> > }
>
> Where is beta0 in the line   with  ed$bshz<-beta0 ?
>
> In the last for loop  for (m in i:length(comb3$ID))  could it be that i
> should be 1?
> When the i is changed to 1 then integrate results are <> 0, which might be
> what you expect?
>
> Berend
>
>


-- 
Navin Goyal

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with gsub function or a similar function

2012-04-06 Thread Sarah Goslee


You don't provide a reproducible example, or even str(), but I'd guess you need 
to match "^15" instead of just "15".

Sarah

On Apr 5, 2012, at 10:38 PM, ieatnapalm  wrote:

> Hey, sorry if this has been addressed before, but I'm really new to R and
> having trouble with the gsub function. I need a way to make this function
> exclude certain values from being substituted:
> ie my data looks something like (15:.0234,10:.0157) and I'm trying to
> replace the leading 15 with something else - but of course it replaces the
> second 15 with something else too. If there's a way to exclude anything with
> a leading decimal point or something like that, that would do the trick.
> Thanks yall.
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Help-with-gsub-function-or-a-similar-function-tp4536584p4536584.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Legend based on levels of a variable

2012-04-06 Thread Petr PIKAL

Thanks, 

anyway, using build-in R features is preferable for colours

with(data, plot(axis1, axis2, col= c("red", "blue", 
"green")[as.numeric(data$Region)]))
legend("topright", legend=levels(data$Region), fill= c("red", "blue", 
"green"))

although sometimes can be preferable to get advantage of grid graphic

library(ggplot2)
p<-ggplot(data, aes(x=axis1, y=axis2, colour=Region))
p+geom_point()

Regards
Petr

> 
> He provided data, yet in an inconvenient way at the bottom of his post.
> 
> Kumar, please use dput() to provide data to the list, because its much
> easier to import:
> dput(data)## name data is made up by me
> 
> structure(list(Region = structure(c(2L, 2L, 2L, 2L, 2L, 3L, 3L, 
> 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L), .Label = c("Asia", "NorthAmerica", 
> "SouthAmerica"), class = "factor"), axis1 = c(5L, 8L, 8L, 6L, 
> 5L, 8L, 7L, 7L, 8L, 6L, 7L, 6L, 7L, 5L, 4L), axis2 = c(14L, 13L, 
> 11L, 11L, 13L, 17L, 16L, 13L, 14L, 17L, 13L, 15L, 14L, 13L, 16L
> )), .Names = c("Region", "axis1", "axis2"), class = "data.frame", 
row.names = c(NA, 
> -15L))
> 
> 
> 
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Time series - year on year growth rate

2012-04-06 Thread Berend Hasselman

On 06-04-2012, at 10:27, jpm miao wrote:

> Hello,
> 
>   Is there a function in R that calculates the year-on-year growth rate of
> some time series?
> 
>   In EView the function is @pchy.

This might do what you need

pchy <- function(x) {
if(!is.ts(x)) stop("x is not a timeseries")

x.freq <- tsp(x)[3]
if(!(x.freq %in% c(1,2,4,12))) stop("Invalid frequency of timeseries x 
(must be 1, 2, 4, 12)")

y <- diff(x,lag=x.freq)/lag(x,-x.freq)
return(y)
}

Berend
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to do piecewise linear regression in R?

2012-04-06 Thread Petr PIKAL

Hi

Your post is rather screwed.

> 
> [R] how to do piecewise linear regression in R?

Maybe segmented?

Regards
Petr

> 
> 
> Dear all,
> I want to do piecewise CAPM linear regression in R:
> RRiskArbâ’Rf  = (1â’Î´)[Î±MktLow+Î˛MktLow(RMktâ’Rf)]  +  Î´[Î±Mkt 
High 
> +Î˛Mkt High(RMkt â’Rf )]
> 
> where Î´ is a dummy variable if the excess return on the value-weighted 
> CRSP index is above a threshold level and zero otherwise. and at the 
same 
> time add the restriction:
> 
> Î±Mkt Low + Î˛Mkt Low Â· Threshold = Î±Mkt High + Î˛Mkt High Â· 
Threshold
> to ensure continuity.
> But I do not know how to add this restriction in R, could you help me on 
this?
> Thanks a lot!
> Eunice 
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Legend based on levels of a variable

2012-04-06 Thread mlell08

He provided data, yet in an inconvenient way at the bottom of his post.

Kumar, please use dput() to provide data to the list, because its much
easier to import:
dput(data)## name data is made up by me

structure(list(Region = structure(c(2L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L), .Label = c("Asia", "NorthAmerica", 
"SouthAmerica"), class = "factor"), axis1 = c(5L, 8L, 8L, 6L, 
5L, 8L, 7L, 7L, 8L, 6L, 7L, 6L, 7L, 5L, 4L), axis2 = c(14L, 13L, 
11L, 11L, 13L, 17L, 16L, 13L, 14L, 17L, 13L, 15L, 14L, 13L, 16L
)), .Names = c("Region", "axis1", "axis2"), class = "data.frame", row.names = 
c(NA, 
-15L))



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Hi: Help Using Spreadsheets

2012-04-06 Thread Petr PIKAL

Hi
> Hello,
> 
> I am a new user of R and I am trying to use the data I am reading from a 

> spreadsheet.
> I installed the xlsReadWrite package and I am able to read data from 
this 
> files, but how can I assign the colums into values?
> E.g:
> as I read a spreadsheet like this one:

Maybe with read.xls? Did you read it into an object?

> A B
> 1 2
> 4 9
> 
> I manually assign the values:
> A<-c(1,4)
> B<-c(2,9)

Why? If you read in to an object (e.g. mydata)


> 
> to plot it on a graph:
> plot(A,B)

plot(mydata$A, mydata$B)


> 
> or make histograms:
> hist(A)

hist(mydata$A)

> 
> But actualy I am using very large colums, does exist any other way to do 

> it automatically?

Yes. But before that you shall automatically read some introduction 
documentation like R-intro)

Regards
Petr

> 
> Best Regards,
> 
> Lämarăo
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Legend based on levels of a variable

2012-04-06 Thread Petr PIKAL

Hi

> 
> I have a bivariate plot of axis2 against axis1 (data below). I would 
like
> to use different size, type and color for points in the plot for the 
point
> coming from different region. For some reasons, I cannot get it done. 
Below
> is my code.
> 
> col <- rep(c("blue", "red", "darkgreen"), c(16, 16, 16))
> ## Choose different size of points
> cex <- rep(c(1, 1.2, 1), c(16, 16, 16))
> ## Choose the form of the points (square, circle, triangle and
> diamond-shaped
> pch <- rep(c(15, 16, 17), c(16, 16, 16))
> 
> plot(axis1, axis2, main="My plot", xlab="Axis 1", ylab="Axis 2",
>  col=c(Category, col), pch=pch, cex=cex)
> legend(4, 12.5, c("NorthAmerica", "SouthAmerica", "Asia"), col = col,
>pch = pch, pt.cex = cex, title = "Region")
> 
> I also prefer a control on what kind of point I want to use for 
different
> levels of Region. Something like this:
> legend(4,12.5, col(levels(Category), Asia="red", NorthAmerica="blue",
> SouthAmerica="green"))

So why you do not use Region and/or Category for automatic point 
colouring/size/type.

Without data I can only use built in one.

with(iris, plot(Sepal.Length, Sepal.Width, col= as.numeric(Species)))
legend("topright", legend=levels(iris$Species), pch=19, col=1:3)

Regards
Petr

> 
> Thanks,
> Kumar
> 
>   Region axis1 axis2  NorthAmerica 5 14  NorthAmerica 8 13  NorthAmerica 
8
> 11  NorthAmerica 6 11  NorthAmerica 5 13  SouthAmerica 8 17 SouthAmerica 
7
> 16  SouthAmerica 7 13  SouthAmerica 8 14  SouthAmerica 6 17  Asia 7 13 
Asia
> 6 15  Asia 7 14  Asia 5 13  Asia 4 16
> 
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fisher's LSD multiple comparisons in a two-way ANOVA

2012-04-06 Thread Jinsong Zhao


On 2012-04-05 10:49, Richard M. Heiberger wrote:

Here is your example.  The table you displayed in gigawiz ignored the
two-way factor structure
and interpreted the data as a single factor with 6 levels.  I created
the interaction of
a and b to get that behavior.
## your example, with data stored in a data.frame
tmp <- data.frame(x=c(76, 84, 78, 80, 82, 70, 62, 72,
 71, 69, 72, 74, 66, 74, 68, 66,
 69, 72, 72, 78, 74, 71, 73, 67,
 86, 67, 72, 85, 87, 74, 83, 86,
 66, 68, 70, 76, 78, 76, 69, 74,
 72, 72, 76, 69, 69, 82, 79, 81),
   a=factor(rep(c("A1", "A2"), each = 24)),
   b=factor(rep(c("B1", "B2", "B3"), each=8, times=2)))
x.aov <- aov(x ~ a*b, data=tmp)
summary(x.aov)
## your request
require(multcomp)
tmp$ab <- with(tmp, interaction(a, b))
xi.aov <- aov(x ~ ab, data=tmp)
summary(xi.aov)
xi.glht <- glht(xi.aov, linfct=mcp(ab="Tukey"))
confint(xi.glht)

## graphs
## boxplots
require(lattice)
bwplot(x ~ ab, data=tmp)
## interaction plot
## install.packages("HH")  ## if you don't have HH yet
require(HH)
interaction2wt(x ~ a*b, data=tmp)



Thank you very much for the demonstration.

There is still a small difference between the results of glht() and the 
the table displayed in gigawiz. I try my best to figure out, but fail...


By the way, I have a similar question. I built a ANOVA model:

activity ~ pH * I * f

Df Sum Sq Mean Sq F value   Pr(>F)
pH   1   13301330  59.752 2.15e-10 ***
I1137 137   6.131   0.0163 *
f6  230543842 172.585  < 2e-16 ***
pH:I 1152 152   6.809   0.0116 *
pH:f 6274  46   2.049   0.0741 .
I:f  6   5015 836  37.544  < 2e-16 ***
pH:I:f   6849 142   6.356 3.82e-05 ***
Residuals   56   1247  22
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.

Now, how can I do a multi-comparison on `pH:I'?

Do I need to do separate ANOVA for each `pH' or `I', just as that in 
demo("MMC.WoodEnergy", "HH")? And then do multi-comparison on `I' or 
`pH' in each separate ANOVA?


Thanks again.

Regards,
Jinsong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Time series - year on year growth rate

2012-04-06 Thread jpm miao

Hello,

   Is there a function in R that calculates the year-on-year growth rate of
some time series?

   In EView the function is @pchy.

   Thanks,

miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

97 matches

Mail list logo