date:20120110

Re: [R] Extracting Data from SQL Server

2012-01-10 Thread dthomas

Thanks for you help guys, your suggestion Ajay worked fine, cheers

Dyfed Thomas
Population Health Analyst
DDI: 07 858 5967
Mob: 021 409 800

Midlands Health Network
WEL House
711 Victoria Street
PO Box 983
Hamilton 3240
Phone: 07 839 2888
Fax: 07 834 9242

From: Ajay Askoolum [via R] [mailto:ml-node+s789695n4281558...@n4.nabble.com]
Sent: Tuesday, 10 January 2012 10:40 p.m.
To: Dyfed Thomas
Subject: Re: Extracting Data from SQL Server

try:

SELECT a.UNIQUE_ID,
   a.diag01
  from LoadPUS a
left join CVD_ICD10 b
on a.diag01 = b.[ICD-10 Codes]
   or a.diag02 = b.[ICD-10 Codes]
   or a.diag03 = b.[ICD-10 Codes]

I am not sure why your table name CVD_ICD10 has a suffix $.




 From: Jeff Newmiller <[hidden 
email]>
To: dthomas <[hidden email]>; 
[hidden email]
Sent: Tuesday, 10 January 2012, 8:00
Subject: Re: [R] Extracting Data from SQL Server

This is OT here. However, you might want to investigate the UNIQUE keyword in 
the SQL Server documentation for SELECT.
---
Jeff NewmillerThe .   .  Go Live...
DCN:<[hidden email]>
Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

dthomas <[hidden email]> wrote:

>Hi,
>
>I am new to R (and rusty on SQL!) and I'm trying to extract records
>from a
>SQL server database. I have a table of patient records (LoadPUS) which
>have
>three code columns which i want to evaluate against a list of
>particular
>codes (CVD_ICD$ table). Given the size of the patient table I want to
>restrict the data I pull into R to the data I only want to analyse so I
>am
>using SQL to do this. The code i have is as follows:
>
>library(RODBC)
>channel<-odbcConnect("NatCollections")
>query<-"SELECT UNIQUE_ID, diag01 from LoadPUS
>WHERE (diag01 IN (SELECT [ICD-10 Codes] From CVD_ICD10$)) OR (diag02 IN
>(SELECT [ICD-10 Codes] From CVD_ICD10$))
>OR (diag03 IN (SELECT [ICD-10 Codes] From CVD_ICD10$))"
>
>This returns duplicate values, I don't want to hardcode the values
>because
>it is quite a long list. Running the "IN" function just for "diag01"
>returns
>the correct number of records, however when combining with another "IN"
>function it doesn't return the correct number of records. Can you see
>where
>my SQL is incorrect or is there another way of doing this?
>
>Much appreciated,
>D
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Extracting-Data-from-SQL-Server-tp4281000p4281000.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>[hidden email] mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
__
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]


__
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Extracting-Data-from-SQL-Server-tp4281000p4281558.html
To unsubscribe from Extracting Data from SQL Server, click 
here.
NAML

#
Please Read the following:

This message is for the named person's use only.  It may...{{dropped:23}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reprodu

Re: [R] Finding percentile of a value from an empirical distribution

2012-01-10 Thread Robert A'gata

Thank you. That's easier than I thought.

On Wed, Jan 11, 2012 at 12:06 AM, Jorge I Velez
 wrote:
> Hi Robert,
>
> Try
>
> set.seed(123)
> x <- seq(100)
> x <- sample(x, 1000, replace = TRUE)
> f <- ecdf(x)
> f(10)
> # [1] 0.099
> f(71)
> # [1] 0.716
>
> See ?ecdf for more information.
>
> HTH,
> Jorge.-
>
>
> On Tue, Jan 10, 2012 at 11:52 PM, Robert A'gata <> wrote:
>>
>> Hello,
>>
>> I am not sure how to do this in R. Any suggestion would be
>> appreciated. I have a vector of values from where I build an empirical
>> CDF. For example:
>>
>> > x <- seq(1,100)
>> > x <- sample(x,1000,replace=T)
>> > quantile(x,probs=seq(0,1,.05))
>>    0%     5%    10%    15%    20%    25%    30%    35%    40%    45%
>>  50%    55%
>>  1.00   5.00  10.00  16.00  20.00  25.00  31.00  36.00  41.00  45.55
>> 50.00  56.00
>>   60%    65%    70%    75%    80%    85%    90%    95%   100%
>>  60.00  65.00  70.00  74.00  80.00  85.00  91.00  95.05 100.00
>>
>> I would like to write a function that takes in a number z and vector x
>> (i.e. the raw vector).It returns percentile of z wrt x. E.g.
>>
>> > f(71,x)
>>
>> Should return something around 0.708 or 0.709. I am wondering if there
>> is any pre-packaged functions that do this in R? If not, how can I
>> write such a function. Any suggestion would be appreciated.
>>
>> Robert
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding percentile of a value from an empirical distribution

2012-01-10 Thread Jorge I Velez

Hi Robert,

Try

set.seed(123)
x <- seq(100)
x <- sample(x, 1000, replace = TRUE)
f <- ecdf(x)
f(10)
# [1] 0.099
f(71)
# [1] 0.716

See ?ecdf for more information.

HTH,
Jorge.-


On Tue, Jan 10, 2012 at 11:52 PM, Robert A'gata <> wrote:

> Hello,
>
> I am not sure how to do this in R. Any suggestion would be
> appreciated. I have a vector of values from where I build an empirical
> CDF. For example:
>
> > x <- seq(1,100)
> > x <- sample(x,1000,replace=T)
> > quantile(x,probs=seq(0,1,.05))
>0% 5%10%15%20%25%30%35%40%45%
>  50%55%
>  1.00   5.00  10.00  16.00  20.00  25.00  31.00  36.00  41.00  45.55
> 50.00  56.00
>   60%65%70%75%80%85%90%95%   100%
>  60.00  65.00  70.00  74.00  80.00  85.00  91.00  95.05 100.00
>
> I would like to write a function that takes in a number z and vector x
> (i.e. the raw vector).It returns percentile of z wrt x. E.g.
>
> > f(71,x)
>
> Should return something around 0.708 or 0.709. I am wondering if there
> is any pre-packaged functions that do this in R? If not, how can I
> write such a function. Any suggestion would be appreciated.
>
> Robert
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding percentile of a value from an empirical distribution

2012-01-10 Thread R. Michael Weylandt

Look at ?ecdf 

Michael

On Jan 10, 2012, at 11:52 PM, "Robert A'gata"  wrote:

> Hello,
> 
> I am not sure how to do this in R. Any suggestion would be
> appreciated. I have a vector of values from where I build an empirical
> CDF. For example:
> 
>> x <- seq(1,100)
>> x <- sample(x,1000,replace=T)
>> quantile(x,probs=seq(0,1,.05))
>0% 5%10%15%20%25%30%35%40%45%
>  50%55%
>  1.00   5.00  10.00  16.00  20.00  25.00  31.00  36.00  41.00  45.55
> 50.00  56.00
>   60%65%70%75%80%85%90%95%   100%
> 60.00  65.00  70.00  74.00  80.00  85.00  91.00  95.05 100.00
> 
> I would like to write a function that takes in a number z and vector x
> (i.e. the raw vector).It returns percentile of z wrt x. E.g.
> 
>> f(71,x)
> 
> Should return something around 0.708 or 0.709. I am wondering if there
> is any pre-packaged functions that do this in R? If not, how can I
> write such a function. Any suggestion would be appreciated.
> 
> Robert
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Finding percentile of a value from an empirical distribution

2012-01-10 Thread Robert A'gata

Hello,

I am not sure how to do this in R. Any suggestion would be
appreciated. I have a vector of values from where I build an empirical
CDF. For example:

> x <- seq(1,100)
> x <- sample(x,1000,replace=T)
> quantile(x,probs=seq(0,1,.05))
0% 5%10%15%20%25%30%35%40%45%
  50%55%
  1.00   5.00  10.00  16.00  20.00  25.00  31.00  36.00  41.00  45.55
50.00  56.00
   60%65%70%75%80%85%90%95%   100%
 60.00  65.00  70.00  74.00  80.00  85.00  91.00  95.05 100.00

I would like to write a function that takes in a number z and vector x
(i.e. the raw vector).It returns percentile of z wrt x. E.g.

> f(71,x)

Should return something around 0.708 or 0.709. I am wondering if there
is any pre-packaged functions that do this in R? If not, how can I
write such a function. Any suggestion would be appreciated.

Robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] try() with silent=TRUE not preventing printing of error message

2012-01-10 Thread Benjamin Tyner


Hello,

We're curious to know why, on builds lacking Tcl/Tk,

   try(library(tcltk),silent=TRUE)

still allows the message to be printed:

   Error in firstlib(which.lib.loc, package) :
 Tcl/Tk support is not available on this system

though it does succeed in trapping the error condition?

Thanks,
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help

2012-01-10 Thread R. Michael Weylandt

That sort of name is allowed but not advised because it can lead to confusion 
in certain non-standard evaluation functions like subset(). If you really want 
the name like that add the check.names = FALSE argument to read.table()

Michael Weylandt

On Jan 10, 2012, at 5:57 PM, Anna Olofsson  wrote:

> Thank you! The c was missing. I don't know if it's ok to continue on this
> thread, but I also had another question about reading data. I have this
> file containing 3 columns and 19 rows. 
> 
> 00.960.21
> 00.450.4
> 00.870.1
> 00.560.04
> 00.570.04
> 00.20.7
> 00.450.43
> 00.350.21
> 00.750.56
> 10.630.43
> 10.950.32
> 10.420.2
> 10.120.05
> 10.560.06
> 10.340.3
> 10.10.7
> 10.110.75
> 10.20.21
> 10.950.37
> 
> I tried to read it into R, but I'm not exactly sure exactly what to use as
> input. This is my input line using read.table:
> 
> data1 <- read.table(file = "filename.txt", header=FALSE, col.names =
> c("class", "P", "1G"))
> 
> but in the output I get an X infront of "1G", which disappears when I run
> it with the name 'G' instead of '1G'. Am I not allowed to use numerical
> values? 
> 
> Best,
> Anna
> 
> 
> 
> 
> On Tue, 10 Jan 2012 23:02:04 +0100, Anna Olofsson 
> wrote:
>> Hi,
>> I'm pretty new at programming and with the R language. I'm just trying
> to
>> get familiar with R and wrote a script in gedit (should I use emacs
>> instead?),
>> 
>> x <- [10.4  5.6  3.1  6.4 21.7]
>> y <- [12,5.6, 7.2, 1.0, 9.3]
>> plot(x,y)
>> 
>> then I went to the command window in the terminal (I'm using unix) to
> run
>> this with source("name_of_file"), but it doesn't work. Shouldn't a plot
>> come up automatically when I run it? What am I doing wrong? It knows
> what x
>> and y is, but I don't get an error of what might be wrong.
>> 
>>> source("name_of_file")
>>> x
>> [1] 10.4  5.6  3.1  6.4 21.7
>> 
>> 
>> Best,
>> Anna
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help

2012-01-10 Thread Anna Olofsson

Thank you! The c was missing. I don't know if it's ok to continue on this
thread, but I also had another question about reading data. I have this
file containing 3 columns and 19 rows. 

0   0.960.21
0   0.450.4
0   0.870.1
0   0.560.04
0   0.570.04
0   0.2 0.7
0   0.450.43
0   0.350.21
0   0.750.56
1   0.630.43
1   0.950.32
1   0.420.2
1   0.120.05
1   0.560.06
1   0.340.3
1   0.1 0.7
1   0.110.75
1   0.2 0.21
1   0.950.37

I tried to read it into R, but I'm not exactly sure exactly what to use as
input. This is my input line using read.table:

data1 <- read.table(file = "filename.txt", header=FALSE, col.names =
c("class", "P", "1G"))

but in the output I get an X infront of "1G", which disappears when I run
it with the name 'G' instead of '1G'. Am I not allowed to use numerical
values? 

Best,
Anna

On Tue, 10 Jan 2012 23:02:04 +0100, Anna Olofsson 
wrote:
> Hi,
> I'm pretty new at programming and with the R language. I'm just trying
to
> get familiar with R and wrote a script in gedit (should I use emacs
> instead?),
> 
> x <- [10.4  5.6  3.1  6.4 21.7]
> y <- [12,5.6, 7.2, 1.0, 9.3]
> plot(x,y)
> 
> then I went to the command window in the terminal (I'm using unix) to
run
> this with source("name_of_file"), but it doesn't work. Shouldn't a plot
> come up automatically when I run it? What am I doing wrong? It knows
what x
> and y is, but I don't get an error of what might be wrong.
> 
>> source("name_of_file")
>> x
> [1] 10.4  5.6  3.1  6.4 21.7
> 
> 
> Best,
> Anna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problems with constrOptim

2012-01-10 Thread cianders

I am having problems with the constrOptim problem. I am trying to compute
a function which must be concave. When I put the linear constraints in,
constrOptim always returns the original values of the parameter entered
after several iterations. However, if I set ui to the 0 matrix, it will
optimize the function. Does this imply the constraints are restraining the
algorithm from exploring the parameter space correctly?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 64bit R under 32bit winxp

2012-01-10 Thread 孟欣

Hi all:


My OS is 32bit winxp,but I wanna install 64bit R2.14.1.


>From the following website,it says "You can also go back and add 64-bit 
>components to a 32-bit install, or vice versa"
http://cran.r-project.org/bin/windows/rw-FAQ.html#Can-both-32_002d-and-64_002dbit-R-be-installed-on-the-same-machine_003f




Does it mean that I can install and run 64bit R2.14.1 under 32bit winxp?If 
so,how can I "add 64-bit components to a 32-bit install"?




Many thanks for your help.




My best
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different results from fligner.test

2012-01-10 Thread gaiarrido

Yeah! it works, i know it because of the degrees of freedom. This is what i
get now  


> g <- interaction(ano, edadysexo, zona, estacion)
> fligner.test(rojos ~ g)

Fligner-Killeen test of homogeneity of variances

data:  rojos by g 


Fligner-Killeen:med chi-squared = 249.7591, df = 87, p-value < 2.2e-16
 I didn't repair on d.f. 
 I would pay more attention next time.

Thanks a lot,


-
Mario Garrido Escudero
PhD student
Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola
Universidad de Salamanca
--
View this message in context: 
http://r.789695.n4.nabble.com/different-results-from-fligner-test-tp4283312p4283753.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] short-hand to avoid use of length() in subsetting vectors?

2012-01-10 Thread Duncan Murdoch


On 12-01-10 6:04 PM, Eric Rupley wrote:


Hi--

I suspect this is a frequently considered (and possibly asked) question, but I 
haven't thus far found an answer:

For slicing a vector with x[…], is there a symbol for length(x)?


No, there isn't.  You don't need it much, but if you do you'll have to 
calculate it.  You'll often save a noticeable amount of execution time 
if you store it in a local variable rather than calling length(x) 
repeatedly.


Duncan Murdoch



I'm seeking a short-hand for referring to the length of vector one is trying to 
index.

E.g., for a data.frame, say,


test.frame<-data.frame(matrix(c(1:100),ncol=10,nrow=10,byrow=T))
names(test.frame)<- names(islands)[1:10]


is there a short-hand for subsetting


test.frame$Baffin[1:(length(test.frame$Baffin)-3)]

[1]  6 16 26 36 46 56 66





that would allow one to avoid the "(length(some.dataframe$variable)-offset)"?

I was thinking something paralleling the use of negative indices in […] might 
exist with seq(from,to), e.g. for the above


test.frame$Baffin[seq(,7)]

[1]  6 16 26 36 46 56 66




works.  But the fantasy

test.frame$Baffin[seq(,-3)]

obviously doesn't…

Any suggestions will be gratefully appreciated…

As always, many thanks to the patient list members who helps on these simple 
questions...

Best,
Eric



--
  Eric Rupley
  University of Michigan, Museum of Anthropology
  1109 Geddes Ave, Rm. 4013
  Ann Arbor, MI 48109-1079

  erup...@umich.edu
  +1.734.276.8572

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] runif with condition

2012-01-10 Thread Carl Witthoft


AlanM said,
"Consider the pair {X, 1-X} where X is sampled from a uniform(0,1) 
distribution. The quantity 1- X also comes from a uniform(0,1) 
distribution and therefore is probabilistic and not deterministic. The 
sum of independent random variables is itself a random variable. If X1, 
X2 & X3 are uniformly distributed, then the distribution of Y = X1 + X2 
+ X3 can be determined (i.e. Y is probabilistic and NOT deterministic). 
Y is a random variable, but it is correlated with X1, X2 and X3. The set 
{X1, X2, X3, 100 - (X1 + X2 + X3) } contains 4 random variables, however 
they are neither independent or identically distributed.


 If you are curious, check this out.

Deriving the Probability Density for Sums of Uniform Random Variables 
Edward J. Lusk and Haviland Wright

The American Statistician
Vol. 36, No. 2 (May, 1982), pp. 128-130"
(endquote)


Be *very* careful about how you state the problem.  The quantity (1-X), 
once you know the value of X, is deterministic.  The confusion arises, I 
think, in that if you do NOT know the value of X, then both X and (1-X) 
are unknown and probabilistic.  But, rather like entangled photons :-), 
once you know one of the values, you immediately know the other.


I'll sign off with the reminder that, back when Marylin vos Savant first 
published the Monte Hall problem, a number of people with PhDs in math, 
some of whom claimed to be statisticians, angrily supported the 
incorrect answer.


Carl

--

Sent from my Cray XK6
"Pendeo-navem mei anguillae plena est."

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Vegan(ordistep) error: Error in if (aod[1, 5] <= Pin) { : missing value where TRUE/FALSE needed

2012-01-10 Thread Nevil Amos

I am getting the following erro rmessage in ordistep.  I have a number of
similarly structured datasets using ordistep in a loop, and the message
only occurs for some of the datasets.

I cannot include a reproducible sample  - the specific datasets where this
is occur ing are fairly large and there are several pcnm's in the rhs of
the formula.

thanks for any pointers that may allow me to track down the cause of the
error.


Error in if (aod[1, 5] <= Pin) { : missing value where TRUE/FALSE needed
In addition: There were 50 or more warnings (use warnings() to see the
first 50)
traceback()
9: ordistep(myrda0, scope = formula(myrda1), direction = "both",
  Pin = 0.05, Pout = 0.1) at
RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
8: eval(expr, envir, enclos) at RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
7: eval(expr, pf) at RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
6: withVisible(eval(expr, pf)) at
RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
5: evalVis(expr) at RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
4: capture.output(ordistep(myrda0, scope = formula(myrda1), direction =
"both",
  Pin = 0.05, Pout = 0.1)) at
RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
3: eval.with.vis(expr, envir, enclos)
2: eval.with.vis(ei, envir)
1:
source("~/Documents/Dropbox/thesis/CH3/Analysis/RDAPARTIALSexandAgeConnectandGEOGraphy2.R")




print(myrda1)
Call: rda(formula = mygenind@tab ~ pcnmTRE_25_100_CS25 + pcnmTRE_25_10_CS25
+ pcnmTRE_25_2_CS25 +
pcnmTRE_25_5_CS25 + mydata$TreeCov + mydata$Hab_Config +
pcnmEYR_EO_100_CS25 +
pcnmEYR_EO_5000_CS25 + pcnmEYR_TH_10_CS25 + pcnmEYR_TH_2_CS25 +
mydata$Site_No + mydata$Landscape
+ Condition(pcnmCS_NULL + mydata$LAT.x + mydata$LONG.x), na.action =
"na.omit")

 Inertia Proportion Rank
Total  1.8110 1.
Conditional0.8681 0.4793   32
Constrained0. 0.0
Unconstrained  0.9429 0.5207   29
Inertia is variance
Some constraints were aliased because they were collinear (redundant)

Eigenvalues for unconstrained axes:
   PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
0.16008 0.14733 0.12183 0.09054 0.07380 0.06971 0.05578 0.04215
(Showed only 8 of all 29 unconstrained eigenvalues)


-- 
Nevil Amos
Molecular Ecology Research Group
Australian Centre for Biodiversity
Monash University
CLAYTON VIC 3800
Australia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to make this for() loop memory efficient?

2012-01-10 Thread Steve Lianoglou

Yeah -- just fired off an apology email before this landed in my inbox.

Sometimes I'm better off not trying to help at all -- this was one of
those cases ;-)

Whatever I was trying to do clearly was going down the wrong trail

Thankfully, you're on top of it though.

Sorry for the spam,
-steve

On Tue, Jan 10, 2012 at 7:33 PM, Ray Brownrigg
 wrote:
> Steve:
>
> I don't understand why you couldn't get the original code working.  You just 
> have to
> notice that one comment overflows its line.
>
> However I couldn't get your code to match the output of the original - 
> almost, but not
> quite!
>
> Ray
>
> On Wed, 11 Jan 2012, Steve Lianoglou wrote:
>> I'm having a really difficult time understanding what you're trying to
>> get -- copy and pasting your code is failing to run, and your question
>> isn't clear, ie:
>>
>> "For each phone call that BEGINS with the module which is denoted by 81
>> (i.e. of the form 81X,XXX), what is the expected number of modules in these
>> calls?"
>>
>> How does one calculate the expected number of "modules" in this
>> module? What does that even mean?
>>
>> Anyway, here's some using your `data` data.frame that calculates the
>> number of unique calls and other statistics on the "call id" within
>> each module prefix. I'm using both data.table and plyr ... there are
>> no for loops.
>>
>> You will want to do `whatever it is you really want to do` inside the
>> "blocks" below.
>>
>> ## R code
>> data <- transform(data, module.prefix=substring(modules, 1, 2))
>>
>> ## take a look at `data` now
>>
>> ## calulate "stuff" inside each module.prefix using data.table
>> xx <- data.table(data, key="module.prefix")
>>
>> ans <- xx[, {
>>   ## the columns of the particular subset of your data.table
>>   ## are "injected" into the scope for this expression block
>>   ## which is where the `calls` variable below comes from
>>   tabled <- table(as.character(calls))
>>   list(unique.calls=length(tabled), min=min(tabled),
>> median=as.numeric(median(tabled)), max=max(tabled))
>>   ## you will want to return your own list of "stuff"
>> }, by='module.prefix']
>>
>>
>> ## with plyr
>> library(plyr)
>> ans <- ddply(data, "module.prefix", function(x) {
>>   ## `x` is a data.frame that all share the same module.prefix
>>   ## do whatever you want with it here
>>   tabled <- table(as.character(x$calls))
>>   c(unique.calls=length(tabled), min=min(tabled),
>> median=median(tabled), max=max(tabled))
>> })
>>
>> You'll have to read up on the particulars of data.table and plyr. Both
>> are really powerful packages ... you should get familiar with at least
>> one.
>>
>> plyr is a bit more flexible in some ways.
>>
>> data.table is a bit more strict (cf. the need for
>> `as.numeric(median(tabled))`), but also tends to be (much) faster when
>> working over large datasets
>>
>> HTH,
>> -steve
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to make this for() loop memory efficient?

2012-01-10 Thread Steve Lianoglou

Let me just reply to myself.

Sorry, it's funny how much I don't get this, but it appears Ray is
following you and provides an answer -- scratch my email, it seems to
be way off

(you should still learn plyr and/or data.table if you haven't yet, tho ;-)

Apologies,
-steve

On Tue, Jan 10, 2012 at 7:18 PM, Steve Lianoglou
 wrote:
> I'm having a really difficult time understanding what you're trying to
> get -- copy and pasting your code is failing to run, and your question
> isn't clear, ie:
>
> "For each phone call that BEGINS with the module which is denoted by 81
> (i.e. of the form 81X,XXX), what is the expected number of modules in these
> calls?"
>
> How does one calculate the expected number of "modules" in this
> module? What does that even mean?
>
> Anyway, here's some using your `data` data.frame that calculates the
> number of unique calls and other statistics on the "call id" within
> each module prefix. I'm using both data.table and plyr ... there are
> no for loops.
>
> You will want to do `whatever it is you really want to do` inside the
> "blocks" below.
>
> ## R code
> data <- transform(data, module.prefix=substring(modules, 1, 2))
>
> ## take a look at `data` now
>
> ## calulate "stuff" inside each module.prefix using data.table
> xx <- data.table(data, key="module.prefix")
>
> ans <- xx[, {
>  ## the columns of the particular subset of your data.table
>  ## are "injected" into the scope for this expression block
>  ## which is where the `calls` variable below comes from
>  tabled <- table(as.character(calls))
>  list(unique.calls=length(tabled), min=min(tabled),
> median=as.numeric(median(tabled)), max=max(tabled))
>  ## you will want to return your own list of "stuff"
> }, by='module.prefix']
>
>
> ## with plyr
> library(plyr)
> ans <- ddply(data, "module.prefix", function(x) {
>  ## `x` is a data.frame that all share the same module.prefix
>  ## do whatever you want with it here
>  tabled <- table(as.character(x$calls))
>  c(unique.calls=length(tabled), min=min(tabled),
> median=median(tabled), max=max(tabled))
> })
>
> You'll have to read up on the particulars of data.table and plyr. Both
> are really powerful packages ... you should get familiar with at least
> one.
>
> plyr is a bit more flexible in some ways.
>
> data.table is a bit more strict (cf. the need for
> `as.numeric(median(tabled))`), but also tends to be (much) faster when
> working over large datasets
>
> HTH,
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to make this for() loop memory efficient?

2012-01-10 Thread Ray Brownrigg

Steve:

I don't understand why you couldn't get the original code working.  You just 
have to 
notice that one comment overflows its line.

However I couldn't get your code to match the output of the original - almost, 
but not 
quite!

Ray

On Wed, 11 Jan 2012, Steve Lianoglou wrote:
> I'm having a really difficult time understanding what you're trying to
> get -- copy and pasting your code is failing to run, and your question
> isn't clear, ie:
> 
> "For each phone call that BEGINS with the module which is denoted by 81
> (i.e. of the form 81X,XXX), what is the expected number of modules in these
> calls?"
> 
> How does one calculate the expected number of "modules" in this
> module? What does that even mean?
> 
> Anyway, here's some using your `data` data.frame that calculates the
> number of unique calls and other statistics on the "call id" within
> each module prefix. I'm using both data.table and plyr ... there are
> no for loops.
> 
> You will want to do `whatever it is you really want to do` inside the
> "blocks" below.
> 
> ## R code
> data <- transform(data, module.prefix=substring(modules, 1, 2))
> 
> ## take a look at `data` now
> 
> ## calulate "stuff" inside each module.prefix using data.table
> xx <- data.table(data, key="module.prefix")
> 
> ans <- xx[, {
>   ## the columns of the particular subset of your data.table
>   ## are "injected" into the scope for this expression block
>   ## which is where the `calls` variable below comes from
>   tabled <- table(as.character(calls))
>   list(unique.calls=length(tabled), min=min(tabled),
> median=as.numeric(median(tabled)), max=max(tabled))
>   ## you will want to return your own list of "stuff"
> }, by='module.prefix']
> 
> 
> ## with plyr
> library(plyr)
> ans <- ddply(data, "module.prefix", function(x) {
>   ## `x` is a data.frame that all share the same module.prefix
>   ## do whatever you want with it here
>   tabled <- table(as.character(x$calls))
>   c(unique.calls=length(tabled), min=min(tabled),
> median=median(tabled), max=max(tabled))
> })
> 
> You'll have to read up on the particulars of data.table and plyr. Both
> are really powerful packages ... you should get familiar with at least
> one.
> 
> plyr is a bit more flexible in some ways.
> 
> data.table is a bit more strict (cf. the need for
> `as.numeric(median(tabled))`), but also tends to be (much) faster when
> working over large datasets
> 
> HTH,
> -steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to make this for() loop memory efficient?

2012-01-10 Thread Ray Brownrigg

On Wed, 11 Jan 2012, Ray Brownrigg wrote:
> On Wed, 11 Jan 2012, iliketurtles wrote:
> > ##I have 2 columns of data. The first column is unique "event IDs" that
> > represent a phone call made to a customer.
> > ###So, if you see 3 entries together in the first column like follows:
> > 
> > matrix(c("call1a","call1a","call1a") )
> > 
> > ##then this means that this particular phone call  (the first call that's
> > logged in the data set) was transferred
> > ##between 3 different "modules" before the call was terminated.
> > 
> > ##The second column is a numerical description of the module the call
> > started with and then got transferred to prior to ##call termination.
> > Now, I'll construct a ##representative array of the type of data I'm
> > dealing with (the real data set goes ##on for X00,000s of rows):
> > ##(Ignore how I construct the following array, it’s completely unrelated
> > to how the actual data set was constructed).
> > 
> > 
> > a<-sapply(1:50,function(i){paste("call",i,sep="",collapse="")})
> > development.a<-seq(1,40,3)
> > development.a2<-seq(1,40,5)
> > a[development.a]<-a[development.a+1]
> > a[development.a2]<-a[development.a2+1]
> > a[1:2]<-"call2a";a[3]<-"call3a";a[4:5]<-"call5a";a[6:8]<-"call8a";a[9]<-"
> > ca ll9a"
> > b<-c(920010,960010,820009,920010,960500,970050,930010,920010,960500,97005
> > 0
> > ,930900,870010,840010,960500,920010,970050,930010,960500,920010,970050,9
> > 300
> > 10,960010,920010,940010,960010,970010,960500,920010,970050,930010,960500
> > ,92
> > 0010,970050,930010,960500,920010,970050,930010,920010,960500,970050,9300
> > 10, 920009,960500,970050,930009,940010,960500,960500,960500)
> > data<-as.data.frame(cbind(a,b))
> > colnames(data)<-c("phone calls","modules")
> > dim(data)
> > print(data[1:10,]) #sample of 10 rows
> > 
> > # Note that in the real data set, data[,2] ranges from 810,000 to
> > 999,999. I've been tasked with the following:
> > # "For each phone call that BEGINS with the module which is denoted by 81
> > (i.e. of the form 81X,XXX), what is the expected number of modules in
> > these calls?"
> > #Then it's the same question for each module beginning with 82, 83,
> > 84. all the way until 99.
> > #I've created code that I think works for this, but I can't actually run
> > it on the whole data set. I left it for 30 minutes and it only had about
> > #5% of the task completed (I clicked "STOP" then checked my output to
> > see if I did it properly, and it seems correct).
> > #I know the apply() family specializes in vector operations, but I can't
> > figure out how to complete the above question in any way other than
> > #loops.
> > 
> > L<-data
> > 
> > A<-array(0,dim=c(19,2));rownames(A)<-seq(81,99,1)
> > A<-data.frame(A)
> > 
> >  for(i in 1:(nrow(L)-1))
> >  {
> >  
> >   if(L[(i+1),1]!=L[i,1])
> >   {
> > 
> > A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse="")
> > ,1 ]<- {
> > 
> > A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse="")
> > ,1 ]+length(grep(as.character(L[i+1,1]),L[,1],value=FALSE)) #aggregate
> > number of modules in the calls that begin with XX (not yet averaged).
> > 
> > }
> > 
> > A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse="")
> > ,2 ]<- {
> > 
> > A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse="")
> > ,2 ]+1 }
> > 
> >   }
> >  
> >  }
> > 
> > #If I can get this code to be more memory efficient such that I can do it
> > on a 400,000 row data set, I can do, for example,
> > 
> > A[17,1]/A[17,2]
> > 
> > #and I'll arrive at the mean number of modules per call where the call
> > starts with a module that starts with 97.
> > 
> > A[17,1]
> > #is 10, which means that, out of every single call that started with a
> > module of 97X,XXX,
> > #they went through 10 modules in total.
> > 
> > A[17,2]
> > #is 6, which means that there was 6 calls in total that began with a
> > 97X,XXX module.
> > 
> > #Hence,
> > 
> > 
> > A[17,1]/A[17,2]
> > 
> > #is the average number of modules that were executed in all the calls
> > that began with a 97X,XXX module.
> > 
> > 
> > -
> > 
> > 
> > Isaac
> > Research Assistant
> > Quantitative Finance Faculty, UTS
> 
> I don't see any need for you to use data frames.
> 
> If you make A and data (not a good use of a variable name) just matrices,
> you get the same answers at about 10 times the speed (using your example).
> 
Further, you should calculate your rowname, namely:
paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse="")
only once each loop, instead of 4 times. this saves another 25-30% cputime.

And you can combine the two updates into a single assignment.

So using the code:

L <- as.matrix(data)
A <- array(0, dim=c(19, 2)); rownames(A) <- seq(81, 99, 1)
# A <- data.frame(A)

 for(i in 1:(nrow(L)-1))
 {
  if(L[(i+1),1]!=L[i,1])
  {
  myrow <- paste(strsplit(as.character(L[i+1, 2]), "")[[1]][1:2], sep="",
collapse="")
  A[myrow, ] <- A[myrow, ] +
c(length(g

Re: [R] How to make this for() loop memory efficient?

2012-01-10 Thread Steve Lianoglou

I'm having a really difficult time understanding what you're trying to
get -- copy and pasting your code is failing to run, and your question
isn't clear, ie:

"For each phone call that BEGINS with the module which is denoted by 81
(i.e. of the form 81X,XXX), what is the expected number of modules in these
calls?"

How does one calculate the expected number of "modules" in this
module? What does that even mean?

Anyway, here's some using your `data` data.frame that calculates the
number of unique calls and other statistics on the "call id" within
each module prefix. I'm using both data.table and plyr ... there are
no for loops.

You will want to do `whatever it is you really want to do` inside the
"blocks" below.

## R code
data <- transform(data, module.prefix=substring(modules, 1, 2))

## take a look at `data` now

## calulate "stuff" inside each module.prefix using data.table
xx <- data.table(data, key="module.prefix")

ans <- xx[, {
  ## the columns of the particular subset of your data.table
  ## are "injected" into the scope for this expression block
  ## which is where the `calls` variable below comes from
  tabled <- table(as.character(calls))
  list(unique.calls=length(tabled), min=min(tabled),
median=as.numeric(median(tabled)), max=max(tabled))
  ## you will want to return your own list of "stuff"
}, by='module.prefix']


## with plyr
library(plyr)
ans <- ddply(data, "module.prefix", function(x) {
  ## `x` is a data.frame that all share the same module.prefix
  ## do whatever you want with it here
  tabled <- table(as.character(x$calls))
  c(unique.calls=length(tabled), min=min(tabled),
median=median(tabled), max=max(tabled))
})

You'll have to read up on the particulars of data.table and plyr. Both
are really powerful packages ... you should get familiar with at least
one.

plyr is a bit more flexible in some ways.

data.table is a bit more strict (cf. the need for
`as.numeric(median(tabled))`), but also tends to be (much) faster when
working over large datasets

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rpad.org down? Searching latest R-Reference card

2012-01-10 Thread Jonas Stein

Hi,

i'd like to update my R-Reference card and commit some edits,
but i could not get the source from rpad.org 

Did it move?

kind regards,

-- 
Jonas Stein 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] short-hand to avoid use of length() in subsetting vectors?

2012-01-10 Thread Nordlund, Dan (DSHS/RDA)

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Steve Lianoglou
> Sent: Tuesday, January 10, 2012 3:28 PM
> To: Eric Rupley
> Cc: r-help@r-project.org
> Subject: Re: [R] short-hand to avoid use of length() in subsetting
> vectors?
> 
> Hi,
> 
> On Tue, Jan 10, 2012 at 6:04 PM, Eric Rupley  wrote:
> >
> > Hi--
> >
> > I suspect this is a frequently considered (and possibly asked)
> question, but I haven't thus far found an answer:
> >
> >        For slicing a vector with x[…], is there a symbol for
> length(x)?
> >
> > I'm seeking a short-hand for referring to the length of vector one is
> trying to index.
> >
> > E.g., for a data.frame, say,
> >
> >> test.frame <-data.frame(matrix(c(1:100),ncol=10,nrow=10,byrow=T))
> >> names(test.frame) <- names(islands)[1:10]
> >
> > is there a short-hand for subsetting
> >
> >> test.frame$Baffin[1:(length(test.frame$Baffin)-3)]
> > [1]  6 16 26 36 46 56 66
> >>
> >>
> >
> > that would allow one to avoid the "(length(some.dataframe$variable)-
> offset)"?
> 
> Sadly there is no shorthand of the (exact) type you are looking for.
> 
> You can access the data like so, however:
> 
> R> head(test.frame$Baffin, -3)
> [1]  6 16 26 36 46 56 66
> 
> I think that's as good as you're going to get, but let's see what
> other suggestions pop up.
> 
> -steve
> 

Steve is probably right. But, here is one suggestion.  Since every column of a 
data frame is the same length,  You could try something like

test.frame$Baffin[1:(nrow(test.frame)-3)]

However, it is only a little bit shorter.


Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to make this for() loop memory efficient?

2012-01-10 Thread Ray Brownrigg

On Wed, 11 Jan 2012, iliketurtles wrote:
> ##I have 2 columns of data. The first column is unique "event IDs" that
> represent a phone call made to a customer.
> ###So, if you see 3 entries together in the first column like follows:
> 
> matrix(c("call1a","call1a","call1a") )
> 
> ##then this means that this particular phone call  (the first call that's
> logged in the data set) was transferred
> ##between 3 different "modules" before the call was terminated.
> 
> ##The second column is a numerical description of the module the call
> started with and then got transferred to prior to ##call termination. Now,
> I'll construct a ##representative array of the type of data I'm dealing
> with (the real data set goes ##on for X00,000s of rows):
> ##(Ignore how I construct the following array, it’s completely unrelated to
> how the actual data set was constructed).
> 
> 
> a<-sapply(1:50,function(i){paste("call",i,sep="",collapse="")})
> development.a<-seq(1,40,3)
> development.a2<-seq(1,40,5)
> a[development.a]<-a[development.a+1]
> a[development.a2]<-a[development.a2+1]
> a[1:2]<-"call2a";a[3]<-"call3a";a[4:5]<-"call5a";a[6:8]<-"call8a";a[9]<-"ca
> ll9a"
> b<-c(920010,960010,820009,920010,960500,970050,930010,920010,960500,970050
> ,930900,870010,840010,960500,920010,970050,930010,960500,920010,970050,9300
> 10,960010,920010,940010,960010,970010,960500,920010,970050,930010,960500,92
> 0010,970050,930010,960500,920010,970050,930010,920010,960500,970050,930010,
> 920009,960500,970050,930009,940010,960500,960500,960500)
> data<-as.data.frame(cbind(a,b))
> colnames(data)<-c("phone calls","modules")
> dim(data)
> print(data[1:10,]) #sample of 10 rows
> 
> # Note that in the real data set, data[,2] ranges from 810,000 to 999,999.
> I've been tasked with the following:
> # "For each phone call that BEGINS with the module which is denoted by 81
> (i.e. of the form 81X,XXX), what is the expected number of modules in these
> calls?"
> #Then it's the same question for each module beginning with 82, 83, 84.
> all the way until 99.
> #I've created code that I think works for this, but I can't actually run it
> on the whole data set. I left it for 30 minutes and it only had about #5%
> of the task completed (I clicked "STOP" then checked my output to see if I
> did it properly, and it seems correct).
> #I know the apply() family specializes in vector operations, but I can't
> figure out how to complete the above question in any way other than #loops.
> 
> L<-data
> 
> A<-array(0,dim=c(19,2));rownames(A)<-seq(81,99,1)
> A<-data.frame(A)
> 
>  for(i in 1:(nrow(L)-1))
>  {
>   if(L[(i+1),1]!=L[i,1])
>   {
> 
> A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse=""),1
> ]<- {
> 
> A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse=""),1
> ]+length(grep(as.character(L[i+1,1]),L[,1],value=FALSE)) #aggregate number
> of modules in the calls that begin with XX (not yet averaged).
> }
> 
> A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse=""),2
> ]<- {
> 
> A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse=""),2
> ]+1 }
>   }
> 
>  }
> 
> #If I can get this code to be more memory efficient such that I can do it
> on a 400,000 row data set, I can do, for example,
> 
> A[17,1]/A[17,2]
> 
> #and I'll arrive at the mean number of modules per call where the call
> starts with a module that starts with 97.
> 
> A[17,1]
> #is 10, which means that, out of every single call that started with a
> module of 97X,XXX,
> #they went through 10 modules in total.
> 
> A[17,2]
> #is 6, which means that there was 6 calls in total that began with a
> 97X,XXX module.
> 
> #Hence,
> 
> 
> A[17,1]/A[17,2]
> 
> #is the average number of modules that were executed in all the calls that
> began with a 97X,XXX module.
> 
> 
> -
> 
> 
> Isaac
> Research Assistant
> Quantitative Finance Faculty, UTS

I don't see any need for you to use data frames.

If you make A and data (not a good use of a variable name) just matrices, you 
get the same 
answers at about 10 times the speed (using your example).

Hope this helps,
Ray Brownrigg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] short-hand to avoid use of length() in subsetting vectors?

2012-01-10 Thread Steve Lianoglou

Hi,

On Tue, Jan 10, 2012 at 6:04 PM, Eric Rupley  wrote:
>
> Hi--
>
> I suspect this is a frequently considered (and possibly asked) question, but 
> I haven't thus far found an answer:
>
>        For slicing a vector with x[…], is there a symbol for length(x)?
>
> I'm seeking a short-hand for referring to the length of vector one is trying 
> to index.
>
> E.g., for a data.frame, say,
>
>> test.frame <-data.frame(matrix(c(1:100),ncol=10,nrow=10,byrow=T))
>> names(test.frame) <- names(islands)[1:10]
>
> is there a short-hand for subsetting
>
>> test.frame$Baffin[1:(length(test.frame$Baffin)-3)]
> [1]  6 16 26 36 46 56 66
>>
>>
>
> that would allow one to avoid the "(length(some.dataframe$variable)-offset)"?

Sadly there is no shorthand of the (exact) type you are looking for.

You can access the data like so, however:

R> head(test.frame$Baffin, -3)
[1]  6 16 26 36 46 56 66

I think that's as good as you're going to get, but let's see what
other suggestions pop up.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can "prototype" and "initialize" coexist?

2012-01-10 Thread Martin Morgan

On 01/10/2012 11:47 AM, Keith Weintraub wrote:

Folks,

My object oriented background is in Java and C++. I am a novice to using 
S4/object-oriented coding in R but not to R. Below is an example that I found 
that I have expanded on.

I am not getting how "prototype" and "initialize" work together, if at all.

Here is output from a short "session"

setClass("xx",

+  representation(a="numeric", b ="numeric"),
+  prototype(a=33, b = 333)
+ )
[1] "xx"

#setMethod("initialize", "xx", function(.Object){.Object})

setMethod("initialize", "xx",

+   function(.Object, b) {
+ .Object@b<-b
+ .Object@a<-b^2
+ .Object
+   }
+ )
[1] "initialize"

new("xx", 3)

An object of class "xx"
Slot "a":
[1] 9

Slot "b":
[1] 3

new("xx")

Error in .local(.Object, ...) : argument "b" is missing, with no default

Hi Keith --

This might be a good place to start, setting aside prototype for the moment.

You'd like new("xx") to work. This means that all arguments to 
initialize need to have a default value. You'd also like new("xx", 
new("xx")) to work, because it's advertised as a copy constructor (a 
slightly loose interpretation of ?new, which says for the ... argument 
'Unnamed arguments must be objects from classes that this class 
extends'). An appropriate initialize method is

setMethod(initialize, "xx", function(.Object, ..., b=2) {
callNextMethod(.Object, ..., b=b)
})

For instance,

> setClass("xx", representation(a="numeric", b="numeric"))
> new("xx")
An object of class "xx"
Slot "a":
numeric(0)

Slot "b":
[1] 2

> new("xx", new("xx", a=1))
An object of class "xx"
Slot "a":
[1] 1

Slot "b":
[1] 2

> new("xx", new("xx", a=1), a=2, b=3)
An object of class "xx"
Slot "a":
[1] 2

Slot "b":
[1] 3

with the basic initializer out of the way, you could introduce a prototype

setClass("xx",
representation(a="numeric", b="numeric"),
prototype(a=-1, b=-2))

> new("xx")
An object of class "xx"
Slot "a":
[1] -1

Slot "b":
[1] 2

which shows that the prototype and initializer are playing well together 
(for slot 'a') and that the initialize method is over-riding the 
prototype (for slot 'b').

One could provide a default value to b in the initializer that is 
derived from the prototype, which is used to populate .Object

setMethod(initialize, "xx", function(.Object, ..., b=.Object@b) {
callNextMethod(.Object, ..., b=b)
})

with

> new("xx", a=1)
An object of class "xx"
Slot "a":
[1] 1

Slot "b":
[1] -2

e.g., maybe you want to check user input and the check is expensive...

setMethod(initialize, "xx", function(.Object, ..., b=.Object@b) {
if (!missing(b))
## check user input, expensive
Sys.sleep(2)
callNextMethod(.Object, ..., b=b)
})

Maybe a slightly more common case is to provide a prototype for one slot 
(e.g., for internal business, not the user) with initialize taking care 
of others.

It is often the case that you'd like a constructor

xx <- function(a=-1, b=-2, ...)
new("xx", a=a, b=b, ...)

which is nicer for the end user, documents the necessary arguments, 
frees one to separately implement the constructor vs. copy-constructor, 
etc. Neither prototype nor initialize method would be implemented here, 
leaving all the cleverness to the defaults.

One other thing is that the prototype must create a valid object; this 
prototype allows R to produce invalid instances:

setClass("Abs", representation(x="numeric"),
prototype(x=-1),
validity=function(object) if (any(object@x < 0)) "oops" else TRUE)

> a = new("Abs")
An object of class "Abs"
Slot "x":
[1] -1

> validObject(a)
Error in validObject(a) : invalid class "Abs" object: oops

Not sure whether that helps, or is too much information...

Martin

Any help you can provide would be greatly appreciated,
Thanks,
KW

--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] short-hand to avoid use of length() in subsetting vectors?

2012-01-10 Thread Eric Rupley


Hi--

I suspect this is a frequently considered (and possibly asked) question, but I 
haven't thus far found an answer: 

For slicing a vector with x[…], is there a symbol for length(x)?

I'm seeking a short-hand for referring to the length of vector one is trying to 
index.

E.g., for a data.frame, say,

> test.frame <-data.frame(matrix(c(1:100),ncol=10,nrow=10,byrow=T))
> names(test.frame) <- names(islands)[1:10]

is there a short-hand for subsetting

> test.frame$Baffin[1:(length(test.frame$Baffin)-3)]
[1]  6 16 26 36 46 56 66
> 
> 

that would allow one to avoid the "(length(some.dataframe$variable)-offset)"?

I was thinking something paralleling the use of negative indices in […] might 
exist with seq(from,to), e.g. for the above

> test.frame$Baffin[seq(,7)]
[1]  6 16 26 36 46 56 66
> 

works.  But the fantasy

test.frame$Baffin[seq(,-3)]

obviously doesn't…

Any suggestions will be gratefully appreciated…

As always, many thanks to the patient list members who helps on these simple 
questions...

Best,
Eric



--
 Eric Rupley
 University of Michigan, Museum of Anthropology
 1109 Geddes Ave, Rm. 4013
 Ann Arbor, MI 48109-1079
 
 erup...@umich.edu
 +1.734.276.8572

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Aggregate by minimum

2012-01-10 Thread Hadley Wickham

On Mon, Jan 9, 2012 at 8:00 PM, jim holtman  wrote:
> try this:
>
>> x <- structure(list(speed = c(3,9,14,8,7,6), result = c(0.697, 0.011, 0.015, 
>> 0.012, 0.018, 0.019), house = c(1,
> + 1, 1, 1, 1, 1), date = c(719, 1027, 1027, 1027, 1030, 1030),
> +    id = c("1000", "1",
> +    "10001", "10002", "10003", "10004")), .Names = c("speed",
> + "result", "house", "date", "id"), class = "data.frame", row.names = 
> c("1000",
> + "1", "10001", "10002", "10003", "10004"))
>>
>> require(plyr)
>> ddply(x, .(date), .fun = function(a){
> +     a[which.min(a$speed), ]
> + })

Or even more succinctly:

ddply(x, .(date), subset, speed == min(speed))

Hadley


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different results from fligner.test

2012-01-10 Thread peter dalgaard


On Jan 10, 2012, at 21:30 , gaiarrido wrote:

> I've made fligner test with the same data, changing the orders of the
> variables, and this what i get
> 
>> fligner.test(rojos~edadysexo*zona*ano*estacion)
> 
>Fligner-Killeen test of homogeneity of variances
> 
> data:  rojos by edadysexo by zona by ano by estacion 
> Fligner-Killeen:med chi-squared = 15.7651, df = 2, p-value = 0.0003773
> 
>> fligner.test(rojos~ano*edadysexo*zona*estacion)
> 
>Fligner-Killeen test of homogeneity of variances
> 
> data:  rojos by ano by edadysexo by zona by estacion 
> Fligner-Killeen:med chi-squared = 86.5317, df = 3, p-value < 2.2e-16
> 
> Different results with the same variables!!! Why? i try to find an answer,
> but i really surprised

You are assuming that you can put interactions on the rhs of formulas, but 
nowhere in the documentation does it say that that should work. All examples 
have a single grouping variables. So the behavior is undefined. As far as I can 
see, only the _first_ variable is actually being used for grouping.  

Arguably, there is potential for better argument checking in fligner.test(), 
but in the meantime, I suspect that you need something like

g <- interaction(ano, edadysexo, zona, estacion)
fligner.test(rojos ~ g)

> 
> -
> Mario Garrido Escudero
> PhD student
> Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola
> Universidad de Salamanca
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/different-results-from-fligner-test-tp4283312p4283312.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Correlograms

2012-01-10 Thread Kevin Wright

If I understand your question correctly, install the corrgram package from
CRAN.

Then,

library(corrgram)
cm <- cor(iris[ , 1:4])
corrgram(cm, type="corr")

Also, see help for the "vote" data:
?vote

Kevin Wright

On Tue, Jan 10, 2012 at 2:06 PM, Natbyah  wrote:

> I would like to make a correlogram in which I also have a correlation
> matrix
> instead of one of the panels.
> Is that possible?
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Correlograms-tp4283245p4283245.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Kevin Wright

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help

2012-01-10 Thread Peter Alspach

Tena koe Anna

[ is for subsetting, you need c():

x <- c(10.4, 5.6, 3.1, 6.4, 21.7)
y <- c(12, 5.6, 7.2, 1.0, 9.3)
plot(x, y)

HTH 

Peter Alspach

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Anna Olofsson
Sent: Wednesday, 11 January 2012 11:02 a.m.
To: r-help@r-project.org
Subject: [R] help

Hi,
I'm pretty new at programming and with the R language. I'm just trying to
get familiar with R and wrote a script in gedit (should I use emacs
instead?),

x <- [10.4  5.6  3.1  6.4 21.7]
y <- [12,5.6, 7.2, 1.0, 9.3]
plot(x,y)

then I went to the command window in the terminal (I'm using unix) to run
this with source("name_of_file"), but it doesn't work. Shouldn't a plot
come up automatically when I run it? What am I doing wrong? It knows what x
and y is, but I don't get an error of what might be wrong.

> source("name_of_file")
> x
[1] 10.4  5.6  3.1  6.4 21.7


Best,
Anna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help

2012-01-10 Thread Sarah Goslee

There might be an x in your R session, but not from that script. Try
it by pasting
those three lines at the command line:

> x <- [10.4  5.6  3.1  6.4 21.7]
Error: unexpected '[' in "x <- ["
> y <- [12,5.6, 7.2, 1.0, 9.3]
Error: unexpected '[' in "y <- ["
> plot(x,y)
Error in plot(x, y) : object 'x' not found

You should get an error message when you try to source the script, I
would think.

Try instead:
x <- c(10.4,  5.6,  3.1, 6.4, 21.7)
y <- c(12, 5.6, 7.2, 1.0, 9.3)
plot(x, y)
and rereading your introductory materials.

If that still doesn't work then we'll need more information, like OS
and version of R.

Sarah

On Tue, Jan 10, 2012 at 5:02 PM, Anna Olofsson  wrote:
> Hi,
> I'm pretty new at programming and with the R language. I'm just trying to
> get familiar with R and wrote a script in gedit (should I use emacs
> instead?),
>
> x <- [10.4  5.6  3.1  6.4 21.7]
> y <- [12,5.6, 7.2, 1.0, 9.3]
> plot(x,y)
>
> then I went to the command window in the terminal (I'm using unix) to run
> this with source("name_of_file"), but it doesn't work. Shouldn't a plot
> come up automatically when I run it? What am I doing wrong? It knows what x
> and y is, but I don't get an error of what might be wrong.
>
>> source("name_of_file")
>> x
> [1] 10.4  5.6  3.1  6.4 21.7
>
>
> Best,
> Anna
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Correlograms

2012-01-10 Thread Natbyah

I would like to make a correlogram in which I also have a correlation matrix
instead of one of the panels. 
Is that possible?

--
View this message in context: 
http://r.789695.n4.nabble.com/Correlograms-tp4283245p4283245.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can "prototype" and "initialize" coexist?

2012-01-10 Thread Keith Weintraub

Folks,

My object oriented background is in Java and C++. I am a novice to using 
S4/object-oriented coding in R but not to R. Below is an example that I found 
that I have expanded on. 

I am not getting how "prototype" and "initialize" work together, if at all.

Here is output from a short "session"

> setClass("xx",
+  representation(a="numeric", b ="numeric"),
+  prototype(a=33, b = 333)
+ )
[1] "xx"
> 
> #setMethod("initialize", "xx", function(.Object){.Object})
> 
> setMethod("initialize", "xx", 
+   function(.Object, b) {
+ .Object@b<-b
+ .Object@a<-b^2
+ .Object
+   }
+ )
[1] "initialize"
> 
> new("xx", 3)
An object of class "xx"
Slot "a":
[1] 9

Slot "b":
[1] 3

> 
> new("xx")
Error in .local(.Object, ...) : argument "b" is missing, with no default

Any help you can provide would be greatly appreciated,
Thanks,
KW

--


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help

2012-01-10 Thread Anna Olofsson

Hi,
I'm pretty new at programming and with the R language. I'm just trying to
get familiar with R and wrote a script in gedit (should I use emacs
instead?),

x <- [10.4  5.6  3.1  6.4 21.7]
y <- [12,5.6, 7.2, 1.0, 9.3]
plot(x,y)

then I went to the command window in the terminal (I'm using unix) to run
this with source("name_of_file"), but it doesn't work. Shouldn't a plot
come up automatically when I run it? What am I doing wrong? It knows what x
and y is, but I don't get an error of what might be wrong.

> source("name_of_file")
> x
[1] 10.4  5.6  3.1  6.4 21.7


Best,
Anna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] different results from fligner.test

2012-01-10 Thread gaiarrido

I've made fligner test with the same data, changing the orders of the
variables, and this what i get

> fligner.test(rojos~edadysexo*zona*ano*estacion)

Fligner-Killeen test of homogeneity of variances

data:  rojos by edadysexo by zona by ano by estacion 
Fligner-Killeen:med chi-squared = 15.7651, df = 2, p-value = 0.0003773

> fligner.test(rojos~ano*edadysexo*zona*estacion)

Fligner-Killeen test of homogeneity of variances

data:  rojos by ano by edadysexo by zona by estacion 
Fligner-Killeen:med chi-squared = 86.5317, df = 3, p-value < 2.2e-16

Different results with the same variables!!! Why? i try to find an answer,
but i really surprised

-
Mario Garrido Escudero
PhD student
Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola
Universidad de Salamanca
--
View this message in context: 
http://r.789695.n4.nabble.com/different-results-from-fligner-test-tp4283312p4283312.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to make this for() loop memory efficient?

2012-01-10 Thread iliketurtles

##I have 2 columns of data. The first column is unique "event IDs" that
represent a phone call made to a customer.
###So, if you see 3 entries together in the first column like follows:

matrix(c("call1a","call1a","call1a") )

##then this means that this particular phone call  (the first call that's
logged in the data set) was transferred 
##between 3 different "modules" before the call was terminated.

##The second column is a numerical description of the module the call
started with and then got transferred to prior to ##call termination. Now,
I'll construct a ##representative array of the type of data I'm dealing with
(the real data set goes ##on for X00,000s of rows):
##(Ignore how I construct the following array, it’s completely unrelated to
how the actual data set was constructed). 


a<-sapply(1:50,function(i){paste("call",i,sep="",collapse="")})
development.a<-seq(1,40,3)
development.a2<-seq(1,40,5)
a[development.a]<-a[development.a+1]
a[development.a2]<-a[development.a2+1]
a[1:2]<-"call2a";a[3]<-"call3a";a[4:5]<-"call5a";a[6:8]<-"call8a";a[9]<-"call9a"
b<-c(920010,960010,820009,920010,960500,970050,930010,920010,960500,970050,930900,870010,840010,960500,920010,970050,930010,960500,920010,970050,930010,960010,920010,940010,960010,970010,960500,920010,970050,930010,960500,920010,970050,930010,960500,920010,970050,930010,920010,960500,970050,930010,920009,960500,970050,930009,940010,960500,960500,960500)
data<-as.data.frame(cbind(a,b))
colnames(data)<-c("phone calls","modules")
dim(data)
print(data[1:10,]) #sample of 10 rows

# Note that in the real data set, data[,2] ranges from 810,000 to 999,999.
I've been tasked with the following:
# "For each phone call that BEGINS with the module which is denoted by 81
(i.e. of the form 81X,XXX), what is the expected number of modules in these
calls?"
#Then it's the same question for each module beginning with 82, 83, 84.
all the way until 99. 
#I've created code that I think works for this, but I can't actually run it
on the whole data set. I left it for 30 minutes and it only had about #5% of
the task completed (I clicked "STOP" then checked my output to see if I did
it properly, and it seems correct).
#I know the apply() family specializes in vector operations, but I can't
figure out how to complete the above question in any way other than #loops.

L<-data

A<-array(0,dim=c(19,2));rownames(A)<-seq(81,99,1)
A<-data.frame(A)

 for(i in 1:(nrow(L)-1))
 {
  if(L[(i+1),1]!=L[i,1])
  {
   
A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse=""),1]<-
{ 
 
A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse=""),1]+length(grep(as.character(L[i+1,1]),L[,1],value=FALSE))
 
#aggregate number of modules in the calls that begin with XX (not yet
averaged). 
}
   
A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse=""),2]<-
{
 
A[paste(strsplit(as.character(L[i+1,2]),"")[[1]][1:2],sep="",collapse=""),2]+1
}
  }
   
 }

#If I can get this code to be more memory efficient such that I can do it on
a 400,000 row data set, I can do, for example,

A[17,1]/A[17,2]

#and I'll arrive at the mean number of modules per call where the call
starts with a module that starts with 97.

A[17,1] 
#is 10, which means that, out of every single call that started with a
module of 97X,XXX,
#they went through 10 modules in total. 

A[17,2] 
#is 6, which means that there was 6 calls in total that began with a 97X,XXX
module.

#Hence,


A[17,1]/A[17,2]

#is the average number of modules that were executed in all the calls that
began with a 97X,XXX module.


-


Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-make-this-for-loop-memory-efficient-tp4283594p4283594.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem installing packages

2012-01-10 Thread R. Michael Weylandt

What lists are you referring to when you state: "there are many packages that 
do not show up in the list of binaries. They do in the list of sources"? CRAN? 
To see all packages installed on your machine try 

rownames(installed.packages(()) 

I think available.packages() will give packages available from your local CRAN 
mirror. 

Just a hunch, but if you are seeing different behaviors between home and work 
it may well be that a work-firewall is blocking packages which contain 
pre-compiled DLLs/SOs while allowing the source code through. 

Also, if you aren't on Windows the vast majority of packages are quite easy to 
learn compile yourself

Michael

On Jan 10, 2012, at 4:01 PM, natalia norden  wrote:

> Thank you very much for your answers. I could do it by downloading the
> package I needed manually and then installing it through the Terminal. Yet
> the fundamental problem remains. I downloaded R 2.14.1 several times from
> different mirrors and there are many packages that do not show up in the
> list of binaries. They do in the list of sources, but then I have a
> problem compiling the package... I did not have this problem from my home
> computer when I installed R 2.14.0.
> 
> Best,
> Natalia
> 
> 
> El 10/01/12 14:52, "Ken Hutchison"  escribió:
> 
>> Maybe check your proxy settings in your browser and make sure you're
>> connecting to the mirror.
>>   Ken
>> 
>> Sent from my iPhone
>> 
>> On Jan 10, 2012, at 8:35 AM, natalia norden  wrote:
>> 
>>> Hello, 
>>> 
>>> I was using version 2.13.2 and I have just downloaded the latest version
>>> 2.14.1. However, I'm trying to install the packages I was using and
>>> when I
>>> look for them in the packages list, I can´t find many in the CRAN
>>> binaries
>>> (e.g. "vegan"). I do find them in the CRAN sources but the installation
>>> fails. I tried downloading the version 2.14.0 and I had the same
>>> problem. I
>>> re-installed the old version, and now it works again. Is this a problem
>>> with
>>> 2.14?
>>> Thank you for your help.
>>> Natalia Norden
>>> 
>>> 
>>> 
>>> Natalia Norden
>>> Profesor Asistente
>>> Departamento de Ecología y Territorio
>>> Facultad de Estudios Ambientales y Rurales
>>> Pontificia Universidad Javeriana
>>> Bogotá, Colombia
>>> Tel: 320 83 20  Ext: 2448
>>> www.phylodiversity.net/nnorden/
>>> 
>>> 
>>> 
>>>   [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem installing packages

2012-01-10 Thread natalia norden

Thank you very much for your answers. I could do it by downloading the
package I needed manually and then installing it through the Terminal. Yet
the fundamental problem remains. I downloaded R 2.14.1 several times from
different mirrors and there are many packages that do not show up in the
list of binaries. They do in the list of sources, but then I have a
problem compiling the package... I did not have this problem from my home
computer when I installed R 2.14.0.

Best,
Natalia

El 10/01/12 14:52, "Ken Hutchison"  escribió:

>Maybe check your proxy settings in your browser and make sure you're
>connecting to the mirror.
>Ken
>
>Sent from my iPhone
>
>On Jan 10, 2012, at 8:35 AM, natalia norden  wrote:
>
>> Hello, 
>> 
>> I was using version 2.13.2 and I have just downloaded the latest version
>> 2.14.1. However, I'm trying to install the packages I was using and
>>when I
>> look for them in the packages list, I can´t find many in the CRAN
>>binaries
>> (e.g. "vegan"). I do find them in the CRAN sources but the installation
>> fails. I tried downloading the version 2.14.0 and I had the same
>>problem. I
>> re-installed the old version, and now it works again. Is this a problem
>>with
>> 2.14?
>> Thank you for your help.
>> Natalia Norden
>> 
>> 
>> 
>> Natalia Norden
>> Profesor Asistente
>> Departamento de Ecología y Territorio
>> Facultad de Estudios Ambientales y Rurales
>> Pontificia Universidad Javeriana
>> Bogotá, Colombia
>> Tel: 320 83 20  Ext: 2448
>> www.phylodiversity.net/nnorden/
>> 
>> 
>> 
>>[[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting BY to a data.frame

2012-01-10 Thread David Winsemius



On Jan 10, 2012, at 1:36 PM, Ramiro Barrantes wrote:


Hello,

I am trying to convert BY to a data frame, consider the following  
example:


exampleDF<-data.frame(a=c(1,2),b=c(10,20),name=c("first","second"))
exampleBY<-by(exampleDF,with(exampleDF,paste(a,b,sep="_")),
  function(x) {
data.frame(
name=as.character(x$name),
a=x$a,
b=x$b,
c=x$a + x$b)
  }
)



It would be less confusing when the time comes around to coercing  
vectors if you added stringsAsFactors =FALSE


exampleBY<-by(exampleDF,with(exampleDF,paste(a,b,sep="_")),
  function(x) {
data.frame(
name=as.character(x$name),
a=x$a,
b=x$b,
c=x$a + x$b, stringsAsFactors=FALSE)
  }

Then just:

do.call(rbind, exampleBY)
   name a  b  c
1_10  first 1 10 11
2_20 second 2 20 22



I made this function:

convertByToDataFrame <- function( byObject ) {
data 
.frame 
(matrix 
(unlist 
(byObject 
),nrow 
= 
length 
(byObject 
),ncol 
= 
length 
(byObject[[1]]),byrow=TRUE,dimnames=list(NULL,names(byObject[[1]]

}

but when I run it:
convertByToDataFrame(exampleBY)

I either: (1) loose the types of the different values in the BY, or  
2) get factors instead of the values


I tried the following but it doesn't scale:

exampleSUMRY <- data.frame(a =  
as 
.vector 
(unlist(lapply(strsplit(names(exampleBY),split="_"),function(x)  
x[1]))),
   b =  
as 
.vector 
(unlist(lapply(strsplit(names(exampleBY),split="_"),function(x)  
x[2]))),

name=as.vector(unlist(lapply(exampleBY,function(x) x[[1]]


Any suggestions?

Thank you,
Ramiro
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotOHLC(alpha3): Error in plotOHLC(alpha3) : x is not a open/high/low/close time series

2012-01-10 Thread Ted Byers

R version 2.12.0, 64 bit on Windows.

 

Here is a short script that illustrates the problem:

 

library(tseries)

library(xts)

setwd('C:\\cygwin\\home\\Ted\\New.Task\\NKs-01-08-12\\NKs\\tests')

x = read.table("quotes_h.2.dat", header = FALSE, sep="\t", skip=0)

str(x)

y <- data.frame(as.POSIXlt(paste(x$V2,substr(x$V4,4,8),sep="
"),format='%Y-%m-%d %H:%M'),x$V5)

colnames(y) <- c("tickdate","price")

str(y)

plot(y)

z <- as.irts(y)

str(z)

plot(z)

str(alpha3)

List of 2

$ time : POSIXt[1:98865], format: "2010-06-30 15:47:00" "2010-06-30
15:53:00" "2010-06-30 17:36:00" ...

$ value: num [1:98865, 1:4] 9215 9220 9205 9195 9195 ...

  ..- attr(*, "dimnames")=List of 2

  .. ..$ : NULL

  .. ..$ : chr [1:4] "z.Open" "z.High" "z.Low" "z.Close"

- attr(*, "class")= chr "ts"

- attr(*, "tsp")= num [1:3] 1 2 1

alpha3 <- as.xts(to.minutes3(z,OHLC = TRUE))

plotOHLC(alpha3)

Error in plotOHLC(alpha3) : x is not a open/high/low/close time series

 

The file quotes_h.2.dat contains real time tick data for futures contracts,
so the above manipulation is my attempt to just get a time series with one
column being a date/time and the other being tick price.  I believe I have
to use read.table to make a data frame, and then the manipulations to
combine the date and time fields from that feed, along with the price.

 

My first attempt at using to.minutes3 (and I am interested in the other
'to.period' functions too), is to get a regular time series to which I can
apply rollapply, along with a function in which I use various autoregression
methods, along with forecasting for as long as the 95% confidence intervals
is reasonably close - I want to know how far into the future the forecast
contains useful information.  And then, I want to create a plot in which I
do the autoregression, and then plot the actual and forecast prices (along
with the confidence interval), as a function of time, embed that in a
function, which rollappply works with, so I can have a plot comprised of all
those individual plots (plotting only the comparison of actual and forecast
values).

 

It seems everything works adequately until I try the plotOHLC function
itself, which gives me the error in the subject line.

 

I would ask for two things: 

 

1) what the fix is to get rid of that error plotOHLC gives me

2) some tips on the 'walk-forward' method I am looking at using.

 

Thanks

 

Ted


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmmPQL and predict

2012-01-10 Thread Prof Brian Ripley

The whole of idea of 'level' in mixed models is confusing to some. 
Professor Snijders (who teaches our students) and Professor Bates label 
from opposite ends.


But, assuming this is my work in package MASS (Master Harwood: it is 
childish, to put it mildly, to fail to give due credit), it follows lme 
in package nlme: glmmPQL's predict method just passes this on to nlme, 
and the documentation is identical.


So not only was it unfair to fail to mention which package and whose 
work this was, it was even more unfair to attribute personal lack of 
understanding to my work and not package nlme.


Just because R and many contributed packages are free does not entitle 
you to treat them as zero-value: very much to the contrary.


On 10/01/2012 16:38, Ben Bolker wrote:

Mike Harwood  gmail.com>  writes:


Is the labeling/naming of levels in the documentation for the
predict.glmmPQL function "backwards"?  The documentation states "Level
values increase from outermost to innermost grouping, with level zero
corresponding to the population predictions".  Taking the sample in
the documentation:

fit<- glmmPQL(y ~ trt + I(week>  2), random = ~1 |  ID,
family = binomial, data = bacteria)


head(predict(fit, bacteria, level = 0, type="response"))

[1] 0.9680779 0.9680779 0.8587270 0.8587270 0.9344832 0.9344832

head(predict(fit, bacteria, level = 1, type="response"))

   X01   X01   X01   X01   X02   X02
0.9828449 0.9828449 0.9198935 0.9198935 0.9050782 0.9050782

head(predict(fit, bacteria, type="response")) ## population prediction

   X01   X01   X01   X01   X02   X02
0.9828449 0.9828449 0.9198935 0.9198935 0.9050782 0.9050782

The returned values for level=1 and level=  match, which
is not what I expected based upon the documentation.


   Well, the documentation says: "Defaults to the highest or innermost level of
grouping", which is level 1 in this case -- right?


Exponentiating
the intercept coefficients from the fitted regression, the level=0
values match when the random effect intercept is included


   Do you mean "is NOT included" here?

   0.9680779 (no random effect, below) matches the level=0 prediction above
   0.9828449 (include random effect, below) matches the level=1 prediction,
which is also the default prediction, above.




1/(1+exp(-3.412014)) ## only the fixed effect

[1] 0.9680779

1/(1+exp(-1*(3.412014+0.63614382))) ## fixed and random effect intercepts

[1] 0.9828449


   This all matches my expectations.  If your expectations still go
in the other direction, could you explain in more detail?

   By the way, I recommend r-sig-mixed-mod...@r-project.org for
mixed model questions in general ...

   Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] grplasso

2012-01-10 Thread Scott Raynaud

I want to use the grplasso package on a data set where I want to fit a linear 
model.  My interest is in identifying significant beta coefficients.  The 
documentation is a bit cryptic so I'd appreciate some help.
 
I know this is a strategy for large numbers of variables but consider a simple 
case for pedagogical puposes.  Say I have two 3 category predictors (2 dummies 
each), a binary predictor and a continuous predictor with a continuous outcome:
 
y  x1  x2  x3  x4 x5 x6
rows of data here
..
..
 
Naturally, I want to select x1 and x2 as a group and x3 and x4 as another 
group.  
The documentation has a couple of examples but it's not clear how they 
translate 
to the current problem.  How do I specify my groups and run the lasso 
regression?
 
Looks like this is the grouping part:
 
index<-c(NA,)
 
but I'm not sure how to specify the df for the variables past the NA for the 
intercept.
 
Once that's defined the penalty can be specified:
 
lambda <- lambdamax(x, y = y, index = index, penscale = sqrt,
model = LogReg()) * 0.5^(0:5) 
In my case I'd use LinReg for the model.  
 
Then the model:
 
fit <- grplasso(x, y = y, index = index, lambda = lambda, model = LogReg(),
penscale = sqrt, control = grpl.control(update.hess = "lambda", trace = 0))
 
again using LinReg for the model.

This can be plotted against lambda, but when I do lasso regression 
in other software I end up with a plot of the coefficients against the 
tuning parameter with a cutpoint or a table and graph that tells me 
what to include in the model based on some selected criterion.  
It's not clear from the example if there's a cross-validation or some 
other procedure to determine what variables to include.  Plot(fit) 
produces a graph of coefficients against lambda but nothig to indicate 
what to include.  What is used in the package, if anything, to make that 
determination?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Restricting R session

2012-01-10 Thread Antonio Rodriges

Hello,

Is it possible to use R on public server where each user has its own
restricted R session?
In particular, how to prohibit some set of functions, for example,
from "base" package?
How to limit session operating memory and CPU time? What additional
security considerations must be taken care of?

-- 
Kind regards,
Antonio Rodriges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fwd: Sum of a couple of variables of which a few have NA values

2012-01-10 Thread PetraOpic

Dear Ivan,

Thank you very much for your help.

How do I use rowSums if I need to "skip" a variable from summing? (example:
sum var1, var2, var3, var5, var34 only).

Thanks in advance,
Petra Opic

--
View this message in context: 
http://r.789695.n4.nabble.com/Sum-of-a-couple-of-variables-of-which-a-few-have-NA-values-tp4282448p4282969.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rpart vs. tree and deviance calculations

2012-01-10 Thread Josh Browning

Hi Everyone,

 

I'm working on building some classification trees, and up to this point
I've been using rpart.  However, I recently discovered the tree package,
and found that it had some useful functions (in particular deviance(),
which I would really like to use for my project).  I can't seem to find
an equivalent function for rpart.  I've considered using tree() in place
of rpart(), but I read an old post on this list that recommended using
rpart() as tree() was built for bug-checking in S.  So, to summarize:

 

Is there a way to compute deviance for an rpart object?

If not, would it be a bad idea to use tree() in place of rpart()?  My
rpart models are currently quite large and take ~ 15 minutes to run...

 

Thanks everyone!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error when using foreach package for parralelization

2012-01-10 Thread Mikko Korpela

On 01/03/2012 03:19 PM, Julien Textoris wrote:

> I'm trying to parallelize the following R code :
> 
> pk2test =
> c(1:16,(12*16+1):(12*16+16),(16*16+1):(16*16+16),(20*16+1):(20*16+16))
> score.mat = matrix(nc=16*4,nr=16*4)
> for(i in 1:(16*4)) {
> 
>   for(j in i:(16*4)) {
>   score.mat[i,j] = score.mat[j,i] =
> computeScore(pk[[pk2test[i]]],pk[[pk2test[j]]],10,5)$score
>   }
> }
> 
> pk is a list of Object of type MassPeak from MALDIquant library. Each
> object is composed with a mass vector (@mass) an intensity vector
> (@intensity) and a metaData field (another list)
> 
> score.mat is a matrix with scores (reals)
> pk2test is just a vector to know which objects in pk i want to deal with
> computeScore is the function i wrote to compute the score, it calls
> another function called filterSpectra
> 
> 
> I write the function like that, and i got the error below, and i can't
> figure out why ?
> 
> pk2test =
> c(1:16,(12*16+1):(12*16+16),(16*16+1):(16*16+16),(21*16+1):(21*16+16))
> score.mat = matrix(nc=16*4,nr=16*4)
> for(i in 1:4) {
>   score.mat[i,i:4] =
>   foreach(filterSpectra=filterSpectra,
>   computeScore=computeScore,
>   pk=pk,pk2test=pk2test,i=i,j=c(i:4),
>   .combine="c", .packages="MALDIquant" ) %dopar% {
>   computeScore(pk[[pk2test[i]]],pk[[pk2test[j]]],10,5)$score
>   }
> }
> 
> Error: trying to get slot "mass" from an object of a basic class
> ("integer") with no slots.

Hi Julien!

Are these two code sequences supposed to produce the same result? The
two definitions of pk2test are slightly different. Also, in the
attempted parallelized version, you are only assigning to a small part
of score.mat. Is that intentional?

The real error in this case seems to be that you mistakenly redefine
some variables in the foreach() call. As far as I can tell, you should
not redefine the variables 'filterSpectra', 'computeScore', 'pk',
'pk2test' or 'i'. In the foreach() call, you should only define
iteration variables, i.e. variables whose value changes from one
iteration to another (like 'j'). Now you actually accidentally iterate
over some data structures. For example, 'pk' inside the %dopar% loop is
a single element of the original 'pk' list (which may get overwritten,
depending on whether the loop is actually run in parallel). This is
probably not what you want.

- Mikko Korpela

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Converting BY to a data.frame

2012-01-10 Thread Ramiro Barrantes

Hello,

I am trying to convert BY to a data frame, consider the following example:

exampleDF<-data.frame(a=c(1,2),b=c(10,20),name=c("first","second"))
exampleBY<-by(exampleDF,with(exampleDF,paste(a,b,sep="_")),
  function(x) {
    data.frame( 
    name=as.character(x$name),
    a=x$a,
    b=x$b,
    c=x$a + x$b)
  }
    )

I made this function:

convertByToDataFrame <- function( byObject ) {  
data.frame(matrix(unlist(byObject),nrow=length(byObject),ncol=length(byObject[[1]]),byrow=TRUE,dimnames=list(NULL,names(byObject[[1]]
}

but when I run it:
convertByToDataFrame(exampleBY)

I either: (1) loose the types of the different values in the BY, or 2) get 
factors instead of the values

I tried the following but it doesn't scale:

exampleSUMRY <- data.frame(a = 
as.vector(unlist(lapply(strsplit(names(exampleBY),split="_"),function(x) 
x[1]))),
   b = 
as.vector(unlist(lapply(strsplit(names(exampleBY),split="_"),function(x) 
x[2]))),
   name=as.vector(unlist(lapply(exampleBY,function(x) 
x[[1]]

Any suggestions?

Thank you,
Ramiro
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating rolling mean by group

2012-01-10 Thread Sam Albers

Thanks for getting me on the right path Gabor! I have one outstanding
issue though.

On Mon, Jan 9, 2012 at 4:21 PM, Gabor Grothendieck
 wrote:
> On Mon, Jan 9, 2012 at 6:39 PM, Sam Albers  wrote:
>> Hello all,
>>
>> I am trying to determine how to calculate rolling means in R using a
>> grouping variable. Say I have a dataframe like so:
>>
>> dat1 <- data.frame(x = runif(2190, 0, 125), year=rep(1995:2000,
>> each=365), jday=1:365, site="here")
>> dat2 <- data.frame(x = runif(2190, 0, 200), year=rep(1995:2000,
>> each=365), jday=1:365, site="there")
>> dat <- rbind(dat1,dat2)
>>
>> ## What I would like to do is calculate a rolling 7 day mean
>> separately for each site. I have looked at both
>> ## rollmean() in the zoo package and running.mean() in the igraph
>> package but neither seem to have led
>> ## me to calculating a rolling mean by group. My first thought was to
>> use the plyr package but I am confused
>> ## by this output:
>>
>> library(plyr)
>> library(zoo)
>>
>> ddply(dat, c("site"), function(df) return(c(roll=rollmean(df$x, 7
>>
>> ## Can anyone recommend a better way to do this or shed some light on
>> this output?
>>
>
> Using dat in the question, try this:
>
> library(zoo)
> z <- read.zoo(dat, index = 2:3, split = 4, format = "%Y %j")
> zz <- rollmean(z, 7)
>
> The result, zz, is a multivariate zoo series with one column per group.

Using the zoo approach works well except that an wrinkle in my dataset
not reflected in the sample data caused some problems. I am actually
dealing with a situation where there is an unequal number of
observations in each group like the below data set

library(zoo)

dat1 <- data.frame(x = runif(2190, 0, 125), year=rep(1995:2000,
each=365), jday=1:365, site="here")
dat2 <- data.frame(x = runif(4380, 0, 200), year=rep(1989:2000,
each=365), jday=1:365, site="there")
dat <- rbind(dat1,dat2)

## When I use read.zoo everything is read in fine
z <- read.zoo(dat, index = 2:3, split = 4, format = "%Y %j")

## But when I use rollmean to get a 7 day average for both the 'here'
and 'there' columns only the 'there' column 7 day
## average is calculated
zz <- rollmean(z, 7)

Any thoughts on how I can then calculate a rolling mean on groups
where there is an unequal number of observations in each group?

Thanks for the previous post and in advance.

Sam

>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 2 sample wilcox.test != kruskal.test

2012-01-10 Thread Łukasz Ręcławowicz

2012/1/10 syrvn 

> And why does kruskal.test(x~y) differ from kruskal.test(f~d)??
>

Your formula is wrong, but function doesn't see errors.
"formula
a formula of the form lhs ~ rhs where lhs gives the data values and rhs the
corresponding groups."
And that leads to kruskal.test(d~as.factor(f)) which is fine.



-- 
Mi³ego dnia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem installing packages

2012-01-10 Thread Uwe Ligges




On 10.01.2012 14:35, natalia norden wrote:

Hello,

I was using version 2.13.2 and I have just downloaded the latest version
2.14.1. However, I'm trying to install the packages I was using and when I
look for them in the packages list, I can´t find many in the CRAN binaries
(e.g. "vegan"). I do find them in the CRAN sources but the installation
fails. I tried downloading the version 2.14.0 and I had the same problem. I
re-installed the old version, and now it works again. Is this a problem with
2.14?


No. According to
http://cran.r-project.org/web/checks/check_results_vegan.html
the package works fine with R-2.14.1 (aka R-release).

Uwe Ligges



Thank you for your help.
Natalia Norden



Natalia Norden
Profesor Asistente
Departamento de Ecología y Territorio
Facultad de Estudios Ambientales y Rurales
Pontificia Universidad Javeriana
Bogotá, Colombia
Tel: 320 83 20  Ext: 2448
www.phylodiversity.net/nnorden/



[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 2 sample wilcox.test != kruskal.test

2012-01-10 Thread syrvn

Hello,


I think I am right in saying that a 2 sample wilcox.test is equal to a 2
sample kruskal.test and

a 2 sample t.test is equal to a 2 sample anova. This is also stated in the
?kruskal.test man page:

The Wilcoxon rank sum test (wilcox.test) as the special case for two
samples; lm together with anova for performing one-way location analysis
under normality assumptions; with Student's t test (t.test) as the special
case for two samples.


>From this example it seems like it doesn't but I cannot figure out what I am
doing wrong.


x <- c(10,11,15,8,16,12,20)
y <- c(10,14,18,25,28,30,35)
f <- c(rep("a",7), rep("b",7))
d <- c(x,y)

wilcox.test(x,y)
kruskal.test(x,y)
kruskal.test(x~y)
kruskal.test(f~d)

t.test(x,y)
anova(lm(x~y))
summary(aov(lm(x~y)))


And why does kruskal.test(x~y) differ from kruskal.test(f~d)??


Cheers

--
View this message in context: 
http://r.789695.n4.nabble.com/2-sample-wilcox-test-kruskal-test-tp4282888p4282888.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding Institution-Affiliation to Description File of R Package

2012-01-10 Thread Duncan Murdoch


On 10/01/2012 1:02 PM, Ben Ganzfried wrote:

Hi,

I'm just finishing up an R package and I was wondering if anyone knows how
to include institution name in the "Description File."  That is, my current
Description File looks like:

Package: curatedCancerData
Type: Package
Title: Cancer Gene Expression Analysis
Version: 1.0
Date: 2011-12-24
Author: Benjamin F. Ganzfried, et al.
Maintainer: Benjamin F. Ganzfried
Description: The curatedCancerData package provides relevant functions and
data for gene expression analysis.
License: GPL (>= 3)


Ideally I would like the final PDF to have footnotes showing the university
affiliations of all the authors (more will be listed than currently are
above).  I would greatly appreciate any suggestions!


Write a curatedCancerData-package.Rd topic, and put the information in 
there.  It won't show up on the title page of the reference manual, but 
it will be the first topic.  See "Documenting Packages" (section 2.1.4, 
I think) in the "Writing R Extensions" manual.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] colored outliers

2012-01-10 Thread Justin Haynes

woops! see inline.

Hope that helps, and enjoy R.

Justin

On Tue, Jan 10, 2012 at 8:40 AM, Geophagus
wrote:

> Hi Justin,
> thanks a lot for your quick answer.
> If I use your code, all points become red.
> How do you include the sorted and separated four values into the "points"
> argument?
> The variable in your script is called "circ" but this is not fronted up
> anymore.
> Here the script again:
>
>
> TOC_NI<-read.csv2("C:/Users/hilliges/Desktop/Master/Daten/Statistik/TOC-NI.csv",
> sep=";", dec=",", encoding="UTF-8")
>

this line just needs trimming.  not sure how i missed that on my copy...
anyway, order puts the data.frame in order of the given vector, default
behavior sorts in ascending order unless you specify decreasing=TRUE.

circ<-TOC_NI[order(TOC_NI$NI,decreasing=T),][1:4,]
>

and it should work

> plot(NI~TOC,data=TOC_NI,col="blue", pch=16, xlim=c(0,450))
> abline(lm(NI~TOC,data=TOC_NI),col = "red",lwd=3)
> points(NI~TOC,data=TOC_NI,col='red',pch=1,size=3)
>
> Thanks a lot for your help!
> GeO
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/colored-outliers-tp4282207p4282481.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
__
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding Institution-Affiliation to Description File of R Package

2012-01-10 Thread Ben Ganzfried

Hi,

I'm just finishing up an R package and I was wondering if anyone knows how
to include institution name in the "Description File."  That is, my current
Description File looks like:

Package: curatedCancerData
Type: Package
Title: Cancer Gene Expression Analysis
Version: 1.0
Date: 2011-12-24
Author: Benjamin F. Ganzfried, et al.
Maintainer: Benjamin F. Ganzfried 
Description: The curatedCancerData package provides relevant functions and
data for gene expression analysis.
License: GPL (>= 3)


Ideally I would like the final PDF to have footnotes showing the university
affiliations of all the authors (more will be listed than currently are
above).  I would greatly appreciate any suggestions!

Thanks!

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Online 'Beginner's Guide to R' course (with video)

2012-01-10 Thread Highland Statistics Ltd


Apologies for cross-posting


We would like to announce an on-line   'Beginner's Guide to R'  course
With video presentations of theory and solutions



For details:  http://www.highstat.com/statscourse.htm

Kind regards,

Alain Zuur

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] runif with condition

2012-01-10 Thread AlanM

I have to disagree with what's been posted, but I think some very interesting
points have been addressed.  I'd like to add my two cents.  

Consider the pair {X, 1-X} where X is sampled from a uniform(0,1)
distribution.  The quantity 1- X also comes from a uniform(0,1) distribution
and therefore is probabilistic and not deterministic.

The sum of independent random variables is itself a random variable.  If X1,
X2 & X3 are uniformly distributed, then the distribution of Y = X1 + X2 + X3
can be determined (i.e. Y is probabilistic and NOT deterministic).  Y is a
random variable, but it is correlated with X1, X2 and X3.  The set {X1, X2,
X3, 100 - (X1 + X2 + X3) } contains 4 random variables, however they are
neither independent or identically distributed. 

 If you are curious, check this out.

Deriving the Probability Density for Sums of Uniform Random Variables
Edward J. Lusk and Haviland Wright
The American Statistician
Vol. 36, No. 2 (May, 1982), pp. 128-130 

Thanks to the OP.  This has become an interesting thread.  

-Alan Mitchell

--
View this message in context: 
http://r.789695.n4.nabble.com/runif-with-condition-tp4278704p4282600.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum of a couple of variables of which a few have NA values

2012-01-10 Thread Filoche

x = runif(10)
x[4] = NA
sum(x, na.rm = T)

--
View this message in context: 
http://r.789695.n4.nabble.com/Sum-of-a-couple-of-variables-of-which-a-few-have-NA-values-tp4282448p4282483.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] colored outliers

2012-01-10 Thread Geophagus

Hi Justin,
thanks a lot for your quick answer.
If I use your code, all points become red.
How do you include the sorted and separated four values into the "points"
argument?
The variable in your script is called "circ" but this is not fronted up
anymore.
Here the script again:

TOC_NI<-read.csv2("C:/Users/hilliges/Desktop/Master/Daten/Statistik/TOC-NI.csv",
sep=";", dec=",", encoding="UTF-8")
circ<-TOC_NI[order(TOC_NI$NI,decreasing=T),]
plot(NI~TOC,data=TOC_NI,col="blue", pch=16, xlim=c(0,450))
abline(lm(NI~TOC,data=TOC_NI),col = "red",lwd=3)
points(NI~TOC,data=TOC_NI,col='red',pch=1,size=3) 

Thanks a lot for your help!
GeO



--
View this message in context: 
http://r.789695.n4.nabble.com/colored-outliers-tp4282207p4282481.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about R 2.15.0

2012-01-10 Thread Duncan Murdoch


On 10/01/2012 11:31 AM, Wang, Jing wrote:

Dear Sir/Madam,

I want to download R development version 2.15.0 source code. But I just found 
the version for windows and MacOS. So would you please give me some instruction 
about how can I download the R 2.15.0 source code? Thank you very much for you 
help!


Go to your local mirror of CRAN, and look in the second box "Source Code 
for all Platforms" on the main page.  There is no 2.15.0 (it hasn't been 
released yet); the closest we have is the daily snapshot of the 
development version (R-devel).  You might instead want R-patched, which 
contains bug fixes to 2.14.1, and which will eventually be released as 
2.14.2.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum of a couple of variables of which a few have NA values

2012-01-10 Thread David Winsemius



On Jan 10, 2012, at 11:25 AM, Petra Opic wrote:


Dear everyone,

I have looked all over the internet but I cannot find a way to solve  
my problem.


? rowSums   # has an na.rm argument



In my data I want to sum a couple of variables. Some of these
variables have NA values, and when I add them together, the result is
NA

snip

attach(dat)


You would be well-advised to forget `attach`. Use with(dat, ...)  
instead. It will prevent frustration and embarrassing postings to rhelp.




dat$sum <- var2 + var3 + var4


The plus infix operator does not have an na.rm argment

dat$sum <- rowSums(dat[ , 2:4] , na.rm=TRUE)

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: Sum of a couple of variables of which a few have NA values

2012-01-10 Thread Ivan Calandra


Hi Petra,

Try this:
dat$sums <- rowSums(dat[3:5], na.rm=TRUE)

I think this should do what you're looking for
HTH,
Ivan

 Message original 
Sujet:  [R] Sum of a couple of variables of which a few have NA values
Date :  Tue, 10 Jan 2012 17:25:21 +0100
De :Petra Opic 
Pour :  r-help@r-project.org



Dear everyone,

I have looked all over the internet but I cannot find a way to solve my problem.

In my data I want to sum a couple of variables. Some of these
variables have NA values, and when I add them together, the result is
NA

dat<- data.frame(
id = gl(5,1),
var1 = rnorm(5, 10),
var2 = rnorm(5, 7),
var3 = rnorm(5, 6),
var4 = rnorm(5, 3),
var5 = rnorm(5, 8)
)
dat[3,3]<- NA
dat[4,5]<- NA


 dat

 id  var1 var2 var3 var4 var5
1  1  9.371328 7.830814 5.032541 3.491053 7.626418
2  2 10.413516 7.333630 6.557178 1.465597 8.591770
3  3 10.967073   NA 6.674079 3.946451 7.251263
4  4  9.900380 7.727111 5.059698   NA 6.632962
5  5  9.191068 7.901271 6.652410 2.734856 8.484757

attach(dat)

dat$sum<- var2 + var3 + var4 # I think I'm doing this wrong, but I
don't know what command to use


 dat

  id  var1 var2 var3 var4 var5  sum
1  1  9.371328 7.830814 5.032541 3.491053 7.626418 16.35441
2  2 10.413516 7.333630 6.557178 1.465597 8.591770 15.35640
3  3 10.967073   NA 6.674079 3.946451 7.251263   NA
4  4  9.900380 7.727111 5.059698   NA 6.632962   NA
5  5  9.191068 7.901271 6.652410 2.734856 8.484757 17.28854

I would like to omit the values of NA and just sum the rest.

I tried to use rowSums() but that sums an entire row and I only need a
few variables.

Does anyone know how to do this?

Thanks in advance,
Petra

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Ivan CALANDRA
Université de Bourgogne
UMR CNRS 5561 Biogéosciences
6 Boulevard Gabriel
21000 Dijon, FRANCE
ivan.calan...@u-bourgogne.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmmPQL and predict

2012-01-10 Thread Ben Bolker

Mike Harwood  gmail.com> writes:

> Is the labeling/naming of levels in the documentation for the
> predict.glmmPQL function "backwards"?  The documentation states "Level
> values increase from outermost to innermost grouping, with level zero
> corresponding to the population predictions".  Taking the sample in
> the documentation:
> 
> fit <- glmmPQL(y ~ trt + I(week > 2), random = ~1 |  ID,
>family = binomial, data = bacteria)
> 
> > head(predict(fit, bacteria, level = 0, type="response"))
> [1] 0.9680779 0.9680779 0.8587270 0.8587270 0.9344832 0.9344832
> > head(predict(fit, bacteria, level = 1, type="response"))
>   X01   X01   X01   X01   X02   X02
> 0.9828449 0.9828449 0.9198935 0.9198935 0.9050782 0.9050782
> > head(predict(fit, bacteria, type="response")) ## population prediction
>   X01   X01   X01   X01   X02   X02
> 0.9828449 0.9828449 0.9198935 0.9198935 0.9050782 0.9050782
> 
> The returned values for level=1 and level= match, which
> is not what I expected based upon the documentation.

  Well, the documentation says: "Defaults to the highest or innermost level of
grouping", which is level 1 in this case -- right?

> Exponentiating
> the intercept coefficients from the fitted regression, the level=0
> values match when the random effect intercept is included

  Do you mean "is NOT included" here?

  0.9680779 (no random effect, below) matches the level=0 prediction above
  0.9828449 (include random effect, below) matches the level=1 prediction,
which is also the default prediction, above.

> 
> > 1/(1+exp(-3.412014)) ## only the fixed effect
> [1] 0.9680779
> > 1/(1+exp(-1*(3.412014+0.63614382))) ## fixed and random effect intercepts
> [1] 0.9828449

  This all matches my expectations.  If your expectations still go
in the other direction, could you explain in more detail?

  By the way, I recommend r-sig-mixed-mod...@r-project.org for
mixed model questions in general ...

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with segmented

2012-01-10 Thread Filoche

Hi everyone.

I'm trying to use the segmented function with the following data:


For instance, I use segmented package as follow:

myreg2 = lm(xy$y ~ xy$x)
mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000), control =
seg.control(display=FALSE))

Which get me to the following error : 


As a break point, a starting guess of 245000 seems fair.

Anyone has an idea why I'm getting such error?

Regards,
Phil

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-segmented-tp4282398p4282398.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about R 2.15.0

2012-01-10 Thread Wang, Jing

Dear Sir/Madam,

I want to download R development version 2.15.0 source code. But I just found 
the version for windows and MacOS. So would you please give me some instruction 
about how can I download the R 2.15.0 source code? Thank you very much for you 
help!

Best,
Jing Wang


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sum of a couple of variables of which a few have NA values

2012-01-10 Thread Petra Opic

Dear everyone,

I have looked all over the internet but I cannot find a way to solve my problem.

In my data I want to sum a couple of variables. Some of these
variables have NA values, and when I add them together, the result is
NA

dat <- data.frame(
id = gl(5,1),
var1 = rnorm(5, 10),
var2 = rnorm(5, 7),
var3 = rnorm(5, 6),
var4 = rnorm(5, 3),
var5 = rnorm(5, 8)
)
dat[3,3] <- NA
dat[4,5] <- NA

> dat
 id      var1     var2     var3     var4     var5
1  1  9.371328 7.830814 5.032541 3.491053 7.626418
2  2 10.413516 7.333630 6.557178 1.465597 8.591770
3  3 10.967073       NA 6.674079 3.946451 7.251263
4  4  9.900380 7.727111 5.059698       NA 6.632962
5  5  9.191068 7.901271 6.652410 2.734856 8.484757

attach(dat)

dat$sum <- var2 + var3 + var4 # I think I'm doing this wrong, but I
don't know what command to use

> dat
  id  var1 var2 var3 var4 var5  sum
1  1  9.371328 7.830814 5.032541 3.491053 7.626418 16.35441
2  2 10.413516 7.333630 6.557178 1.465597 8.591770 15.35640
3  3 10.967073   NA 6.674079 3.946451 7.251263   NA
4  4  9.900380 7.727111 5.059698   NA 6.632962   NA
5  5  9.191068 7.901271 6.652410 2.734856 8.484757 17.28854

I would like to omit the values of NA and just sum the rest.

I tried to use rowSums() but that sums an entire row and I only need a
few variables.

Does anyone know how to do this?

Thanks in advance,
Petra

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error message in vegan ordistep

2012-01-10 Thread Nevil Amos

I am getting the following erro rmessage in ordistep.  I have a number of 
similarly structured datasets using ordistep in a loop, and the message only 
occurs for some of the datasets.  

I cannot include a reproducible sample  - the specific datasets where this is 
occur ing are fairly large and there are several pcnm's in the rhs of the 
formula.

thanks for any pointers that may allow me to track down the cause of the error.

Nevil Amos

Error in if (aod[1, 5] <= Pin) { : missing value where TRUE/FALSE needed
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> traceback()
9: ordistep(myrda0, scope = formula(myrda1), direction = "both", 
   Pin = 0.05, Pout = 0.1) at RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
8: eval(expr, envir, enclos) at RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
7: eval(expr, pf) at RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
6: withVisible(eval(expr, pf)) at RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
5: evalVis(expr) at RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
4: capture.output(ordistep(myrda0, scope = formula(myrda1), direction = "both", 
   Pin = 0.05, Pout = 0.1)) at RDAPARTIALSexandAgeConnectandGEOGraphy2.R#86
3: eval.with.vis(expr, envir, enclos)
2: eval.with.vis(ei, envir)
1: 
source("~/Documents/Dropbox/thesis/CH3/Analysis/RDAPARTIALSexandAgeConnectandGEOGraphy2.R")



> 
> print(myrda1)
Call: rda(formula = mygenind@tab ~ pcnmTRE_25_100_CS25 + pcnmTRE_25_10_CS25 + 
pcnmTRE_25_2_CS25 +
pcnmTRE_25_5_CS25 + mydata$TreeCov + mydata$Hab_Config + pcnmEYR_EO_100_CS25 +
pcnmEYR_EO_5000_CS25 + pcnmEYR_TH_10_CS25 + pcnmEYR_TH_2_CS25 + mydata$Site_No 
+ mydata$Landscape
+ Condition(pcnmCS_NULL + mydata$LAT.x + mydata$LONG.x), na.action = "na.omit")

  Inertia Proportion Rank
Total  1.8110 1. 
Conditional0.8681 0.4793   32
Constrained0. 0.0
Unconstrained  0.9429 0.5207   29
Inertia is variance 
Some constraints were aliased because they were collinear (redundant)

Eigenvalues for unconstrained axes:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 
0.16008 0.14733 0.12183 0.09054 0.07380 0.06971 0.05578 0.04215 
(Showed only 8 of all 29 unconstrained eigenvalues) 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] colored outliers

2012-01-10 Thread Justin Haynes

# find top 4 points
circ
<- 
TOC_NI[order(TOC_NI$NI,decreasing=T),][1:4,]TOC_NI[order(TOC_NI$NI,decreasing=T),][1:4,]

# add them to your plot!
plot(NI~TOC,data=TOC_NI,col="blue", pch=16, xlim=c(0,450))
abline(lm(NI~TOC,data=TOC_NI),col = "red",lwd=3)
points(NI~TOC,data=TOC_NI,col='red',pch=1,size=3)



Justin

On Tue, Jan 10, 2012 at 7:11 AM, Geophagus
wrote:

> Hi @ all,
> I have question how to mark significant outliers in R.
> This is my very simple script to plot a regression:
>
> TOC_NI<-read.csv2("C:/Users/XYZ/Desktop/Master/Daten/Statistik/TOC-NI.csv",
> sep=";", dec=",", encoding="UTF-8")
> plot(NI~TOC,data=TOC_NI,col="blue", pch=16, xlim=c(0,450))
> abline(lm(NI~TOC,data=TOC_NI),col = "red",lwd=3)
> summary(lm(NI~TOC,data=TOC_NI))
>
> The result is the following pic:
> http://r.789695.n4.nabble.com/file/n4282207/nickel_TOC_5f.png
> nickel_TOC_5f.png
>
> Now I want to make small red circles around the four highest values of Ni.
> Does anyone has an idea how to do that?
> Thanks a lot!
>
> Best Regards
> Geophagus
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/colored-outliers-tp4282207p4282207.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rjags installation trouble

2012-01-10 Thread Ben Bolker

  Trying to install latest rjags (3-5) from CRAN with JAGS 3.2.0
installed on Ubuntu 10.04, with r-devel ... the bottom line is that it
fails while loading with

 /libs/rjags.so: undefined symbol: _ZN7Console15checkAdaptationERb

  Has anyone else seen this or is it a glitch somewhere in my system?

  thanks
Ben Bolker

==
bolker@ubuntu-10-new:~/R/pkgs/rjags$ jags
Welcome to JAGS 3.2.0 on Tue Jan 10 10:38:31 2012
JAGS is free software and comes with ABSOLUTELY NO WARRANTY
Loading module: basemod: ok
Loading module: bugs: ok
.

===

install.packages("rjags") starts out fine:

Installing package(s) into
‘/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'http://probability.ca/cran/src/contrib/rjags_3-5.tar.gz'
Content type 'application/x-gzip' length 66429 bytes (64 Kb)
opened URL
==
downloaded 64 Kb

* installing *source* package ‘rjags’ ...
** package ‘rjags’ successfully unpacked and MD5 sums checked
checking for prefix by checking for jags... /usr/bin/jags

  [snip snip]

** libs
g++ -I/usr/local/lib/R/include -DNDEBUG -I/usr/include/JAGS
-I/usr/local/include-fpic  -g -O2  -c jags.cc -o jags.o
g++ -I/usr/local/lib/R/include -DNDEBUG -I/usr/include/JAGS
-I/usr/local/include-fpic  -g -O2  -c parallel.cc -o parallel.o
g++ -shared -L/usr/local/lib -o rjags.so jags.o parallel.o -L/usr/lib -ljags
installing to /mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags/libs
** R

  [snip snip]

  but fails at:

** building package indices
Error : .onLoad failed in loadNamespace() for 'rjags', details:
  call: dyn.load(file, DLLpath = DLLpath, ...)
  error: unable to load shared object
'/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags/libs/rjags.so':

/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags/libs/rjags.so:
undefined symbol: _ZN7Console15checkAdaptationERb
ERROR: installing package indices failed
* removing ‘/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags’
* restoring previous
‘/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags’

The downloaded source packages are in
‘/tmp/RtmpSKHIz5/downloaded_packages’
Warning message:
In install.packages("rjags") :
  installation of package ‘rjags’ had non-zero exit status
> sessionInfo()
R Under development (unstable) (2012-01-01 r58032)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_CA.utf8   LC_NUMERIC=C
 [3] LC_TIME=en_CA.utf8LC_COLLATE=en_CA.utf8
 [5] LC_MONETARY=en_CA.utf8LC_MESSAGES=en_CA.utf8
 [7] LC_PAPER=CLC_NAME=C
 [9] LC_ADDRESS=C  LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.15.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] colored outliers

2012-01-10 Thread Geophagus

Hi @ all,
I have question how to mark significant outliers in R.
This is my very simple script to plot a regression:

TOC_NI<-read.csv2("C:/Users/XYZ/Desktop/Master/Daten/Statistik/TOC-NI.csv",
sep=";", dec=",", encoding="UTF-8")
plot(NI~TOC,data=TOC_NI,col="blue", pch=16, xlim=c(0,450))
abline(lm(NI~TOC,data=TOC_NI),col = "red",lwd=3)
summary(lm(NI~TOC,data=TOC_NI))

The result is the following pic:
http://r.789695.n4.nabble.com/file/n4282207/nickel_TOC_5f.png
nickel_TOC_5f.png 

Now I want to make small red circles around the four highest values of Ni.
Does anyone has an idea how to do that?
Thanks a lot!

Best Regards 
Geophagus




--
View this message in context: 
http://r.789695.n4.nabble.com/colored-outliers-tp4282207p4282207.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem installing packages

2012-01-10 Thread David Winsemius



On Jan 10, 2012, at 8:45 AM, Gavin Blackburn wrote:



The packages you require might not have been updated yet. You could  
contact the package admin.


That would not be the first option. Checking to see if your mirror is  
deficient by looking at another mirror would be the first option. You  
should also look at at the Package Checks page; http://cran.r-project.org/web/checks/check_summary.html


 There is a version of vegan on the mirror I am using and I have a  
relatively recent version of R 2.14.1 Patched (with a Mac). And there  
are no errors reported for current versions. So leave the package  
maintainer in peace, since he has been doing his job.




Gavin.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
] On Behalf Of natalia norden

Sent: 10 January 2012 13:35
To: r-help@r-project.org
Subject: [R] problem installing packages

Hello,

I was using version 2.13.2 and I have just downloaded the latest  
version 2.14.1. However, I'm trying to install the packages I was  
using and when I look for them in the packages list, I can´t find  
many in the CRAN binaries (e.g. "vegan"). I do find them in the CRAN  
sources but the installation fails. I tried downloading the version  
2.14.0 and I had the same problem. I re-installed the old version,  
and now it works again. Is this a problem with 2.14?

Thank you for your help.
Natalia Norden




--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lapack routine dgesv: system is exactly singular

2012-01-10 Thread Terry Therneau

I was sent a copy of the data on request.  A quick look shows that 

> range(days.alive[censored==0])
[1]0 1825
>  range(days.alive[censored==1])
[1] 1826 1826

 The original call of survdiff(Surv(days.alive, censored) ~ group) will
assume that censored=1 corresponds to deaths and 0 to alive; from the
above this is almost certainly backwards.  If instead we force 0 to be
the "yes they're dead" group the results look better.

> survdiff(Surv(days.alive, censored==0) ~ group, f2)
Call:
survdiff(formula = Surv(days.alive, censored == 0) ~ group, data = f2)

  N Observed Expected (O-E)^2/E (O-E)^2/V
group=PRI_CAS_5_NODU   3326 1129 1745 217.7   281
group=SEC_CAS_5_NODUP 13469 6731 6115  62.1   281

 Chisq= 281  on 1 degrees of freedom, p= 0 

In the original setup the test statistic had value 0 + roundoff error
and std = 0 + roundoff error due to all the "deaths" being tied on
exactly the same day, and you were getting the matrix version of a 0/0
error.

  Terry T

--- begin included message 
I have a problem with this error, I have searched the archives and found
previous discussion about this, can I cannot understand how the
explanations
apply to what I am trying to do.

 

I am trying to do Log_rank Survival analysis, I have included tables and
str command, is it a factor/integer problem? If so how do I correct
this, as all my attempt to recode the data have failed.

 

> survdiff(Surv(f2$days.alive , f2$censored)~group, data=f2)

Error in drop(.Call("La_dgesv", a, as.matrix(b), tol, PACKAGE =
"base")) : 

  Lapack routine dgesv: system is exactly singular

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange Sys.Date() side effect

2012-01-10 Thread Duncan Murdoch


On 12-01-10 8:04 AM, Czerminski, Ryszard wrote:

Any ideas what is the problem with this code?


N<- 2; c(Sys.Date(), sprintf('N = %d', N))

[1] "2012-01-10" NA
Warning message:
In as.POSIXlt.Date(x) : NAs introduced by coercion


You are trying to create a vector combining a Date object and a 
character object.  R is trying to coerce both objects to dates, and that 
fails.


You probably want two strings; so convert the date explicitly:

c(as.character(Sys.Date()), sprintf('N = %d', N))

(or use format or some other function to convert the date to a string.)

Duncan Murdoch



Best regards,
Ryszard

Ryszard Czerminski
AstraZeneca Pharmaceuticals LP
35 Gatehouse Drive
Waltham, MA 02451
USA
781-839-4304
ryszard.czermin...@astrazeneca.com


--
Confidentiality Notice: This message is private and may ...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem installing packages

2012-01-10 Thread Gavin Blackburn


The packages you require might not have been updated yet. You could contact the 
package admin.

Gavin.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of natalia norden
Sent: 10 January 2012 13:35
To: r-help@r-project.org
Subject: [R] problem installing packages

Hello, 

I was using version 2.13.2 and I have just downloaded the latest version 
2.14.1. However, I'm trying to install the packages I was using and when I look 
for them in the packages list, I can´t find many in the CRAN binaries (e.g. 
"vegan"). I do find them in the CRAN sources but the installation fails. I 
tried downloading the version 2.14.0 and I had the same problem. I re-installed 
the old version, and now it works again. Is this a problem with 2.14?
Thank you for your help.
Natalia Norden



Natalia Norden
Profesor Asistente
Departamento de Ecología y Territorio
Facultad de Estudios Ambientales y Rurales Pontificia Universidad Javeriana 
Bogotá, Colombia
Tel: 320 83 20  Ext: 2448
www.phylodiversity.net/nnorden/



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Propensity score matching in R using Classification tree method

2012-01-10 Thread Frank Harrell

A single tree will undermatch subjects.
Frank

ardsiiitmg wrote
> 
> I can able to calculate the propensity score using classification tree
> method.
> And if i am trying to find the PS matching i am getting error.(Error in
> Match(Y = Y, Tr = Tr, X = ps0) : length(Y) != length(Tr))
> 
> Propensity score matching:
> library("rgenoud")
> library("Matching")
> data("Passport")
> attach("Passport")
> 
> Y<-Passport$A3
> Tr<-Passport$fserv_cd
> rr1<-Match(Y=Y, Tr = Tr, X= glm1$fitted)
> [in above command i am getting Error in Match(Y = Y, Tr = Tr, X = ps0) :
> length(Y) != length(Tr)]
> 
> Matchbalance(fserv_cd~ A4,match.out=rr1,nboots=1000,data=Passport)
> 
> Please suggest me right approach for Propensity score matching using
> classification trees(if possible code me) .
> ps0-propensity score values
> 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Propensity-score-matching-in-R-using-Classification-tree-method-tp4281605p4281981.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Propensity score matching in R using Classification tree method

2012-01-10 Thread ardsiiitmg

I can able to calculate the propensity score using classification tree
method.
And if i am trying to find the PS matching i am getting error.(Error in
Match(Y = Y, Tr = Tr, X = ps0) : length(Y) != length(Tr))

Propensity score matching:
library("rgenoud")
library("Matching")
data("Passport")
attach("Passport")

Y<-Passport$A3
Tr<-Passport$fserv_cd
rr1<-Match(Y=Y, Tr = Tr, X= glm1$fitted)
[in above command i am getting Error in Match(Y = Y, Tr = Tr, X = ps0) :
length(Y) != length(Tr)]

Matchbalance(fserv_cd~ A4,match.out=rr1,nboots=1000,data=Passport)

Please suggest me right approach for Propensity score matching using
classification trees(if possible code me) .
ps0-propensity score values

--
View this message in context: 
http://r.789695.n4.nabble.com/Propensity-score-matching-in-R-using-Classification-tree-method-tp4281605p4281605.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem installing packages

2012-01-10 Thread natalia norden

Hello, 

I was using version 2.13.2 and I have just downloaded the latest version
2.14.1. However, I'm trying to install the packages I was using and when I
look for them in the packages list, I can´t find many in the CRAN binaries
(e.g. "vegan"). I do find them in the CRAN sources but the installation
fails. I tried downloading the version 2.14.0 and I had the same problem. I
re-installed the old version, and now it works again. Is this a problem with
2.14?
Thank you for your help.
Natalia Norden



Natalia Norden
Profesor Asistente
Departamento de Ecología y Territorio
Facultad de Estudios Ambientales y Rurales
Pontificia Universidad Javeriana
Bogotá, Colombia
Tel: 320 83 20  Ext: 2448
www.phylodiversity.net/nnorden/



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] strange Sys.Date() side effect

2012-01-10 Thread Czerminski, Ryszard

Any ideas what is the problem with this code?

> N <- 2; c(Sys.Date(), sprintf('N = %d', N))
[1] "2012-01-10" NA
Warning message:
In as.POSIXlt.Date(x) : NAs introduced by coercion

Best regards,
Ryszard

Ryszard Czerminski
AstraZeneca Pharmaceuticals LP
35 Gatehouse Drive
Waltham, MA 02451
USA
781-839-4304
ryszard.czermin...@astrazeneca.com


--
Confidentiality Notice: This message is private and may ...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rworldmap: xlim, ylim do not change plotting region

2012-01-10 Thread Dan Bebber

Specifying xlim or ylim in the mapCountryData function of the rworldmap library 
do not alter the plotting region on my system.

#Using the example from rworldmap
library(rworldmap)
mapCountryData() #uses the sample dataset
mapCountryData(ylim = c(-45,45)) #makes no difference to the plot

R 2.14.1 on Linux Mint 11 64bit.

Thanks,
Dan

Dr Dan Bebber
 
Head of Climate Change Research, Earthwatch Institute, 256 Banbury Road, Oxford 
OX2 7DE, UK

Research Fellow in Biology, St. Peter's College, University of Oxford
 
Tel. +44 (0)1865 318842, Mob. +44 (0)7729 167502, Fax. +44 (0)1865 318824, 
skype danbebber, email dbeb...@earthwatch.org.uk, web 
http://www.earthwatch.org/europe
 
This e-mail (and any attachments) is confidential and may contain personal 
views, which are not the views of Earthwatch Institute Europe unless 
specifically stated. If you have received it in error, please delete it from 
your system, do not use, copy or disclose the information in any way nor act in 
reliance on it and notify the sender immediately. Please note that Earthwatch 
Institute (Europe) monitors e-mails sent or received. Further communication 
will signify your consent to this.Conservation Education & Research Trust also 
known as Earthwatch Institute (Europe) is a company limited by guarantee and 
registered in England and Wales under company number 4373313 and charity number 
1094467. The registered address is, Mayfield House, 256 Banbury Road, Oxford, 
OX2 7DE England.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] "tau + h > 1: error in summary.rq"

2012-01-10 Thread Julia Lira


Dear all,

I am doing a simulation for my model that works when I use only the rq() 
command. However, since I need to use the varcov matrix for my Wald test, I 
need to compute summary(rq(), cov=TRUE). But the simulation does not work 
because of the error: tau + h > 1:  error in summary.rq

I tried to use: 

if (tau + h > 1) 
  stop("tau + h > 1:  error in summary.rq")

But the Hall-Sheather bandwidth is very high because I also vary the number of 
observations from 40 to 300.

Is there anyone that could help me?

Thanks in advance!

All the best,

Julia



 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-sig-Geo] Spatial data, rpoispp, using window with fixed radius?

2012-01-10 Thread Mathieu Rajerison

Hi Adrian,

I can not see any reference about scanmeasure and deviation functions in
the current spatstat manual: are these included in the newest version of
spatstat?

Mathieu

2012/1/9 

>
> The following message appeared on R-help but this discussion should be
> moved to R-sig-geo
>
> On 07/01/12 02:17, herbert8...@gmx.de wrote:
>
> > I was searching through the spatstat manual in order to find a function
> >  to simulate a Poisson pattern only within a fixed radius (circular
> moving window)
>  > around individual points. If points are distributed heterogeneously
> over a
> > large area this may help to only assess deviation from CSR within the
> window
> > and thus does not require additional information on a covariate.
> > I could not find such a function in spatstat. Can please anyone help?
>
> What do you want to happen if two of the circles overlap? Should the
> density of random points be twice as high?
>
> If the answer is 'yes' then do the following (where X is your original
> point pattern of centres, and 'r' is the radius of the circles, and
> 'lambda' is the intensity of random points per unit area in each circle)
>
>   V <- scanmeasure(X, r)
>   V <- eval.im(lambda * V)
>   Y <- rpoispp(V)
>
> If the answer is 'no' then do
>   W <- dilation(X, r)
>   Y <- rpoispp(lambda, win=W)
>
> Adrian Baddeley
>
> ___
> R-sig-Geo mailing list
> r-sig-...@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting Data from SQL Server

2012-01-10 Thread Ajay Askoolum

try:

SELECT a.UNIQUE_ID, 
   a.diag01 
  from LoadPUS a
left join CVD_ICD10 b
on a.diag01 = b.[ICD-10 Codes] 
   or a.diag02 = b.[ICD-10 Codes] 
   or a.diag03 = b.[ICD-10 Codes]

I am not sure why your table name CVD_ICD10 has a suffix $.




 From: Jeff Newmiller 
To: dthomas ; r-help@r-project.org 
Sent: Tuesday, 10 January 2012, 8:00
Subject: Re: [R] Extracting Data from SQL Server

This is OT here. However, you might want to investigate the UNIQUE keyword in 
the SQL Server documentation for SELECT.
---
Jeff Newmiller                        The     .       .  Go Live...
DCN:        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

dthomas  wrote:

>Hi, 
>
>I am new to R (and rusty on SQL!) and I'm trying to extract records
>from a
>SQL server database. I have a table of patient records (LoadPUS) which
>have
>three code columns which i want to evaluate against a list of
>particular
>codes (CVD_ICD$ table). Given the size of the patient table I want to
>restrict the data I pull into R to the data I only want to analyse so I
>am
>using SQL to do this. The code i have is as follows:
>
>library(RODBC)
>channel<-odbcConnect("NatCollections")
>query<-"SELECT UNIQUE_ID, diag01 from LoadPUS 
>WHERE (diag01 IN (SELECT [ICD-10 Codes] From CVD_ICD10$)) OR (diag02 IN
>(SELECT [ICD-10 Codes] From CVD_ICD10$))
>OR (diag03 IN (SELECT [ICD-10 Codes] From CVD_ICD10$))"
>
>This returns duplicate values, I don't want to hardcode the values
>because
>it is quite a long list. Running the "IN" function just for "diag01"
>returns
>the correct number of records, however when combining with another "IN"
>function it doesn't return the correct number of records. Can you see
>where
>my SQL is incorrect or is there another way of doing this?
>
>Much appreciated,
>D
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Extracting-Data-from-SQL-Server-tp4281000p4281000.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unexpected results using the oneway_test in the coin package

2012-01-10 Thread Mark Difford

On Jan 09, 2012 at 11:48am Christoph Liedtke wrote:

> I should be detecting some non-significance between groups I and III at
> least, but the test comes back with 
> extremely low p-values.  Where am I going wrong?

Nowhere, I think. This does seem to be an error in coin. You should send
your example to the maintainers of the package. Apart from the visual you
have provided, other reasons for thinking that this is an error are the
following.

First, if you redo the analysis excluding habitat II, then the contrast is
not significant, as expected. Secondly, if you repeat the full analysis
using package nparcomp then you get the results you are expecting, based on
the graphical representation of the data. See the examples below.

## drop habitat == II
NDWD <- oneway_test(breeding ~ habitat, data = droplevels(subset(mydata,
habitat != "II")), 
ytrafo = function(data) trafo(data, numeric_trafo = rank), 
xtrafo = function(data) trafo(data, factor_trafo = function(x) 
model.matrix(~x - 1) %*% t(contrMat(table(x), "Tukey"))), 
teststat = "max", distribution = approximate(B = 90)) 

print(NDWD)
print(pvalue(NDWD, method = "single-step"))

## use nparcomp
library(nparcomp)
npar <- nparcomp(breeding ~ habitat, data = mydata, type = "Tukey")
npar

Regards, Mark.

-
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: 
http://r.789695.n4.nabble.com/Unexpected-results-using-the-oneway-test-in-the-coin-package-tp4278371p4281329.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error in Recursive

2012-01-10 Thread Berend Hasselman


arunkumar wrote
> 
> Hi
>  
> I need help in the recursive problem.  this is my code
> 
> #Generate two random Numbers
> minval=20
> maxval=100
> cutoffValue=50
> 
> optVal<- function(cutoffValue,minval,maxval)
> {
>x=runif(2)
>x=x*cutoffValue
>for( i in 1:2)
> {
>   if(x[i] < 30 || x[i] >60)   # checking it falls between the range
>   {
> optVal(cutoffValue,minval,maxval)
>   }
>}
>return(x)
> }
> 
> optVal(40,20,60)
> 
> I'm getting Error like this
> *Error: evaluation nested too deeply: infinite recursion /
> options(expressions=)?*
> 

You can easily find out what is wrong by inserting a print(x) just before
the if(...).
Why are you not using the minval and maxval arguments in your function
optVal?

Change the line with the if to

  if(x[i] < minval || x[i] > maxval)   # checking it falls between the
range 

and call optVal with a smaller value for minval 

optVal(40,20,60) 

and you will avoid the error message. Your minval of 30 is simply too high
i.e. you are rejecting an x too quickly.

Berend





--
View this message in context: 
http://r.789695.n4.nabble.com/error-in-Recursive-tp4281052p4281449.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting Data from SQL Server

2012-01-10 Thread Jeff Newmiller

This is OT here. However, you might want to investigate the UNIQUE keyword in 
the SQL Server documentation for SELECT.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

dthomas  wrote:

>Hi, 
>
>I am new to R (and rusty on SQL!) and I'm trying to extract records
>from a
>SQL server database. I have a table of patient records (LoadPUS) which
>have
>three code columns which i want to evaluate against a list of
>particular
>codes (CVD_ICD$ table). Given the size of the patient table I want to
>restrict the data I pull into R to the data I only want to analyse so I
>am
>using SQL to do this. The code i have is as follows:
>
>library(RODBC)
>channel<-odbcConnect("NatCollections")
>query<-"SELECT UNIQUE_ID, diag01 from LoadPUS 
>WHERE (diag01 IN (SELECT [ICD-10 Codes] From CVD_ICD10$)) OR (diag02 IN
>(SELECT [ICD-10 Codes] From CVD_ICD10$))
>OR (diag03 IN (SELECT [ICD-10 Codes] From CVD_ICD10$))"
>
>This returns duplicate values, I don't want to hardcode the values
>because
>it is quite a long list. Running the "IN" function just for "diag01"
>returns
>the correct number of records, however when combining with another "IN"
>function it doesn't return the correct number of records. Can you see
>where
>my SQL is incorrect or is there another way of doing this?
>
>Much appreciated,
>D
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Extracting-Data-from-SQL-Server-tp4281000p4281000.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

83 matches

Mail list logo