[R] Odp: Superposing mean line to xyplot

2011-10-11 Thread Petr PIKAL
Hi
 
 Dear R-users,
 I'm using lattice package and function xyplot for the first time so
 you will excuse me for my inexperience. I'm facing quite a simple
 problem but I'm having troubles on how to solve it, I've read tons of
 old mails in the archives and looked at some slides from Deepayan
 Sarkar but still can not get the point.
 
 This is the context. I've got data on 9 microRNAs, each miRNA has been
 measured on three different arrays and on each array I have 4
 replicates for each miRNA, which sums up to a total of 108
 measurements. I've the suspect that measurement on the first array are
 systematically lower than the others so I wanted to draw some line
 plot where each panel correspond to a miRNA, and each line correspond
 to one of the four replicates (that is: first replicate of miRNA A on
 array 1 must be connected to first replicate of miRNA A on array 2 and
 so on), so that for each panel there are 4 series of three points
 connected by a line/segment. I've done this easily with lattice doing
 this:
 
 array = rep(c(A,B,C),each = 36) # array replicate
 spot =  rep(1:4,27) # miRNA replicate on each array
 miRNA = rep(rep(paste(miRNA,1:9,sep=.),each=4),3) # miRNA label
 exprs = rnorm(mean=2.8,n = 108) # intensity
 data = data.frame(miRNA,array,spot,exprs)
 xyplot(exprs ~ 
array|miRNA,data=data,type=b,groups=spot,xlab=Array,ylab
 = Intensity,col=black,lty=2:5,scales = list(y = list(relation =
 free)))
 
 Now, I want to superpose to each panel an other series of three points
 connected by a line, where each point represent the mean of the four
 replicates of the miRNA on each array, a sort of mean line. I've tried
 using the following, but it's not working as expected:
 
 xyplot(exprs ~ 
array|miRNA,data=array,type=b,groups=spot,xlab=Array,ylab
 = Intensity,col=black,lty=2:5,scales = list(y = list(relation =
 free)), panel = function(x,y,groups,subscripts){
panel.xyplot(x,y,groups=groups,subscripts=subscripts)
panel.superpose
 (x,y,panel.groups=panel.average,groups=groups,subscripts=subscripts)
 })
 
 This is maybe a silly question and possibly there's a trivial way to
 do it, but I can not figure it out.

With some help I made function addLine

# based on Gabor Grothendieck's code suggestion
# adds straight lines to panels in lattice plots

addLine- function(a=NULL, b=NULL, v = NULL, h = NULL, ..., once=F) { 
tcL - trellis.currentLayout()
k-0
for(i in 1:nrow(tcL))
  for(j in 1:ncol(tcL))
if (tcL[i,j]  0) {
k-k+1
trellis.focus(panel, j, i, highlight = FALSE)
if (once) panel.abline(a=a[k], b=b[k], v=v[k], h=h[k], ...) else 
panel.abline(a=a, b=b, v=v, h=h, ...)
trellis.unfocus()
}
}


addLine(h=tapply(data$exprs, miRNA, mean), once=T)

Regards
Petr


 
 Thanx for any help.
 
 niccolò
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extra digits added to data

2011-10-11 Thread Mark Harrison
I am having a problem with extra digits being added to my data which I think
is a result of how I am converting my data.frame data to xts.

I see the same issue in R v2.13.1 and RStudio version 0.94.106.

I am loading historical foreign exchange data in via csv files or from a sql
server database.  In both cases there are no extra digits and the original
data looks like the following:

Date   Open   HighLow  Close
1 2001-01-03 1.5021 1.5094 1.4883 1.4898
2 2001-01-04 1.4897 1.5037 1.4882 1.5020
3 2001-01-05 1.5020 1.5074 1.4952 1.5016
4 2001-01-08 1.5035 1.5104 1.4931 1.4964
5 2001-01-09 1.4964 1.4978 1.4873 1.4887
6 2001-01-10 1.4887 1.4943 1.4856 1.4866

So for 2001-01-03 the Open value is 1.5021 with only 4 digits after the
decimal place - i.e. .5021.

I then proceed to do the following in R to convert the 'british pound' data
above from data.frame to xts:

Require(quantmod)
rownames(gbp) - gbp$Date
head(gbp)

 Open   HighLow  Close
2001-01-03 1.5021 1.5094 1.4883 1.4898
2001-01-04 1.4897 1.5037 1.4882 1.5020
2001-01-05 1.5020 1.5074 1.4952 1.5016
2001-01-08 1.5035 1.5104 1.4931 1.4964
2001-01-09 1.4964 1.4978 1.4873 1.4887
2001-01-10 1.4887 1.4943 1.4856 1.4866

gbp- as.xts(gbp[,2:5])
class(gbp)

[1] xts zoo

The data at this point looks ok until you look closer or output the data to
excel at which point you see the following for the 'Open' 2001-01-03:
1.5020084473

It is not just the above 'Open' or the first value but all the data points
contain the extra digits which I think is the original date data and/or row
numbers that are being tacked on.

My problem is the extra digits being added or whatever I am doing wrong in R
to cause the extra digits to be added.  I need 1.5021 to be 1.5021 and not
1.5020084473.

Thanks for the help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SLOW split() function

2011-10-11 Thread Joshua Wiley
As another followup, given that you are doing numerous regression
models and (I presume) working with finance/stock data that is
strictly numeric (no need for special contrast coding, etc.), you can
substantially reduce the time spent estimating the coefficients.  A
simple way is to use lm.fit directly instead of lm.  For lm.fit, you
pass the y and x (design) matrices directly.  This skips a good deal
of overhead.  Here is one naive way, I imagine more speedups could be
gained by incorporating the intercept (1 vector) into d instead of
cbind()ing it.  The catch it that lm.fit requires matrices, not data
tables, so what you gain may be lost in having to do an extra
conversion.  In any case, here are the times on my system for the two
options (note I used N = 1000 * 100 because I am presently on a
glorified netbook).

 print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~
+ x, data=d[.indx,])) })))
   user  system elapsed
  69.000.00   69.56

 print(system.time(all.2c - lapply(si, function(.indx) { coef(lm.fit(y = 
 d[.indx, y], x = cbind(1, d[.indx, x]))) })))
   user  system elapsed
  37.830.03   38.36

the column names for the coeficients will not be the same as from lm,
but the estimates should be identical.  While this is not recommended
in typical usage, in an application like regressions on rolling time
windows, etc. where you know the data are not changing, I think it
makes sense to bypass the clever determine your data and best methods
to use, and go straight to passing the design matrix.  Since you do
not need residuals, variances, etc. it may be possible to speed this
up even more, perhaps bypassing dqrls altogether.

Cheers,

Josh

On Mon, Oct 10, 2011 at 9:56 PM, ivo welch ivo.we...@gmail.com wrote:
 thank you, everyone.  this was very helpful to my specific task and
 understanding.  for the benefit of future googlers, I thought I would
 post some experiments and results here.

 ultimately, I need to do a by() on an irregular matrix, and I now know
 how to speed up by() on a single-core, and then again on a multi-core
 machine.

 library(data.table)
 N - 1000*1000
 d - data.table(data.frame( key= as.integer(runif(N, min=1,
 max=N/10)), x=rnorm(N), y=rnorm(N) ))  # irregular
 setkey(d, key); gc() ## sort and force a garbage collection


 cat(N=, N, .  Size of d=, object.size(d)/1024/1024, MB\n)

 cat(\nStandard by() Function:\n)
 print(system.time( all.1 - by( d, d$key, function(d) coef(lm(y ~ x, 
 data=d)


 cat(\n\nPreSplit Function [aka Jim H]\n\t(a) Splitting Operation:\n)
 print(system.time(si - split(seq(nrow(d)), d$key)))
 cat(\n\t(b) Regressions:\n)
 print(system.time(all.2 - lapply(si, function(.indx) {
 coef(lm(d$y[.indx] ~ d$x[.indx])) })))
 print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~
 x, data=d[.indx,])) })))

 cat(\n\nNaive Split Data Frame\n\t(a) Splitting Operation:\n)
 print(system.time(ds - split(d, d$key)))
 cat(\n\t(b) Regressions:\n)
 print(system.time(all.3a - lapply(ds, function(ds) { coef(lm(ds$y ~ ds$x)) 
 })))
 print(system.time(all.3b - lapply(ds, function(ds) { coef(lm(y ~ x,
 data=ds)) })))

 the first and the last ways (all.1 and all.3) are naive ways of
 doing this, and take about 400-500 seconds on a Mac Air, core i5.
 Jim's suggestion (all.2) cuts this roughly into half by speeding up
 the split to take almost no time.

 and now,

 library(multicore)
 print(system.time(all.4 - mclapply(si, function(.indx) { coef(lm(y ~
 x, data=d[.indx,])) })))

 on my dual-core (quad-thread) i5, all four pseudo cores become busy,
 and the time roughly halves again from 230 seconds to 120 seconds.


 maybe the by() function should use Jim's approach, and multicore
 should provide mcby().  of course, knowing how to do this myself fast
 now by hand, this is not so important for me.  but it may help some
 other novices.

 thanks again everybody.

 regards,

 /iaw

 
 Ivo Welch (ivo.we...@gmail.com)




 On Mon, Oct 10, 2011 at 9:31 PM, William Dunlap wdun...@tibco.com wrote:
 The following avoids the overhead of data.frame methods
 (and assumes the data.frame doesn't include matrices
 or other data.frames) and relies on split(vector,factor)
 quickly splitting a vector into a list of vectors.
 For a 10^6 row by 10 column data.frame split in 10^5
 groups this took 14.1 seconds while split took 658.7 s.
 Both returned the same thing.

 Perhaps something based on this idea would help your
 parallelized by().

 mysplit.data.frame -
 function (x, f, drop = FALSE, ...)
 {
    f - as.factor(f)
    tmp - lapply(x, function(xi) split(xi, f, drop = drop, ...))
    rn - split(rownames(x), f, drop = drop, ...)
    tmp - unlist(unname(tmp), recursive = FALSE)
    tmp - split(tmp, factor(names(tmp), levels = unique(names(tmp
    tmp - lapply(setNames(seq_along(tmp), names(tmp)), function(i) {
        t - tmp[[i]]
        names(t) - names(x)
        attr(t, row.names) - rn[[i]]
        class(t) - data.frame
        t
    })
    tmp
 }

 Bill 

[R] Labels in ICLUST

2011-10-11 Thread Steve Powell
Dear all,
I can't get the labels slot in ICLUST to accept a character vector.
library(psych)
test.data - Harman74.cor$cov
ic.out - ICLUST(test.data,nclusters
=4,labels=letters[1:ncol(test.data)]) ## Error in !labels : invalid
argument type
ic.out - ICLUST(test.data,nclusters =4,labels=1:ncol(test.data)) ## OK

Any ideas?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] binding all elements of list (character vectors) to a matrix as rows

2011-10-11 Thread Marion Wenty
dear r-users,

i have got a problem which i am trying to solve:

i have got the following commands:

Mymatrix - matrix(1:9,ncol=3)
Z -
list(V1=c(a,,),V2=c(b,,),V3=c(c,,),V4=c(d,,))
Mymatrix - rbind(Mymatrix,Z[[1]],Z[[2]],Z[[3]],Z[[4]])

now this is working, but i would like to substitute

Z[[1]],Z[[2]],Z[[3]],Z[[4]]

for a command with which i could also use another list with a different
number of elements, e.g. 5 or 6 elements.

does anyone know the solution to this problem?
thank you very much in advance!

marion

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] binding all elements of list (character vectors) to a matrixasrows

2011-10-11 Thread Gerrit Eichner

Marion,

try

rbind( Mymatrix, do.call( rbind, Z))

Hth  --  Gerrit


On Tue, 11 Oct 2011, Marion Wenty wrote:


dear r-users,

i have got a problem which i am trying to solve:

i have got the following commands:

Mymatrix - matrix(1:9,ncol=3)
Z -
list(V1=c(a,,),V2=c(b,,),V3=c(c,,),V4=c(d,,))
Mymatrix - rbind(Mymatrix,Z[[1]],Z[[2]],Z[[3]],Z[[4]])

now this is working, but i would like to substitute

Z[[1]],Z[[2]],Z[[3]],Z[[4]]

for a command with which i could also use another list with a different
number of elements, e.g. 5 or 6 elements.

does anyone know the solution to this problem?
thank you very much in advance!

marion

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Package/Function for Blending Images

2011-10-11 Thread Kay Cichini
Hi there,

Does someone know a package/function for blending two pictures or to add
transparency..

Thanks in advance,
KC

-

Kay Cichini
Postgraduate student
Institute of Botany
Univ. of Innsbruck


--
View this message in context: 
http://r.789695.n4.nabble.com/Package-Function-for-Blending-Images-tp3893170p3893170.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] warning with cut2 function

2011-10-11 Thread taby gathoni
Dear r user,

please find my attached sample of the dataset i  am using to create a 
crosstable and eventually plot a histogram from the output.

I am using  the cut2 function to create bins, about 7 of them using the code 
after reading the data:
cluster - cut2(cross_val$value, g=7)

I get the warning:
Warning message:
In min(xx[xx  upper]) : no non-missing arguments to min; returning Inf



additionally, the bins become 6 instead of 7 through the crossTable function:
cross1 -CrossTable(cross_val$factor, 
cluster,prop.chisq=FALSE,prop.r=FALSE,prop.t=FALSE)


Please assist me to  get my 7 bins.

How can i plot an output of the cross table as a historgram of factor rate  vs 
bins?

Any help will be highly appreciated.

Kind regards,
Taby


 


An idea not coupled with action will never get any bigger than the brain cell 
it occupied.
Arnold Glasgow
..
Attempt something large enough that failure is guaranteed…unless God steps in!
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] binding all elements of list (character vectors) to a matrix as rows

2011-10-11 Thread David Winsemius


On Oct 11, 2011, at 2:47 AM, Marion Wenty wrote:


dear r-users,

i have got a problem which i am trying to solve:

i have got the following commands:

Mymatrix - matrix(1:9,ncol=3)
Z -
list
(V1
=c(a,,),V2=c(b,,),V3=c(c,,),V4=c(d,,))


rbind(Mymatrix, t(as.data.frame(Z)))

The next method could be used if you had more lists:

do.call(rbind, list(Mymatrix, t(as.data.frame(Z



Mymatrix - rbind(Mymatrix,Z[[1]],Z[[2]],Z[[3]],Z[[4]])

now this is working, but i would like to substitute

Z[[1]],Z[[2]],Z[[3]],Z[[4]]

for a command with which i could also use another list with a  
different

number of elements, e.g. 5 or 6 elements.

--

David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Vegan: Anova.CCA accessing original data using option by=margin

2011-10-11 Thread Steve Pawson
Hello,

I am attempting to use the ANOVA.CCA function with the by=margin option.
The process works fine using the by=terms option and I note in the Vegan
manual that Jari suggests that an error may occur if the anova does not have
access to the data on the original constraints.

This is the error that I get:

Error in dimnames(x) - dn : 
  length of 'dimnames' [2] not equal to array extent

My question is, does anyone know if this error relates to what Jari is
referring to (or is it a different problem), and if it is, how do I link the
anova to the original constraints?

Many thanks for any help provided.

Regards

Steve

--
View this message in context: 
http://r.789695.n4.nabble.com/Vegan-Anova-CCA-accessing-original-data-using-option-by-margin-tp3893005p3893005.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to add a double quote to a string

2011-10-11 Thread arunkumar1111
Hi 

I want to add a double quote to a string

eg


Expected output = DROP TABLE  IF EXISTS abc

My code

tab=c(abc)
query = paste(DROP TABLE IF EXISTS ,tab,sep=)

Please help me to solve this problem


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-add-a-double-quote-to-a-string-tp3893061p3893061.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to add a double quote to a string

2011-10-11 Thread Mario Valle



On 11-Oct-11 09:23, arunkumar wrote:

Hi

I want to add a double quote to a string

eg


Expected output = DROP TABLE  IF EXISTS abc

My code

tab=c(abc)
query = paste(DROP TABLE IF EXISTS ,tab,sep=)


query = paste(DROP TABLE IF EXISTS \,tab,\, sep=)

or
query = paste('DROP TABLE IF EXISTS ',tab,'', sep=)

and tab=abc. no need for c()
 
Ciao!

mario



Please help me to solve this problem


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-add-a-double-quote-to-a-string-tp3893061p3893061.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Ing. Mario Valle
Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Type of Graph to use

2011-10-11 Thread Jim Lemon

On 10/10/2011 09:49 PM, Jurgens de Bruin wrote:

Hi,

Please advice on what type of graph can be used to display the following
data set.

I have the following:

NameClass
a Class 1
a Class4
b Class2
b Class1
d Class3
d Class5
e Class4
e Class2

So each entry in name can belong to more than one class. I want to represent
the data as to see where overlaps occur that is which names are in the same
Class Name and also which names are unique to a Class. I tough a Venn
Diagram would work but this can only present numerical values for each
Class, I would like each name to be presented by a dot or *.


Hi Jurgens,
Have a look at the intersectDiagram function in the plotrix package. 
This only plots the number of cases in each intersection, but it would 
be possible to plot dots or asterisks or even the lower case letters as 
long as there are not too many cases.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Perform 20 x one-way anova in 1 go

2011-10-11 Thread Joshua Wong
Hi Guys,

I have about 20 continous predictors and I want to do one-way anova to check 
the significance of each variable against the dependent variable.
Apart from doing running the anova 20 times, is there a faster way?

Thanks,
Joshua
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test if two C statistics are significantly different?

2011-10-11 Thread Eik Vettorazzi
Hi Yujie,
there is still a lot of work in progress, I think. As
http://faculty.washington.edu/heagerty/Software/SurvROC/RisksetROC/risksetROCdiscuss.pdf

states: [...] for inference and variance estimation, we now suggest
bootstrapping [...].
Recently I catched a glimpse on roc.test from the pROC package, they
implemented, amongst others, a bootstrap algorithm - maybe this is a
start for your own work?

Hth.

Am 10.10.2011 21:35, schrieb Yujie Wang:
 Hey all,
 
 In order to test if a marker is a risk factor, I built two models (using cox
 proportional hazard model). One model included this marker, and the other is
 not.
 
 Then, I use R package risksetROC to test how much predictive value did the
 marker add to this model. I get two C statistics by analyzing the linear
 predictors of the two models into this package.
 
 The qustion is How to test if two C statistics are significantly different?
 
 Your help will be greatly appreciated!
 
 Yujie
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; 
Gerichtsstand: Hamburg

Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. 
Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extra digits added to data

2011-10-11 Thread Jim Holtman
FAQ 7.31

Sent from my iPad

On Oct 11, 2011, at 1:07, Mark Harrison harrisonma...@gmail.com wrote:

 I am having a problem with extra digits being added to my data which I think
 is a result of how I am converting my data.frame data to xts.
 
 I see the same issue in R v2.13.1 and RStudio version 0.94.106.
 
 I am loading historical foreign exchange data in via csv files or from a sql
 server database.  In both cases there are no extra digits and the original
 data looks like the following:
 
Date   Open   HighLow  Close
 1 2001-01-03 1.5021 1.5094 1.4883 1.4898
 2 2001-01-04 1.4897 1.5037 1.4882 1.5020
 3 2001-01-05 1.5020 1.5074 1.4952 1.5016
 4 2001-01-08 1.5035 1.5104 1.4931 1.4964
 5 2001-01-09 1.4964 1.4978 1.4873 1.4887
 6 2001-01-10 1.4887 1.4943 1.4856 1.4866
 
 So for 2001-01-03 the Open value is 1.5021 with only 4 digits after the
 decimal place - i.e. .5021.
 
 I then proceed to do the following in R to convert the 'british pound' data
 above from data.frame to xts:
 
 Require(quantmod)
 rownames(gbp) - gbp$Date
 head(gbp)
 
 Open   HighLow  Close
 2001-01-03 1.5021 1.5094 1.4883 1.4898
 2001-01-04 1.4897 1.5037 1.4882 1.5020
 2001-01-05 1.5020 1.5074 1.4952 1.5016
 2001-01-08 1.5035 1.5104 1.4931 1.4964
 2001-01-09 1.4964 1.4978 1.4873 1.4887
 2001-01-10 1.4887 1.4943 1.4856 1.4866
 
 gbp- as.xts(gbp[,2:5])
 class(gbp)
 
 [1] xts zoo
 
 The data at this point looks ok until you look closer or output the data to
 excel at which point you see the following for the 'Open' 2001-01-03:
 1.5020084473
 
 It is not just the above 'Open' or the first value but all the data points
 contain the extra digits which I think is the original date data and/or row
 numbers that are being tacked on.
 
 My problem is the extra digits being added or whatever I am doing wrong in R
 to cause the extra digits to be added.  I need 1.5021 to be 1.5021 and not
 1.5020084473.
 
 Thanks for the help.
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help on read.spss

2011-10-11 Thread Smart Guy
Hi,
  I have one doubt about one of the parameter of 'read.spss()' from
'foreign' package.
Here is the syntax :-

read.spss ( file,
use.value.labels = TRUE,
to.data.frame = FALSE,
max.value.labels = Inf,
trim.factor.names = FALSE,
trim_values = TRUE,
reencode = NA,
use.missings = to.data.frame )


In above syntax when I pass *'to.data.frame= FALSE*' it gives me missing
values from SPSS file (that I try to read using read.spss() ). But when I
pass '*to.data.frame = TRUE*' then its not giving me missing values. And
need to get missing values.

According to read.spss() documentation

*to.data.frame :  return a data frame?*

I am curious to know, if we pass *'to.data.frame = TRUE*' , is it going to
cause some issue or effect something? I didn't understand the read.spss()
documentation correctly.
Please explain.

Thanks in Advance

-- 
SG

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is it possible to generate an ExpressionSet object that contain duplicate row names?

2011-10-11 Thread nqueralt
Dear all,

 

I am facing the problem that comes up when an ExpressionSet object is intended 
to be created parsing a matrix expression data with duplicate row names:

 

 try(myExpressionSet - new(ExpressionSet, exprs = myexprsunique, phenoData 
 = myphenoData, annotation = myannotation, check.names=FALSE))

Error in data.frame(numeric(n), row.names = nms) :

  duplicate row.names: blu

 

I was wondering if there exists a way to create this ExpressionSet object 
although duplicate row names exist in the expression matrix data parsed? Many 
thanks in advance.

 

Kind regards,

 

Núria 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: WHO Anthro growth curve macros and R

2011-10-11 Thread Gustaf Rydevik
On Tue, Oct 11, 2011 at 1:21 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Oct 10, 2011, at 4:48 PM, Gustaf Rydevik wrote:

  Hi all,
 some years ago, I sent a question to the mailing list regarding the WHO
 anthro macros. Since I've now received three mails asking how I solved it,
 I
 thought I'd cc R-help in for future reference. Attaching a zip file
 with  the relevant code parts that
 I used that I'm not sure gets through (if anyone has recommendations on
 how
 to manage such files for the list, I'd be grateful.
  What I ended up doing was importing the data in SPSS format, and
 adapting the Splus function igrowup.standard slightly.
 igrowup.standard2.R is the adapted function, while the ssc files are
 original splus functions. Let me know if anyone gets problems in figuring
 out how to use the files.


 The only files that reach the readership are .pdf and .txt files. I do not
 know how carefully these get inspected, so it is possible that a zip file
 named something.txt might make it through.


  best regards,
 Gustaf

  \
 David Winsemius, MD
 West Hartford, CT



Hi all again,

I noticed (and suspected) that as David said, zip files does not get
through.
Here's a google docs link for the Anthro example.zip file that won't
change in the foreseeable future:

*
https://docs.google.com/viewer?a=vpid=explorerchrome=truesrcid=0B77NeAmIHMaQMjJkZTQ0OTQtNTRkYy00ZWMzLThhNTUtMzg1ZDY5MjljOGQxhl=en_US

*(if the link is problematic due to it's length, try *
http://tinyurl.com/625vod6 *instead)*
*The most interesting files are igrowup.standard2.R (which is a modified
version of igrowup.standard) and anthro-example.R.
Hopes this comes in use for someone in the future!

Regards,
Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +44(0)704 253 760 42
address:St John's hill 18/5  EH8 9UQ Edinburgh, UK
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help on read.spss

2011-10-11 Thread Eik Vettorazzi
Hi,
if you specify to.data.frame=T, then use.missings is implictly set to
T as well, which causes different results for (user-defined) missing values.

cheers.

Am 11.10.2011 12:07, schrieb Smart Guy:
 Hi,
   I have one doubt about one of the parameter of 'read.spss()' from
 'foreign' package.
 Here is the syntax :-
 
 read.spss ( file,
 use.value.labels = TRUE,
 to.data.frame = FALSE,
 max.value.labels = Inf,
 trim.factor.names = FALSE,
 trim_values = TRUE,
 reencode = NA,
 use.missings = to.data.frame )
 
 
 In above syntax when I pass *'to.data.frame= FALSE*' it gives me missing
 values from SPSS file (that I try to read using read.spss() ). But when I
 pass '*to.data.frame = TRUE*' then its not giving me missing values. And
 need to get missing values.
 
 According to read.spss() documentation
 
 *to.data.frame :  return a data frame?*
 
 I am curious to know, if we pass *'to.data.frame = TRUE*' , is it going to
 cause some issue or effect something? I didn't understand the read.spss()
 documentation correctly.
 Please explain.
 
 Thanks in Advance
 


-- 
Eik Vettorazzi

Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; 
Gerichtsstand: Hamburg

Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. 
Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pmml for random forest rules

2011-10-11 Thread Graham Williams
Hi Patrick,

Thanks for the detailed report. See comments below.

On 11 October 2011 05:57, Patrick McCann patmmcc...@gmail.com wrote:
[...]
 I am having some trouble using R 2.13.1 for generating a pmml object
 of class c('randomForest.formula', 'randomForest')
[...]
 Random Forest (and randomSurvivalForest)
 — randomForest (Breiman and Cutler. R
 port by A. Liaw and M. Wiener, 2009) and randomSurvivalForest
 (Ishwaran and Kogalur ,
 2009): PMML export of a randomSurvivalForest rsf object. This
 function gives the user
 the ability to export PMML containing the geometry of a forest.
[...]
 Error in UseMethod(pmml) :
  no applicable method for 'pmml' applied to an object of class
 c('randomForest.formula', 'randomForest')

Sorry for the ambiguity there. It tries to say in the paper that pmml supports
PMML export of a randomSurvivalForest rsf object. It mentions
randomForest but does not say it can export randomForest. There is
some experimental code for pmml.randomForest but it has not yet
been completed.

 Also, if I run these lines of code
 data(Adult)
 ## Mine association rules.
 rules - apriori(Adult,
                 parameter = list(supp = 0.5, conf = 0.9,
                                  target = rules))
  pmml(rules)


 I get this error:
 pmml(rules)
 Error in function (classes, fdef, mtable)  :
  unable to find an inherited method for function size, for
 signature itemMatrix
[...]
   standardGeneric(size), environment)
 3: size(is.unique)
 2: pmml.rules(rules)
 1: pmml(rules)

That's odd. Not quite sure yet what is causing that. On my system it
works just fine:

 library(pmml)
 library(arules)
 data(Adult)
 rules - apriori(Adult,
parameter = list(supp = 0.5, conf = 0.9,
 target = rules))
 pmml(rules)
PMML version=3.2 ...
 Header copyright=Copyright (c) 2011 gjw...
 Extension name=user value=gjw extender=Rattle/PMML/
 Application name=Rattle/PMML version=1.2.27/
 Timestamp2011-10-11 21:50:40/Timestamp
 /Header
[...]

My system:

 rattleInfo()
Rattle: version 2.6.11 cran 2.6.11
R: version 2.13.2 (2011-09-30) (Revision 57111)

Sysname: Linux
Release: 2.6.38-12-generic
Version: #51-Ubuntu SMP Wed Sep 28 14:27:32 UTC 2011
[...]
pmml: version 1.2.27
[...]
arules: version 1.0-6

I'm using R 2.13.2 - could that be an issue - you have 2.13.1?

Regards,
Graham

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Import/convert PMML to R model

2011-10-11 Thread Graham Williams
Not possible (at least with the pmml package) at this time. There is
some experimental code for reading PMML (and converting into
standalone executable C code) but importing into an R object needs
quite a bit of work to re-create the kmeans object before it would be
worth releasing.

Regards,
Graham




On 1 June 2011 18:40, Raji raji.sanka...@gmail.com wrote:
 Hi R-helpers,

  Can you please let me know if it is possible to import a PMML in R? If yes,
 can you give me the command to do the same? If not, can you tell me the
 reason why?

 Many thanks,
 Raji

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Import-convert-PMML-to-R-model-tp3332772p3565260.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to generate an ExpressionSet object that contain duplicate row names?

2011-10-11 Thread Ben Bolker
 nqueralt at clinic.ub.es writes:

 I am facing the problem that comes up when an ExpressionSet 
 object is intended to be created parsing a matrix
 expression data with duplicate row names:

  You might have more luck with this question on the BioConductor mailing
list ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variable scope for deltavar function from emdbook

2011-10-11 Thread Ben Bolker
adad adad at gmx.at writes:

 Working example:
 --
 library(emdbook)
 
 fn - function()
 {
 browser()
 y - 2
 print(deltavar(y*b2, meanval=c(b2=3), Sigma=1) )
 }
 
 x - 2
 print(deltavar(x*b1, meanval=c(b1=3), Sigma=1) )
 y-3
 
 fn()
 
 
 running this returns 4 for the first function call, which is fine.
 
 For the call of deltavar in fn(), I get 9, i.e. the function uses y-3 
 instead of the local y-2. If y- is commented, deltavar returns an error.
 
 So why is the function not using the local variable and how do I make it 
 use it?

  The real problem is that I (the author) don't understand scoping in R, and
how to manipulate it, as well as I'd like to. I will work on this (any
tips from the R-helpers appreciated).  In the meantime, you could
try out one of the other available delta-method calculators, such
as the one in the msm package (library(sos); findFn({delta method})).

  More text to try to make gmane happy

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vegan: Anova.CCA accessing original data using option by=margin

2011-10-11 Thread Gavin Simpson
On Mon, 2011-10-10 at 23:51 -0700, Steve Pawson wrote:
 Hello,
 
 I am attempting to use the ANOVA.CCA function with the by=margin option.
 The process works fine using the by=terms option and I note in the Vegan
 manual that Jari suggests that an error may occur if the anova does not have
 access to the data on the original constraints.
 
 This is the error that I get:
 
 Error in dimnames(x) - dn : 
   length of 'dimnames' [2] not equal to array extent
 
 My question is, does anyone know if this error relates to what Jari is
 referring to (or is it a different problem), and if it is, how do I link the
 anova to the original constraints?

It is almost impossible to answer that without a lot more information.
For starters, what does traceback() say when run immediately *after* you
get the error?

G

 Many thanks for any help provided.
 
 Regards
 
 Steve
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Vegan-Anova-CCA-accessing-original-data-using-option-by-margin-tp3893005p3893005.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] An issue regarding to gradient

2011-10-11 Thread luke1022
The following code will get me a curve plot:
cutoff - seq(1,7,0.25)
Sensitivity - 1 - pnorm(cutoff, 5, 0.8)
Specificity - pnorm(cutoff, 3, 1.2)
plot(1-Specificity,Sensitivity,main = ROC curve,type = o)

How do I get a gradient of a particular point on that curve?
Any packages/functions allow me to do that?  

Thank you

--
View this message in context: 
http://r.789695.n4.nabble.com/An-issue-regarding-to-gradient-tp3893401p3893401.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Parallel processing for R loop

2011-10-11 Thread Sandeep Patil
I have an R script that consists of a for loop
that repeats a process for many different files.


I want to process this parallely on machine with
multiple cores, is there any package for it ?

Thanks
-- 
Sandeep R Patil

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to run Rcmdr with OS 10.4?

2011-10-11 Thread rocher . f

I've installed Rcmdr package and it doesn't run
Here is the error message:


R version 2.9.2 (2009-08-24)
[R.app GUI 1.29 (5464) powerpc-apple-darwin8.11.1]

[Workspace restored from /Users/jfc/Documents/TravauxFR/.RData]

Le chargement a nécessité le package : tcltk
Chargement de Tcl/Tk... terminé
Le chargement a nécessité le package : car
Error in structure(.External(dotTclObjv, objv, PACKAGE = tcltk), class = 
tclObj) : 
  [tcl] invalid command name font.

De plus : Warning message:
In fun(...) : couldn't connect to display :0
Error : .onAttach a échoué dans 'attachNamespace'
Erreur : le chargement du package / espace de noms a échoué pour 'Rcmdr'


I've tried another version 2.10.2 and Rcmdr with its dependences and it returns 
the same warnings!
I feel that something lacks on my computer. X11 works and I've installed TcTlk 
8.5.5-x11.

What else to do?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help to ... import the data from Excel

2011-10-11 Thread Sarah_R_edu
Hi every one 

i have problem in R program to import the data from excel ,

I have done the following:

1. install.packages(xlsReadWrite)

2. library(xlsReadWrite)

3. z- read.xls(ReadXls,LTS,colNames=FALSE,sheet,type,form,rowNames=FALSE)

and i got on the result:

Error in read.xls(ReadXls, LTS, colNames = FALSE, rowNames = FALSE) :

 object 'LTS' not found

also i tried to done   data(LTS, package = xlsReadWrite)

 and we got on : Warning message: In data(LTS, package = xlsReadWrite) :
data set 'LTS' not found

How i get on LTS in the list objects? 

Note: LTS is name my data in Eexcl

 



 

i used another way as following:

mydata- read.table(C:\Users\user\Desktop\LTS.xls)

but its not working how can i do it?

*/My regards/ *

--
View this message in context: 
http://r.789695.n4.nabble.com/help-to-import-the-data-from-Excel-tp3893382p3893382.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to map current Europe?

2011-10-11 Thread fub2011
hi,

see here
http://r.789695.n4.nabble.com/Create-a-map-td3689877.html#a3893581

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-map-current-Europe-tp3715709p3893588.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create a map

2011-10-11 Thread fub2011
hi,

I've just figured out how to plot (more) up-to-date poilitical borders in R
in an easy way:

You can get the outline file Borders_MWDB3 from the NASA panoply site
http://www.giss.nasa.gov/tools/panoply/overlays/

Then read it into R, get the indices with a jump over the -180/180 E line,
remove the first point of the part on the other side and finally plot the
boarders as line:

bord - read.table(Borders_MWDB3.cno,sep=,,na.strings=,fill=T)
around - which(abs(diff(bord[,1]))180)+1
bord[around,] - NA
plot(bord,type=l,xlab=degrees east,ylab=degrees north)

hope this works.

--
View this message in context: 
http://r.789695.n4.nabble.com/Create-a-map-tp3689877p3893581.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mean or mode imputation fro missing values

2011-10-11 Thread francesca casalino
Dear R experts,

I have a large database made up of mixed data types (numeric,
character, factor, ordinal factor) with missing values, and I am
looking for a package that would help me impute the missing values
using  either the mean if numerical or the mode if character/factor.

I maybe could use replace like this:
df$var[is.na(df$var)] - mean(df$var, na.rm = TRUE)
And go through all the many different variables of the datasets using
mean or mode for each, but I was wondering if there was a faster way,
or if a package existed to automate this (by doing 'mode' if it is a
factor or character or 'mean' if it is numeric)?

I have tried the package dprep because I wanted to use the function
ce.mimp, btu unfortunately it is not available anymore.

Thank you for your help,
-francy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] map question

2011-10-11 Thread fub2011
maybe this helps:
http://r.789695.n4.nabble.com/Create-a-map-td3689877.html#a3893581

--
View this message in context: 
http://r.789695.n4.nabble.com/map-question-tp795873p3893593.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[R] rpanel

2011-10-11 Thread Pascal A. Niklaus

Dear all,

I am struggling to align textentry fields in a Tcl/Tk widget. In the 
example below, I'd like to have the boxes aligned.


library(rpanel)
panel - rp.control(title=title,size=c(100,100))

rp.textentry(panel,var=a,labels=Variable A,
 initval=1,pos=list(row=0,column=0))
rp.textentry(panel,var=b,labels=Var. B,
 initval=1,pos= list(row=1,column=0))


Thanks for your help

Pascal


--

Pascal A. Niklaus
Institute of Evolutionary Biology and Environmental Studies
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich / Switzerland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test if two C statistics are significantly different?

2011-10-11 Thread alanm (Alan Mitchell)
?Hmisc::rcorrp.cens

-Alan



-Original Message-
From: Eik Vettorazzi [mailto:e.vettora...@uke.de]
Sent: Tue 10/11/2011 2:25 AM
To: Yujie Wang
Cc: r-help@r-project.org
Subject: Re: [R] How to test if two C statistics are significantly different?
 
Hi Yujie,
there is still a lot of work in progress, I think. As
http://faculty.washington.edu/heagerty/Software/SurvROC/RisksetROC/risksetROCdiscuss.pdf

states: [...] for inference and variance estimation, we now suggest
bootstrapping [...].
Recently I catched a glimpse on roc.test from the pROC package, they
implemented, amongst others, a bootstrap algorithm - maybe this is a
start for your own work?

Hth.

Am 10.10.2011 21:35, schrieb Yujie Wang:
 Hey all,
 
 In order to test if a marker is a risk factor, I built two models (using cox
 proportional hazard model). One model included this marker, and the other is
 not.
 
 Then, I use R package risksetROC to test how much predictive value did the
 marker add to this model. I get two C statistics by analyzing the linear
 predictors of the two models into this package.
 
 The qustion is How to test if two C statistics are significantly different?
 
 Your help will be greatly appreciated!
 
 Yujie
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; 
Gerichtsstand: Hamburg

Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. 
Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test if two C statistics are significantly different?

2011-10-11 Thread Frank Harrell
Thanks for mentioning rcorrp.cens which is much more powerful than testing
for differences in C.  Likelihood ratio tests would be even more powerful. 
Ordinary differences in C index yields a test with power that is too low.
Frank

alanm (Alan Mitchell) wrote:
 
 ?Hmisc::rcorrp.cens
 
 -Alan
 
 
 
 -Original Message-
 From: Eik Vettorazzi [mailto:E.Vettorazzi@]
 Sent: Tue 10/11/2011 2:25 AM
 To: Yujie Wang
 Cc: r-help@
 Subject: Re: [R] How to test if two C statistics are significantly
 different?
  
 Hi Yujie,
 there is still a lot of work in progress, I think. As
 http://faculty.washington.edu/heagerty/Software/SurvROC/RisksetROC/risksetROCdiscuss.pdf
 
 states: [...] for inference and variance estimation, we now suggest
 bootstrapping [...].
 Recently I catched a glimpse on roc.test from the pROC package, they
 implemented, amongst others, a bootstrap algorithm - maybe this is a
 start for your own work?
 
 Hth.
 
 Am 10.10.2011 21:35, schrieb Yujie Wang:
 Hey all,
 
 In order to test if a marker is a risk factor, I built two models (using
 cox
 proportional hazard model). One model included this marker, and the other
 is
 not.
 
 Then, I use R package risksetROC to test how much predictive value did
 the
 marker add to this model. I get two C statistics by analyzing the linear
 predictors of the two models into this package.
 
 The qustion is How to test if two C statistics are significantly
 different?
 
 Your help will be greatly appreciated!
 
 Yujie
 
  [[alternative HTML version deleted]]
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 Eik Vettorazzi
 Institut für Medizinische Biometrie und Epidemiologie
 Universitätsklinikum Hamburg-Eppendorf
 
 Martinistr. 52
 20246 Hamburg
 
 T ++49/40/7410-58243
 F ++49/40/7410-57790
 
 --
 Pflichtangaben gemäß Gesetz über elektronische Handelsregister und
 Genossenschaftsregister sowie das Unternehmensregister (EHUG):
 
 Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
 Rechts; Gerichtsstand: Hamburg
 
 Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden),
 Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus 
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-test-if-two-C-statistics-are-significantly-different-tp3891857p3894430.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mean or mode imputation fro missing values

2011-10-11 Thread Weidong Gu
In your case, it may not be sensible to simply fill missing values by
mean or mode as multiple imputation becomes the norm this day. For
your specific question, na.roughfix in randomForest package would do
the work.

Weidong Gu

On Tue, Oct 11, 2011 at 8:11 AM, francesca casalino
francy.casal...@gmail.com wrote:
 Dear R experts,

 I have a large database made up of mixed data types (numeric,
 character, factor, ordinal factor) with missing values, and I am
 looking for a package that would help me impute the missing values
 using  either the mean if numerical or the mode if character/factor.

 I maybe could use replace like this:
 df$var[is.na(df$var)] - mean(df$var, na.rm = TRUE)
 And go through all the many different variables of the datasets using
 mean or mode for each, but I was wondering if there was a faster way,
 or if a package existed to automate this (by doing 'mode' if it is a
 factor or character or 'mean' if it is numeric)?

 I have tried the package dprep because I wanted to use the function
 ce.mimp, btu unfortunately it is not available anymore.

 Thank you for your help,
 -francy

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem executing function

2011-10-11 Thread Divyam
Hello All,

I have a series of steps that needs to be run many times. Hence I put them
all into a function. There is no problem in function creation, but when I
call the function, the steps are not getting executed or only the first step
gets executed. What possibly could be the reason?

Sample Function and the result:

fun -  function ()
{
 # Package load into R;
a -  c(library(RODBC),library(e1071));
b - read.csv(Path of the csv file, header=TRUE,sep=,,quote=);
c - b[,1];
d - b[,2];
e - b[,3];
rm(b);
 # Establishing ODBC connection;
conn - odbcConnect(c,uid=d,pwd=e);
}
 
fun()

Warning messages:
1: package 'RODBC' was built under R version 2.13.1
2: package 'e1071' was built under R version 2.13.1

The subsequent csv fetch and odbc connection establishment are not getting
executed. Why is the function not getting executed fully? Even if I create a
separate function for csv file fetch, it is not being executed. But if I
simply type on the command prompt directly b - read.csv(Path of the csv
file,   header=TRUE, sep=,,quote=); it is working. Why is it like this?
I am not able to figure out the mistake.

Any help will be much useful. Have been struggling with this for quite some
time now.

Thanks
Divya 

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-executing-function-tp3894359p3894359.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] controling text in facets (ggplot2)

2011-10-11 Thread Thomthom
Hi R-helpers!

Here is my problem: 

I have a graph with 3 different facets where there are 3 different
regression line. My goal is to mention separately in each facet each
equation that describes my lines.

So far, I managed to add a line and the same equation to all my facets but
that's not unfortunately what I want.

Is there a way to do that? Any suggestion would be gladly welcome!

Thanks for your help!

Thomas

--
View this message in context: 
http://r.789695.n4.nabble.com/controling-text-in-facets-ggplot2-tp3894148p3894148.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] correlation matrix

2011-10-11 Thread 1Rnwb
Thank you all for your suggestions.
Sharad

--
View this message in context: 
http://r.789695.n4.nabble.com/correlation-matrix-tp3891085p3894329.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem executing function

2011-10-11 Thread R. Michael Weylandt michael.weyla...@gmail.com
Sounds like what you were hoping for does happen in the function environment 
but isn't returned to the global environment. The proper fix is to put the 
values in a list and return() them as the function output. 

Michael

On Oct 11, 2011, at 9:21 AM, Divyam divyamural...@gmail.com wrote:

 Hello All,
 
 I have a series of steps that needs to be run many times. Hence I put them
 all into a function. There is no problem in function creation, but when I
 call the function, the steps are not getting executed or only the first step
 gets executed. What possibly could be the reason?
 
 Sample Function and the result:
 
 fun -  function ()
 {
 # Package load into R;
 a -  c(library(RODBC),library(e1071));
 b - read.csv(Path of the csv file, header=TRUE,sep=,,quote=);
 c - b[,1];
 d - b[,2];
 e - b[,3];
 rm(b);
 # Establishing ODBC connection;
 conn - odbcConnect(c,uid=d,pwd=e);
 }
 
 fun()
 
 Warning messages:
 1: package 'RODBC' was built under R version 2.13.1
 2: package 'e1071' was built under R version 2.13.1
 
 The subsequent csv fetch and odbc connection establishment are not getting
 executed. Why is the function not getting executed fully? Even if I create a
 separate function for csv file fetch, it is not being executed. But if I
 simply type on the command prompt directly b - read.csv(Path of the csv
 file,   header=TRUE, sep=,,quote=); it is working. Why is it like this?
 I am not able to figure out the mistake.
 
 Any help will be much useful. Have been struggling with this for quite some
 time now.
 
 Thanks
 Divya 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Problem-executing-function-tp3894359p3894359.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] filtering rows

2011-10-11 Thread Samir Benzerfa
Hi everyone,

 

I've got two data sets as below. My question now is: how can I use Dataset2
as a filter for Dataset1? My goal is just to keep the rows of Dataset1 where
the first column (Date) matches the Dates in Dataset2.

 

I would appreciate any solutions to this issue.

 

Many thanks!

S.B.

 

Dataset1: 

Date  A   B C D

1 1977  10   11   12   13

2 1978  14   15   16   17

3 1979  18   19   20   21

4 1980  22   23   24   25

5 1981  26   27   28   29

 

Dateset2:

 

Date 

1 1977

2 1978

3 1979

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Text Mining with Facebook Reviews (XML and FQL)

2011-10-11 Thread Duncan Temple Lang

Hi Kenneth

  First off, you probably don't need to use xmlParseDoc(), but rather
  xmlParse().  (Both are fine, but xmlParseDoc() allows you to control many of
  the options in the libxml2 parser, which you don't need here.)

  xmlParse() has some capabilities to fetch the content of URLs. However,
 it cannot deal with HTTPS requests which this call to facebook is.
 The approach to this is to
i) make the request
   ii) parse the resulting string via xmlParse(txt, asText = TRUE)

 As for i), there are several ways to do this, but the RCurl
 package allows you to do it entirely within R and gives you
 more control over the request than you would ever want.

   library(RCurl)
   txt = getForm('https://api.facebook.com/method/fql.query', query = QUERY)

   mydata.xml = xmlParse(txt, asText = TRUE)

However, you are most likely going to have to login / get a token
before you make this request. And then, if you are using RCurl,
you will want to use the same curl object with the token or cookies, etc.

D.

On 10/10/11 3:52 PM, Kenneth Zhang wrote:
 Hello,
 
 I am trying to use XML package to download Facebook reviews in the following
 way:
 
 require(XML)
 mydata.vectors - character(0)
 Qword - URLencode('#IBM')
 QUERY - paste('SELECT review_id, message, rating from review where message
 LIKE %',Qword,'%',sep='')
 Facebook_url =  paste('https://api.facebook.com/method/fql.query?query=
 ',QUERY,sep='')
 mydata.xml - xmlParseDoc(Facebook_url, asText=F)
 mydata.vector - xpathSApply(mydata.xml, '//s:entry/s:title', xmlValue,
 namespaces =c('s'='http://www.w3.org/2005/Atom'))
 
 The mydata.xml is NULL therefore no further step can be execute. I am not so
 familiar with XML or FQL. Any suggestion will be appreciated. Thank you!
 
 Best regards,
 Kenneth
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mean or mode imputation fro missing values

2011-10-11 Thread francesca casalino
Yes thank you Gu…
I am just trying to do this as a rough step and will try other
imputation methods which are more appropriate later.
I am just learning R, and was trying to do the for loop and
f-statement by hand but something is going wrong…

This is what I have until now:

*fake array:
age- c(5,8,10,12,NA)
a- factor(c(aa, bb, NA, cc, cc))
b- c(banana, apple, pear, grape, NA)
df_test - data.frame(age=age, a=a, b=b)
df_test$b- as.character(df_test$b)

for (var in 1:ncol(df_test)) {
if (class(df_test$var)==numeric) {
df_test$var[is.na(df_test$var)] - mean(df_test$var, na.rm = 
TRUE)
} else if (class(df_test$var)==character) {
Mode(df_test$var[is.na(df_test$var)], na.rm = TRUE)
}
}

Where 'Mode' is the function:

function (x, na.rm)
{
xtab - table(x)
xmode - names(which(xtab == max(xtab)))
if (length(xmode)  1)
xmode - 1 mode
return(xmode)
}


It seems as it is just ignoring the statements though, without giving
any error…Does anybody have any idea what is going on?

Thank you very much for all the great help!
-f

2011/10/11 Weidong Gu anopheles...@gmail.com:
 In your case, it may not be sensible to simply fill missing values by
 mean or mode as multiple imputation becomes the norm this day. For
 your specific question, na.roughfix in randomForest package would do
 the work.

 Weidong Gu

 On Tue, Oct 11, 2011 at 8:11 AM, francesca casalino
 francy.casal...@gmail.com wrote:
 Dear R experts,

 I have a large database made up of mixed data types (numeric,
 character, factor, ordinal factor) with missing values, and I am
 looking for a package that would help me impute the missing values
 using  either the mean if numerical or the mode if character/factor.

 I maybe could use replace like this:
 df$var[is.na(df$var)] - mean(df$var, na.rm = TRUE)
 And go through all the many different variables of the datasets using
 mean or mode for each, but I was wondering if there was a faster way,
 or if a package existed to automate this (by doing 'mode' if it is a
 factor or character or 'mean' if it is numeric)?

 I have tried the package dprep because I wanted to use the function
 ce.mimp, btu unfortunately it is not available anymore.

 Thank you for your help,
 -francy

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extra digits added to data

2011-10-11 Thread Mark Harrison
Thanks for the quick response.

Read the FAQ.  If i want to keep the values in R the same as when inputed 
should i be converting the data to a different type - i.e. Not numeric?



Sent from my iPhone

On Oct 11, 2011, at 4:46 AM, Jim Holtman jholt...@gmail.com wrote:

 FAQ 7.31
 
 Sent from my iPad
 
 On Oct 11, 2011, at 1:07, Mark Harrison harrisonma...@gmail.com wrote:
 
 I am having a problem with extra digits being added to my data which I think
 is a result of how I am converting my data.frame data to xts.
 
 I see the same issue in R v2.13.1 and RStudio version 0.94.106.
 
 I am loading historical foreign exchange data in via csv files or from a sql
 server database.  In both cases there are no extra digits and the original
 data looks like the following:
 
   Date   Open   HighLow  Close
 1 2001-01-03 1.5021 1.5094 1.4883 1.4898
 2 2001-01-04 1.4897 1.5037 1.4882 1.5020
 3 2001-01-05 1.5020 1.5074 1.4952 1.5016
 4 2001-01-08 1.5035 1.5104 1.4931 1.4964
 5 2001-01-09 1.4964 1.4978 1.4873 1.4887
 6 2001-01-10 1.4887 1.4943 1.4856 1.4866
 
 So for 2001-01-03 the Open value is 1.5021 with only 4 digits after the
 decimal place - i.e. .5021.
 
 I then proceed to do the following in R to convert the 'british pound' data
 above from data.frame to xts:
 
 Require(quantmod)
 rownames(gbp) - gbp$Date
 head(gbp)
 
Open   HighLow  Close
 2001-01-03 1.5021 1.5094 1.4883 1.4898
 2001-01-04 1.4897 1.5037 1.4882 1.5020
 2001-01-05 1.5020 1.5074 1.4952 1.5016
 2001-01-08 1.5035 1.5104 1.4931 1.4964
 2001-01-09 1.4964 1.4978 1.4873 1.4887
 2001-01-10 1.4887 1.4943 1.4856 1.4866
 
 gbp- as.xts(gbp[,2:5])
 class(gbp)
 
 [1] xts zoo
 
 The data at this point looks ok until you look closer or output the data to
 excel at which point you see the following for the 'Open' 2001-01-03:
 1.5020084473
 
 It is not just the above 'Open' or the first value but all the data points
 contain the extra digits which I think is the original date data and/or row
 numbers that are being tacked on.
 
 My problem is the extra digits being added or whatever I am doing wrong in R
 to cause the extra digits to be added.  I need 1.5021 to be 1.5021 and not
 1.5020084473.
 
 Thanks for the help.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Background Colors

2011-10-11 Thread Gabriel Yospin
Hi R-Help -

If I make a plot:

numYears = 500
plot(x = c(1,numYears), y = c(200,300), xlab = Time, ylab = Vegetation
Class, xlim = c(100,600), ylim = c(200,300), type=n)

Is there a way to make different parts of the background for the plot
different colors?

For example, I'd like to have the background color col = (250,250,0,50) for
y = c(200,204), and col = (250,125,0,50) for y = c(210,212).

Any suggestions?

Thanks in advance for the help,

Gabe
-- 
Gabriel I. Yospin

Institute of Ecology and Evolution
Bridgham Lab
University of Oregon
Eugene, OR 97403-5289

Ph: 541 346 1549
Fax: 541 346 2364

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SLOW split() function

2011-10-11 Thread ivo welch
thanks, josh.  in my posting example, I did not need anything except
coefficients.  (when this is the case, I usually do not even use
lm.fit, but I eliminate all missing obs first and then use solve
crossprod(y,cbind(1,x)) crossprod(cbind(1,x)).)  this is pretty fast.)

alas, I will need to figure how to get coef standard errors faster in
this case.  summary.lm() is really slow.

regards,

/iaw

Ivo Welch (ivo.we...@gmail.com)
http://www.ivo-welch.info/
J. Fred Weston Professor of Finance
Anderson School at UCLA, C519





On Mon, Oct 10, 2011 at 11:30 PM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 As another followup, given that you are doing numerous regression
 models and (I presume) working with finance/stock data that is
 strictly numeric (no need for special contrast coding, etc.), you can
 substantially reduce the time spent estimating the coefficients.  A
 simple way is to use lm.fit directly instead of lm.  For lm.fit, you
 pass the y and x (design) matrices directly.  This skips a good deal
 of overhead.  Here is one naive way, I imagine more speedups could be
 gained by incorporating the intercept (1 vector) into d instead of
 cbind()ing it.  The catch it that lm.fit requires matrices, not data
 tables, so what you gain may be lost in having to do an extra
 conversion.  In any case, here are the times on my system for the two
 options (note I used N = 1000 * 100 because I am presently on a
 glorified netbook).

 print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~
 + x, data=d[.indx,])) })))
   user  system elapsed
  69.00    0.00   69.56

 print(system.time(all.2c - lapply(si, function(.indx) { coef(lm.fit(y = 
 d[.indx, y], x = cbind(1, d[.indx, x]))) })))
   user  system elapsed
  37.83    0.03   38.36

 the column names for the coeficients will not be the same as from lm,
 but the estimates should be identical.  While this is not recommended
 in typical usage, in an application like regressions on rolling time
 windows, etc. where you know the data are not changing, I think it
 makes sense to bypass the clever determine your data and best methods
 to use, and go straight to passing the design matrix.  Since you do
 not need residuals, variances, etc. it may be possible to speed this
 up even more, perhaps bypassing dqrls altogether.

 Cheers,

 Josh

 On Mon, Oct 10, 2011 at 9:56 PM, ivo welch ivo.we...@gmail.com wrote:
 thank you, everyone.  this was very helpful to my specific task and
 understanding.  for the benefit of future googlers, I thought I would
 post some experiments and results here.

 ultimately, I need to do a by() on an irregular matrix, and I now know
 how to speed up by() on a single-core, and then again on a multi-core
 machine.

 library(data.table)
 N - 1000*1000
 d - data.table(data.frame( key= as.integer(runif(N, min=1,
 max=N/10)), x=rnorm(N), y=rnorm(N) ))  # irregular
 setkey(d, key); gc() ## sort and force a garbage collection


 cat(N=, N, .  Size of d=, object.size(d)/1024/1024, MB\n)

 cat(\nStandard by() Function:\n)
 print(system.time( all.1 - by( d, d$key, function(d) coef(lm(y ~ x, 
 data=d)


 cat(\n\nPreSplit Function [aka Jim H]\n\t(a) Splitting Operation:\n)
 print(system.time(si - split(seq(nrow(d)), d$key)))
 cat(\n\t(b) Regressions:\n)
 print(system.time(all.2 - lapply(si, function(.indx) {
 coef(lm(d$y[.indx] ~ d$x[.indx])) })))
 print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~
 x, data=d[.indx,])) })))

 cat(\n\nNaive Split Data Frame\n\t(a) Splitting Operation:\n)
 print(system.time(ds - split(d, d$key)))
 cat(\n\t(b) Regressions:\n)
 print(system.time(all.3a - lapply(ds, function(ds) { coef(lm(ds$y ~ ds$x)) 
 })))
 print(system.time(all.3b - lapply(ds, function(ds) { coef(lm(y ~ x,
 data=ds)) })))

 the first and the last ways (all.1 and all.3) are naive ways of
 doing this, and take about 400-500 seconds on a Mac Air, core i5.
 Jim's suggestion (all.2) cuts this roughly into half by speeding up
 the split to take almost no time.

 and now,

 library(multicore)
 print(system.time(all.4 - mclapply(si, function(.indx) { coef(lm(y ~
 x, data=d[.indx,])) })))

 on my dual-core (quad-thread) i5, all four pseudo cores become busy,
 and the time roughly halves again from 230 seconds to 120 seconds.


 maybe the by() function should use Jim's approach, and multicore
 should provide mcby().  of course, knowing how to do this myself fast
 now by hand, this is not so important for me.  but it may help some
 other novices.

 thanks again everybody.

 regards,

 /iaw

 
 Ivo Welch (ivo.we...@gmail.com)




 On Mon, Oct 10, 2011 at 9:31 PM, William Dunlap wdun...@tibco.com wrote:
 The following avoids the overhead of data.frame methods
 (and assumes the data.frame doesn't include matrices
 or other data.frames) and relies on split(vector,factor)
 quickly splitting a vector into a list of vectors.
 For a 10^6 row by 10 column data.frame split in 10^5
 groups this took 14.1 seconds while split took 658.7 s.
 Both 

[R] apply for each value

2011-10-11 Thread Ben qant
Hello,

There has to be a more R'ish way to do this. I have two matrices, one has
the values I want, but I want to NA some of them. The other matrix has
binary values that tell me if I want to NA the values in the other matrix. I
produce a third matrix based on this. I've also tried apply() passing in
c(1,2) for rows and columns with no success yet.

Example (this works, but I'm looking for a better/faster solution):

a = matrix(1:6,2,3)
colnames(a) = c('a','b','c')
b = matrix(c(1,0,1,0,0,1),2,3)
colnames(b) = colnames(a)
c = matrix(0,nrow(a),ncol(a))
for(cl in 1:ncol(a)){
 for(rw in 1:nrow(a)){
c[rw,cl] = ifelse(b[rw,cl]==1,a[rw,cl],NA)
 }

}

 a
 a b c
[1,] 1 3 5
[2,] 2 4 6
 b
 a b c
[1,] 1 1 0
[2,] 0 0 1
 c
 [,1] [,2] [,3]
[1,]13   NA
[2,]   NA   NA6


Thanks!

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply for each value

2011-10-11 Thread Sarah Goslee
Hi,

On Tue, Oct 11, 2011 at 12:08 PM, Ben qant ccqu...@gmail.com wrote:
 Hello,

 There has to be a more R'ish way to do this. I have two matrices, one has
 the values I want, but I want to NA some of them. The other matrix has
 binary values that tell me if I want to NA the values in the other matrix. I
 produce a third matrix based on this. I've also tried apply() passing in
 c(1,2) for rows and columns with no success yet.

 Example (this works, but I'm looking for a better/faster solution):

 a = matrix(1:6,2,3)
 colnames(a) = c('a','b','c')
 b = matrix(c(1,0,1,0,0,1),2,3)
 colnames(b) = colnames(a)
 c = matrix(0,nrow(a),ncol(a))
 for(cl in 1:ncol(a)){
  for(rw in 1:nrow(a)){
    c[rw,cl] = ifelse(b[rw,cl]==1,a[rw,cl],NA)
  }

 }

You're making it far too complicated. No need for loops or apply() or
anything like that.

 c - a
 c[b == 0] - NA
 c
  a  b  c
[1,]  1  3 NA
[2,] NA NA  6

And thanks for the reproducible small example.

Sarah

 a
     a b c
 [1,] 1 3 5
 [2,] 2 4 6
 b
     a b c
 [1,] 1 1 0
 [2,] 0 0 1
 c
     [,1] [,2] [,3]
 [1,]    1    3   NA
 [2,]   NA   NA    6


 Thanks!

 Ben




-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about string to boor?

2011-10-11 Thread song_gpqg
Thanks guys, that's a great help.

Nellie

--
View this message in context: 
http://r.789695.n4.nabble.com/question-about-string-to-boor-tp3890983p3894996.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get(a[1]) : object 'a[1]' not found

2011-10-11 Thread Timothy Bates
In the help for get(), the following example is given:

a - 1:4
assign(a[1], 2)
a[1] == 2  #FALSE
get(a[1]) == 2   #TRUE

However, executing that last line for me gives

Error in get(a[1]) : object 'a[1]' not found

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New package announcement: R2STATS, a GUI for fitting GLM and GLMM

2011-10-11 Thread Yvonnick Noel

Dear R-users,

I wanted to inform you that a new package called R2STATS is available, 
as a graphical front-end for the glm() and glmer() functions.


The GUI is based on the RGTk2 and gWidgets packages by Michael Lawrence 
and John Verzani, and so requires that the GTK+ library be installed 
first on your system. This is done automagically when installing the 
RGtk2 package (or the script mentioned below). It also use the 
RGtk2Extras by Tom Taverner, to provide editable grids for data frames.


This GUI is intended to provide an easy way to fit and compare GLM and 
GLMM models. The GLMM part is based on Douglas Bates' lme4 package and 
the glmer() function. Automatic plots are also drawn for every model, 
and you can switch from one plot to the other by just clicking on the 
model name. I found this feature quite useful when teaching: It helps 
students to get an immediate understanding of differences between models.


Note that this GUI is left (deliberately) simple and is not intended to 
provide a full-featured GUI (please consider using Rcmdr instead for a 
far more advanced GUI). But it tries to do well the one and only thing 
it was designed to do: Fitting and comparing models. Note that most 
standard statistical tests may well be presented as a simple comparison 
between GLMs and this is the way I go with my students here. This allows 
an integrated presentation for almost all common (and simple) situations 
in social sciences.


More information is available on my webpage : 
http://yvonnick.noel.free.fr/r2stats [in French for the moment, although 
the package is in English].


Installing the package is done from a temporary repository:

install.packages(R2STATS,repos=http://yvonnick.noel.free.fr/cran,dep=TRUE)

if you already have a recent version of GTK+ and RGtk2 installed, or by:

source(http://yvonnick.noel.free.fr/r2stats/installwin.R;)

for an automatic script that download and install everything. I will 
submit it to CRAN as soon as I have fixed some minor issues with R-devel 
(but the package works flawlessly with the current R-2.13.2).


Any comment welcome. Also, if you are willing to contribute a 
translation into your language, please let me know.


Best,

Yvonnick Noel, PhD.
University of Brittany
Rennes, France

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get(a[1]) : object 'a[1]' not found

2011-10-11 Thread Sarah Goslee
Hi,

On Tue, Oct 11, 2011 at 12:31 PM, Timothy Bates
timothy.c.ba...@gmail.com wrote:
 In the help for get(), the following example is given:

 a - 1:4
 assign(a[1], 2)
 a[1] == 2          #FALSE
 get(a[1]) == 2   #TRUE

 However, executing that last line for me gives

 Error in get(a[1]) : object 'a[1]' not found

That's actually in the help for assign().

But anyway, help files are checked before distribution, so something is
likely odd about your session.

Is this in an empty session? OS, version, etc? sessionInfo() at the very least.
What does ls() look like?
Did you get any other warnings or errors?

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plots of correlation matrices

2011-10-11 Thread gj
Hi,

I want to do a visualisation of a matrix plot made up of several plots of
correlation matrices (using corrplot()). My data is in csv format. Here's an
example:

id,category,attribute1,attribute2,attribute3,attribute4
661,SCHS,43.2,0,56.5,1
12202,SCHS,161.7,5.7,155,16
1182,SCHS,21.4,0,29,0
1356,SSS, 8.8182,0.1818,10.6667,0.6667
1864,SCHS,443.7273,9.9091,537,46
12360,SOA,6.6364,0,10,0
3382,SOA,7.1667,0,26,0.5
1033,SOA,63.9231,1.5385,91.5,11.5
14742,SSS,4.3846,0,8,0
12760,SSS,425.0714,1.7857,297.5,3.5

I can get rid of the id. But I need the 'category' as a way of
distinguishing the various correlation matrices.
I can do a plot of the correlation matrix using corrplot() function in the
corrplot package (ignoring the id and category). But what I need is a matrix
of the plots of each correlation matrix based on the category, ie I have
three categories in the data, hence I will need three plots of the
correlation matrix  in one diagram (because the correlation matrix only
makes sense if they are distinguished by category).

Any help?

Regards
Gawesh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get(a[1]) : object 'a[1]' not found

2011-10-11 Thread Duncan Murdoch

On 11/10/2011 12:31 PM, Timothy Bates wrote:

In the help for get(), the following example is given:

a- 1:4
assign(a[1], 2)
a[1] == 2  #FALSE
get(a[1]) == 2   #TRUE

However, executing that last line for me gives

Error in get(a[1]) : object 'a[1]' not found


What did the second line say?  It's the line that created the `a[1]` object.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] An issue regarding to gradient

2011-10-11 Thread David Winsemius


On Oct 11, 2011, at 6:02 AM, luke1022 wrote:


The following code will get me a curve plot:
cutoff - seq(1,7,0.25)
Sensitivity - 1 - pnorm(cutoff, 5, 0.8)
Specificity - pnorm(cutoff, 3, 1.2)
plot(1-Specificity,Sensitivity,main = ROC curve,type = o)

How do I get a gradient of a particular point on that curve?


First you need to define what you mean by gradient at a point when  
the gradient is discontinuous at each point. Is this a numerical  
example and you want to take the means of the slopes on either side,  
(rather like the definition of the Dirac function at x=0) ...


in whiich case these are the  slopes _between_not_at_ the points:

diff(Specificity)/diff(1-Sensitivity)

or ... is this a homework problem and you are being asked to use the  
knowledge that those (Sensitivity, Specificity) points came from  
particular pnorm functions?


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filtering rows

2011-10-11 Thread David Winsemius


On Oct 11, 2011, at 11:18 AM, Samir Benzerfa wrote:


Hi everyone,



I've got two data sets as below. My question now is: how can I use  
Dataset2
as a filter for Dataset1? My goal is just to keep the rows of  
Dataset1 where

the first column (Date) matches the Dates in Dataset2.



Perhaps:

merge(Dataset1, Dataset2)

Or:

Dataset1[ Dataset1$Date %in% Dataset1$Date , ]

--
David.


I would appreciate any solutions to this issue.



Many thanks!

S.B.



Dataset1:

   Date  A   B C D

1 1977  10   11   12   13

2 1978  14   15   16   17

3 1979  18   19   20   21

4 1980  22   23   24   25

5 1981  26   27   28   29



Dateset2:



   Date

1 1977

2 1978

3 1979




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help to write to a file

2011-10-11 Thread Sergio René Araujo Enciso
Dear all:

I am having some problems to use the function sink(). Basically I am doing
a loop over two files which contain unit-root variables. Then on a loop, I
extract every i element of both files to create an object called z. If z
meets some requirements, then I perform a unit root test (ADF test),
otherwise not. As this process is repeated several times, for each i I want
to get the summary of the ADF test on a common file. For that I use the
function sink(). My code runs fine, but I do not get anything written on
the text file where my results are supposed to be saved. The code is below

setwd(C:\\Users\\Sergio René\\Dropbox\\R)

library(urca)

P1-read.csv(2R_EQ_P_R1_500.csv)
P2-read.csv(2R_EQ_P_R2_500.csv)

d-(1:1000)
sink (ADF_results_b_1.txt)

for (i in seq(d))
{
z.1-P1[i]*-1-P2[i]*-1
if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}}
if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')}
if (r==1) {summary(ADF)}
}
sink()

Any suggestion of what I might be doing wroong?

best regards,

Sergio René

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] restricted cubic spline within survfit.cph in the package rms

2011-10-11 Thread Stan Maydan
Hello,
 
does anyone have an example on how to use restricted cubic
splines function rcs within survfit.cph, if cph (Cox Proportional Hazard 
Regression) was done with restricted cubic
splines (which I made to work)?

Thank you.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] suggestions for ANOVA which includes the year as a factor

2011-10-11 Thread Marco Fontanelli
Dear R Fundation,

I am a post-doc researcher at the University of Pisa, Italy.
I apologize for my english and I have to tell you in advance that I  
am a very beginner with R.
I used R for fitting dose-response curves (drc package) and for an  
ordinary ANOVA (one, two or three factors), including the post-hoc  
mean comparison (I used the LSD test...).
Now I have to process some simple data on tomato yield. I have just  
three different treatments (weed control) and three different  
years of experiment.

My questions are:

How can I insert the factor year in the ANOVA?
Do you think a mixed system could be suitable?
In case which texts, paper, references, manuals, etc. could you  
suggest me?
How can I compare means in a mixed model (for example LSD test) ?

Thank you very much and sorry for bothering you.
Sincerely.
Marco

__

Marco Fontanelli

Sezione Meccanica Agraria e
Meccanizzazione Agricola
Dipartimento di Agronomia e Gestione
dell'Agro-Ecosistema
Facoltà di Agraria
Università di Pisa

tel: 050 2218922
cell: 338 8832323
mail: mfontane...@agr.unipi.it






-- 
Questa email e' stata controllata da 
Astaro Security Gateway. http://www.astaro.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Background Colors

2011-10-11 Thread Carlos Ortega
Hi,

Yes, one way to do that is by using function polygon().

Regards,
Carlos Ortega
www.qualityexcellence.es

2011/10/11 Gabriel Yospin yosp...@gmail.com

 Hi R-Help -

 If I make a plot:

 numYears = 500
 plot(x = c(1,numYears), y = c(200,300), xlab = Time, ylab = Vegetation
 Class, xlim = c(100,600), ylim = c(200,300), type=n)

 Is there a way to make different parts of the background for the plot
 different colors?

 For example, I'd like to have the background color col = (250,250,0,50) for
 y = c(200,204), and col = (250,125,0,50) for y = c(210,212).

 Any suggestions?

 Thanks in advance for the help,

 Gabe
 --
 Gabriel I. Yospin

 Institute of Ecology and Evolution
 Bridgham Lab
 University of Oregon
 Eugene, OR 97403-5289

 Ph: 541 346 1549
 Fax: 541 346 2364

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help to write to a file

2011-10-11 Thread Sarah Goslee
Hi,

Inside a loop, you must explicitly wrap your summary() command and anything
else from which you expect output in a print() command.

Sarah

2011/10/11 Sergio René Araujo Enciso araujo.enc...@gmail.com:
 Dear all:

 I am having some problems to use the function sink(). Basically I am doing
 a loop over two files which contain unit-root variables. Then on a loop, I
 extract every i element of both files to create an object called z. If z
 meets some requirements, then I perform a unit root test (ADF test),
 otherwise not. As this process is repeated several times, for each i I want
 to get the summary of the ADF test on a common file. For that I use the
 function sink(). My code runs fine, but I do not get anything written on
 the text file where my results are supposed to be saved. The code is below

 setwd(C:\\Users\\Sergio René\\Dropbox\\R)

 library(urca)

 P1-read.csv(2R_EQ_P_R1_500.csv)
 P2-read.csv(2R_EQ_P_R2_500.csv)

 d-(1:1000)
 sink (ADF_results_b_1.txt)

 for (i in seq(d))
 {
 z.1-P1[i]*-1-P2[i]*-1
 if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}}
 if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')}
 if (r==1) {summary(ADF)}
 }
 sink()

 Any suggestion of what I might be doing wroong?

 best regards,

 Sergio René




-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help to write to a file

2011-10-11 Thread R. Michael Weylandt michael.weyla...@gmail.com
Untested, does adding a print() around summary() get it done?

Michael

On Oct 11, 2011, at 1:03 PM, Sergio René Araujo Enciso 
araujo.enc...@gmail.com wrote:

 Dear all:
 
 I am having some problems to use the function sink(). Basically I am doing
 a loop over two files which contain unit-root variables. Then on a loop, I
 extract every i element of both files to create an object called z. If z
 meets some requirements, then I perform a unit root test (ADF test),
 otherwise not. As this process is repeated several times, for each i I want
 to get the summary of the ADF test on a common file. For that I use the
 function sink(). My code runs fine, but I do not get anything written on
 the text file where my results are supposed to be saved. The code is below
 
 setwd(C:\\Users\\Sergio René\\Dropbox\\R)
 
 library(urca)
 
 P1-read.csv(2R_EQ_P_R1_500.csv)
 P2-read.csv(2R_EQ_P_R2_500.csv)
 
 d-(1:1000)
 sink (ADF_results_b_1.txt)
 
 for (i in seq(d))
 {
 z.1-P1[i]*-1-P2[i]*-1
 if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}}
 if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')}
 if (r==1) {summary(ADF)}
 }
 sink()
 
 Any suggestion of what I might be doing wroong?
 
 best regards,
 
 Sergio René
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help to write to a file

2011-10-11 Thread David Winsemius


On Oct 11, 2011, at 1:03 PM, Sergio René Araujo Enciso wrote:


Dear all:

I am having some problems to use the function sink(). Basically I  
am doing
a loop over two files which contain unit-root variables. Then on a  
loop, I
extract every i element of both files to create an object called z.  
If z

meets some requirements, then I perform a unit root test (ADF test),
otherwise not. As this process is repeated several times, for each i  
I want
to get the summary of the ADF test on a common file. For that I use  
the
function sink(). My code runs fine, but I do not get anything  
written on
the text file where my results are supposed to be saved. The code is  
below


setwd(C:\\Users\\Sergio René\\Dropbox\\R)

library(urca)

P1-read.csv(2R_EQ_P_R1_500.csv)
P2-read.csv(2R_EQ_P_R2_500.csv)

d-(1:1000)
sink (ADF_results_b_1.txt)

for (i in seq(d))
{
z.1-P1[i]*-1-P2[i]*-1
if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}}
if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')}
if (r==1) {summary(ADF)}


You may need to print() that summary-object inside the for-function.

 summary(a)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
   1.001.752.502.503.254.00
 sink(test.txt)
 for(i in 1) summary(a)
 sink()# No test.txt file created
 sink(test2.txt)
 for(i in 1) print( summary(a) )
 sink() # The expected file created

This relates to the FAQ about similar puzzling behavior with plotting  
lattice , grid or ggplot objects.



}
sink()

Any suggestion of what I might be doing wroong?

best regards,

Sergio René


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help to write to a file

2011-10-11 Thread Duncan Murdoch

On 11/10/2011 1:03 PM, Sergio René Araujo Enciso wrote:

Dear all:

I am having some problems to use the function sink(). Basically I am doing
a loop over two files which contain unit-root variables. Then on a loop, I
extract every i element of both files to create an object called z. If z
meets some requirements, then I perform a unit root test (ADF test),
otherwise not. As this process is repeated several times, for each i I want
to get the summary of the ADF test on a common file. For that I use the
function sink(). My code runs fine, but I do not get anything written on
the text file where my results are supposed to be saved. The code is below

setwd(C:\\Users\\Sergio René\\Dropbox\\R)

library(urca)

P1-read.csv(2R_EQ_P_R1_500.csv)
P2-read.csv(2R_EQ_P_R2_500.csv)

d-(1:1000)
sink (ADF_results_b_1.txt)

for (i in seq(d))
{
z.1-P1[i]*-1-P2[i]*-1
if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}}
if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')}
if (r==1) {summary(ADF)}
}
sink()

Any suggestion of what I might be doing wroong?


You aren't printing anything.  In a loop, you need to call print() 
explicitly; only the last value of an expression auto-prints.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get(a[1]) : object 'a[1]' not found

2011-10-11 Thread Timothy Bates
so… cleared out, and now it’s working: Must have been an obscure workspace 
conflict. Thanks for quick helpful replies

a - 1:4
assign(a[1], 2)
 a[1] == 2
[1] FALSE
 get(a[1]) == 2 
[1] TRUE

On 11 Oct 2011, at 5:45 PM, Duncan Murdoch wrote:
 On 11/10/2011 12:31 PM, Timothy Bates wrote:
 In the help for get(), the following example is given:
 a- 1:4
 assign(a[1], 2)
 a[1] == 2  #FALSE
 get(a[1]) == 2   #TRUE
 
 However, executing that last line for me gives
 
 Error in get(a[1]) : object 'a[1]' not found
 
 What did the second line say?  It's the line that created the `a[1]` object.
 
 Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help to write to a file

2011-10-11 Thread Sergio René Araujo Enciso
Ok, I see my mistake, just did as you suggest and works. Thanks for the
answer people

Best,

Sergio Rné

El 11 de octubre de 2011 19:03, Sergio René Araujo Enciso 
araujo.enc...@gmail.com escribió:

 Dear all:

 I am having some problems to use the function sink(). Basically I am
 doing a loop over two files which contain unit-root variables. Then on a
 loop, I extract every i element of both files to create an object called z.
 If z meets some requirements, then I perform a unit root test (ADF test),
 otherwise not. As this process is repeated several times, for each i I want
 to get the summary of the ADF test on a common file. For that I use the
 function sink(). My code runs fine, but I do not get anything written on
 the text file where my results are supposed to be saved. The code is below

 setwd(C:\\Users\\Sergio René\\Dropbox\\R)

 library(urca)

 P1-read.csv(2R_EQ_P_R1_500.csv)
 P2-read.csv(2R_EQ_P_R2_500.csv)

 d-(1:1000)
 sink (ADF_results_b_1.txt)

 for (i in seq(d))
 {
 z.1-P1[i]*-1-P2[i]*-1
 if (all(z.1=0)) {r=1} else {if (all(z.1=0)) {r=1} else {r=2}}
 if (r==1) {ADF-ur.df(ts(z.1), lags=1, type='drift')}
 if (r==1) {summary(ADF)}
 }
 sink()

 Any suggestion of what I might be doing wroong?

 best regards,

 Sergio René


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] stop()

2011-10-11 Thread Doran, Harold
Suppose I have a function, such as the toy example below:

myFun - function(x, max.iter = 5) {
   for(i in 1:10){
   result - x + i
   iter - i
   if(iter == max.iter) stop('Max reached')
   }
   result
   }

I can of course do this:
myFun(10, max.iter = 11)

However, if I reach the maximum number of iterations before my algorithm has 
finished (in my real application there are EM steps for a mixed model), I 
actually want the function to return the value of result up to that point. 
Currently using stop(), I would get

 myFun(10, max.iter = 4)
Error in myFun(10, max.iter = 4) : Max reached

But, in this toy case the function should return the value of result up to 
iteration 4.

Not sure how I can adjust this.

Thanks,
Harold



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stop()

2011-10-11 Thread Dimitris Rizopoulos

You could use return(), e.g.,

myFun - function (x, max.iter = 5) {
for (i in 1:10) {
result - x + i
iter - i
if (iter == max.iter) {
return(result)
}
}
result
}

myFun(10, max.iter = 4)


I hope it helps.

Best,
Dimitris


On 10/11/2011 7:31 PM, Doran, Harold wrote:

Suppose I have a function, such as the toy example below:

myFun- function(x, max.iter = 5) {
for(i in 1:10){
result- x + i
iter- i
if(iter == max.iter) stop('Max reached')
}
result
}

I can of course do this:
myFun(10, max.iter = 11)

However, if I reach the maximum number of iterations before my algorithm has finished 
(in my real application there are EM steps for a mixed model), I actually want the function to 
return the value of result up to that point. Currently using stop(), I would get


myFun(10, max.iter = 4)

Error in myFun(10, max.iter = 4) : Max reached

But, in this toy case the function should return the value of result up to 
iteration 4.

Not sure how I can adjust this.

Thanks,
Harold



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stop()

2011-10-11 Thread Doran, Harold
Thanks, Dimitris. Very helpful on something I *should* know by now. 

 -Original Message-
 From: Dimitris Rizopoulos [mailto:d.rizopou...@erasmusmc.nl]
 Sent: Tuesday, October 11, 2011 1:43 PM
 To: Doran, Harold
 Cc: r-help@r-project.org
 Subject: Re: [R] stop()
 
 You could use return(), e.g.,
 
 myFun - function (x, max.iter = 5) {
  for (i in 1:10) {
  result - x + i
  iter - i
  if (iter == max.iter) {
  return(result)
  }
  }
  result
 }
 
 myFun(10, max.iter = 4)
 
 
 I hope it helps.
 
 Best,
 Dimitris
 
 
 On 10/11/2011 7:31 PM, Doran, Harold wrote:
  Suppose I have a function, such as the toy example below:
 
  myFun- function(x, max.iter = 5) {
  for(i in 1:10){
  result- x + i
  iter- i
  if(iter == max.iter) stop('Max reached')
  }
  result
  }
 
  I can of course do this:
  myFun(10, max.iter = 11)
 
  However, if I reach the maximum number of iterations before my algorithm
 has finished (in my real application there are EM steps for a mixed model), I
 actually want the function to return the value of result up to that point.
 Currently using stop(), I would get
 
  myFun(10, max.iter = 4)
  Error in myFun(10, max.iter = 4) : Max reached
 
  But, in this toy case the function should return the value of result up to
 iteration 4.
 
  Not sure how I can adjust this.
 
  Thanks,
  Harold
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 --
 Dimitris Rizopoulos
 Assistant Professor
 Department of Biostatistics
 Erasmus University Medical Center
 
 Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
 Tel: +31/(0)10/7043478
 Fax: +31/(0)10/7043014
 Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help to ... import the data from Excel

2011-10-11 Thread Jean V Adams
Sarah_R_edu wrote on 10/11/2011 04:57:08 AM:
 
 Hi every one 
 
 i have problem in R program to import the data from excel ,
 
 I have done the following:
 
 1. install.packages(xlsReadWrite)
 
 2. library(xlsReadWrite)
 
 3. z- 
read.xls(ReadXls,LTS,colNames=FALSE,sheet,type,form,rowNames=FALSE)
 
 and i got on the result:
 
 Error in read.xls(ReadXls, LTS, colNames = FALSE, rowNames = FALSE) :
 
  object 'LTS' not found
 
 also i tried to done   data(LTS, package = xlsReadWrite)
 
  and we got on : Warning message: In data(LTS, package = xlsReadWrite) 
:
 data set 'LTS' not found
 
 How i get on LTS in the list objects? 
 
 Note: LTS is name my data in Eexcl
 
 
 
 
 
 
 
 i used another way as following:
 
 mydata- read.table(C:\Users\user\Desktop\LTS.xls)
 
 but its not working how can i do it?
 
 */My regards/ *


Try this

z - read.xls(file=C:\\Users\\user\\Desktop\\LTS.xls, colNames=FALSE, 
rowNames=FALSE)


Jean
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SLOW split() function

2011-10-11 Thread Joshua Wiley
I do not know if stripping down functions is generally recommended,
but it is not too difficult to do if you know that you can make
assumptions.  Here is an example (I also found a fast way to convert
the data table to a matrix, again if some assumptions can be made).
Using the stripped down function, you can get coefficients and
standard errors in less time than you can get just coefficients using
default lm.  It is hugely less flexible.

Cheers,

Josh

##
library(data.table)

## stripped down lm and summary.lm (for standard errors)
minimal.lm - function(y, x) {
  dims - dim(x)
  x - unlist(x, FALSE, FALSE)
  dim(x) - dims
  obj - lm.fit(x = x, y = y)
  resvar - sum(obj$residuals^2)/obj$df.residual
  p - obj$rank
  R - .Call(La_chol2inv, x = obj$qr$qr[1L:p, 1L:p, drop = FALSE],
size = p, PACKAGE = base)
  m - min(dim(R))
  d - c(R)[1L + 0L:(m - 1L) * (dim(R)[1L] + 1L)]
  se - sqrt(d * resvar)
  cbind(coef = obj$coefficients, se)
}

N - 1000*100
d - data.table(data.frame( key= as.integer(runif(N, min=1,
max=N/10)), x=rnorm(N), y=rnorm(N) ))  # irregular
## add intercept column
d$int - 1L
setkey(d, key); gc() ## sort and force a garbage collection

cat(N=, N, .  Size of d=, object.size(d)/1024/1024, MB\n)
print(system.time(si - split(seq(nrow(d)), d$key)))

cat(\n\t(b) Regressions:\n)
## using lm
print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~
x, data=d[.indx,])) })))
## using minimal.lm---faster and gives standard errors
print(system.time(all.2c - lapply(si, function(.indx) { minimal.lm(y
= d[.indx, y], x = d[.indx, list(int, x)]) })))

 Timings on my system 
 print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~
+ x, data=d[.indx,])) })))
   user  system elapsed
  67.870.01   68.46
 print(system.time(all.2c - lapply(si, function(.indx) { minimal.lm(y = 
 d[.indx, y], x = d[.indx, list(int, x)]) })))
   user  system elapsed
  47.720.00   48.00

##

On Tue, Oct 11, 2011 at 8:56 AM, ivo welch ivo.we...@gmail.com wrote:
 thanks, josh.  in my posting example, I did not need anything except
 coefficients.  (when this is the case, I usually do not even use
 lm.fit, but I eliminate all missing obs first and then use solve
 crossprod(y,cbind(1,x)) crossprod(cbind(1,x)).)  this is pretty fast.)

 alas, I will need to figure how to get coef standard errors faster in
 this case.  summary.lm() is really slow.

 regards,

 /iaw
 
 Ivo Welch (ivo.we...@gmail.com)
 http://www.ivo-welch.info/
 J. Fred Weston Professor of Finance
 Anderson School at UCLA, C519





 On Mon, Oct 10, 2011 at 11:30 PM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 As another followup, given that you are doing numerous regression
 models and (I presume) working with finance/stock data that is
 strictly numeric (no need for special contrast coding, etc.), you can
 substantially reduce the time spent estimating the coefficients.  A
 simple way is to use lm.fit directly instead of lm.  For lm.fit, you
 pass the y and x (design) matrices directly.  This skips a good deal
 of overhead.  Here is one naive way, I imagine more speedups could be
 gained by incorporating the intercept (1 vector) into d instead of
 cbind()ing it.  The catch it that lm.fit requires matrices, not data
 tables, so what you gain may be lost in having to do an extra
 conversion.  In any case, here are the times on my system for the two
 options (note I used N = 1000 * 100 because I am presently on a
 glorified netbook).

 print(system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~
 + x, data=d[.indx,])) })))
   user  system elapsed
  69.00    0.00   69.56

 print(system.time(all.2c - lapply(si, function(.indx) { coef(lm.fit(y = 
 d[.indx, y], x = cbind(1, d[.indx, x]))) })))
   user  system elapsed
  37.83    0.03   38.36

 the column names for the coeficients will not be the same as from lm,
 but the estimates should be identical.  While this is not recommended
 in typical usage, in an application like regressions on rolling time
 windows, etc. where you know the data are not changing, I think it
 makes sense to bypass the clever determine your data and best methods
 to use, and go straight to passing the design matrix.  Since you do
 not need residuals, variances, etc. it may be possible to speed this
 up even more, perhaps bypassing dqrls altogether.

 Cheers,

 Josh

 On Mon, Oct 10, 2011 at 9:56 PM, ivo welch ivo.we...@gmail.com wrote:
 thank you, everyone.  this was very helpful to my specific task and
 understanding.  for the benefit of future googlers, I thought I would
 post some experiments and results here.

 ultimately, I need to do a by() on an irregular matrix, and I now know
 how to speed up by() on a single-core, and then again on a multi-core
 machine.

 library(data.table)
 N - 1000*1000
 d - data.table(data.frame( key= as.integer(runif(N, min=1,
 max=N/10)), x=rnorm(N), y=rnorm(N) ))  # 

Re: [R] Background Colors

2011-10-11 Thread Jean V Adams
Carlos Ortega wrote on 10/11/2011 11:30:46 AM:
 
 Hi,
 
 Yes, one way to do that is by using function polygon().
 
 Regards,
 Carlos Ortega
 www.qualityexcellence.es
 
 2011/10/11 Gabriel Yospin yosp...@gmail.com
 
  Hi R-Help -
 
  If I make a plot:
 
  numYears = 500
  plot(x = c(1,numYears), y = c(200,300), xlab = Time, ylab = 
Vegetation
  Class, xlim = c(100,600), ylim = c(200,300), type=n)
 
  Is there a way to make different parts of the background for the plot
  different colors?
 
  For example, I'd like to have the background color col = 
(250,250,0,50) for
  y = c(200,204), and col = (250,125,0,50) for y = c(210,212).
 
  Any suggestions?
 
  Thanks in advance for the help,
 
  Gabe
  --
  Gabriel I. Yospin
 
  Institute of Ecology and Evolution
  Bridgham Lab
  University of Oregon
  Eugene, OR 97403-5289
 
  Ph: 541 346 1549
  Fax: 541 346 2364


For example:

plot(1, 1, xlab=Time, ylab=Vegetation Class, 
xlim=c(100, 600), ylim=c(200, 300), type=n)
xrange - par(usr)[1:2]
polygon(c(xrange, rev(xrange)), c(200, 200, 204, 204), 
col=rgb(250, 250, 0, 50, maxColorValue=255), border=NA)
polygon(c(xrange, rev(xrange)), c(210, 210, 212, 212), 
col=rgb(250, 125, 0, 50, maxColorValue=255), border=NA)
points(10*(20:30), 10*(20:30))

Jean
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] singular gradient error in nls

2011-10-11 Thread Katie Tully
I am trying to fit a nonlinear regression to infiltration data in order to
determine saturated hydraulic conductivity and matric pressure.  The
original equation can be found in Bagarello et al. 2004 SSSAJ (green-ampt
equation for falling head including gravity).  I am also VERY new to R and
to nonlinear regressions. I have searched the posts, but am still unable to
determine why my data come up with the singular gradient error.

Here are the data:
time - c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
#time in minutes
cumul -c(2, 5, 7, 9.5, 11, 13, 14, 15, 16, 18.5, 21, 23, 24.5, 26.5, 28,
29.5, 31, 31.5, 32.5) #cumulative infiltration in cm per min
df - data.frame(time, cumul)
df$cumul.m - df$cumul/100/60 #convert to meters per second
df$time.s - df$time*60 #convert to seconds
b2 - 1-(0.196/(0.06/0.01131)) #relationship between soil moisture and the
size of the ring infiltrometer (6 cm radius by 113.1 cm2 cross sectional
area)
theta - 0.196 #difference in residual soil water and field capacity

Here is the formula:
#Where a = K_fs and b=psi_f
nlsfit -
nls(time.s~(theta/a*b2)*((cumul.m/theta)-(((0.16-b)/b2)*log(1+((cumul.m*b2)/(theta*(0.16-b)),
data = df, start=list (a=1, b=0.5), trace=TRUE)

-
I am likely over parameterizing, but I must admit, that I am not entirely
sure what that means.  Any help offered would be greatly appreciated.  I am
sorry if I sound naive, but I am an ecologist, not a hydrologist.

Kate

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to calculate percentage variation in a zero-inflated negative binomial regression model

2011-10-11 Thread Yaw Boafo
I am a novice in R but using R 2.13.1 in Windows I wish to be able to
calculate the percentage variation in a 

 zero-inflated negative binomial regression model  that is explained by the
two predictors in my model.  My response variable was no. of dung-piles per
km and the predictor of excess zeros was distance to major road (km) .

Thanks in advance.

Boafo

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating the mean using algebra matrix

2011-10-11 Thread flokke
Dear all, 
I wanted to create the mean using a algebra matrix. 
so I tried this one: 

 meanAnimals - new3%*%factorial

(Calculates the matrix multiplication of the new3 * factorial).

But I get the following error message: 

Error in new3 %*% factorial : non-conformable arguments

These are my matrices: 


 new3
   [,1]   [,2]
 [1,] 1.3508.1
 [2,]   465.000  423.0
 [3,]36.330  119.5
 [4,]27.660  115.0
 [5,] 1.0405.5
 [6,] 11700.000   50.0
 [7,]  2547.000 4603.0
 [8,]   187.100  419.0
 [9,]   521.000  655.0
[10,]10.000  115.0
[11,] 3.300   25.6
[12,]   529.000  680.0
[13,]   207.000  406.0
[14,]62.000 1320.0
[15,]  6654.000 5712.0
[16,]  9400.000   70.0
[17,] 6.800  179.0
[18,]35.000   56.0
[19,] 0.1201.0
[20,] 0.0230.4
[21,] 2.500   12.1
[22,]55.500  175.0
[23,]   100.000  157.0
[24,]52.160  440.0
[25,] 0.2801.9
[26,] 87000.000  154.5
[27,] 0.1223.0
[28,]   192.000  180.0
 factorial
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[,14] [,15] [,16] [,17]
[1,]111111111 1 1 1 1
1 1 1 1
 [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28]
[1,] 1 1 1 1 1 1 1 1 1 1 1


Can anyone help me out of this?

Cheers, maria

--
View this message in context: 
http://r.789695.n4.nabble.com/Creating-the-mean-using-algebra-matrix-tp3895378p3895378.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stop()

2011-10-11 Thread Nordlund, Dan (DSHS/RDA)
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Dimitris Rizopoulos
 Sent: Tuesday, October 11, 2011 10:43 AM
 To: Doran, Harold
 Cc: r-help@r-project.org
 Subject: Re: [R] stop()
 
 You could use return(), e.g.,
 
 myFun - function (x, max.iter = 5) {
  for (i in 1:10) {
  result - x + i
  iter - i
  if (iter == max.iter) {
  return(result)
  }
  }
  result
 }
 
 myFun(10, max.iter = 4)
 
 
 I hope it helps.
 
 Best,
 Dimitris


Or, just use break :

myFun - function (x, max.iter = 5) {
 for (i in 1:10) {
 result - x + i
 iter - i
 if (iter == max.iter) break
 }
 result
}


Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204



 
 
 On 10/11/2011 7:31 PM, Doran, Harold wrote:
  Suppose I have a function, such as the toy example below:
 
  myFun- function(x, max.iter = 5) {
  for(i in 1:10){
  result- x + i
  iter- i
  if(iter == max.iter) stop('Max reached')
  }
  result
  }
 
  I can of course do this:
  myFun(10, max.iter = 11)
 
  However, if I reach the maximum number of iterations before my
 algorithm has finished (in my real application there are EM steps for
 a mixed model), I actually want the function to return the value of
 result up to that point. Currently using stop(), I would get
 
  myFun(10, max.iter = 4)
  Error in myFun(10, max.iter = 4) : Max reached
 
  But, in this toy case the function should return the value of
 result up to iteration 4.
 
  Not sure how I can adjust this.
 
  Thanks,
  Harold
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 --
 Dimitris Rizopoulos
 Assistant Professor
 Department of Biostatistics
 Erasmus University Medical Center
 
 Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
 Tel: +31/(0)10/7043478
 Fax: +31/(0)10/7043014
 Web: http://www.erasmusmc.nl/biostatistiek/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SLOW split() function

2011-10-11 Thread Thomas Lumley
On Wed, Oct 12, 2011 at 4:56 AM, ivo welch ivo.we...@gmail.com wrote:
 thanks, josh.  in my posting example, I did not need anything except
 coefficients.  (when this is the case, I usually do not even use
 lm.fit, but I eliminate all missing obs first and then use solve
 crossprod(y,cbind(1,x)) crossprod(cbind(1,x)).)  this is pretty fast.)

solve(cbind(1,x), y) should be even faster, and more numerically stable,
 [and less likely to make certain people want to cast you into the
outer darkness, where there is SAS and gnashing of teeth]

 alas, I will need to figure how to get coef standard errors faster in
 this case.  summary.lm() is really slow.

The code from summary.lm that actually computes the standard errors is
fairly efficient; you could extract that.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating the mean using algebra matrix

2011-10-11 Thread David Winsemius


On Oct 11, 2011, at 1:45 PM, flokke wrote:


Dear all,
I wanted to create the mean using a algebra matrix.
so I tried this one:


meanAnimals - new3%*%factorial


(Calculates the matrix multiplication of the new3 * factorial).

But I get the following error message:

Error in new3 %*% factorial : non-conformable arguments


You probably want to transpose `factorial`. I don't understand how the  
result would be particularly interesting, however.


--
David.


These are my matrices:



new3

  [,1]   [,2]
[1,] 1.3508.1
[2,]   465.000  423.0
[3,]36.330  119.5
[4,]27.660  115.0
[5,] 1.0405.5
[6,] 11700.000   50.0
[7,]  2547.000 4603.0
[8,]   187.100  419.0
[9,]   521.000  655.0
[10,]10.000  115.0
[11,] 3.300   25.6
[12,]   529.000  680.0
[13,]   207.000  406.0
[14,]62.000 1320.0
[15,]  6654.000 5712.0
[16,]  9400.000   70.0
[17,] 6.800  179.0
[18,]35.000   56.0
[19,] 0.1201.0
[20,] 0.0230.4
[21,] 2.500   12.1
[22,]55.500  175.0
[23,]   100.000  157.0
[24,]52.160  440.0
[25,] 0.2801.9
[26,] 87000.000  154.5
[27,] 0.1223.0
[28,]   192.000  180.0

factorial
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [, 
13]

[,14] [,15] [,16] [,17]
[1,]111111111 1 1  
1 1

1 1 1 1
[,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28]
[1,] 1 1 1 1 1 1 1 1 1 1 1


Can anyone help me out of this?

Cheers, maria

--
View this message in context: 
http://r.789695.n4.nabble.com/Creating-the-mean-using-algebra-matrix-tp3895378p3895378.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating the mean using algebra matrix

2011-10-11 Thread Timothy Bates
To do matrix multiplication: m x n, the Rows and columns of  m must be equal to 
the columns and rows of n, respectively. 


Sent from my iPhone

On 11 Oct 2011, at 06:45 PM, flokke ingaschw...@gmail.com wrote:

 Dear all, 
 I wanted to create the mean using a algebra matrix. 
 so I tried this one: 
 
 meanAnimals - new3%*%factorial
 
 (Calculates the matrix multiplication of the new3 * factorial).
 
 But I get the following error message: 
 
 Error in new3 %*% factorial : non-conformable arguments
 
 These are my matrices: 
 
 
 new3
   [,1]   [,2]
 [1,] 1.3508.1
 [2,]   465.000  423.0
 [3,]36.330  119.5
 [4,]27.660  115.0
 [5,] 1.0405.5
 [6,] 11700.000   50.0
 [7,]  2547.000 4603.0
 [8,]   187.100  419.0
 [9,]   521.000  655.0
 [10,]10.000  115.0
 [11,] 3.300   25.6
 [12,]   529.000  680.0
 [13,]   207.000  406.0
 [14,]62.000 1320.0
 [15,]  6654.000 5712.0
 [16,]  9400.000   70.0
 [17,] 6.800  179.0
 [18,]35.000   56.0
 [19,] 0.1201.0
 [20,] 0.0230.4
 [21,] 2.500   12.1
 [22,]55.500  175.0
 [23,]   100.000  157.0
 [24,]52.160  440.0
 [25,] 0.2801.9
 [26,] 87000.000  154.5
 [27,] 0.1223.0
 [28,]   192.000  180.0
 factorial
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
 [,14] [,15] [,16] [,17]
 [1,]111111111 1 1 1 1
 1 1 1 1
 [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28]
 [1,] 1 1 1 1 1 1 1 1 1 1 1
 
 
 Can anyone help me out of this?
 
 Cheers, maria
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Creating-the-mean-using-algebra-matrix-tp3895378p3895378.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot methods for summary of rms objects

2011-10-11 Thread Rob James
The integration of plot methods for various outputs from rms packages is 
a great appreciated aspect of the rms package.


I particularly like to use:

plot(summary(model))

for my own purposes, but... for publication/presentation I need to 
modify details like variable names, or the number of signficant digits 
used in the figure annotations.


Is there a simple way to modify the plot inputs arising from summary, or 
is it necessary to hack the summary object?


Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] singular gradient error in nls

2011-10-11 Thread Bert Gunter
Katie:

I would say that this is not an R question, so I would suggest that either

a) You ask it on a statistics help website like stats.stackexchange.com  or
b) You consult with someone locally who knows about nonlinear regression
(possibly a statistician, but not necessarily so).

-- Bert


On Tue, Oct 11, 2011 at 11:34 AM, Katie Tully katherinetu...@gmail.comwrote:

 I am trying to fit a nonlinear regression to infiltration data in order to
 determine saturated hydraulic conductivity and matric pressure.  The
 original equation can be found in Bagarello et al. 2004 SSSAJ (green-ampt
 equation for falling head including gravity).  I am also VERY new to R and
 to nonlinear regressions. I have searched the posts, but am still unable to
 determine why my data come up with the singular gradient error.

 Here are the data:
 time - c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
 19)
 #time in minutes
 cumul -c(2, 5, 7, 9.5, 11, 13, 14, 15, 16, 18.5, 21, 23, 24.5, 26.5, 28,
 29.5, 31, 31.5, 32.5) #cumulative infiltration in cm per min
 df - data.frame(time, cumul)
 df$cumul.m - df$cumul/100/60 #convert to meters per second
 df$time.s - df$time*60 #convert to seconds
 b2 - 1-(0.196/(0.06/0.01131)) #relationship between soil moisture and the
 size of the ring infiltrometer (6 cm radius by 113.1 cm2 cross sectional
 area)
 theta - 0.196 #difference in residual soil water and field capacity

 Here is the formula:
 #Where a = K_fs and b=psi_f
 nlsfit -

 nls(time.s~(theta/a*b2)*((cumul.m/theta)-(((0.16-b)/b2)*log(1+((cumul.m*b2)/(theta*(0.16-b)),
 data = df, start=list (a=1, b=0.5), trace=TRUE)

 -
 I am likely over parameterizing, but I must admit, that I am not entirely
 sure what that means.  Any help offered would be greatly appreciated.  I am
 sorry if I sound naive, but I am an ecologist, not a hydrologist.

 Kate

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] high and lowest with names

2011-10-11 Thread Ben qant
Hello,

I'm looking to get the values, row names and column names of the largest and
smallest values in a matrix.

Example (except is does not include the names):

 x - swiss$Education[1:25]
 dat = matrix(x,5,5)
 colnames(dat) = c('a','b','c','d','c')
 rownames(dat) = c('z','y','x','w','v')
 dat
   a  b  c  d  c
z 12  7  6  2 10
y  9  7 12  8  3
x  5  8  7 28 12
w  7  7 12 20  6
v 15 13  5  9  1

 #top 10
 sort(dat,partial=n-9:n)[(n-9):n]
 [1]  9 10 12 12 12 12 13 15 20 28
 # bottom 10
 sort(dat,partial=1:10)[1:10]
 [1] 1 2 3 5 5 6 6 7 7 7

...except I need the rownames and colnames to go along for the ride with the
values...because of this, I am guessing the return value will need to be a
list since all of the values have different row and col names (which is
fine).

Regards,

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] replicate data.frame n times

2011-10-11 Thread Martin Batholdy
Hi,


is there a way to replicate a data.frame like you can replicate the entries of 
a vector (with the repeat-function)?

I want to do this:

x - data.frame(x, x)
(where x is a data.frame).


but n times.



And it should be as cpu / memory efficient as possible, since n is pretty big 
in my case.



thanks for any suggestions!
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] restricted cubic spline within survfit.cph in the package rms

2011-10-11 Thread Frank Harrell
It may be best to either write to the package maintainer (me, as you did) or
post to the group but not both.
Frank

Stan Maydan-2 wrote:
 
 Hello,
  
 does anyone have an example on how to use restricted cubic
 splines function rcs within survfit.cph, if cph (Cox Proportional Hazard
 Regression) was done with restricted cubic
 splines (which I made to work)?
 
 Thank you.
 
 
 

   [[alternative HTML version deleted]]
 
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/restricted-cubic-spline-within-survfit-cph-in-the-package-rms-tp3895252p3895797.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix multiplication

2011-10-11 Thread B77S
Your question as answered by Timothy in your previous thread 

http://r.789695.n4.nabble.com/Re-Creating-the-mean-using-algebra-matrix-td3895689.html



flokke wrote:
 
 Dear all, 
 Sorry to bother you with such a stupid question, but I just cannot find
 the solution to my problem.
 
 I'd like to use matrix multiplication for meanA and factorial 3. 
 I use the command meanA%*%factorial 3. 
 But everything I get is: Error in factorial3 %*% A : non-conformable
 arguments
 
 I know that the number of the columns of the first vector has to be the
 same number of rows of the 
 second vector to be able to use matrix multiplication, but that is the
 case here. I also tried it with 
 two columns for factorial 3 and that didnt work either. 
 
 Can someone help me out with this?'
 
 these are my matrices:
 
 meanA
  [,1] [,2]
 [1,] 3.67 4.67
 
 factorial3
  [,1]
 [1,]1
 
 Thank you so much!
 Cheers, maria
 


--
View this message in context: 
http://r.789695.n4.nabble.com/matrix-multiplication-tp3895833p3895860.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plots of correlation matrices

2011-10-11 Thread Carlos Ortega
Hi,

One way to do that is this  (avoiding the use of a for loop):


l.txt- id category attribute1 attribute2 attribute3 attribute4
661 SCHS 43.2 0 56.5 1
12202 SCHS 161.7 5.7 155 16
1182 SCHS 21.4 0 29 0
1356 SSS  8.8182 0.1818 10.6667 0.6667
1864 SCHS 443.7273 9.9091 537 46
12360 SOA 6.6364 0 10 0
3382 SOA 7.1667 0 26 0.5
1033 SOA 63.9231 1.5385 91.5 11.5
14742 SSS 4.3846 0 8 0
12760 SSS 425.0714 1.7857 297.5 3.5


dat.df - read.table(textConnection(l.txt),  header=T, as.is = TRUE)
closeAllConnections()

dat.lt-by(dat.df[,3:6], dat.df$category, cor)
lapply(dat.lt,corrplot)


Regards,
Carlos Ortega
www.qualityexcellence.es

2011/10/11 gj gaw...@gmail.com

 Hi,

 I want to do a visualisation of a matrix plot made up of several plots of
 correlation matrices (using corrplot()). My data is in csv format. Here's
 an
 example:

 id,category,attribute1,attribute2,attribute3,attribute4
 661,SCHS,43.2,0,56.5,1
 12202,SCHS,161.7,5.7,155,16
 1182,SCHS,21.4,0,29,0
 1356,SSS, 8.8182,0.1818,10.6667,0.6667
 1864,SCHS,443.7273,9.9091,537,46
 12360,SOA,6.6364,0,10,0
 3382,SOA,7.1667,0,26,0.5
 1033,SOA,63.9231,1.5385,91.5,11.5
 14742,SSS,4.3846,0,8,0
 12760,SSS,425.0714,1.7857,297.5,3.5

 I can get rid of the id. But I need the 'category' as a way of
 distinguishing the various correlation matrices.
 I can do a plot of the correlation matrix using corrplot() function in the
 corrplot package (ignoring the id and category). But what I need is a
 matrix
 of the plots of each correlation matrix based on the category, ie I have
 three categories in the data, hence I will need three plots of the
 correlation matrix  in one diagram (because the correlation matrix only
 makes sense if they are distinguished by category).

 Any help?

 Regards
 Gawesh

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stop()

2011-10-11 Thread Greg Snow
Replace stop() with break to see if that does what you want.  (you may also 
want to include cat() or warn() to indicate the early stopping.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Doran, Harold
 Sent: Tuesday, October 11, 2011 11:32 AM
 To: r-help@r-project.org
 Subject: [R] stop()
 
 Suppose I have a function, such as the toy example below:
 
 myFun - function(x, max.iter = 5) {
for(i in 1:10){
result - x + i
iter - i
if(iter == max.iter) stop('Max reached')
}
result
}
 
 I can of course do this:
 myFun(10, max.iter = 11)
 
 However, if I reach the maximum number of iterations before my
 algorithm has finished (in my real application there are EM steps for
 a mixed model), I actually want the function to return the value of
 result up to that point. Currently using stop(), I would get
 
  myFun(10, max.iter = 4)
 Error in myFun(10, max.iter = 4) : Max reached
 
 But, in this toy case the function should return the value of result
 up to iteration 4.
 
 Not sure how I can adjust this.
 
 Thanks,
 Harold
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replicate data.frame n times

2011-10-11 Thread Bert Gunter
Replicate the row indices?

x[rep(seq_len(nrow(x)), k), ]

-- Bert

On Tue, Oct 11, 2011 at 12:55 PM, Martin Batholdy
batho...@googlemail.comwrote:

 Hi,


 is there a way to replicate a data.frame like you can replicate the entries
 of a vector (with the repeat-function)?

 I want to do this:

 x - data.frame(x, x)
 (where x is a data.frame).


 but n times.



 And it should be as cpu / memory efficient as possible, since n is pretty
 big in my case.



 thanks for any suggestions!
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often be
impatient with elementary studies or fight shy of them. If it were possible
to reach the ultimate truths without the elementary studies usually prefixed
to them, these would not be preparatory studies but superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plots of correlation matrices

2011-10-11 Thread Dénes TÓTH



 Hi,

 One way to do that is this  (avoiding the use of a for loop):


 l.txt- id category attribute1 attribute2 attribute3 attribute4
 661 SCHS 43.2 0 56.5 1
 12202 SCHS 161.7 5.7 155 16
 1182 SCHS 21.4 0 29 0
 1356 SSS  8.8182 0.1818 10.6667 0.6667
 1864 SCHS 443.7273 9.9091 537 46
 12360 SOA 6.6364 0 10 0
 3382 SOA 7.1667 0 26 0.5
 1033 SOA 63.9231 1.5385 91.5 11.5
 14742 SSS 4.3846 0 8 0
 12760 SSS 425.0714 1.7857 297.5 3.5
 

 dat.df - read.table(textConnection(l.txt),  header=T, as.is = TRUE)
 closeAllConnections()

 dat.lt-by(dat.df[,3:6], dat.df$category, cor)

I guess Gawesh is looking for ?layout or ?par:

par(mfrow=c(2,2))
lapply(dat.lt,corrplot)


 lapply(dat.lt,corrplot)


 Regards,
 Carlos Ortega
 www.qualityexcellence.es

 2011/10/11 gj gaw...@gmail.com

 Hi,

 I want to do a visualisation of a matrix plot made up of several plots
 of
 correlation matrices (using corrplot()). My data is in csv format.
 Here's
 an
 example:

 id,category,attribute1,attribute2,attribute3,attribute4
 661,SCHS,43.2,0,56.5,1
 12202,SCHS,161.7,5.7,155,16
 1182,SCHS,21.4,0,29,0
 1356,SSS, 8.8182,0.1818,10.6667,0.6667
 1864,SCHS,443.7273,9.9091,537,46
 12360,SOA,6.6364,0,10,0
 3382,SOA,7.1667,0,26,0.5
 1033,SOA,63.9231,1.5385,91.5,11.5
 14742,SSS,4.3846,0,8,0
 12760,SSS,425.0714,1.7857,297.5,3.5

 I can get rid of the id. But I need the 'category' as a way of
 distinguishing the various correlation matrices.
 I can do a plot of the correlation matrix using corrplot() function in
 the
 corrplot package (ignoring the id and category). But what I need is a
 matrix
 of the plots of each correlation matrix based on the category, ie I have
 three categories in the data, hence I will need three plots of the
 correlation matrix  in one diagram (because the correlation matrix only
 makes sense if they are distinguished by category).

 Any help?

 Regards
 Gawesh

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with twitteR package

2011-10-11 Thread steven mosher
check the version of  libcurl you have installed. If you have an older
version  some of the
options may not be present.



On Sun, Oct 9, 2011 at 10:39 AM, Steven Oliver s1oli...@ucsd.edu wrote:
 Hey Guys,

 I just started fooling around with the twitteR package in order to get a 
 record of all tweets from a single public account. When I run userTimeline, I 
 get the default 20 most recent tweets just fine. However, when I specify an 
 arbitrary number of tweets (as described in the documentation from June 14th, 
 2011), I get the following warning:

 bjaTweets-userTimeline(BeijingAir, n=50)
 Warning message:
 In mapCurlOptNames(names(.els), asNames = TRUE) :
  Unrecognized CURL options: n

 Does anyone familiar with the twitteR package know what is going on with 
 options? Alternatively, if there are any other simple means for getting this 
 sort of data?

 Steve
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot methods for summary of rms objects

2011-10-11 Thread David Winsemius


On Oct 11, 2011, at 3:20 PM, Rob James wrote:

The integration of plot methods for various outputs from rms  
packages is a great appreciated aspect of the rms package.


I particularly like to use:

plot(summary(model))

for my own purposes, but... for publication/presentation I need to  
modify details like variable names, or the number of signficant  
digits used in the figure annotations.


Is there a simple way to modify the plot inputs arising from  
summary, or is it necessary to hack the summary object?




If you type:

methods(summary)

... you should see why it might be very difficult to answer your  
question in its current state of vagueness. I just ran the example in  
help(summary.rms) and it appears that it used base graphics and that  
if such output is your target, you would need to either hack the code  
or hack the pdf file. Much of the graphical output from rms functions  
has been ported to lattice graphics, but apparently not the version  
for summary.rms objects.


If you have the data and can redo the analysis, read on. The  
comparison levels used by summary.rms are set with datadist and that  
is probably what you should be spending some time understanding, There  
are other possibilities than just using a datadist(data_object) call.


?datadist

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] controling text in facets (ggplot2)

2011-10-11 Thread Dennis Murphy
In the absence of a reproducible example, a general question induces a
general response. I'd suggest creating a small data frame that
contains the x and y coordinates, a third variable consisting of
expressions representing each fitted model and an indicator of the
group to which the expression is to be applied. Use this data frame as
the data argument of geom_text, and set x, y and labels = variable
containing expressions as the aesthetics of the geom.

If that doesn't work, provide a reproducible example and you'll
undoubtedly get a more accurate answer. You're also more likely to get
a higher response rate if you post on the ggplot2 group:
http://had.co.nz/ggplot2/  (see the Mailing List paragraph near the
top of the page for subscription information).

Dennis

On Tue, Oct 11, 2011 at 5:45 AM, Thomthom rime.tho...@gmail.com wrote:
 Hi R-helpers!

 Here is my problem:

 I have a graph with 3 different facets where there are 3 different
 regression line. My goal is to mention separately in each facet each
 equation that describes my lines.

 So far, I managed to add a line and the same equation to all my facets but
 that's not unfortunately what I want.

 Is there a way to do that? Any suggestion would be gladly welcome!

 Thanks for your help!

 Thomas

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/controling-text-in-facets-ggplot2-tp3894148p3894148.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] high and lowest with names

2011-10-11 Thread Carlos Ortega
Hi,

With this code you can find row and col names for the largest value applied
to your example:

r.m.tmp-apply(dat,1,max)
r.max-names(r.m.tmp)[r.m.tmp==max(r.m.tmp)]

c.m.tmp-apply(dat,2,max)
c.max-names(c.m.tmp)[c.m.tmp==max(c.m.tmp)]

It's inmediate how to get the same for the smallest and build a function to
calculate everything and return a list.


Regards,
Carlos Ortega
www.qualityexcellence.es

2011/10/11 Ben qant ccqu...@gmail.com

 Hello,

 I'm looking to get the values, row names and column names of the largest
 and
 smallest values in a matrix.

 Example (except is does not include the names):

  x - swiss$Education[1:25]
  dat = matrix(x,5,5)
  colnames(dat) = c('a','b','c','d','c')
  rownames(dat) = c('z','y','x','w','v')
  dat
   a  b  c  d  c
 z 12  7  6  2 10
 y  9  7 12  8  3
 x  5  8  7 28 12
 w  7  7 12 20  6
 v 15 13  5  9  1

  #top 10
  sort(dat,partial=n-9:n)[(n-9):n]
  [1]  9 10 12 12 12 12 13 15 20 28
  # bottom 10
  sort(dat,partial=1:10)[1:10]
  [1] 1 2 3 5 5 6 6 7 7 7

 ...except I need the rownames and colnames to go along for the ride with
 the
 values...because of this, I am guessing the return value will need to be a
 list since all of the values have different row and col names (which is
 fine).

 Regards,

 Ben

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] high and lowest with names

2011-10-11 Thread Bert Gunter
But it's simpler and probably faster to use R's built-in capabilities.
?which ## note the arr.ind argument!)

As an example:

test - matrix(rnorm(24), nr = 4)
which(test==max(test), arr.ind=TRUE)
 row col
[1,]   2   6

So this gives the row and column indices of the max, from which row and
column names can easily be obtained from the dimnames attribute of the
matrix.

Note: This assumes that the object in question is a matrix, NOT a data
frame, for which it would be slightly more complicated.

-- Bert


On Tue, Oct 11, 2011 at 3:06 PM, Carlos Ortega c...@qualityexcellence.eswrote:

 Hi,

 With this code you can find row and col names for the largest value applied
 to your example:

 r.m.tmp-apply(dat,1,max)
 r.max-names(r.m.tmp)[r.m.tmp==max(r.m.tmp)]

 c.m.tmp-apply(dat,2,max)
 c.max-names(c.m.tmp)[c.m.tmp==max(c.m.tmp)]

 It's inmediate how to get the same for the smallest and build a function to
 calculate everything and return a list.


 Regards,
 Carlos Ortega
 www.qualityexcellence.es

 2011/10/11 Ben qant ccqu...@gmail.com

  Hello,
 
  I'm looking to get the values, row names and column names of the largest
  and
  smallest values in a matrix.
 
  Example (except is does not include the names):
 
   x - swiss$Education[1:25]
   dat = matrix(x,5,5)
   colnames(dat) = c('a','b','c','d','c')
   rownames(dat) = c('z','y','x','w','v')
   dat
a  b  c  d  c
  z 12  7  6  2 10
  y  9  7 12  8  3
  x  5  8  7 28 12
  w  7  7 12 20  6
  v 15 13  5  9  1
 
   #top 10
   sort(dat,partial=n-9:n)[(n-9):n]
   [1]  9 10 12 12 12 12 13 15 20 28
   # bottom 10
   sort(dat,partial=1:10)[1:10]
   [1] 1 2 3 5 5 6 6 7 7 7
 
  ...except I need the rownames and colnames to go along for the ride with
  the
  values...because of this, I am guessing the return value will need to be
 a
  list since all of the values have different row and col names (which is
  fine).
 
  Regards,
 
  Ben
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating the mean using algebra matrix

2011-10-11 Thread Rolf Turner

On 12/10/11 08:31, Timothy Bates wrote:

To do matrix multiplication: m x n, the Rows and columns of  m must be equal to 
the columns and rows of n, respectively.


No.  The number of columns of m must equal the number of rows of n,
that's all.  The number of *rows* of m and the number of *columns* of n
can be anything you like.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] round() and negative digits

2011-10-11 Thread Rolf Turner

On 11/10/11 08:17, Michael Friendly wrote:

On 10/9/2011 6:18 AM, Prof Brian Ripley wrote:


Sometimes it is better not to document things than try to give precise
details which may get changed *and* there will be useRs who misread (and
maybe even file bug reports on their misreadings). The source is the
ultimate documentation.


I can't agree with this less.  The source does the computation. The 
documentation says how to use it and what it should do.  Corner cases

can be trapped in code or mentioned in Notes.  But the source is
only useful if you can easily find it and then can understand what it is
doing, particularly for a .Primitive like round().
The source is only the documentation of last resort.


I agree.  It seems to me that saying that the source is the ultimate 
documentation

is rather like (in pure mathematics) saying that all maths follows from the
Zermello-Fraenkel axioms plus the Axiom of Choice, so those axioms are 
all that we

need to tell anyone.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] round() and negative digits

2011-10-11 Thread Duncan Murdoch

On 11-10-11 7:14 PM, Rolf Turner wrote:

On 11/10/11 08:17, Michael Friendly wrote:

On 10/9/2011 6:18 AM, Prof Brian Ripley wrote:


Sometimes it is better not to document things than try to give precise
details which may get changed *and* there will be useRs who misread (and
maybe even file bug reports on their misreadings). The source is the
ultimate documentation.


I can't agree with this less.  The source does the computation. The
documentation says how to use it and what it should do.  Corner cases
can be trapped in code or mentioned in Notes.  But the source is
only useful if you can easily find it and then can understand what it is
doing, particularly for a .Primitive like round().
The source is only the documentation of last resort.


I agree.  It seems to me that saying that the source is the ultimate
documentation
is rather like (in pure mathematics) saying that all maths follows from the
Zermello-Fraenkel axioms plus the Axiom of Choice, so those axioms are
all that we
need to tell anyone.


R is an open source project.  That means we expect people to look at the 
source, to answer some of their own questions, to suggest improvements, 
to point out errors.  If you don't look at it, you aren't holding up 
your side of the bargain.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] high and lowest with names

2011-10-11 Thread Dénes TÓTH

which.max is even faster:

dims - c(1000,1000)
tt - array(rnorm(prod(dims)),dims)
# which
system.time(
replicate(100, which(tt==max(tt), arr.ind=TRUE))
)
# which.max ( arrayInd)
system.time(
replicate(100, arrayInd(which.max(tt), dims))
)

Best,
Denes

 But it's simpler and probably faster to use R's built-in capabilities.
 ?which ## note the arr.ind argument!)

 As an example:

 test - matrix(rnorm(24), nr = 4)
 which(test==max(test), arr.ind=TRUE)
  row col
 [1,]   2   6

 So this gives the row and column indices of the max, from which row and
 column names can easily be obtained from the dimnames attribute of the
 matrix.

 Note: This assumes that the object in question is a matrix, NOT a data
 frame, for which it would be slightly more complicated.

 -- Bert


 On Tue, Oct 11, 2011 at 3:06 PM, Carlos Ortega
 c...@qualityexcellence.eswrote:

 Hi,

 With this code you can find row and col names for the largest value
 applied
 to your example:

 r.m.tmp-apply(dat,1,max)
 r.max-names(r.m.tmp)[r.m.tmp==max(r.m.tmp)]

 c.m.tmp-apply(dat,2,max)
 c.max-names(c.m.tmp)[c.m.tmp==max(c.m.tmp)]

 It's inmediate how to get the same for the smallest and build a function
 to
 calculate everything and return a list.


 Regards,
 Carlos Ortega
 www.qualityexcellence.es

 2011/10/11 Ben qant ccqu...@gmail.com

  Hello,
 
  I'm looking to get the values, row names and column names of the
 largest
  and
  smallest values in a matrix.
 
  Example (except is does not include the names):
 
   x - swiss$Education[1:25]
   dat = matrix(x,5,5)
   colnames(dat) = c('a','b','c','d','c')
   rownames(dat) = c('z','y','x','w','v')
   dat
a  b  c  d  c
  z 12  7  6  2 10
  y  9  7 12  8  3
  x  5  8  7 28 12
  w  7  7 12 20  6
  v 15 13  5  9  1
 
   #top 10
   sort(dat,partial=n-9:n)[(n-9):n]
   [1]  9 10 12 12 12 12 13 15 20 28
   # bottom 10
   sort(dat,partial=1:10)[1:10]
   [1] 1 2 3 5 5 6 6 7 7 7
 
  ...except I need the rownames and colnames to go along for the ride
 with
  the
  values...because of this, I am guessing the return value will need to
 be
 a
  list since all of the values have different row and col names (which
 is
  fine).
 
  Regards,
 
  Ben
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Nonlinear regression aborting due to error

2011-10-11 Thread Dennis Fisher
Colleagues,

I am fitting an Emax model using nls.  The code is:
START   - list(EMAX=INITEMAX, EFFECT=INITEFFECT, 
C50=INITC50)
CONTROL - list(maxiter=1000, warnOnly=T)
#FORMULA- as.formula(YVAR ~ EMAX - EFFECT * XVAR^GAMMA 
/ (XVAR^GAMMA + C50^GAMMA)) ## alternate version of formula
FORMULA - as.formula(YVAR ~ EMAX - EFFECT / (1 + 
(C50/XVAR)^GAMMA))
FIT - nls(FORMULA, start=START, 
control=CONTROL, trace=T)

If GAMMA equals 10-80, nls converges successfully and the fit tracks the fit 
from a smoother (Supersmoother).  However, if I attempt to estimate GAMMA using:
START   - list(EMAX=INITEMAX, EFFECT=INITEFFECT, 
C50=INITC50, GAMMA=INITGAMMA)
GAMMA increases rapidly to  500 and nls terminates with:
Error in chol2inv(object$m$Rmat()) : 
  element (4, 4) is zero, so the inverse cannot be computed
In addition: Warning message:
In nls(FORMULA, start = START, control = CONTROL, trace = T) :
  singular gradient

I also tried fixing GAMMA to  1000 and I get a similar error message:
Error in chol2inv(object$m$Rmat())
 : element (2, 2) is zero, so the inverse cannot be computed 
In addition: Warning message: 
In nls(FORMULA, start = START, control = CONTROL, trace = T)
 : singular gradient 

The data do not suggest a very large value for GAMMA so I am surprised that the 
estimate is increasing so rapidly.  I attempted to use the port algorithm with 
an upper bound on GAMMA but the upper bound is reached rapidly, suggesting that 
the data support a large  value for GAMMA.
 
A subset of the data (with added noise) is shown below.  A GAMMA value of 1280 
triggers the error with this subset

XVAR- c(26, 31.3, 20.9, 24.8, 22.9, 4.79, 19.6, 18, 19.6, 9.69, 21.7, 
26.6, 27.8, 9.12, 10.5, 20.1, 16.7, 14.1, 10.2, 19.2, 24.7, 34.6, 
26.6, 25.1, 5.98, 13.4, 15.7, 9.59, 7.39, 21.5, 15.7, 12.4, 19.2, 17.8, 19.7, 
27.1, 25.6, 36.4, 22.9, 8.68, 27, 25.9, 33.3, 24.2, 
21.4, 31, 19.1, 18.7, 23.5, 19.4, 10.3, 12.8, 13.9, 18.5, 21, 15.2, 18.9, 9.12, 
16.9, 12.9, 29.5, 15.5, 7.34, 8.97, 8.04, 23.7, 
16.3, 37.6, 35.2, 13.7, 28.1, 29.5, 15.1, 26, 6.52)


YVAR- c(-34.2, -84.2, -71.1, -91.9, -104.1, -23.2, -27.2, -13.4, -143.2,  
24.7, -72.1, -38, 25.2, -8, -34.1, -15.1, -112.6, -93.5, -130.9, 
-127.8, -118.7, -53.5, -29.8, 98, 0, -37.6, -99.4, 57.9, 0.2, -62.2, -27.3, 
8.3, -51.6, -111.6, -25.6, -51.7, -106.4, -85.1, 
-63.1, -60.8, -27.7, -20.7, 22.9, -49.4, -85.7, -90.9, -107,  -20.6, -36.3, 
-40.2, 39.8, -55, -54.5, -103.9, -53.1, -2.3, -72.3, 
-65.6, -57.8, -64.4, -129.1, 10.4, -9.9, -29.6, -40.8, 52, -94, 8.8, -98.8, 28, 
-16.3, -99.2, -48.5, -111.9, -15.4)

I suspect that I am making a conceptual error in the use of nls.  Any help 
would be appreciated.  If a different function to fit nonlinear regression 
would work better, please direct me.

Dennis

Dennis Fisher MD
P  (The P Less Than Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >