Re: [R] Lattice Histogram with Normal Curve - Y axis as percentages

2014-05-06 Thread jimdare
Thanks for your help Duncan and Peter!  I ended up using a combination of
your suggestions.  I used Duncan's y.limits ration of two plots with
differing types (percent and density), to provide me with a scale variable. 
I then used this in Peter's dnorm_scaled function and called it using
panel.mathdensity.  See below for my amended code.

Regards,
Jim


x1<-histogram(~rdf[,j]|Year,nint=20, data=rdf,main = i,strip = my.strip,xlab
= j,  
  type = "density",layout=c(2,1))
  
x2<-histogram(~rdf[,j]|Year,nint=20, data=rdf,main = i,strip = my.strip,xlab
= j,  
type = "percent",layout=c(2,1))

scale <- x2$y.limits/x1$y.limits
  
dnorm_scaled <- function(...){ scale[1]*dnorm(...)}
  
histogram(~rdf[,j]|Year,nint=20, data=rdf,main = i,strip = my.strip,xlab =
j,  
type = "percent",layout=c(2,1),
panel=function(x, ...) {

  panel.histogram(x, ...)
  
  panel.mathdensity(dmath=dnorm_scaled, col="black", 
# Add na.rm = TRUE to mean() and sd()
args=list(mean=mean(x, na.rm = TRUE),
  sd=sd(x, na.rm = TRUE)), ...)

})



--
View this message in context: 
http://r.789695.n4.nabble.com/Lattice-Histogram-with-Normal-Curve-Y-axis-as-percentages-tp469p4690093.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using unbalanced-learning algorithms in the randomForest package.

2014-05-06 Thread Byron Dom


The following report by the authors of the randomForest package describes two 
different algorithm modifications for using random forests to learn classifiers 
for "unbalanced" learning problems in which one class is much less frequent 
than the other (in 2-class problems). These two variations are called "balanced 
RF" and "weighted RF."
http://statistics.berkeley.edu/sites/default/files/tech-reports/666.pdf


Would someone please answer these three questions.
(1) Is it possible to use the R randomForest package to learn random forests 
using either of these modified RF-learning algorithms? 
(2) If it is possible, how does one do it?
(3) Is there some detailed documentation for running these modified versions? 
I've read the R package manual but it's too sketchy. It seems to be primarily 
for users who are already familiar with the package and just need to look up 
some detail like the name of an argument.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Thomas Lumley
On Wed, May 7, 2014 at 2:21 AM, David R Forrest  wrote:
> It sounds as if your underlying MySQL database is too slow for your purposes. 
>  Whatever you layer on top of it will be constrained by the underlying 
> database.  To speed up the process significantly, you may need to do work on 
> the database backend part of the process.


You might try MonetDB and its R interface -- it is fast for
aggregation operations, and either the current version or the upcoming
version has dplyr support.

-thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mix of plain and italic text in ggplot categorical x-axis

2014-05-06 Thread David Winsemius

On May 6, 2014, at 2:51 PM, Tom Walker wrote:

> Hi,
> 
> I need to generate bar charts where the x-axis is a factor that
> includes a mixture of species names (in italic) and control treatments
> (in plain text).
> 
> I would like this to be represented in the contents of the axis
> labels, meaning that I need the x-axis to include both italic and
> plain text. An example of my failed attempt is below (the bins in the
> label list containing expressions become blank).
> 
> d <- data.frame(Category = c("Sphagnum plant", "Calluna plant",
> "Eriophorum plant", "Control"),
>Response = c(1, 3, 5, 6))
> 
> d
> 
> mylabels <- list(expression(paste(italic("Sphagnum"), " plant")),
> expression(paste(italic("Calluna"), " plant")),
> expression(paste(italic("Eriophorum"), " plant")),
> "Control")
> 

Just changing that assignment to create an expression vector instead of a list 
works in R 3.1.0 Patched and ggplot2_0.9.3.1

mylabels <- c(expression(paste(italic("Sphagnum"), " plant")),
expression(paste(italic("Calluna"), " plant")),
expression(paste(italic("Eriophorum"), " plant")),
"Control")

> ggplot(d) +
>  aes(x = Category, y = Response) +
>  geom_bar() +
>  scale_x_discrete(labels = mylabels)
> 
> Any help would be much appreciated!

Thanks for providing complete example.

> 
> Many thanks,
> 
> Tom
> 
> _

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mix of plain and italic text in ggplot categorical x-axis

2014-05-06 Thread Tom Walker
Hi,

I need to generate bar charts where the x-axis is a factor that
includes a mixture of species names (in italic) and control treatments
(in plain text).

I would like this to be represented in the contents of the axis
labels, meaning that I need the x-axis to include both italic and
plain text. An example of my failed attempt is below (the bins in the
label list containing expressions become blank).

d <- data.frame(Category = c("Sphagnum plant", "Calluna plant",
"Eriophorum plant", "Control"),
Response = c(1, 3, 5, 6))

d

mylabels <- list(expression(paste(italic("Sphagnum"), " plant")),
 expression(paste(italic("Calluna"), " plant")),
 expression(paste(italic("Eriophorum"), " plant")),
 "Control")

ggplot(d) +
  aes(x = Category, y = Response) +
  geom_bar() +
  scale_x_discrete(labels = mylabels)

Any help would be much appreciated!

Many thanks,

Tom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Making a package works on any R version

2014-05-06 Thread Ben Bolker
Ashis Deb  gmail.com> writes:

> 
> Hi all   I  had made a package  in  R-3.0.3  , and its running  well   ,
> my  issue is  it  is  not running in  other versions   or  R   like
> R-3.0.2/3.0.1  it is  showing error  like  ---
> 
> Error: This is R 3.0.2, package ‘xxx’ needs >= 3.0.3
> 
> Does anybody  have  the solution  on  how  to  make this  package  run  on
> any  versions  .
> 

  [snip]

  It sounds like there is a Depends: line in your DESCRIPTION file that
mandates R >= 3.0.3.  Try removing it?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiply of two expressions

2014-05-06 Thread Ben Bolker
Niloofar.Javanrouh  yahoo.com> writes:

> 
> 
>  hello,
> i want to differentiate of L with respect to b
> when:
> 
> L= k*ln (k/(k+mu)) + sum(y) * ln (1-(k/mu+k))   
>#(negative binomial ln likelihood)
> and 
> ln(mu/(mu+k)) = a+bx   #link function
> 
> how can i do it in R?
> thank you.
> 

  R has a couple of functions for differentiation, D() and deriv(),
but this is probably a case where it would actually be simpler to
do it yourself by hand.  If you insist on doing it in R, you can
either use the chain rule (i.e. compute d(mu)/d(b)*d(L)/d(mu)) or
you can make a big ugly expression where you substitute the expression
for mu(b) into the expression for L.  In either case you'll have to
solve your link expression for mu.

  Wolfram Alpha might help too.

  good luck
Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Precedence and parentheses

2014-05-06 Thread Duncan Murdoch

On 06/05/2014 2:09 PM, Göran Broström wrote:

A thread on r-devel ("Historical NA question") went (finally) off-topic,
heading towards "Precedence". This triggered a question that I think is
better put on this list:

I have been more or less regularly been writing programs since the
seventies (Fortran, later C) and I early got the habit of using
parentheses almost everywhere, for two reasons. The first is the
obvious, to avoid mistakes with precedences, but the second is almost as
important: Readability.

Now, I think I have seen somewhere that unnecessary parentheses in  R
functions may slow down execution time considerably. Is this really
true, ant should I consequently get rid of my faiblesse for parentheses?
Or are there rules for when it matters and doesn't matter?


I think "considerably" is an exaggeration, but they are kept as part of 
the expression, and they do take a little bit of execution time.


The reason is to support the difference between

(x <- 1)

and

x <- 1

but they are kept even when they make no difference (other than to slow 
things down).


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Precedence and parentheses

2014-05-06 Thread Ted Harding
On 06-May-2014 18:09:12 Göran Broström wrote:
> A thread on r-devel ("Historical NA question") went (finally) off-topic, 
> heading towards "Precedence". This triggered a question that I think is 
> better put on this list:
> 
> I have been more or less regularly been writing programs since the 
> seventies (Fortran, later C) and I early got the habit of using 
> parentheses almost everywhere, for two reasons. The first is the 
> obvious, to avoid mistakes with precedences, but the second is almost
> as important: Readability.
> 
> Now, I think I have seen somewhere that unnecessary parentheses in  R 
> functions may slow down execution time considerably. Is this really 
> true, ant should I consequently get rid of my faiblesse for parentheses?
> Or are there rules for when it matters and doesn't matter?
> 
> Thanks, Göran

I have much sympathy with your motives for using parentheses!
(And I have a similar computing history).

My general encouragement would be: continue using them when you
feel that each usage brings a benefit.

Of course, you would avoid silliness like
  x <- (1*(2*(3*(4*(5)

and indeed, that does slow it down somewhat (it takes about twice
as long as a <- 1*2*3*4*5):

  system.time(for(i in (1:1)) 1*2*3*4*5)
#  user  system elapsed 
# 0.032   0.000   0.032 

  system.time(for(i in (1:1)) (1*(2*(3*(4*(5))
#  user  system elapsed 
# 0.056   0.000   0.054 

The main reason, I suppose, is that a "(" forces a new level
on a stack, which is not popped until the matching ")" arrives.

Interestingly, the above silliness takes a little longer when
the nesting is the other way round:

  system.time(for(i in (1:1)) 1*2*3*4*5)
#  user  system elapsed 
# 0.028   0.000   0.029 

  system.time(for(i in (1:1)) (1)*2)*3)*4)*5) )
#  user  system elapsed 
#  0.052   0.000   0.081 

(though in fact the times are somwhat variable in both cases,
so I'm not sure of the value of the relationship).

Best wishes,
Ted.

-
E-Mail: (Ted Harding) 
Date: 06-May-2014  Time: 19:41:13
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conversion error from numeric to factor in raster: Error in 1:ncol(r) : argument of length 0, r command: as.factor()

2014-05-06 Thread David L Carlson
Does

values(r) <- as.factor(1:ncell(r))

do what you want?

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Stefan Schmidt
Sent: Tuesday, May 6, 2014 11:22 AM
To: r-help@r-project.org
Subject: [R] conversion error from numeric to factor in raster: Error in 
1:ncol(r) : argument of length 0, r command: as.factor()

Hello together,

I was wondering how I can solve the following conversion problem of a 
raster file: when I try to convert the values from the raster (r) from 
numeric into a factor via as.factor(r) always the error appears: "Error 
in 1:ncol(r) : argument of length 0".

r <- raster(ncol=5, nrow=5)
values(r) <- 1:ncell(r)
as.factor(r)

Urgently I have to figure out how to convert a numeric raster into a 
factor raster for a predict() calculation within the raster package.

Every hint is very welcome!

Best,

Stefan

-- 

Stefan Schmidt

Abteilung Landschaftsökologie/
Department Computational Landscape Ecology
Helmholtz-Zentrum für Umweltforschung GmbH – UFZ/
Helmholtz Centre for Environmental Research – UFZ
Permoserstraße 15 / 04318 / Leipzig
Phone: +49 341 235 - 1056
Fax: +49 341 235 - 1939
Email: stefan.schm...@ufz.de
WWW: http://www.ufz.de

Sitz der Gesellschaft: Leipzig
Registergericht: Amtsgericht Leipzig, Handelsregister Nr. B 4703
Vorsitzender des Aufsichtsrats: MinDirig Wilfried Kraus
Wissenschaftlicher Geschäftsführer: Prof. Dr. Georg Teutsch
Administrative Geschäftsführerin: Dr. Heike Graßmann

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Precedence and parentheses

2014-05-06 Thread Göran Broström
A thread on r-devel ("Historical NA question") went (finally) off-topic, 
heading towards "Precedence". This triggered a question that I think is 
better put on this list:


I have been more or less regularly been writing programs since the 
seventies (Fortran, later C) and I early got the habit of using 
parentheses almost everywhere, for two reasons. The first is the 
obvious, to avoid mistakes with precedences, but the second is almost as 
important: Readability.


Now, I think I have seen somewhere that unnecessary parentheses in  R 
functions may slow down execution time considerably. Is this really 
true, ant should I consequently get rid of my faiblesse for parentheses?

Or are there rules for when it matters and doesn't matter?

Thanks, Göran

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] conversion error from numeric to factor in raster: Error in 1:ncol(r) : argument of length 0, r command: as.factor()

2014-05-06 Thread Stefan Schmidt

Hello together,

I was wondering how I can solve the following conversion problem of a 
raster file: when I try to convert the values from the raster (r) from 
numeric into a factor via as.factor(r) always the error appears: "Error 
in 1:ncol(r) : argument of length 0".


r <- raster(ncol=5, nrow=5)
values(r) <- 1:ncell(r)
as.factor(r)

Urgently I have to figure out how to convert a numeric raster into a 
factor raster for a predict() calculation within the raster package.


Every hint is very welcome!

Best,

Stefan

--

Stefan Schmidt

Abteilung Landschaftsökologie/
Department Computational Landscape Ecology
Helmholtz-Zentrum für Umweltforschung GmbH – UFZ/
Helmholtz Centre for Environmental Research – UFZ
Permoserstraße 15 / 04318 / Leipzig
Phone: +49 341 235 - 1056
Fax: +49 341 235 - 1939
Email: stefan.schm...@ufz.de
WWW: http://www.ufz.de

Sitz der Gesellschaft: Leipzig
Registergericht: Amtsgericht Leipzig, Handelsregister Nr. B 4703
Vorsitzender des Aufsichtsrats: MinDirig Wilfried Kraus
Wissenschaftlicher Geschäftsführer: Prof. Dr. Georg Teutsch
Administrative Geschäftsführerin: Dr. Heike Graßmann

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Metafor: How to integrate effectsizes?

2014-05-06 Thread Michael Dewey

At 14:23 06/05/2014, Viechtbauer Wolfgang (STAT) wrote:
Without the sample size of a study (i.e., either 
the group sizes or the total sample size), you 
cannot convert the p-value to a t-value or a 
t-value to a d-value. And for studies where you 
have the d-value but no sample size, you cannot 
compute the corresponding sampling variance. So, 
without additional information, you cannot 
include these studies. Maybe studies where a 
d-value is directly reported also report a CI 
for the d-value? Then the sampling variance can 
be back-calculated (since a 95% CI for d is 
typically computed with d +- 1.96 sqrt(vi), where vi is the sampling variance).


Verena,
What Wolfgang says is true of course but if you 
have _both_ the t value and the p value you can 
backcalculate the number of degrees of freedom 
and then if you are willing to assume equal arms you have the sample sizes.


finddf <- function(t, pval) {
   helper <- function(df) {res <- pval - pt(t, df, lower.tail = FALSE); res}
   res <- uniroot(helper, interval = c(5, 1))
   res
}

If you call finddf with the value of t and the 
_one-sided_ p-value (divide by 2 if two-sided) it 
should give you a return value which, if you look 
at the element of the list called root is its 
estimate of the degrees of freedom. If you get 
errors from uniroot the interval supplied in the call may need to be widened.


I would suggest that when you have your final 
dataset it would be a really good idea to do some 
model checks using plot.influence to see whether 
the studies for which you have imputed values are 
fundamentally different for some reason. This 
will also check your calculations as a bonus.




Best,
Wolfgang

> -Original Message-
> From: Verena Weinbir [mailto:vwein...@gmail.com]
> Sent: Tuesday, May 06, 2014 15:09
> To: Michael Dewey
> Cc: Viechtbauer Wolfgang (STAT); r-help@r-project.org
> Subject: Re: [R] Metafor: How to integrate effectsizes?
>
> Thank you very much for your illustration, Wolfgang! It helped me a
> lot.  And also thank you for the package-hint, Michael!
>
> Now, I have re-checked the respective studies, and there still are a
> couple of studies left, only stating cohens d, and the respective t-value
> and p-value - sample and group sizes are not addressed (its data from an
> older meta-analysis). Is there a way to embed these studies in my sample?
> Wolfgangs illustration addresses only cases in which group sizes are
> stated, if I understand you correctly...
>
> Many thanks in advance,
>
> Verena
>
> On Sat, Apr 26, 2014 at 1:38 PM, Michael Dewey 
> wrote:
> At 20:34 25/04/2014, Viechtbauer Wolfgang (STAT) wrote:
> If you know the d-value and the corresponding group sizes for a study,
> then it's possible to add that study to the rest of the dataset. Also, if
> you only know the test statistic from an independent samples t-test (or
> only the p-value corresponding to that test), it's possible to back-
> compute what the standardized mean difference is.
>
> I added an illustration of this to the metafor package website:
>
> http://www.metafor-project.org/doku.php/tips:assembling_data_smd
>
> Verena might also like to look at the compute.es package available from
> CRAN to see whether any of the conversions programmed there do the job.
>
>
> Best,
> Wolfgang
>
> --
> Wolfgang Viechtbauer, Ph.D., Statistician
> Department of Psychiatry and Psychology
> School for Mental Health and Neuroscience
> Faculty of Health, Medicine, and Life Sciences
> Maastricht University, P.O. Box 616 (VIJV1)
> 6200 MD Maastricht, The Netherlands
> +31 (43) 388-4170Â | http://www.wvbauer.com
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org]
> > On Behalf Of Michael Dewey
> > Sent: Friday, April 25, 2014 16:23
> > To: Verena Weinbir
> > Cc: r-help@r-project.org
> > Subject: Re: [R] Metafor: How to integrate effectsizes?
> >
> > At 12:33 25/04/2014, you wrote:
> > >Thank you very much for your reply and the book recommendation,
> Michael.
> > >
> > >Yes, I mean Cohen's d - sorry for the typo :-)
> > >
> > >Just to make this sure for me: There is no
> > >possibility to integrate stated Cohens' ds in an
> > >R-Metaanalysis (or a MA at all), if there is no
> > >further information traceable regarding SE or the like?
> >
> > If there is really no other information like
> > sample sizes, significance level, value of some
> > significance test then you would have to impute a
> > value from somewhere. That would seem a last resort.
> >
> > I have cc'ed this back to the list, please keep
> > it on the list so others may benefit and contribute.
> >
> >
> > >best regards,
> > >
> > >Verena
> > >
> > >
> > >On Fri, Apr 25, 2014 at 1:21 PM, Michael Dewey
> > ><i...@aghmed.fsnet.co.uk> wrote:
> > >At 13:15 24/04/2014, Verena Weinbir wrote:
> > >Hello!
> > >
> > >I am using the metafor package for my master's thesis as an R-newbie.
> > While
> > >calculating effec

Re: [R] Parsing XML file to data frame

2014-05-06 Thread David Winsemius

On May 5, 2014, at 11:42 AM, Timothy W. Cook wrote:

> I didn't find an attached XML file. Maybe the list removes attachments?

The list does not remove all attachments, It removes ones that are not among 
the listed acceptable formats. XML is not among the list of acceptable formats. 
If it had been submitted as a MIME-text file it would have been accepted.

-- 
David.

> You might try posting to StackOverflow.com if this is the case.
> 
> 
> 
> 
> On Fri, May 2, 2014 at 2:17 PM, starcraz  wrote:
> 
>> Hi all - I am trying to parse out the attached XML file into a data frame.
>> The file is extracted from Hadoop File Services (HFS). I am new in using
>> the
>> XML package so need some help in parsing out the data. Below is some code
>> that I explore to get the attribute into a data frame. Any help is
>> appreciated.
>> 
>> library(XML)
>> temp <- xmlParseDoc("sample.xml")
>> temp.root <- xmlRoot(temp)
>> xmlName(temp.root)
>> xmlSize(temp.root) #21 child nodes
>> temp.root[[2]] #headers
>> temp.root[[2]][[2]] #extracts just the revision
>> temp.2 <- xmlToList(temp.root[[2]]) #extracts the info in temp.root[[2]]
>> into a list
>> temp.2
>> temp.2.df <- xmlToDataFrame(temp.root[[2]]) #data frame of the list
>> temp.2.df
>> xmlValue(temp.root[[2]]) #string the values of the node inside [[2]]
>> 
>> temp.revision <- xmlValue(temp.root[[2]][["Revision"]])
>> temp.revision
>> 
>> test <- xmlTreeParse("sample.xml")
>> test
>> 
>> 
>> 
>> 
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Parsing-XML-file-to-data-frame-tp4689883.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> 
> 
> -- 
> 
> 
> Timothy Cook
> LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook
> MLHIM http://www.mlhim.org
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Bert Gunter
I believe this discussion should be taken offlist as it no longer
seems to be concerned with R.

-- Bert Gunter

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Tue, May 6, 2014 at 7:40 AM, Peter Crowther
 wrote:
> The dataset is not large by database standards.  Even in mySQL - not known
> for its speed at multi-row querying - the queries you describe should
> complete within a few seconds on even moderately recent hardware if your
> indexes are reasonable.
>
> What are your performance criteria for processing these queries, and how
> have you / your team optimised the relational database storage?
>
> Cheers,
>
> - Peter
> --
> Peter Crowther, Director, Melandra Limited
>
>
> On 6 May 2014 15:32, Dr Eberhard Lisse  wrote:
>
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> Exactly,
>>
>> which is why I am looking for something faster :-)-O
>>
>> el
>>
>> on 2014-05-06, 15:21 David R Forrest said the following:
>> > It sounds as if your underlying MySQL database is too slow for your
>> > purposes.  Whatever you layer on top of it will be constrained by
>> > the underlying database.  To speed up the process significantly,
>> > you may need to do work on the database backend part of the
>> > process.
>> >
>> > Dave
>> -BEGIN PGP SIGNATURE-
>> Version: GnuPG v1.4.12 (Darwin)
>> Comment: GPGTools - http://gpgtools.org
>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>>
>> iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA
>> KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn
>> clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW
>> 6aytsFtPZTI=
>> =zepY
>> -END PGP SIGNATURE-
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Metafor: How to integrate effectsizes?

2014-05-06 Thread Verena Weinbir
Thank you very much for your illustration, Wolfgang! It helped me a lot.
And also thank you for the package-hint, Michael!

Now, I have re-checked the respective studies, and there still are a couple
of studies left, only stating cohens d, and the respective t-value and
p-value - sample and group sizes are not addressed (its data from an older
meta-analysis). Is there a way to embed these studies in my sample?
Wolfgangs illustration addresses only cases in which group sizes are
stated, if I understand you correctly...

Many thanks in advance,

Verena


On Sat, Apr 26, 2014 at 1:38 PM, Michael Dewey wrote:

> At 20:34 25/04/2014, Viechtbauer Wolfgang (STAT) wrote:
>
>> If you know the d-value and the corresponding group sizes for a study,
>> then it's possible to add that study to the rest of the dataset. Also, if
>> you only know the test statistic from an independent samples t-test (or
>> only the p-value corresponding to that test), it's possible to back-compute
>> what the standardized mean difference is.
>>
>> I added an illustration of this to the metafor package website:
>>
>> http://www.metafor-project.org/doku.php/tips:assembling_data_smd
>>
>
> Verena might also like to look at the compute.es package available from
> CRAN to see whether any of the conversions programmed there do the job.
>
>
>
>  Best,
>> Wolfgang
>>
>> --
>> Wolfgang Viechtbauer, Ph.D., Statistician
>> Department of Psychiatry and Psychology
>> School for Mental Health and Neuroscience
>> Faculty of Health, Medicine, and Life Sciences
>> Maastricht University, P.O. Box 616 (VIJV1)
>> 6200 MD Maastricht, The Netherlands
>> +31 (43) 388-4170 | http://www.wvbauer.com
>>
>> > -Original Message-
>> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org
>> ]
>> > On Behalf Of Michael Dewey
>> > Sent: Friday, April 25, 2014 16:23
>> > To: Verena Weinbir
>> > Cc: r-help@r-project.org
>> > Subject: Re: [R] Metafor: How to integrate effectsizes?
>> >
>> > At 12:33 25/04/2014, you wrote:
>> > >Thank you very much for your reply and the book recommendation,
>> Michael.
>> > >
>> > >Yes, I mean Cohen's d - sorry for the typo :-)
>> > >
>> > >Just to make this sure for me: There is no
>> > >possibility to integrate stated Cohens' ds in an
>> > >R-Metaanalysis (or a MA at all), if there is no
>> > >further information traceable regarding SE or the like?
>> >
>> > If there is really no other information like
>> > sample sizes, significance level, value of some
>> > significance test then you would have to impute a
>> > value from somewhere. That would seem a last resort.
>> >
>> > I have cc'ed this back to the list, please keep
>> > it on the list so others may benefit and contribute.
>> >
>> >
>> > >best regards,
>> > >
>> > >Verena
>> > >
>> > >
>> > >On Fri, Apr 25, 2014 at 1:21 PM, Michael Dewey
>> > ><i...@aghmed.fsnet.co.uk> wrote:
>> > >At 13:15 24/04/2014, Verena Weinbir wrote:
>> > >Hello!
>> > >
>> > >I am using the metafor package for my master's thesis as an R-newbie.
>> > While
>> > >calculating effectsizes from my dataset (mean values and
>> > >standarddeviations) using "escalc" shouldn't be a problem (I hope ;-)),
>> > I
>> > >wonder how I could at this point integrate additional studies, which
>> > only
>> > >state conhens d (no information about mean value and sds available), to
>> > >calculate an overall analysis. Â I would be very grateful for your
>> > support!
>> > >
>> > >
>> > >You mean Cohen's d I think.
>> > >
>> > >You will need some more information to enable
>> > >you to calculate its standard error. Have a look at Rosenthal's chapter
>> > in
>> > >@book{cooper94,
>> > >Â  Â author = {Cooper, H and Hedges, L V},
>> > >Â  Â title = {A handbook of research synthesis},
>> > >Â  Â year = {1994},
>> > >Â  Â publisher = {Russell Sage},
>> > >Â  Â address = {New York},
>> > >Â  Â keywords = {meta-analysis}
>> > >}
>> > >(There is an updated edition)
>> > >This gives you more information about converting
>> > >effect sizes and extracting them from unpromising beginnings.
>> > >
>> > >It often requires some ingenuity to get the
>> > >information you need so have a go and then get
>> > >back here with more details if you run into problems
>> > >
>> > >
>> > >Best regards,
>> > >
>> > >Verena
>>
>
> Michael Dewey
> i...@aghmed.fsnet.co.uk
> http://www.aghmed.fsnet.co.uk/home.html
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with products in R ?

2014-05-06 Thread Yuankuns Shi
In R, all number (integer float double precision) is treated as double 
precision. Therefore, it has only 16 significant figures. To achieve higher 
precision, you have to use program support long integer

在 2014年5月4日星期日UTC+8下午8时44分04秒,ARTENTOR Diego Tentor写道:
>
> Trying algorithm for products with large numbers i encountered a 
> difference 
> between result of 168988580159 * 36662978 in my algorithm and r product. 
> The Microsoft calculator confirm my number. 
>
> Thanks. 
>
>
>
>
> -- 
>
>
> *Gráfica ARTENTOR  * 
>
> de Diego L. Tentor 
> Echagüe 558 
> Tel.:0343 4310119 
> Paraná - Entre Ríos 
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] list.files accessing subdirectory as relative path?

2014-05-06 Thread Adel
Thanks for the reply Don and Frede,

Your suggestions works perfectly!

Best
Adel




--
View this message in context: 
http://r.789695.n4.nabble.com/list-files-accessing-subdirectory-as-relative-path-tp4689997p4690048.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sciplot: Increasing the width of bargraph and decreasing the sapce b/n groups

2014-05-06 Thread Roopa Subbaiaih
Thanks Jim. Roopa

---
Roopa Shree Subbaiaih
Post Doctoral Fellow
Department of Dermatology
School of Medicine
Case Western Reserve University
Cleveland, OH-44106
Tel:+1 216 368 0211


On Mon, May 5, 2014 at 6:02 PM, Jim Lemon  wrote:

> On 05/06/2014 05:02 AM, Roopa Subbaiaih wrote:
>
>> Hello,
>>
>> I am trying to plot bargraphs susing Sciplot. Is there a way to increase
>> the width of the bar graphs and decrease the space b/n the groups? I am
>> pasting the script as well as attaching the graph.
>>
>> Bio6<- read.csv("Data/Plin1.csv",na.strings="",header=T)
>> attach(Bio6)
>> head(Bio6)
>> par(family="serif", font=11)
>> Bio6$Sps<- factor(Bio6$Sps, levels = c("FFA1", "FFA2","FFA3"))
>> Bio6$Gp<- factor(Bio6$Gp, levels = c("N-FFA1",
>> "FU1","FA1","N-FFA2","FU2","FA2","N-FFA3","FU3","FA3"))
>> bargraph.CI(Sps, O.D, group = Gp, data = Bio6,ylab = "Relative expression
>> levels", cex.lab = 1.5, y.leg = 6,cex.leg = 0.82,cex=1.5, axisnames=TRUE,
>> col = c("red","blue","grey"),space=c(0, 0.5), ylim=c(0,7),cex.names =
>> 1.0,density = c(30,30,30), legend = TRUE, main="PLIN1")
>> detach(Bio6)
>>
>>   O.D Gp  Sps
>> 1  1.00 N-FFA1 FFA1
>> 2  2.996432FU1 FFA1
>> 3  3.223413FU1 FFA1
>> 4  3.524465FU1 FFA1
>> 5  1.311971FA1 FFA1
>> 6  6.755860FA1 FFA1
>> 7  1.566000FA1 FFA1
>> 8  1.00 N-FFA2 FFA2
>> 9  2.741612FU2 FFA2
>> 10 2.800644FU2 FFA2
>> 11 3.569509FU2 FFA2
>> 12 4.141500FA2 FFA2
>> 13 7.049476FA2 FFA2
>> 14 4.694674FA2 FFA2
>> 15 1.00 N-FFA3 FFA3
>> 16 4.163601FU3 FFA3
>> 17 3.903986FU3 FFA3
>> 18 4.73FU3 FFA3
>> 19 0.00FA3 FFA3
>> 20 0.00FA3 FFA3
>> 21 0.00FA3 FFA3
>>
>>  Hi Roopa,
> bargraph.CI does something with the "space" argument that I can't quite
> work out. I can get a reasonable plot like this:
>
> Bmeans<-matrix(by(Bio6$O.D,Bio6$Gp],FUN=mean),ncol=3)
> barpos<-barplot(Bmeans,beside=TRUE,
>  ylim=c(0,7),col=c("red","blue","grey"),space=c(0.1,1),main="PLIN1")
> legend(8.5,7.1,
>  c("N-FFA1","FU1","FA1","N-FFA2","FU2","FA2","N-FFA3","FU3","FA3"),
>  fill=c("red","blue","grey"),bty="n")
> library(plotrix)
> Bse<-matrix(by(Bio6$O.D,Bio6$Gp,FUN=std.error),ncol=3)
> dispersion(barpos,Bmeans,Bse,display.na=FALSE)
>
> Jim
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Peter Crowther
The dataset is not large by database standards.  Even in mySQL - not known
for its speed at multi-row querying - the queries you describe should
complete within a few seconds on even moderately recent hardware if your
indexes are reasonable.

What are your performance criteria for processing these queries, and how
have you / your team optimised the relational database storage?

Cheers,

- Peter
--
Peter Crowther, Director, Melandra Limited


On 6 May 2014 15:32, Dr Eberhard Lisse  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Exactly,
>
> which is why I am looking for something faster :-)-O
>
> el
>
> on 2014-05-06, 15:21 David R Forrest said the following:
> > It sounds as if your underlying MySQL database is too slow for your
> > purposes.  Whatever you layer on top of it will be constrained by
> > the underlying database.  To speed up the process significantly,
> > you may need to do work on the database backend part of the
> > process.
> >
> > Dave
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.12 (Darwin)
> Comment: GPGTools - http://gpgtools.org
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA
> KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn
> clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW
> 6aytsFtPZTI=
> =zepY
> -END PGP SIGNATURE-
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Exactly,

which is why I am looking for something faster :-)-O

el

on 2014-05-06, 15:21 David R Forrest said the following:
> It sounds as if your underlying MySQL database is too slow for your
> purposes.  Whatever you layer on top of it will be constrained by
> the underlying database.  To speed up the process significantly,
> you may need to do work on the database backend part of the
> process.
> 
> Dave
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA
KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn
clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW
6aytsFtPZTI=
=zepY
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread David R Forrest
It sounds as if your underlying MySQL database is too slow for your purposes.  
Whatever you layer on top of it will be constrained by the underlying database. 
 To speed up the process significantly, you may need to do work on the database 
backend part of the process.

Dave


On May 6, 2014, at 7:08 AM, Dr Eberhard Lisse  wrote:

> Thanks,
> 
> tried all of that, too slow.
> 
> el
> 
> on 2014-05-06, 12:00 Gabor Grothendieck said the following:
>> On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse  wrote:
>>> Jeff
>>> 
>>> It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
>>> dataframe it saves to 180MB. I work from the dataframe.
>>> 
>>> But, it's not only a size issue it's also a speed issue and hence
>>> I don't care what I am going to use, as long as it is fast.
>>> 
>>> sqldf is easy to understand for me but it takes ages.  If
>>> alternatives were roughly similar in speed I would remain with
>>> sqldf.
>>> 
>>> dplyr sounds faster, and promising, but the intrinsic stuff is
>>> way beyond me (elderly Gynaecologist) on the learning curve...
>> 
>> You can create indices in sqldf and that can speed up processing
>> substantially for certain operations.  See examples 4h and 4i on
>> the sqldf home page: http://sqldf.googlecode.com.  Also note that
>> sqldf supports not only the default SQLite backend but also MySQL,
>> h2 and postgresql.  See ?sqldf for info on using sqldf with MySQL
>> and the others.
>> 
> 
> -- 
> Dr. Eberhard W. Lisse  \/ Obstetrician & Gynaecologist (Saar)
> e...@lisse.na/ * |   Telephone: +264 81 124 6733 (cell)
> PO Box 8421 \ /
> Bachbrecht, Namibia ;/
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Dr. David Forrest
d...@vims.edu
804-684-7900w
757-968-5509h
804-413-7125c
#240 Andrews Hall
Virginia Institute of Marine Science
Route 1208, Greate Road
PO Box 1346
Gloucester Point, VA, 23062-1346












signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Metafor: How to integrate effectsizes?

2014-05-06 Thread Viechtbauer Wolfgang (STAT)
Without the sample size of a study (i.e., either the group sizes or the total 
sample size), you cannot convert the p-value to a t-value or a t-value to a 
d-value. And for studies where you have the d-value but no sample size, you 
cannot compute the corresponding sampling variance. So, without additional 
information, you cannot include these studies. Maybe studies where a d-value is 
directly reported also report a CI for the d-value? Then the sampling variance 
can be back-calculated (since a 95% CI for d is typically computed with d +- 
1.96 sqrt(vi), where vi is the sampling variance).

Best,
Wolfgang

> -Original Message-
> From: Verena Weinbir [mailto:vwein...@gmail.com]
> Sent: Tuesday, May 06, 2014 15:09
> To: Michael Dewey
> Cc: Viechtbauer Wolfgang (STAT); r-help@r-project.org
> Subject: Re: [R] Metafor: How to integrate effectsizes?
> 
> Thank you very much for your illustration, Wolfgang! It helped me a
> lot.  And also thank you for the package-hint, Michael!
> 
> Now, I have re-checked the respective studies, and there still are a
> couple of studies left, only stating cohens d, and the respective t-value
> and p-value - sample and group sizes are not addressed (its data from an
> older meta-analysis). Is there a way to embed these studies in my sample?
> Wolfgangs illustration addresses only cases in which group sizes are
> stated, if I understand you correctly...
> 
> Many thanks in advance,
> 
> Verena
> 
> On Sat, Apr 26, 2014 at 1:38 PM, Michael Dewey 
> wrote:
> At 20:34 25/04/2014, Viechtbauer Wolfgang (STAT) wrote:
> If you know the d-value and the corresponding group sizes for a study,
> then it's possible to add that study to the rest of the dataset. Also, if
> you only know the test statistic from an independent samples t-test (or
> only the p-value corresponding to that test), it's possible to back-
> compute what the standardized mean difference is.
> 
> I added an illustration of this to the metafor package website:
> 
> http://www.metafor-project.org/doku.php/tips:assembling_data_smd
> 
> Verena might also like to look at the compute.es package available from
> CRAN to see whether any of the conversions programmed there do the job.
> 
> 
> Best,
> Wolfgang
> 
> --
> Wolfgang Viechtbauer, Ph.D., Statistician
> Department of Psychiatry and Psychology
> School for Mental Health and Neuroscience
> Faculty of Health, Medicine, and Life Sciences
> Maastricht University, P.O. Box 616 (VIJV1)
> 6200 MD Maastricht, The Netherlands
> +31 (43) 388-4170 | http://www.wvbauer.com
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org]
> > On Behalf Of Michael Dewey
> > Sent: Friday, April 25, 2014 16:23
> > To: Verena Weinbir
> > Cc: r-help@r-project.org
> > Subject: Re: [R] Metafor: How to integrate effectsizes?
> >
> > At 12:33 25/04/2014, you wrote:
> > >Thank you very much for your reply and the book recommendation,
> Michael.
> > >
> > >Yes, I mean Cohen's d - sorry for the typo :-)
> > >
> > >Just to make this sure for me: There is no
> > >possibility to integrate stated Cohens' ds in an
> > >R-Metaanalysis (or a MA at all), if there is no
> > >further information traceable regarding SE or the like?
> >
> > If there is really no other information like
> > sample sizes, significance level, value of some
> > significance test then you would have to impute a
> > value from somewhere. That would seem a last resort.
> >
> > I have cc'ed this back to the list, please keep
> > it on the list so others may benefit and contribute.
> >
> >
> > >best regards,
> > >
> > >Verena
> > >
> > >
> > >On Fri, Apr 25, 2014 at 1:21 PM, Michael Dewey
> > ><i...@aghmed.fsnet.co.uk> wrote:
> > >At 13:15 24/04/2014, Verena Weinbir wrote:
> > >Hello!
> > >
> > >I am using the metafor package for my master's thesis as an R-newbie.
> > While
> > >calculating effectsizes from my dataset (mean values and
> > >standarddeviations) using "escalc" shouldn't be a problem (I hope ;-
> )),
> > I
> > >wonder how I could at this point integrate additional studies, which
> > only
> > >state conhens d (no information about mean value and sds available),
> to
> > >calculate an overall analysis. Â I would be very grateful for your
> > support!
> > >
> > >
> > >You mean Cohen's d I think.
> > >
> > >You will need some more information to enable
> > >you to calculate its standard error. Have a look at Rosenthal's
> chapter
> > in
> > >@book{cooper94,
> > >Â  Â author = {Cooper, H and Hedges, L V},
> > >Â  Â title = {A handbook of research synthesis},
> > >Â  Â year = {1994},
> > >Â  Â publisher = {Russell Sage},
> > >Â  Â address = {New York},
> > >Â  Â keywords = {meta-analysis}
> > >}
> > >(There is an updated edition)
> > >This gives you more information about converting
> > >effect sizes and extracting them from unpromising beginnings.
> > >
> > >It often requires some ingenuity to get the
> > >information you need so have a go

Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
Thanks,

tried all of that, too slow.

el

on 2014-05-06, 12:00 Gabor Grothendieck said the following:
> On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse  wrote:
>> Jeff
>>
>> It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
>> dataframe it saves to 180MB. I work from the dataframe.
>>
>> But, it's not only a size issue it's also a speed issue and hence
>> I don't care what I am going to use, as long as it is fast.
>>
>> sqldf is easy to understand for me but it takes ages.  If
>> alternatives were roughly similar in speed I would remain with
>> sqldf.
>>
>> dplyr sounds faster, and promising, but the intrinsic stuff is
>> way beyond me (elderly Gynaecologist) on the learning curve...
> 
> You can create indices in sqldf and that can speed up processing
> substantially for certain operations.  See examples 4h and 4i on
> the sqldf home page: http://sqldf.googlecode.com.  Also note that
> sqldf supports not only the default SQLite backend but also MySQL,
> h2 and postgresql.  See ?sqldf for info on using sqldf with MySQL
> and the others.
> 

-- 
Dr. Eberhard W. Lisse  \/ Obstetrician & Gynaecologist (Saar)
e...@lisse.na/ * |   Telephone: +264 81 124 6733 (cell)
PO Box 8421 \ /
Bachbrecht, Namibia ;/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Gabor Grothendieck
On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse  wrote:
> Jeff
>
> It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
> dataframe it saves to 180MB. I work from the dataframe.
>
> But, it's not only a size issue it's also a speed issue and hence I
> don't care what I am going to use, as long as it is fast.
>
> sqldf is easy to understand for me but it takes ages.  If
> alternatives were roughly similar in speed I would remain with
> sqldf.
>
> dplyr sounds faster, and promising, but the intrinsic stuff is
> way beyond me (elderly Gynaecologist) on the learning curve...

You can create indices in sqldf and that can speed up processing
substantially for certain operations.   See examples 4h and 4i on the
sqldf home page: http://sqldf.googlecode.com. Also note that sqldf
supports not only the default SQLite backend but also MySQL, h2 and
postgresql.  See ?sqldf for info on using sqldf with MySQL and the
others.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Making a package works on any R version

2014-05-06 Thread Duncan Murdoch

On 06/05/2014, 1:19 AM, Ashis Deb wrote:

Hi all   I  had made a package  in  R-3.0.3  , and its running  well   ,
my  issue is  it  is  not running in  other versions   or  R   like
R-3.0.2/3.0.1  it is  showing error  like  ---


Error: This is R 3.0.2, package ‘xxx’ needs >= 3.0.3


The only reason you should get that message is because you coded in the 
DESCRIPTION file that your package depends on R (>= 3.0.3).  If you 
don't want that dependency, don't code it.


Duncan Murdoch




Does anybody  have  the solution  on  how  to  make this  package  run  on
any  versions  .

i  know its   not  a  big problem  ,  please forgive my  ignorance   if it
sounds  silly .


Thanks ,

ASHIS

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
David,

this is quite slow :-)-O

el

on 2014-05-06, 10:55 David McPearson said the following:
[...]
> It seems like you are trying to extract a (relatively) small data set from a
> much larger SQL databaseWhy not do the SQL stiff in the database and the
> analysis *statsm graphics...) in R? Maybe use a make table query to grab the
> data of interest, and then import the whole table into R for the analysis?
> (Disclaimer: my ignorance of SQL is not far off total)
> 
> HTH
> D.
[...]
-- 
Dr. Eberhard W. Lisse  \/ Obstetrician & Gynaecologist (Saar)
e...@lisse.na/ * |   Telephone: +264 81 124 6733 (cell)
PO Box 8421 \ /
Bachbrecht, Namibia ;/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread David McPearson
On Tue, 6 May 2014 10:12:50 +0100 Dr Eberhard Lisse  wrote

> Jeff
> 
> It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
> dataframe it saves to 180MB. I work from the dataframe.
> 
> But, it's not only a size issue it's also a speed issue and hence I
> don't care what I am going to use, as long as it is fast.
> 
> sqldf is easy to understand for me but it takes ages.  If
> alternatives were roughly similar in speed I would remain with
> sqldf.
> 
> dplyr sounds faster, and promising, but the intrinsic stuff is
> way beyond me (elderly Gynaecologist) on the learning curve...
> 
> el
> 
> on 2014-05-06, 09:41 Jeff Newmiller said the following:
> > In what format is this "growing" data stored?  CSV? SQL? Log
> > textfile?  You say you don't want to use sqldf, but you haven't
> > said what you do want to use.
> 


It seems like you are trying to extract a (relatively) small data set from a
much larger SQL databaseWhy not do the SQL stiff in the database and the
analysis *statsm graphics...) in R? Maybe use a make table query to grab the
data of interest, and then import the whole table into R for the analysis?
(Disclaimer: my ignorance of SQL is not far off total)

HTH
D.


South Africas premier free email service - www.webmail.co.za 

Cheapest Insurance Quotes!
https://www.outsurance.co.za/insurance-quote/personal/?source=msn&cr=Postit14_468x60_gif&cid=322

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changing x-axis in plot

2014-05-06 Thread Jim Lemon

On 05/06/2014 07:07 PM, Babak Bastan wrote:

Hi experts

I woul like to change my x-axis. Like this: 10,...,2,...,1

I am using this code:

r<-c(1:10)
plot(r, axes=FALSE, frame.plot=TRUE,xlim=c(10,1))
axis(1,at=10/seq(1:10))
axis(2, at=axTicks(2), axTicks(2))

but my x-sxis i still: 1,..., 2,...,10


Hi Babak,
I get x axis ticks at 10, 5, 3.3, ... with the above code, which is 
expected. I think you want something like:


plot(1:10,axes=FALSE,frame.plot=TRUE)
axis(1,at=1:10,labels=10/seq(1:10))

What you have done is interesting. By specifying xlim=c(10,1) you have 
reversed the order of whatever labels you pass to the "axis" function.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changing x-axis in plot

2014-05-06 Thread David McPearson
On Tue, 6 May 2014 09:07:55 + Babak Bastan  wrote

> Hi experts
> 
> I woul like to change my x-axis. Like this: 10,...,2,...,1
> 
> I am using this code:
> 
> r<-c(1:10)
> plot(r, axes=FALSE, frame.plot=TRUE,xlim=c(10,1))
> axis(1,at=10/seq(1:10))
> axis(2, at=axTicks(2), axTicks(2))
> 
> but my x-sxis i still: 1,..., 2,...,10
> 
> What should I do?
> 
> [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

at and R prompt:
?axis
# read the help page - especially in relation to labels

..
plot(...)
axis(1,at=10/seq(1:10), labels = 10:1)
..




South Africas premier free email service - www.webmail.co.za 

Cheapest Insurance Quotes!
https://www.outsurance.co.za/insurance-quote/personal/?source=msn&cr=Postit14_468x60_gif&cid=322

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Carlos Ortega
Hi,

Yes dplyr syntax is quite equivalent to SQL, although it is faster.
Another alternative you could consider is to use *data.table* which has a
syntax very similar to the way you select subset within a data.frame and in
terms of performance is faster (a bit) than sqldf.

You can get some idea of how to work with it here:

http://stackoverflow.com/questions/1727772/quickly-reading-very-large-tables-as-dataframes-in-r

Regards,
Carlos Ortega
www.qualityexcellence.es





2014-05-06 11:12 GMT+02:00 Dr Eberhard Lisse :

> Jeff
>
> It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
> dataframe it saves to 180MB. I work from the dataframe.
>
> But, it's not only a size issue it's also a speed issue and hence I
> don't care what I am going to use, as long as it is fast.
>
> sqldf is easy to understand for me but it takes ages.  If
> alternatives were roughly similar in speed I would remain with
> sqldf.
>
> dplyr sounds faster, and promising, but the intrinsic stuff is
> way beyond me (elderly Gynaecologist) on the learning curve...
>
> el
>
> on 2014-05-06, 09:41 Jeff Newmiller said the following:
> > In what format is this "growing" data stored?  CSV? SQL? Log
> > textfile?  You say you don't want to use sqldf, but you haven't
> > said what you do want to use.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
Jeff

It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
dataframe it saves to 180MB. I work from the dataframe.

But, it's not only a size issue it's also a speed issue and hence I
don't care what I am going to use, as long as it is fast.

sqldf is easy to understand for me but it takes ages.  If
alternatives were roughly similar in speed I would remain with
sqldf.

dplyr sounds faster, and promising, but the intrinsic stuff is
way beyond me (elderly Gynaecologist) on the learning curve...

el

on 2014-05-06, 09:41 Jeff Newmiller said the following:
> In what format is this "growing" data stored?  CSV? SQL? Log
> textfile?  You say you don't want to use sqldf, but you haven't
> said what you do want to use.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] changing x-axis in plot

2014-05-06 Thread Babak Bastan
Hi experts

I woul like to change my x-axis. Like this: 10,...,2,...,1

I am using this code:

r<-c(1:10)
plot(r, axes=FALSE, frame.plot=TRUE,xlim=c(10,1))
axis(1,at=10/seq(1:10))
axis(2, at=axTicks(2), axTicks(2))

but my x-sxis i still: 1,..., 2,...,10

What should I do?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Jeff Newmiller
In what format is this "growing" data stored? CSV? SQL? Log textfile? You say 
you don't want to use sqldf, but you haven't said what you do want to use.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On May 6, 2014 1:16:12 AM PDT, Eberhard Lisse  wrote:
>Thank you.
>
>My requirements are that simple. One table, 11 fields, of which 3 are
>interesting, 30 Million records, growing daily by between 30.
>
>And, yes I have spent an enormous amount of time reading these things,
>but for someone not dealing with this professionally and/or on a daily
>basis, the documents don't help much.
>
>el
>
>
>on 2014-05-04, 05:26 Jeff Newmiller said the following:
>> ?table
>> ?aggregate
>> 
>> Also, packages plyr, data.table, and dplyr.  You might consider
>> reading [1], but if your interests are really as simple as your
>> examples then the table function should be sufficient.  That
>> function is discussed in the Introduction to R document that you
>> really should have read before posting here.
>> 
>> [1] http://www.jstatsoft.org/v40/i01/
>[...]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Eberhard Lisse
Thank you.

My requirements are that simple. One table, 11 fields, of which 3 are
interesting, 30 Million records, growing daily by between 30.

And, yes I have spent an enormous amount of time reading these things,
but for someone not dealing with this professionally and/or on a daily
basis, the documents don't help much.

el


on 2014-05-04, 05:26 Jeff Newmiller said the following:
> ?table
> ?aggregate
> 
> Also, packages plyr, data.table, and dplyr.  You might consider
> reading [1], but if your interests are really as simple as your
> examples then the table function should be sufficient.  That
> function is discussed in the Introduction to R document that you
> really should have read before posting here.
> 
> [1] http://www.jstatsoft.org/v40/i01/
[...]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice Histogram with Normal Curve - Y axis as percentages

2014-05-06 Thread peter dalgaard
This illustrates why you really do not want percents... (I never quite 
understand why people do want them - I can understand raw counts wheh used in 
teaching as a precursor to the concept of a density, but percentages is an odd 
in-between sort of thing.) 

Anyways, the scaling factor is the bin width (you figure out whether to 
multiply or divide, I get it wrong every second time), possibly multiplied by 
100%. A pragmatic way out would seem to be switch out dnorm with dnorm_scaled 
<- function(...) scale*dnorm(...) in the call to panel.mathdensity.

-Peter D

On 05 May 2014, at 22:23 , jimdare  wrote:

> Hello,
> 
> This may seem like a simple problem, but it's frustrating me immensely.  I'm
> trying to overlay a normal curve (dnorm) on top of a histogram using the
> code below.  This works find when the type = "density", but the person for
> whom I'm making the plot wants the y axis in percent of total rather than
> density.  When I change type to "percent", I get the histogram scale I'm
> after, but the dnorm plot is greatly reduced.  How could I scale the density
> plot to the percent of total axis.  Alternatively, perhaps there is a way to
> add density to a secondary y axis?
> 
> Thanks in advance for your help.
> 
> Jimdare
> 
> 
> plot<-histogram(~rdf[,j]|Year,nint=20, data=rdf,main = i,strip =
> my.strip,xlab = j,  
>   type = "percent",layout=c(2,1),
>   panel=function(x, ...) {
>   panel.histogram(x, ...)
> 
>   panel.mathdensity(dmath=dnorm, col="black", 
>  # Add na.rm = TRUE to mean() and sd()
>args=list(mean=mean(x, na.rm = TRUE),
>sd=sd(x, na.rm = TRUE)), ...)
>}) 
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Lattice-Histogram-with-Normal-Curve-Y-axis-as-percentages-tp469.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Parsing XML file to data frame

2014-05-06 Thread starcraz
Tim - the file is a hyperlink at the beginning of the message called
'sample.xml' or here's the hyperlink
http://r.789695.n4.nabble.com/file/n4689883/sample.xml



--
View this message in context: 
http://r.789695.n4.nabble.com/Parsing-XML-file-to-data-frame-tp4689883p4690021.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with products in R ?

2014-05-06 Thread Martin Maechler
[.]

Sorry to prolong this thread,  but I'm a bit astonished.

'bc' has been a really great tool when it was created (1975, at
Bell labs, according to Wikipedia) and made available, open
source, eventually, and I have been fond of it at the time.

On the other hand, we have had the GNU GMP and MPFR C libraries with
state of the art algorithms, in active development, 
and R packages  'gmp' (for a long time) and 'Rmpfr' for several
years now.

As we are R users as opposed to C programmers (who may be fond
of 'bc' because of its C-like syntax), and of course, as I'm
involved in the maintenance of both R packages 'gmp' and 'Rmpfr',
I wonder why you are not using these which contain considerably
more functionality, notably than the bc interface package.

Martin


> On Mon, May 5, 2014 at 11:09 PM, Richard M. Heiberger  
wrote:
>> Gabor,
>> 
>> Can you confirm that the bc function is supposed to be current.
>> The bc package works with my Mac, but not with Windows.
>> I keep getting the message
>> 
>> Error in system(cmd, input = input, intern = TRUE) : -l
>> 
>> The FAQ on your https://code.google.com/p/r-bc/ page didn't get me past 
that
>> problem.  I tried both the download bc.zip from the page and also the 
cygwin bc.
>> 
>> A secondary issue is that placing your bc.exe into c:/Program
>> Files/R/R-3.1.0/library/bc/bcdir/bc.exe
>> gives the message
>>> one <- bc(1)
>> Error in system(cmd, input = input, intern = TRUE) :
>> 'C:/Program' not found
>> 
>> Working around that is possible with
>>> bc.cmd <- "C:/Progra~1/R/R-3.1.0/library/bc/bcdir/bc.exe -l"
>>> one <- bc(1, cmd=bc.cmd)
>> but the next line gives the same problem
>>> one
>> Error in system(cmd, input = input, intern = TRUE) :
>> 'C:/Program' not found
>> 
>> Thanks
>> Rich
>> 
>> 
>> On Sun, May 4, 2014 at 1:10 PM, Gabor Grothendieck
>>  wrote:
>>> Checking this with the bc R package (https://code.google.com/p/r-bc/),
>>> the Ryacas package (CRAN), the gmp package (CRAN) and the Windows 8.1
>>> calculator all four give the same result:
>>> 
 library(bc)
 bc("168988580159 * 36662978")
>>> [1] "6195624596620653502"
>>> 
 library(Ryacas)
 yacas("168988580159 * 36662978", retclass = "character")
>>> 6195624596620653502
>>> 
 library(gmp)
 as.bigz("168988580159") * as.bigz("36662978")
>>> Big Integer ('bigz') :
>>> [1] 6195624596620653502
>>> 
>>> 
>>> On Sun, May 4, 2014 at 12:50 PM, Ted Harding  
wrote:
 On 04-May-2014 14:13:27 Jorge I Velez wrote:
> Try
> 
> options(digits = 22)
> 168988580159 * 36662978
> # [1] 6195624596620653568
> 
> HTH,
> Jorge.-
 
 Err, not quite ... !
 I hitch my horses to my plough (with help from R):
 
 options(digits=22)
 168988580159*8 = 1351908641272 (copy down)
 168988580159*7 = 1182920061113  ( " " )
 168988580159*9 = 1520897221431  ( " " )
 168988580159*2 =  337977160318  ( " " )
 168988580159*6 = 1013931480954  ( " " )^3
 168988580159*3 =  506965740477  ( " " )
 
 1351908641272
 11829200611130
 152089722143100
 337977160318000
 1013931480954
 10139314809540
 101393148095400
 506965740477000
 ==
 6195624596620653502
 [after adding up mentally]
 
 compared with Jorge's:
 6195624596620653568
 
 ("02" vs "68" in the final two digits).
 
 Alternatively, if using a unixoid system with 'bc' present,
 one can try interfacing R with 'bc'. 'bc' is an calculating
 engine which works to arbitrary precision.
 
 There certainly used to be a utility in which R can evoke 'bc',
 into which one can enter a 'bc' command and get the result
 returned as a string, but I can't seem to find it on CRAN now.
 In any case, the raw UNIX command line for this calculation
 with 'bc' (with result) is:
 
 $ bc -l
 [...]
 168988580159 * 36662978
 6195624596620653502
 quit
 
 which agrees with my horse-drawn working.
 
 Best wishes to all,
 Ted.
 
> On Sun, May 4, 2014 at 10:44 PM, ARTENTOR Diego Tentor <
> diegotento...@gmail.com> wrote:
> 
> Trying algorithm for products with large numbers i encountered a 
> difference
> between result of 168988580159 * 36662978 in my algorithm and r product.
> The Microsoft calculator confirm my number.
>> 
> Thanks.
> --
> *Gráfica ARTENTOR  *
>> 
> de Diego L. Tentor
> Ec