date:20150923

[R] help:parLapply error

2015-09-23 Thread chunjing....@chetuobang.com

  Dear all: 
  I use R(version 3.2.2)'s parallel processing  code as follows:
cat("parall cal rtic:",nListLink, 
",params:",length(arimaPreParams),"start...","\n")
   cl <- makeCluster(getOption("cl.cores", 12));
   system.time({
   res <- parLapply(cl,1:nListLink,linkProcess)
 });
  stopCluster(cl);
The following error was reported:
Error in checkForRemoteErrors(val) : 
one node produced an error: root finding code failed
   Calls: pre_channel ... clusterApply -> staticClusterApply -> 
checkForRemoteErrors
I want to know what a mistake it is,and What caused it.I looked up the old 
mail, did not find the answer.
Please help answer.
   
 Best regards,

chunjing@chetuobang.com

   
   



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Appropriate specification of random effects structure for EEG/ERP data: including Channels or not?

2015-09-23 Thread Paolo Canal

Dear r-help list,

I work with EEG/ERP data and this is the first time I am using LMM to 
analyze my data (using lme4).
The experimental design is a 2X2: one manipulated factor is agreement, 
the other is noun (agreement being within subjects and items, and noun 
being within subjects and between items).

The data matrix is 31 subjects * 160 items * 33 channels. In ERP 
research, the distribution of the EEG amplitude differences (in a time 
window of interest) are important, and we care about knowing whether a 
negative difference is occurring in Parietal or Frontal electrodes. At 
the same time information from single channel is often too noisy and 
channels are organized in topographic factors for evaluating differences 
in distribution. In the present case I have assigned each channel to one 
of three levels of two factors, i.e., Longitude (Anterior, Central, 
Parietal) and Medial (Left, Midline, Right): for instance, one channel 
is Anterior and Left. With traditional ANOVAs channels from the same 
level of topographic factors are averaged before variance is evaluated 
and this also has the benefit of reducing the noise picked up by the 
electrodes.

I have troubles in deciding the random structure of my model. Very few 
examples on LMM on ERP data exist (e.g., Newman, Tremblay, Nichols, 
Neville & Ullman, 2012) and little detail is provided about the 
treatment of channel. I feel it is a tricky term but very important to 
optimize fit. Newman et al say "data from each electrode within an ROI 
were treated as repeated measures of that ROI". In Newman et al, the 
ROIs are the 9 regions deriving from Longitude X Medial (Anterior-Left, 
Anterior-Midline, Anterior-Right, Central-Left ... and so on), so in a 
way they treated each ROI separately and not according to the relevant 
dimensions of Longitude and Medial.

We used the following specifications in lmer:

[fixed effects specification: υV ~ Agreement * Noun * Longitude * Medial 
* (cov1 + cov2 + cov3 + cov4)] (the terms within brackets are a series 
of individual covariates, most of which are continuous variables)

[random effects specification: (1+Agreement*Type of Noun | subject) + 
(1+Agreement | item) + (1|longitude:medial:channel)]

What I care the most about is the last term 
(1|longitude:medial:channel). I chose this specification because I 
thought that allowing each channel to have different intercepts in the 
random structure would affect the estimation of the topographic fixed 
effects (Longitude and Medial) in which channel is nested. Unfortunately 
a reviewer commented that since "channel is not included in the fixed 
effects I would probably leave that out".

But each channel is a repeated measure of the eeg amplitude inside the 
two topographic factors, and random terms do not have to be in the fixed 
structure, otherwise we would also include subjects and items in the 
fixed effects structure. So I kind of feel that including channels as 
random effect is correct, and having them nested in longitude:medial 
allows to relax the assumption that the effect in the EEG has always the 
same longitude:medial distribution. But I might be wrong.

I thus tested differences in fit (ML) with anova() between 
(1|longitude:medial:channel) and the same model without the term, and a 
third model with the model with a simpler (1|longitude:medial).

Fullmod vs Nochannel:

Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
modnoch 119 969479 970653 -484621 969241
fullmod 120 968972 970156 -484366 968732 508.73 1 < 2.2e-16 ***

Differences in fit is remarkable (no variance components with estimates 
close to zero; no correlation parameters with values close to ±1).

Fullmod vs SimplerMod:

   Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)

fullmod 120 968972 970156 -484366 968732
simplermod 120 969481 970665 -484621 969241 0 0 1

Here the number of parameters to estimate in fullmod and simplermod is 
the same but the increase in fit is very consistent (-509 BIC). So I 
guess although the chisquare is not significant we do have a string 
increase in fit. As I understand this, a model with better fit will find 
more accurate estimates, and I would be inclined to keep the fullmod 
random structure.

But perhaps I am missing something or I am doing something wrong. Which 
is the correct random structure to use?

Feedbacks are very much appreciated. I often find answers in the list, 
and this is the first time I post a question.
Thanks,
Paolo










[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] doubt with Odds ratio - URGENT HELP NEEDED

2015-09-23 Thread Michael Dewey


Dear Rosa

It would help if you posted the error messages where they occur so that 
we can see which of your commands caused which error. However see 
comment inline below.


On 22/09/2015 22:17, Rosa Oliveira wrote:

Dear all,


I’m trying to compute Odds ratio and OR confidence interval.

I’m really naive, sorry for that.


I attach my data and my code.

I’m having lots of errors:

1. Error in data.frame(tas1 = tas.data$tas_d2, tas2 = tas.data$tas_d3, tas3 = 
tas.data$tas_d4,  :
   arguments imply differing number of rows: 90, 0



At least one of tas_d2, tas_d3, tas_d4 does not exist

I suggest fixing that one and hoping the rest go away.


2. Error in data.frame(tas = c(unlist(tas.data[, -8:-6])), time = rep(c(0:4),  :
   arguments imply differing number of rows: 630, 450, 0

3. Error: object 'tas.data.long' not found

4. Error in data.frame(media = c(mean.dead, mean.alive), standarderror = 
c(se.dead,  :
   arguments imply differing number of rows: 14, 10

5. Error in ggplot(summarytas, aes(x = c(c(1:5), c(1:5)), y = mean, colour = 
discharge)) :
   object 'summarytas' not found

6. Error in summary(glm(tas.data[, 6] ~ tas.data[, 4], family = binomial(link = 
probit))) :
   error in evaluating the argument 'object' in selecting a method for function 
'summary': Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1

7. Error in wilcox.test.default(pred[obs == 1], pred[obs == 0], alternative = 
"great") :
   not enough (finite) 'x' observations
In addition: Warning message:
In is.finite(x) & apply(pred, 1, f) :
   longer object length is not a multiple of shorter object length


and off course I’m not getting OR.

Nonetheless all this errors, I think I have not been able to compute de code to 
get OR and OR confidence interval.


Can anyone help me please. It’s really urgent.

PLEASE

THE CODE:

the hospital outcome is discharge.

require(gdata)
library(foreign)
library(nlme)
library(lme4)
library(boot)
library(MASS)
library(Hmisc)
library(plotrix)
library(verification)
library(mvtnorm)
library(statmod)
library(epiR)

#
# Data preparation  
#
#

setwd("/Users/RO/Desktop")

casedata <-read.spss("tas_05112008.sav")
tas.data<-data.frame(casedata)

#Delete patients that were not discharged
tas.data <- tas.data[ tas.data$hosp!="si ",]
tas.data$resultado.hosp  <- ifelse(tas.data$hosp=="l", 0, 1)

tas.data$tas_d2  <- 
log(ifelse(tas.data$tas_d2==8|tas.data$tas_d2==9, NA, tas.data$tas_d2))
tas.data$tas_d3  <- 
log(ifelse(tas.data$tas_d3==8|tas.data$tas_d3==9, NA, tas.data$tas_d3))
tas.data$tas_d4  <- 
log(ifelse(tas.data$tas_d4==8|tas.data$tas_d4==9, NA, tas.data$tas_d4))
tas.data$tas_d5  <- 
log(ifelse(tas.data$tas_d5==8|tas.data$tas_d5==9, NA, tas.data$tas_d5))
tas.data$tas_d6  <- 
log(ifelse(tas.data$tas_d6==8|tas.data$tas_d6==9, NA, tas.data$tas_d6))

 tas.data$age  <- ifelse(tas.data$age==8|tas.data$age==9, NA, 
tas.data$age)


 tas.data <-   data.frame(tas1 = tas.data$tas_d2, tas2 
= tas.data$tas_d3,
  tas3 = tas.data$tas_d4, tas4 = 
tas.data$tas_d5,
  tas5 = tas.data$tas_d6, age = 
tas.data$age,
  discharge = 
tas.data$resultado.hosp, id.pat=tas.data$ID)

#tas.data$discharge  <- factor(   tas.data$discharge , levels=c(0,1), labels = 
c("dead", "alive"))

   #select only cases that have more than 3 tas
 tas.data  <- tas.data[apply(tas.data[,-8:-6], 1, 
function(x) sum(!is.na(x)))>2,]



 nsample <- n.obs  <- dim(tas.data)[1]  #nr of patients with 
more than 2 tas measurements

 tas.data.long <- data.frame( 
tas=c(unlist(tas.data[,-8:-6])), time=rep(c(0:4), each=n.obs), 
age=rep(tas.data$age, 5), discharge=rep(tas.data$discharge, 5),
id=rep(c(1:n.obs), 5))
 tas.data.long <- tas.data.long  [order(tas.data.long$id),]

 age=tas.data$age

##
#PLOT EMPIRICAL MEANS OF CRP FOR ALIVE  DEATh
##
   mean.alive  <- apply(tas.data[tas.data$discharge==0, 
-8:-6], 2, mean, na.rm=T)
   mean.dead   <- apply(tas.data[tas.data$discharge==1, 
-8:-6], 2, mean, na.rm=T)
   stderr  <- function(x)

[R] Error: pandoc version 1.12.3 or higher is required and was not found

2015-09-23 Thread Ryszard Czermiński

I am trying to use R Markdown, but call to render() gives me an error:
Error: pandoc version 1.12.3 or higher is required and was not found.

As I understand [http://yihui.name/knitr/demo/pandoc/] pandoc is a function
defined in knitr, which I have installed and it has pandoc() function
defined.

Looks like some version incompatibility issue, but I do not really know how
to resolve it.
Do I need to go to older R version to use it?

I would appreciate your help.

Best regards,
Ryszard

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.3 (Yosemite)
other attached packages: knitr_1.11
[...]

Ryszard Czerminski
508-358-6328
rysz...@czerminski.net
LinkedIn.com/in/Ryszard.Czerminski

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Running R on a hosting o server

2015-09-23 Thread John McKown

On Tue, Sep 22, 2015 at 2:55 PM, bgnumis bgnum  wrote:

> Hi all,
>
> Hope I can Explain:
>
> I want to "run" online some function (code written by me) so that this code
> "saves" an output file on my server and my html webpage read the file R
> plots and save (really manually I would run R function, open Filezilla, and
> pass the output png o jpg file).
>
> Is it possible to do this "authomatically" telling R, something like each
> 15 minutes run this "pru.txt" file, and take save this plot.png and execute
> filezilla with this inputs and save the plot in this folder?
>

In Linux (or any UNIX like system), you can schedule tasks to be run by
using "cron" (do a "man crontab" for some information, if you need to). So,
if you can make a "shell script" which does all of your work without user
input, then you can use "cron" to schedule it periodically, every "n"
minutes, daily, weekly on ???day, monthly, etc. I don't know Filezilla, so
I don't know what you are really doing with it.

Windows has a similar scheduler to run "bat" files or "Power Shell"
scripts. I don't _DO_ Windows!  ->  -> 

Mac OSX - not a fanboy, try somebody else. 

>
> Hope you can understand me.
>
> In rmarkdown it is true that output html file but my intention is tu run my
> own function and the need is to run my function in R, and run and open
> filezilla and deposit the file in the right place.
>
>
If you are using Filezilla to copy a file, where is it being copied to? In
UNIX, I'd see if I could use just the plain "cp" command (for local, NFS,
or CIFS attached places) or either the "ftp" or "scp" command to copy to a
different server. In Windows the "copy" command could be used to copy the
file to a different folder locally or on a "share" (Windows "share" is,
more or less, the same as UNIX CIFS, if you're interested). Again, if this
needs to go to some other server, then a scripted "ftp" should work. One of
my co-workers does this.

-- 

Schrodinger's backup: The condition of any backup is unknown until a
restore is attempted.

Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unexpected/undocumented behavior of 'within': dropping variable names that start with '.'

2015-09-23 Thread Martin Maechler

> Hadley Wickham 
> on Sun, 20 Sep 2015 14:23:43 -0400 writes:

> The problem is that within.data.frame calls as.list.environment with
> the default value of all.names = FALSE. I doubt this is a deliberate
> feature, and is more likely to be a minor oversight.

Indeed;
Thank you, Hadley (and Brian)!

It is fixed now in R-devel   and will be ported to R-patched
probably tomorrow.

Martin

> Hadley

> On Sun, Sep 20, 2015 at 11:49 AM, Brian  wrote:
>> Dear List,
>> 
>> Somewhere I missed something, and now I'm really missing something!
>> 
>>> d.f <- data.frame(.id = c(TRUE, FALSE, TRUE), dummy = c(1, 2, 3), a =
>> c(1, 2, 3), b = c(1, 2, 3) + 1)
>> > within(d.f, {d = a + b})
>> dummy a b d
>> 1 1 1 2 3
>> 2 2 2 3 5
>> 3 3 3 4 7
>> > d.f <- data.frame(.id = c(TRUE, FALSE, TRUE), .dummy = c(1, 2, 3), a
>> = c(1, 2, 3), b = c(1, 2, 3) + 1)
>> > within(d.f, {d = a + b})
>> a b d
>> 1 1 2 3
>> 2 2 3 5
>> 3 3 4 7
>> 
>> Could somebody please explain to me why this does this? I think could be
>> considered a feature (for lots of calculations within a data frame you
>> don't have to remove all extra variables at the end).  I just wish it
>> was documented.
>> 
>> Cheers,
>> Brian
>> 
>> 
>> sessionInfo()
>> R version 3.1.0 (2014-04-10)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> 
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>> [9] LC_ADDRESS=C   LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> 
>> attached base packages:
>> [1] splines   grid  stats graphics  grDevices utils datasets
>> [8] methods   base
>> 
>> other attached packages:
>> [1] scales_0.2.4   plyr_1.8.3 reshape2_1.4
>> ccchDataProc_0.7
>> [5] ccchTools_0.6  xtable_1.7-4   tables_0.7.79  Hmisc_3.14-5
>> [9] Formula_1.1-2  survival_2.37-7ggplot2_1.0.1
>> IDPmisc_1.1.17
>> [13] lattice_0.20-29myRplots_1.1   myRtools_1.2   
meteoconv_0.1
>> [17] pixmap_0.4-11  RColorBrewer_1.0-5 maptools_0.8-30sp_1.1-1
>> [21] mapdata_2.2-3  mapproj_1.2-2  maps_2.3-9 
chron_2.3-45
>> [25] MASS_7.3-35
>> 
>> loaded via a namespace (and not attached):
>> [1] acepack_1.3-3.3 cluster_1.15.2  colorspace_1.2-4
>> [4] compiler_3.1.0  data.table_1.9.4digest_0.6.4
>> [7] foreign_0.8-61  gtable_0.1.2labeling_0.3
>> [10] latticeExtra_0.6-26 munsell_0.4.2   nnet_7.3-8
>> [13] proto_0.3-10Rcpp_0.12.0 rpart_4.1-8
>> [16] stringr_0.6.2   tools_3.1.0
>> > within
>> function (data, expr, ...)
>> UseMethod("within")
>> 
>> 
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



> -- 
> http://had.co.nz/

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Issues fitting Gompertz-Makeham model to mortality data

2015-09-23 Thread Marion Keast

Good evening,

I have a very basic knowledge of how to use R and am struggling with my
code to fit a Gompertz-Makeham model to data. The code and error message is
as follows:

> data <- hmd.mx("AUS","mkeas...@hotmail.com","1Mrkqazwsx", "country")
Warning message:
In hmd.mx("AUS", "mkeas...@hotmail.com", "1Mrkqazwsx", "country") :
  NAs introduced by coercion
> summary(data)
Mortality data for country
Series: female male total
Years: 1921 - 2011
Ages:  0 - 110
> m.qx <- data$rate$male[1:111,1:90]
> f.qx <-data$rate$female[1:111,1:90]
> age<-data$age[21:90]
>
> n=111
> m=90
> beta1=(1:m)*0
> beta2=(1:m)*0
> beta3=(1:m)*0
> qhat.mx=array(0,dim=c(n,m))
> qhat.fx=array(0,dim=c(n,m))
>
> #j is ages 0 - 110
> #i is years 1921 - 2011
>
> #MALES
> for (i in 1:m)
+ {
+ model1 <-fitGM(,m.qx[1:n,i])
+ m1 <- model1[1]
+ m2 <- model1[2]
+ m3 <- model1[3]
+ beta1[i]=m1
+ beta2[i]=m2
+ beta3[i]=m3
+ qhat.mx[,i]=GompertzMakeham(beta1[i],beta2[i],beta3[i],age)
+ }
Error in qhat.mx[, i] = GompertzMakeham(beta1[i], beta2[i], beta3[i],  :
  number of items to replace is not a multiple of replacement length
>
> MAPE.m=sum(abs(qhat.mx-m.qx)/m.qx)/(n*m)
> MaPE.m=sum((qhat.mx-m.qx)/m.qx)/(n*m)

Please advise in regards to fixing errors.
Furthermore, if I would like to fit MAPE to each individual age instead of
the population, is there a loop I can use to achieve this.

Thank you very much for your time and assistance.

Marion

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: pandoc version 1.12.3 or higher is required and was not found

2015-09-23 Thread Marc Schwartz


> On Sep 23, 2015, at 6:13 AM, Ryszard Czermiński  
> wrote:
> 
> I am trying to use R Markdown, but call to render() gives me an error:
> Error: pandoc version 1.12.3 or higher is required and was not found.
> 
> As I understand [http://yihui.name/knitr/demo/pandoc/] pandoc is a function
> defined in knitr, which I have installed and it has pandoc() function
> defined.
> 
> Looks like some version incompatibility issue, but I do not really know how
> to resolve it.
> Do I need to go to older R version to use it?
> 
> I would appreciate your help.
> 
> Best regards,
> Ryszard
> 
>> sessionInfo()
> R version 3.2.2 (2015-08-14)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.10.3 (Yosemite)
> other attached packages: knitr_1.11
> [...]
> 
> Ryszard Czerminski
> 508-358-6328
> rysz...@czerminski.net
> LinkedIn.com/in/Ryszard.Czerminski


Did you actually install pandoc?

As per the page that you link to above:

"Please follow the instructions on the Pandoc website to install it."

The link in the above sentence is:

  http://johnmacfarlane.net/pandoc/

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] retaining characters in a csv file

2015-09-23 Thread Therneau, Terry M., Ph.D.

Thanks for all for the comments, I hadn't intended to start a war.

My summary:
1. Most important: I wasn't missing something obvious. This is always my first
suspicion when I submit something to R-help, and it's true more often than not.

2. Obviously (at least it is now), the CSV standard does not specify that quotes should
force a character result. R is not "wrong". Wrt to using what Excel does as litmus test,
I consider that to be totally uninformative about standards: neither pro (like Duncan) or
anti (like Rolf), but simply irrelevant. (Like many MS choices.)

3. I'll have to code in my own solution, either pre-scan the first few lines to create
a colClasses, or use read_csv from the readr library (if there are leading zeros it keeps
the string as character, which may suffice for my needs), or something else.

4. The source of the data is a "text/csv" field coming from an http POST request. This
is an internal service on an internal Mayo server and coded by our own IT department; this
will not be the first case where I have found that their definition of "csv" is not quite
standard.

Terry T.

On 23/09/15 10:00, Therneau, Terry M., Ph.D. wrote:

I have a csv file from an automatic process (so this will happen
thousands of times), for which the first row is a vector of variable
names and the second row often starts something like this:

5724550,"000202075214",2005.02.17,2005.02.17,"F", .

Notice the second variable which is
a character string (note the quotation marks)
a sequence of numeric digits
leading zeros are significant

The read.csv function insists on turning this into a numeric. Is there
any simple set of options that
will turn this behavior off? I'm looking for a way to tell it to "obey
the bloody quotes" -- I still want the first, third, etc columns to
become numeric. There can be more than one variable like this, and not
always in the second position.

This happens deep inside the httr library; there is an easy way for me
to add more options to the read.csv call but it is not so easy to
replace it with something else.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: pandoc version 1.12.3 or higher is required and was not found

2015-09-23 Thread Duncan Murdoch

On 23/09/2015 7:13 AM, Ryszard Czermiński wrote:
> I am trying to use R Markdown, but call to render() gives me an error:
> Error: pandoc version 1.12.3 or higher is required and was not found.
> 
> As I understand [http://yihui.name/knitr/demo/pandoc/] pandoc is a function
> defined in knitr, which I have installed and it has pandoc() function
> defined.

The error is talking about the non-R software package called pandoc,
which the pandoc() function makes use of.  Pandoc the package does the
heavy lifting, converting the Markdown into HTML, for example.

Duncan Murdoch

> 
> Looks like some version incompatibility issue, but I do not really know how
> to resolve it.
> Do I need to go to older R version to use it?
> 
> I would appreciate your help.
> 
> Best regards,
> Ryszard
> 
>> sessionInfo()
> R version 3.2.2 (2015-08-14)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.10.3 (Yosemite)
> other attached packages: knitr_1.11
> [...]
> 
> Ryszard Czerminski
> 508-358-6328
> rysz...@czerminski.net
> LinkedIn.com/in/Ryszard.Czerminski
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] doubt with Odds ratio - URGENT HELP NEEDED

2015-09-23 Thread Michael Dewey


Dear Rosa

Can you remove all the code which is not relevant to calculating the 
odds ratio so we can see what is going on?


On 23/09/2015 16:06, Rosa Oliveira wrote:

Dear Michael,


I found some of the errors, but others I wasn’t able to.

And my huge huge problem concerns OR and OR confidence interval :(


*New Corrected code:*


casedata <-read.spss("tas_05112008.sav")
tas.data<-data.frame(casedata)

#Delete patients that were not discharged
tas.data <- tas.data[ tas.data$hosp!="si ",]
tas.data$resultado.hosp  <- ifelse(tas.data$hosp=="l", 0, 1)

tas.data$tas_d2  <-
log(ifelse(tas.data$tas_d2==8|tas.data$tas_d2==9, NA,
tas.data$tas_d2))
tas.data$tas_d3  <-
log(ifelse(tas.data$tas_d3==8|tas.data$tas_d3==9, NA,
tas.data$tas_d3))
tas.data$tas_d4  <-
log(ifelse(tas.data$tas_d4==8|tas.data$tas_d4==9, NA,
tas.data$tas_d4))
tas.data$tas_d5  <-
log(ifelse(tas.data$tas_d5==8|tas.data$tas_d5==9, NA,
tas.data$tas_d5))
tas.data$tas_d6  <-
log(ifelse(tas.data$tas_d6==8|tas.data$tas_d6==9, NA,
tas.data$tas_d6))

 tas.data$age  <-
ifelse(tas.data$age==8|tas.data$age==9, NA, tas.data$age)


 tas.data <-   data.frame(tas1 =
tas.data$tas_d2, tas2 = tas.data$tas_d3,
  tas3 = tas.data$tas_d4,
tas4 = tas.data$tas_d5,
  tas5 = tas.data$tas_d6,
age = tas.data$age,
  discharge =
tas.data$resultado.hosp, id.pat=tas.data$id)

#tas.data$discharge  <- factor(   tas.data$discharge ,
levels=c(0,1), labels = c("dead", "alive"))

   #select only cases that have more than 3 tas
 tas.data  <- tas.data[apply(tas.data[,-8:-6],
1, function(x) sum(!is.na(x)))>2,]


 nsample <- n.obs  <- dim(tas.data)[1]  #nr of patients
with more than 2 tas measurements

 tas.data.long <- data.frame(
tas=c(unlist(tas.data[,-8:-6])), time=rep(c(0:4), each=n.obs),
age=rep(tas.data$age, 5), discharge=rep(tas.data$discharge, 5),
id=rep(c(1:n.obs), 5))
 tas.data.long <- tas.data.long
  [order(tas.data.long$id),]

 age=tas.data$age

##
#PLOT EMPIRICAL MEANS OF CRP FOR ALIVE  DEATh
##
   mean.alive  <-
apply(tas.data[tas.data$discharge==0, -8:-6], 2, mean, na.rm=T)
   mean.dead   <-
apply(tas.data[tas.data$discharge==1, -8:-6], 2, mean, na.rm=T)
   stderr  <- function(x)
sqrt(var(x,na.rm=TRUE)/length(na.omit(x)))
   se.alive<-
apply(tas.data[tas.data$discharge==0, -8:-6], 2, stderr)
   se.dead <-
apply(tas.data[tas.data$discharge==1, -8:-6], 2, stderr)
   summarytas  <- data.frame (media = c(mean.dead,
mean.alive),
   standarderror = c( se.dead,
se.alive), discharge = c(rep("dead",5), rep("alive", 5)))


ggplot(summarytas, aes(x=c(c(1:5), c(1:5)), y=mean, colour=discharge)) +
 geom_errorbar(aes(ymin=mean-2* standarderror, ymax=media+2*
standarderror), width=.1) +
   scale_color_manual(values=c("blue", "red")) +
theme(legend.text=element_text(size=20),
axis.text=element_text(size=16),
axis.title=element_text(size=20,face="bold")) +
scale_x_continuous(name="Days") +
   scale_y_continuous(name="log tas") +
   geom_line() +
 geom_point()


library(verification)
prev <- summary(glm(tas.data[,7]~tas.data[,4],family=binomial(link=probit)))
answer= c(prev$coefficients[,1:2])


roc.plot(tas.data[,7], prev, show.thres = FALSE, legen=F )



modelo<-function (dataainit)
{

#dataa<-tas.data
dataa<-dataainit

dataa$ident<-seq(1:90)
tas.6days<-cbind(id=rep(dataa$id,5),tas=c(dataa$tas_d2, dataa$tas_d3,
dataa$tas_d4, dataa$tas_d5, dataa$tas_d6),
time=rep(c(2:6)-2, each=90), out.come=rep(dataa$hosp,5),
ident=rep(dataa$ident,5))
tas.6days<-data.frame(tas.6days[order(tas.6days[,1]),])
tas.6days$tas[tas.6days$tas==8|tas.6days$tas==9 ]<-NA
#mixed model for the longitudinal tas
lme.1 <- lme(tas~ time+1, random = ~ time+1 |ident, data=tas.6days,
na.action=na.exclude )
#Random intercept and slopes
pred.lme<-predict(lme.1)
lme.intercept<-lme.1$coef$random$ident[,1]+lme.1$coef$fixed[1]
lme.slope<- lme.1$coef$random$ident[,2]+lme.1$coef$fixed[2]
selector<-as.numeric(names(lme.intercept)) #to select not NA rows. Apply
to the vector in the dataset
  test(dataa$intercept[resultado.hosp==1],
dataa$intercept[resultado.hosp==0])
print(summary(model.surv1))
return(model.surv1$coef)
  }


*CONSOLE RESULT: (errors in red)*

 > casedata <-read.spss("tas_05112008.sav")
Warning message:
In read.spss("tas_05112008.sav") :

[R] How to config the RStudio

2015-09-23 Thread Cleber Borges


Dear useRs,

1) how to configure the plot command into windows device by default in 
RStudio?
2) in layout panels, how to config for one panel to start in "minimized 
mode" ?
(i would like to see only the "Environment panel" opened and only the 
headers of the other panel) when start RStudio.


TIA!

 :-)  cleber

---
Este email foi escaneado pelo Avast antivírus.
https://www.avast.com/antivirus

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Re: Compare two normal to one normal

2015-09-23 Thread Mark Leeds

Hi Rolf: I have  read  a decent amount about  the AIC  but that was a long,
long time ago. I too am no expert on it and John should read some of the
AIC literature John: There's one whole supposedly great text just on AIC
but I don't have it.  Link is here. Of course, it's
absurdly expensive but does get pretty good reviews.

http://www.amazon.com/Model-Selection-Multimodel-Inference-Information-Theoretic/dp/0387953647/ref=sr_1_1?ie=UTF8=1443026829=8-1=model+selection+aic

Note that if  you end up using the AIC approach, you'll still need the log
likelihoods in both models. I would calculate them yourself and all the
constants like 1/radical 2pi don't need to be included of course since
they'll just be scaling factors.

On Wed, Sep 23, 2015 at 2:22 AM, Rolf Turner 
wrote:

> On 23/09/15 16:38, Mark Leeds wrote:
>
>> John: After I sent what I wrote, I read Rolf's intelligent response. I
>> didn't realize that
>> there are boundary issues so yes, he's correct and  my approach is EL
>> WRONGO. I feel very not good that I just sent that email being that it's
>> totally wrong. My apologies for noise
>> and thanks Rolf for the correct response.
>>
>> Oh,  thing that does still hold in my response is  the AIC approach unless
>> Rolf
>> tells us that it's not valid also. I don't see why it wouldn't be though
>> because you're
>> not doing a hypothesis test when you go the AIC route.
>>
>
> 
>
> I am no expert on this, but I would be uneasy applying AIC to such
> problems without having a very close look at the literature on the
> subject.  I'm pretty sure that there *are* regularity conditions that must
> be satisfied in order that AIC should give you a "valid" basis for
> comparison of models.
>
> AIC has most appeal, and is mostly used (in my understanding) in settings
> where there is a multiplicity of models, whereby the multiple comparisons
> problem causes hypothesis testing to lose its appeal. Correspondingly AIC
> has little appeal in a setting in which a single hypothesis test is
> applicable.
>
> I could be wrong about this; as I said, I am no expert.  Perhaps younger
> and wiser heads will chime in and correct me.
>
> cheers,
>
> Rolf
>
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Error from lme4: "Error: (p <- ncol(X)) == ncol(Y) is not TRUE"

2015-09-23 Thread Rory Wilson

In reply to Rolf Turner and Jean Adams who have been helping me:
This does appear to be an issue with NA values in the non-factor variables. In 
the (non-reproducible) example below, we can see that removing the NAs solves 
the problem. However, from what I can see to this point, there does not seem be 
be rhyme nor reason to why the issue is taking place. A slight modification to 
Rolf Turner's code (introducing some NAs) shows that, in general, NAs are not a 
problem for lmer (indeed, it just runs na.omit as default).
Examining which factors are affected by the removal of the NAs shows no 
discernible pattern - no factors disappeared, none became "1" or anything of 
this nature.I will be able to proceed just by performing the na.omit 
beforehand, but it is curious. 
Thanks for your help everyone (especially Rolf Turner and Jean Adams)!

mod1<-lmer(beta~expData+techVar$RIN+techVar$sample_storage_time+(1|techVar$p_amplification))#Error:
 (p <- ncol(X)) == ncol(Y) is not TRUE
mod1<-lm(beta~expData+techVar$RIN+techVar$sample_storage_time+techVar$p_amplification)#No
 error given.
  #trial with eliminating NAs  elimSamps<-which(is.na(beta))  length(elimSamps) 
#[1] 4
  #eliminate the NAs from all vectors  betaVals<-beta[-elimSamps]  
expD<-expData[-elimSamps])  techRIN<-techVar$RIN[-elimSamps]  
techTime<-techVar$sample_storage_time[-elimSamps]  
techPlate<-factor(techVar$p_amplification[-elimSamps])

mod1<-lmer(betaVals~expD+techRIN+techTime+(1|techPlate))
summary(mod1)#Linear mixed model fit by REML ['lmerMod']#Formula: betaVals ~ 
expD + techRIN + techTime + (1 | techPlate)##REML criterion at convergence: 
-1701.7##Scaled residuals:#    Min      1Q  Median      3Q     Max#-2.3582 
-0.6996 -0.1085  0.6079  4.6743##Random effects:# Groups    Name        
Variance  Std.Dev.# techPlate (Intercept) 1.645e-05 0.004056# Residual          
    4.991e-03 0.070644#Number of obs: 709, groups:  techPlate, 29##Fixed 
effects:#             Estimate Std. Error t value#(Intercept) 2.159e-01  
9.871e-02   2.188#expD        2.330e-03  1.498e-02   0.156#techRIN     
5.096e-03  4.185e-03   1.218#techTime    1.399e-06  1.565e-05   
0.089##Correlation of Fixed Effects:#         (Intr) expD   tchRIN#expD     
-0.919#techRIN  -0.272 -0.103#techTime -0.206  0.063  0.021

summary(techPlate)# plate01  plate02  plate03  plate04  plate05  plate06  
plate07  plate09#       1       22       34       28       31       28       32 
      10#plate09a  plate10  plate11  plate13  plate14  plate15  plate16  
plate17#      16       17       15        4       52       55       41       
33# plate18  plate19  plate20  plate21  plate22  plate23  plate24  plate25#     
 50       42       21       50       13       22       17        7# plate26  
plate27  plate28  plate30  plate32#      25       21        5       13        4
summary(techVar$p_amplification)# plate01  plate02  plate03  plate04  plate05  
plate06  plate07  plate09#       1       22       34       28       31       28 
      32       10#plate09a  plate10  plate11  plate13  plate14  plate15  
plate16  plate17#      17       17       15        4       53       55       41 
      33# plate18  plate19  plate20  plate21  plate22  plate23  plate24  
plate25#      50       42       21       50       13       22       17        
8# plate26  plate27  plate28  plate30  plate32#      25       21        5       
14        4

#Counter-example, where it functions fine#Example from Rolf Turner
set.seed(42)f <- factor(sample(1:29,713,TRUE))x <- seq(0,1,length=713)y <- 
rnorm(713)require(lme4)
x[sample(1:713,4,replace=F)]<-NA
fit <- lmer(y ~ x + (1|f))#No error message given
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] retaining characters in a csv file

2015-09-23 Thread John Kane





> -Original Message-
> From: r.tur...@auckland.ac.nz
> Sent: Wed, 23 Sep 2015 13:26:58 +1200
> To: pda...@gmail.com
   ..
> I would say that this phenomenon ("Excel does it") is *overwhelming*
> evidence that it is bad practice!!! :-)

Fortune?


Can't remember your password? Do you need a strong and secure password?
Use Password manager! It stores your passwords & protects your account.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] reproducing Graphpad IC50 in R with drc package

2015-09-23 Thread Jon Arsenault

I've been banging my head against the wall a bit with this and would be
ecstatic if someone could help.

Initial data:

 Response Dose
1  285.17  0.125
2  377.65  0.250
3  438.99  0.500
4  338.46  1.000
5  227.87  2.000
6  165.96  0.010
7  302.92  0.125
8  418.50  0.250
9  464.69  0.500
10 301.36  1.000
11 213.12  2.000
12 160.34  0.010
13 306.18  0.125
14 435.37  0.250
15 451.34  0.500
16 319.50  1.000
17 219.83  2.000
18 172.52  0.010
19 306.56  0.125
20 439.01  0.250
21 469.74  0.500
22 318.05  1.000
23 223.09  2.000

The graphpad template that that this normally would be placed in then
transforms the taking the log(Dose,base=10) and normalizes the Response.
I've confirmed that I was able to recreate that here:

   NormResponse  LogDose
141.011 -0.90309
272.909 -0.60206
394.067 -0.30103
459.392  0.0
521.246  0.30103
6-0.108 -2.0
747.133 -0.90309
887.000 -0.60206
9   102.932 -0.30103
10   46.595  0.0
11   16.159  0.30103
12   -2.047 -2.0
13   48.258 -0.90309
14   92.819 -0.60206
15   98.327 -0.30103
16   52.852  0.0
17   18.473  0.30103
182.155 -2.0
19   48.389 -0.90309
20   94.074 -0.60206
21  104.674 -0.30103
22   52.352  0.0
23   19.598  0.30103




Now graphpad used 'log(inhibitor) vs. normalized response -- Variable
slope' which it states to be a 4parameter sigmodial but it gives me very
different results.


graphpad gives and IC50 of 0.1560 and drm's LL.4 gives 0.1149 and it wont
even take the transformed data,just throws an error.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Progress bar in random forest model

2015-09-23 Thread Mohammad Tanvir Ahamed via R-help

Hi , 

I am using randomForest model in R .

For large number of tree my program takes long time to complete . 

In "randomForest" function i can use "do.trace=TRUE" to see the real time 
progress .  Sample out put in real time on R console is as follows

ntree  OOB  1  2  3  4  5  6  7  8  9 
100:   2.31%  7.14%  2.08%  0.00%  2.25% 10.81%  0.90%  0.00%  0.00%  1.72% 
200:   1.95%  7.14%  2.08%  0.00%  2.25%  8.11%  0.00%  0.00%  0.00%  1.72% 
300:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
400:   1.95%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  3.45% 
500:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
600:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
700:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
800:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
900:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
1000:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72%

The first row (100: 2.31% ) comes first. After 1 second it comes 2nd row 
and so on. 
Now i like to modify this out put . 

like, when 1st row will come , i like to grab only 100 form the whole line and 
show only 100 on R console instead of showing whole line. The same this will 
happen for rest of the row.

[ i tried sink(). but it will not work as sink write the complete output to out 
file ]

[I searched for do.trace option in randomForest function. but i lost myself  as 
i feel it call come C program. not sure but could not figured it out]


In general, i like to grab the real time output on R console.

I will be very grateful if any one please help me with . 

Thank you

 
Tanvir Ahamed 
Göteborg, Sweden  |  mashra...@yahoo.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sampling the Distance Matrix

2015-09-23 Thread Lorenzo Isella


Dear All,
Suppose you have a distance matrix stored like a dist object, for
instance

x<-rnorm(20)
y<-rnorm(20)

mm<-as.matrix(cbind(x,y))

dst<-(dist(mm))

Now, my problem is the following: I would like to get the rows of mm
corresponding to points whose distance is always larger of, let's say,
0.9.
In other words, if I were to compute the distance matrix on those
selected rows of mm, apart from the diagonal, I would get all entries
larger than 0.9.
Any idea about how I can efficiently code that?
Regards

Lorenzo

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] doubt with Odds ratio - URGENT HELP NEEDED

2015-09-23 Thread Rosa Oliveira

Dear Michael,


I found some of the errors, but others I wasn’t able to.

And my huge huge problem concerns OR and OR confidence interval :(


New Corrected code:


casedata <-read.spss("tas_05112008.sav")
tas.data<-data.frame(casedata)

#Delete patients that were not discharged
tas.data <- tas.data[ tas.data$hosp!="si ",]
tas.data$resultado.hosp  <- ifelse(tas.data$hosp=="l", 0, 1)

tas.data$tas_d2  <- 
log(ifelse(tas.data$tas_d2==8|tas.data$tas_d2==9, NA, tas.data$tas_d2))
tas.data$tas_d3  <- 
log(ifelse(tas.data$tas_d3==8|tas.data$tas_d3==9, NA, tas.data$tas_d3))
tas.data$tas_d4  <- 
log(ifelse(tas.data$tas_d4==8|tas.data$tas_d4==9, NA, tas.data$tas_d4))
tas.data$tas_d5  <- 
log(ifelse(tas.data$tas_d5==8|tas.data$tas_d5==9, NA, tas.data$tas_d5))
tas.data$tas_d6  <- 
log(ifelse(tas.data$tas_d6==8|tas.data$tas_d6==9, NA, tas.data$tas_d6))

tas.data$age  <- ifelse(tas.data$age==8|tas.data$age==9, NA, 
tas.data$age)


tas.data <-   data.frame(tas1 = tas.data$tas_d2, tas2 = 
tas.data$tas_d3, 
 tas3 = tas.data$tas_d4, tas4 = 
tas.data$tas_d5, 
 tas5 = tas.data$tas_d6, age = 
tas.data$age, 
 discharge = 
tas.data$resultado.hosp, id.pat=tas.data$id)

#tas.data$discharge  <- factor(   tas.data$discharge , 
levels=c(0,1), labels = c("dead", "alive"))

  #select only cases that have more than 3 tas
tas.data  <- tas.data[apply(tas.data[,-8:-6], 1, 
function(x) sum(!is.na(x)))>2,]



nsample <- n.obs  <- dim(tas.data)[1]  #nr of patients with 
more than 2 tas measurements

tas.data.long <- data.frame( 
tas=c(unlist(tas.data[,-8:-6])), time=rep(c(0:4), each=n.obs), 
age=rep(tas.data$age, 5), discharge=rep(tas.data$discharge, 5),
   id=rep(c(1:n.obs), 5))
tas.data.long <- tas.data.long  [order(tas.data.long$id),]

age=tas.data$age

##
#PLOT EMPIRICAL MEANS OF CRP FOR ALIVE  DEATh
##
  mean.alive  <- apply(tas.data[tas.data$discharge==0, 
-8:-6], 2, mean, na.rm=T)
  mean.dead   <- apply(tas.data[tas.data$discharge==1, 
-8:-6], 2, mean, na.rm=T) 
  stderr  <- function(x) 
sqrt(var(x,na.rm=TRUE)/length(na.omit(x)))
  se.alive<- apply(tas.data[tas.data$discharge==0, 
-8:-6], 2, stderr)
  se.dead <- apply(tas.data[tas.data$discharge==1, 
-8:-6], 2, stderr)
  summarytas  <- data.frame (media = c(mean.dead, 
mean.alive), 
  standarderror = c( se.dead, se.alive), 
discharge = c(rep("dead",5), rep("alive", 5)))


ggplot(summarytas, aes(x=c(c(1:5), c(1:5)), y=mean, colour=discharge)) + 
geom_errorbar(aes(ymin=mean-2* standarderror, ymax=media+2* standarderror), 
width=.1) +
  scale_color_manual(values=c("blue", "red")) +
   theme(legend.text=element_text(size=20), axis.text=element_text(size=16), 
axis.title=element_text(size=20,face="bold")) +
   scale_x_continuous(name="Days") +
  scale_y_continuous(name="log tas") +
  geom_line() +
geom_point()


library(verification)
prev <- summary(glm(tas.data[,7]~tas.data[,4],family=binomial(link=probit)))
answer= c(prev$coefficients[,1:2])


roc.plot(tas.data[,7], prev, show.thres = FALSE, legen=F )



modelo<-function (dataainit) 

{

#dataa<-tas.data
dataa<-dataainit

dataa$ident<-seq(1:90)
tas.6days<-cbind(id=rep(dataa$id,5),tas=c(dataa$tas_d2, 
dataa$tas_d3, 
dataa$tas_d4, dataa$tas_d5, dataa$tas_d6),
time=rep(c(2:6)-2, each=90), out.come=rep(dataa$hosp,5), 
ident=rep(dataa$ident,5))
  
tas.6days<-data.frame(tas.6days[order(tas.6days[,1]),]) 
tas.6days$tas[tas.6days$tas==8|tas.6days$tas==9 ]<-NA

#mixed model for the longitudinal tas
lme.1 <- lme(tas~ time+1, random = ~ time+1 |ident, 
data=tas.6days, na.action=na.exclude )

#Random intercept and slopes
pred.lme<-predict(lme.1)
lme.intercept<-lme.1$coef$random$ident[,1]+lme.1$coef$fixed[1] 
lme.slope<- lme.1$coef$random$ident[,2]+lme.1$coef$fixed[2] 
selector<-as.numeric(names(lme.intercept)) #to select not NA 
rows. Apply to the vector in the dataset

[R] Running R on a hosting o server

2015-09-23 Thread Marco

> Hi all,
> 
> Hope I can Explain:
> 
> I want to "run" online some function (code written by me) so that this code
> "saves" an output file on my server and my html webpage read the file R
> plots and save (really manually I would run R function, open Filezilla, and
> pass the output png o jpg file).
> 
> Is it possible to do this "authomatically" telling R, something like each
> 15 minutes run this "pru.txt" file, and take save this plot.png and execute
> filezilla with this inputs and save the plot in this folder?
> 
> Hope you can understand me.
> 
> In rmarkdown it is true that output html file but my intention is tu run my
> own function and the need is to run my function in R, and run and open
> filezilla and deposit the file in the right place.

You should use cron, (man cron):
cron (8) - daemon to execute scheduled commands (Vixie Cron)

So you schedule your task.

-- 
Marco Arthur @ (M)arco Creatives

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] get lm by file .cvs

2015-09-23 Thread raoman ramirez

well i have info<-read.csv(file, header=T)
and i want try  thatmodel<-lm(info[1]~info[2])
is for metod backward selection but i don't know how
thanks for help   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] retaining characters in a csv file

2015-09-23 Thread Daniel Nordlund


On 9/23/2015 5:57 AM, Therneau, Terry M., Ph.D. wrote:

Thanks for all for the comments, I hadn't intended to start a war.

My summary:
  1. Most important: I wasn't missing something obvious.  This is 
always my first suspicion when I submit something to R-help, and it's 
true more often than not.


  2. Obviously (at least it is now), the CSV standard does not specify 
that quotes should force a character result.  R is not "wrong".  Wrt 
to using what Excel does as litmus test, I consider that to be totally 
uninformative about standards: neither pro (like Duncan) or anti (like 
Rolf), but simply irrelevant.  (Like many MS choices.)


  3. I'll have to code in my own solution, either pre-scan the first 
few lines to create a colClasses, or use read_csv from the readr 
library (if there are leading zeros it keeps the string as character, 
which may suffice for my needs), or something else.


  4. The source of the data is a "text/csv" field coming from an http 
POST request.  This is an internal service on an internal Mayo server 
and coded by our own IT department; this will not be the first case 
where I have found that their definition of "csv" is not quite standard.


Terry T.




On 23/09/15 10:00, Therneau, Terry M., Ph.D. wrote:

I have a csv file from an automatic process (so this will happen
thousands of times), for which the first row is a vector of variable
names and the second row often starts something like this:

5724550,"000202075214",2005.02.17,2005.02.17,"F", .

Notice the second variable which is
   a character string (note the quotation marks)
   a sequence of numeric digits
   leading zeros are significant

The read.csv function insists on turning this into a numeric. Is there
any simple set of options that
will turn this behavior off?  I'm looking for a way to tell it to "obey
the bloody quotes" -- I still want the first, third, etc columns to
become numeric.  There can be more than one variable like this, and not
always in the second position.

This happens deep inside the httr library; there is an easy way for me
to add more options to the read.csv call but it is not so easy to
replace it with something else.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



A fairly simple workaround is to add two lines of code to the process, 
and then add the colClasses parameter as you suggested in item 2 above.


want <- read.csv('yourfile', quote='', stringsAsFactors= FALSE, nrows=1)
classes <- sapply(want, class)
want <- read.csv('yourfile', stringsAsFactors= FALSE, colClasses=classes)

I don't know if you want your final file to convert strings to factors, 
so you can modify as needed.  In addition, if your files aren't as 
regular as I inferred, you can increase the number of rows to read in 
the first line to ensure getting the classes right.



Hope this is helpful,

Dan

--
Daniel Nordlund
Bothell, WA  USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Error from lme4: "Error: (p <- ncol(X)) == ncol(Y) is not TRUE"

2015-09-23 Thread Rory Wilson

Hi Rolf,Yes, a reprodicuble example would be good - but it appears to be 
trouble with something within the data, and I cannot upload all the data here. 
Jean indicated the issue could be with NAs in the data, and I am inclined to 
agree. I will look more into that angle and see if something becomes 
apparent.Thanks for your help!Rory
  From: Rolf Turner 
 To: Rory Wilson ; "r-help@r-project.org" 

 Sent: Tuesday, September 22, 2015 11:26 PM
 Subject: Re: [FORGED] [R] Error from lme4: "Error: (p <- ncol(X)) == ncol(Y) 
is not TRUE"

On 23/09/15 01:18, Rory Wilson wrote:

> Hello all, I am trying to run a random intercept model using lme4.
> The random effect is a factor of 29 possibilities, making a model
> with one random effect (one level). It is just a linear model. There
> are 713 observations. However, when trying to run the model I receive
> the error "Error: (p <- ncol(X)) == ncol(Y) is not TRUE", a search
> for which reveals somewhat surprisingly little. Has anyone seen this
> before? Note that if I simply change the random effect into a fixed
> effect and use lm, the model works perfectly.Thank you!

[Caveat:  I really find the syntax of lmer() incomprehensible, so my 
example below could be a load of dingos' kidneys.]

I think a reproducible example (as specified by the posting guide) is 
needed here.  When I do:

set.seed(42)
f <- factor(sample(1:29,713,TRUE))
x <- seq(0,1,length=713)
y <- rnorm(713)
require(lme4)
fit <- lmer(y ~ x + (1|f))

I get a reasonable (???) looking result and no error messages.

cheers,

Rolf Turner

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Re: Compare two normal to one normal

2015-09-23 Thread Rolf Turner


On 23/09/15 16:38, Mark Leeds wrote:

John: After I sent what I wrote, I read Rolf's intelligent response. I
didn't realize that
there are boundary issues so yes, he's correct and  my approach is EL
WRONGO. I feel very not good that I just sent that email being that it's
totally wrong. My apologies for noise
and thanks Rolf for the correct response.

Oh,  thing that does still hold in my response is  the AIC approach unless
Rolf
tells us that it's not valid also. I don't see why it wouldn't be though
because you're
not doing a hypothesis test when you go the AIC route.




I am no expert on this, but I would be uneasy applying AIC to such 
problems without having a very close look at the literature on the 
subject.  I'm pretty sure that there *are* regularity conditions that 
must be satisfied in order that AIC should give you a "valid" basis for 
comparison of models.


AIC has most appeal, and is mostly used (in my understanding) in 
settings where there is a multiplicity of models, whereby the multiple 
comparisons problem causes hypothesis testing to lose its appeal. 
Correspondingly AIC has little appeal in a setting in which a single 
hypothesis test is applicable.


I could be wrong about this; as I said, I am no expert.  Perhaps younger 
and wiser heads will chime in and correct me.


cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] Leer datos desde un puerto udp y/o desde un puerto serie

2015-09-23 Thread miguel.angel.rodriguez.muinos

Hola Antonio.

A ver si esto puede dirigirte en la dirección correcta
http://tolstoy.newcastle.edu.au/R/help/05/09/12210.html

Básicamente hace un scan al puerto en cuestión con algo parecido a esto:
 scan(file="/dev/ttyS0",n=1,what="character")

Un saludo,
Miguel.



El 23/09/2015 a las 8:20, Antonio Punzon Merino escribió:
>
>
> Hola,
> Soy bastante novato en esto de las comunicaciones.
> Recibo a trav�s de un puerto udp (tambi�n me interesa como se hace a trav�s 
> de puerto serie) datos de gps, profundidad, etc. Querr�a saber como puedo 
> abrir el puerto y leer la informaci�n en streaming
> He intentado diferentes opciones (scan, file,) pero no he encontrado la 
> forma.
> Muchas gracias
>
> Antonio
>







Nota: A información contida nesta mensaxe e os seus posibles documentos 
adxuntos é privada e confidencial e está dirixida únicamente ó seu 
destinatario/a. Se vostede non é o/a destinatario/a orixinal desta mensaxe, por 
favor elimínea. A distribución ou copia desta mensaxe non está autorizada.

Nota: La información contenida en este mensaje y sus posibles documentos 
adjuntos es privada y confidencial y está dirigida únicamente a su 
destinatario/a. Si usted no es el/la destinatario/a original de este mensaje, 
por favor elimínelo. La distribución o copia de este mensaje no está autorizada.

See more languages: http://www.sergas.es/aviso-confidencialidad
___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R-es] Leer datos desde un puerto udp y/o desde un puerto serie

2015-09-23 Thread Jorge Tornero - Listas

Hola, Antonio:

¡Dichosos barcos! Yo personalmente he encontrado bastantes dificultades 
con NMEA (y otros) a través de UDP, es un pequeño dolor.

Por lo que he visto de R, el problema puede residir en que las opciones 
posibles para conectar a sockets (socketConnection, make.socket) no 
contemplan, al menos por lo que he podido ir viendo, la posibilidad de 
escuchar UDP (aunque alguien lo habra hecho... seguro)

No obstante,hay una posible solución, al menos con simulador de GPS 
funciona (bajo Linux):

1) Hay que instalarse socat (en ubuntu/debian , apt-get install socat)

2) Ahora lo que hacemos es redireccionar el puerto UDP a un fichero de 
nuestro ordenador, eso sí, truncando cada vez que escribimos en él y así 
sólo queda la última sentencia recibida:
(imaginamos puerto UDp el 1 y el fichero /home/antares/GPS.txt)

socat UDP-LISTEN:1,reuseaddr,fork open:/home/antares/GPS.txt, 
create, trunc

En otras palabras, socat queda escuchando el puerto de interés, cuando 
recibe datos crea el fichero GPS.txt si no existe y lo trunca en otro 
caso, y escribe en él lo recibido por UDP.

3) Ahora, desde R hacemos alguna cosilla para ir leyendo desde ese 
fichero, para lo que R sí está preparado con scan(), por ejemplo

z<-0
nl<-""
nl2<-""

while (z<5){
nl<-scan('/home/antares/GPS.txt',what='character',quiet=TRUE,nlines=1)

 if (length(nl)>0) {
 if ((nl!=nl2)==TRUE) {

 nl2 <- nl
 print (paste('NMEA - ',nl))
 z<-z+1;

 }
 }
}


Como ejemplo y con gpsfeed+ como simulador, funciona, ahora ya otra cosa 
es ponerlo en práctica y que sea de utilidad...

Saludos,

Jorge




El 23/09/15 a las 08:20, Antonio Punzon Merino escribió:
>
>
> Hola,
> Soy bastante novato en esto de las comunicaciones.
> Recibo a trav�s de un puerto udp (tambi�n me interesa como se hace a trav�s 
> de puerto serie) datos de gps, profundidad, etc. Querr�a saber como puedo 
> abrir el puerto y leer la informaci�n en streaming
> He intentado diferentes opciones (scan, file,) pero no he encontrado la 
> forma.
> Muchas gracias
>
> Antonio
>
>   [[alternative HTML version deleted]]
>
>
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Loading data chartseries

2015-09-23 Thread Joshua Ulrich

On Tue, Sep 22, 2015 at 3:10 PM, bgnumis bgnum  wrote:
> Hi all,
>
> I want to plot this data on file.txt that has this format
>
> 01/01/2000;970,1877
> 02/01/2000;970,2224
> 03/01/2000;969,0336
> 04/01/2000;958,3023
> 05/01/2000;952,8527
>
> I´m trying to plot with quantmode with this code but it is not working
>
>
> X<-read.table("file.txt", col.names=c("Date","LAST"), sep=";",dec=",")
>
>
>
>
> chartSeries(
> X,theme="white",
>   TA = c(addBBands(200,2))
>
>
> )
>
> But it says on error
>
> Error in try.xts(x, error = "chartSeries requires an xtsible object") :
>   chartSeries requires an xtsible object
>
>
> How can I run chartseries with my own data?
>
Your data need to be in an xts object, or something that can be
converted to an xts object via as.xts().  I recommend the former.

Lines <-
"01/01/2000;970,1877
02/01/2000;970,2224
03/01/2000;969,0336
04/01/2000;958,3023
05/01/2000;952,8527"

library(quantmod)
X <- as.xts(read.zoo(text=Lines, sep=";", dec=",", format="%m/%d/%Y"))
chartSeries(X, theme="white", TA="addBBands(200,2)")

> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling the Distance Matrix

2015-09-23 Thread David L Carlson

I think the OP wanted rows where all values were greater than .9.
If so, this works:

> set.seed(42)
> dst <- dist(cbind(rnorm(20), rnorm(20)))
> dst2 <- as.matrix(dst)
> diag(dst2) <- NA
> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
> idx
13 18 19 
13 18 19 
> dst2[idx, idx]
 13   18   19
13   NA 2.272407 3.606054
18 2.272407   NA 1.578150
19 3.606054 1.578150   NA

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352



-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
Sent: Wednesday, September 23, 2015 3:23 PM
To: Lorenzo Isella
Cc: r-help@r-project.org
Subject: Re: [R] Sampling the Distance Matrix

> mm <- cbind(1/(1:5), sqrt(1:5))
> d <- dist(mm)
> d
  1 2 3 4
2 0.6492864
3 0.9901226 0.3588848
4 1.250 0.6369033 0.2806086
5 1.4723668 0.8748970 0.5213550 0.2413050
> which(as.matrix(d)>0.9, arr.ind=TRUE)
  row col
3   3   1
4   4   1
5   5   1
1   1   3
1   1   4
1   1   5
I.e., the distances between mm's rows 3 & 1, 4 & 1, and 5,1 are more than 0.9

The as.matrix(d) is needed because dist returns the lower triangle of
the distance
matrix and an object of class "dist" and as.matrix.dist converts that
into a matrix.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Sep 23, 2015 at 12:15 PM, Lorenzo Isella
 wrote:
> Dear All,
> Suppose you have a distance matrix stored like a dist object, for
> instance
>
> x<-rnorm(20)
> y<-rnorm(20)
>
> mm<-as.matrix(cbind(x,y))
>
> dst<-(dist(mm))
>
> Now, my problem is the following: I would like to get the rows of mm
> corresponding to points whose distance is always larger of, let's say,
> 0.9.
> In other words, if I were to compute the distance matrix on those
> selected rows of mm, apart from the diagonal, I would get all entries
> larger than 0.9.
> Any idea about how I can efficiently code that?
> Regards
>
> Lorenzo
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] get lm by file .cvs

2015-09-23 Thread David Winsemius

On Sep 23, 2015, at 10:09 AM, raoman ramirez wrote:

> well i have info<-read.csv(file, header=T)
> and i want try  thatmodel<-lm(info[1]~info[2])
> is for metod backward selection but i don't know how

Can you explain why you want to apply backward selection to a model with a 
single predictor vector?

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Error from lme4: "Error: (p <- ncol(X)) == ncol(Y) is not TRUE"

2015-09-23 Thread Rolf Turner


On 24/09/15 01:57, Rory Wilson wrote:

In reply to Rolf Turner and Jean Adams who have been helping me:

This does appear to be an issue with NA values in the non-factor
variables. In the (non-reproducible) example below, we can see that
removing the NAs solves the problem. However, from what I can see to
this point, there does not seem be be rhyme nor reason to why the issue
is taking place. A slight modification to Rolf Turner's code
(introducing some NAs) shows that, in general, NAs are not a problem for
lmer (indeed, it just runs na.omit as default).

Examining which factors are affected by the removal of the NAs shows no
discernible pattern - no factors disappeared, none became "1" or
anything of this nature.
I will be able to proceed just by performing the na.omit beforehand, but
it is curious.

Thanks for your help everyone (especially Rolf Turner and Jean Adams)!

mod1<-lmer(beta~expData+techVar$RIN+techVar$sample_storage_time+(1|techVar$p_amplification))
#Error: (p <- ncol(X)) == ncol(Y) is not TRUE


First a pedantic quibble.  The foregoing call to lmer() would be better 
rendered as:


mod1 <- lmer(beta ~ expData + RIN + sample_storage_time
+ (1|p_amplification), data=techVar)

I.e. Use the "data" argument (!!!) and put *spaces* in your code!

Second, can you not extract a relatively small subset of your data set 
which demonstrates the problem and make that cut-down data set available?




cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: pandoc version 1.12.3 or higher is required and was not found

2015-09-23 Thread David Winsemius

On Sep 23, 2015, at 4:13 AM, Ryszard Czermiński wrote:

> I am trying to use R Markdown, but call to render() gives me an error:
> Error: pandoc version 1.12.3 or higher is required and was not found.
> 
> As I understand [http://yihui.name/knitr/demo/pandoc/] pandoc is a function
> defined in knitr, which I have installed and it has pandoc() function
> defined.

Pandoc ( to be distinguished from `pandoc()`) is an external-to-R software 
hammer capable of creating items of special beauty when supplied with dross 
text-materia.  `knitr` will be relying upon its availability when it casts its 
magical spells. `knitr` on CRAN is currently at version 1.11.

The incantations needed to acquire your very own pandoc hammer are in the link 
on that page entitled "Pandoc website".

> 
> Looks like some version incompatibility issue, but I do not really know how
> to resolve it.
> Do I need to go to older R version to use it?
> 
> I would appreciate your help.
> 
> Best regards,
> Ryszard
> 
>> sessionInfo()
> R version 3.2.2 (2015-08-14)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.10.3 (Yosemite)
> other attached packages: knitr_1.11
> [...]
> 
> Ryszard Czerminski
> 508-358-6328
> rysz...@czerminski.net
> LinkedIn.com/in/Ryszard.Czerminski
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling the Distance Matrix

2015-09-23 Thread William Dunlap

> mm <- cbind(1/(1:5), sqrt(1:5))
> d <- dist(mm)
> d
  1 2 3 4
2 0.6492864
3 0.9901226 0.3588848
4 1.250 0.6369033 0.2806086
5 1.4723668 0.8748970 0.5213550 0.2413050
> which(as.matrix(d)>0.9, arr.ind=TRUE)
  row col
3   3   1
4   4   1
5   5   1
1   1   3
1   1   4
1   1   5
I.e., the distances between mm's rows 3 & 1, 4 & 1, and 5,1 are more than 0.9

The as.matrix(d) is needed because dist returns the lower triangle of
the distance
matrix and an object of class "dist" and as.matrix.dist converts that
into a matrix.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Sep 23, 2015 at 12:15 PM, Lorenzo Isella
 wrote:
> Dear All,
> Suppose you have a distance matrix stored like a dist object, for
> instance
>
> x<-rnorm(20)
> y<-rnorm(20)
>
> mm<-as.matrix(cbind(x,y))
>
> dst<-(dist(mm))
>
> Now, my problem is the following: I would like to get the rows of mm
> corresponding to points whose distance is always larger of, let's say,
> 0.9.
> In other words, if I were to compute the distance matrix on those
> selected rows of mm, apart from the diagonal, I would get all entries
> larger than 0.9.
> Any idea about how I can efficiently code that?
> Regards
>
> Lorenzo
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error of matrix and dimnames when using SVM in R

2015-09-23 Thread uc la via R-help

Hi all!
I am having problems with using SVM in R

I have a data frame `trainData` which contains 198 rows and  looks like 

Matchup Win HomeID AwayID A_TWPCT A_WST6 A_SEED B_TWPCT B_WST6 
B_SEED2010_1115_1457   1   1115   1457   0.531  5 16   0.567  4 
16
2010_1124_1358   1   1124   1358   0.774  5  3    0.75  5 14
    ...

The testData is similar. 

In order to use SVM, I have to change the response variable Win to a factor. I 
tried the below:

    trainDataSVM <- trainData
    trainDataSVM$Win <- factor(trainDataSVM$Win)
    svmfit =svm (Win ~ A_WST6 + A_SEED + B_WST6 + B_SEED , data = trainDataSVM 
, kernel ="linear", cost =10,scale =FALSE, probability=TRUE )
    testDataSVM<-testData
    testDataSVM$Win <-factor(testDataSVM$Win)
    predictions_SVM <- predict(bestmod, testDataSVM, type = 
"response",probability=TRUE)

However, I get the message 

    Error in matrix(ret$prob, nrow = nrow(newdata), byrow = TRUE, dimnames = 
list(rowns,  : 
  length of 'dimnames' [2] not equal to array extent

If I re-run the code except not changing trainDataSVM$Win and testDataSVM$Win 
to factors, if I print out predictions_SVM, I get the message named numeric(0)

How do I fix this?
Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Compare two normal to one normal

2015-09-23 Thread Charles C. Berry


On Tue, 22 Sep 2015, John Sorkin wrote:


Charles,


I am not sure the answer to me question, given a dataset, how can one 
compare the fit of a model of the fits the data to a mixture of two 
normal distributions to the fit of a model that uses a single normal 
distribution, can be based on the glm model you suggest.


Well you *did* ask how to calculate the log-likelihood of a fitted normal 
density, didn't you? That is what I responded to. You can check that 
result longhand as sum( dnorm( y, y.mean, y.std , log=TRUE ) ) and get the 
same result (as long as you used ML estimates of the mean and standard 
deviation).





I have used normalmixEM to fit the data to a mixture of two normal 
curves. The model estimates four (perhaps five) parameters: mu1, sd^2 1, 
mu2, sd^2, (and perhaps lambda, the mixing proportion. The mixing 
proportion may not need to be estimated, it may be determined once once 
specifies mu1, sd^2 1, mu2, and sd^2.) Your model fits the data to a 
model that contains only the mean, and estimates 2 parameters mu0 and 
sd0^2.  I am not sure that your model and mine can be considered to be 
nested. If I am correct I can't compare the log likelihood values from 
the two models. I may be wrong. If I am, I should be able to perform a 
log likelihood test with 2 (or 3, I am not sure which) DFs. Are you 
suggesting the models are nested? If so, should I use 3 or 2 DFs?


As Rolf points out there is a literature on such tests (and Googling 'test 
finite mixture' covers much of it).


Do you really want a test? If you merely want to pick a winner from two 
candidate models there are other procedures. k-fold crossvalidation 
of the loglikelihood ratio statistic seems like an easy, natural approach.


HTH,

Chuck

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reproducing Graphpad IC50 in R with drc package

2015-09-23 Thread David Winsemius


On Sep 23, 2015, at 9:21 AM, Jon Arsenault wrote:

> I've been banging my head against the wall a bit with this and would be
> ecstatic if someone could help.
> 
> Initial data:
> 
> Response Dose
> 1  285.17  0.125
> 2  377.65  0.250
> 3  438.99  0.500
> 4  338.46  1.000
> 5  227.87  2.000
> 6  165.96  0.010
> 7  302.92  0.125
> 8  418.50  0.250
> 9  464.69  0.500
> 10 301.36  1.000
> 11 213.12  2.000
> 12 160.34  0.010
> 13 306.18  0.125
> 14 435.37  0.250
> 15 451.34  0.500
> 16 319.50  1.000
> 17 219.83  2.000
> 18 172.52  0.010
> 19 306.56  0.125
> 20 439.01  0.250
> 21 469.74  0.500
> 22 318.05  1.000
> 23 223.09  2.000
> 
> The graphpad template that that this normally would be placed in then
> transforms the taking the log(Dose,base=10) and normalizes the Response.
> I've confirmed that I was able to recreate that here:
> 
>   NormResponse  LogDose
> 141.011 -0.90309
> 272.909 -0.60206
> 394.067 -0.30103
> 459.392  0.0
> 521.246  0.30103
> 6-0.108 -2.0
> 747.133 -0.90309
> 887.000 -0.60206
> 9   102.932 -0.30103
> 10   46.595  0.0
> 11   16.159  0.30103
> 12   -2.047 -2.0
> 13   48.258 -0.90309
> 14   92.819 -0.60206
> 15   98.327 -0.30103
> 16   52.852  0.0
> 17   18.473  0.30103
> 182.155 -2.0
> 19   48.389 -0.90309
> 20   94.074 -0.60206
> 21  104.674 -0.30103
> 22   52.352  0.0
> 23   19.598  0.30103
> 
> 
> 
> 
> Now graphpad used 'log(inhibitor) vs. normalized response -- Variable
> slope' which it states to be a 4parameter sigmodial but it gives me very
> different results.

"It" gives you very different results, but what is "it", what does it "give" 
and what are you comparing it to? You should start by identifying the package 
you are using, and then present the code. (We cannot read you mind.)

> graphpad gives and IC50 of 0.1560 and drm's LL.4 gives 0.1149 and it wont
> even take the transformed data,just throws an error.
> 
>   [[alternative HTML version deleted]]

But we can tell see that you are not reading the Posting Guide.

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: pandoc version 1.12.3 or higher is required and was not found

2015-09-23 Thread Yihui Xie

I guess the confusion here is the relationship between knitr::pandoc()
and rmarkdown::render(). The error message you saw was from
rmarkdown::render(), which requires Pandoc 1.12.3. The easiest way to
go to use rmarkdown (I mean the R package rmarkdown) is to use
RStudio, and you don't even need to install Pandoc separately. There
are a number of possible reasons for the failure you saw: 1) You
didn't install Pandoc; 2) You installed but didn't put it on PATH; 3)
You installed a lower version of Pandoc. I don't mean you should not
figure out the exact reason, but it just saves so much time not having
to take care of such technical details by yourself.

knitr::pandoc() is almost a completely different story. If you are
familiar with Pandoc command-line arguments, please feel free to use
it. If you don't want to waste time on remembering those arguments, go
for rmarkdown::render() instead, which has a much better interface to
Pandoc than knitr::pandoc(). As the author of knitr::pandoc(), I can
tell you this function was about two afternoon's work, and rmarkdown
has been under active development for almost two years now. Hopefully
that makes it clear enough for you to choose between knitr::pandoc()
and rmarkdown::render() :-)

Regards,
Yihui
--
Yihui Xie 
Web: http://yihui.name

On Wed, Sep 23, 2015 at 6:13 AM, Ryszard Czermiński
 wrote:
> I am trying to use R Markdown, but call to render() gives me an error:
> Error: pandoc version 1.12.3 or higher is required and was not found.
>
> As I understand [http://yihui.name/knitr/demo/pandoc/] pandoc is a function
> defined in knitr, which I have installed and it has pandoc() function
> defined.
>
> Looks like some version incompatibility issue, but I do not really know how
> to resolve it.
> Do I need to go to older R version to use it?
>
> I would appreciate your help.
>
> Best regards,
> Ryszard
>
>> sessionInfo()
> R version 3.2.2 (2015-08-14)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.10.3 (Yosemite)
> other attached packages: knitr_1.11
> [...]
>
> Ryszard Czerminski
> 508-358-6328
> rysz...@czerminski.net
> LinkedIn.com/in/Ryszard.Czerminski

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R-es] seleccionar y eliminar con drop1

2015-09-23 Thread raoman ramirez

estoy haciendo una funcion para el metodo de seleccion hacia atras o 
eliminacion hacia atrasdonde recibo un archivo .csv mi duda es como usar el 
drop1 para elegir que modelo es el siguientese que esta algo enredado, pero 
espero se medio entiendan y puedan ayudarme un poco
info<-read.csv("nombreArchivo.csv", 
header=T)numObservaciones=length(info[,1])numVariables=length(info[1,])k=numVariables-1f=qf(.95,1,numObservaciones-numVariables)modelo=lm(y~.,data=info)fs=c()while(k>0){
   drop1(modelo,test="F")  if(f

37 matches

Mail list logo