[R] How does R determine best value of alpha in forecasting

2014-01-30 Thread neel shah
Hi

I need to implement Holt-Winter's Exponential Smoothing in my project. I
was going through implementation of Holt-Winter's Exponential Smoothing in
forecast package. I would appreciate if someone can explain how R finds
best value of alpha and beta given the dataset. It ll be great if someone
can post the research-paper it uses to implement DSHW.

Thank You

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change points in R

2014-01-30 Thread Benjamin Ward (ENV)
Hi R helpers,

I have a set of data best shown in this below graph.

Each coloured line represents a statistic calculated across pairs of DNA 
sequences. And for each coloured line, I would like to identify breakpoints - 
so identify the chunks where the values are high, for example, in the light 
blue line, there is a large high segment just after x=2e+05. From googling the 
aim to find such points, I've read about something called change-point 
analysis, used with time series data and I wondered if it or a variant of it in 
R might be of use here, this data is a series of % values (double), all a 
single measurement i.e. for each line, a 'scanner' passed over two sequences 
and at each step recorded the % value. Can change-point analysis help me here 
and if so what package or method will allow me to do this making as little 
assumptions about my data as possible?

Thanks in advance,

Ben W.

 [X]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error in assetsFit function of fPortfolio package

2014-01-30 Thread Mohammad Nikzad
Dear Sir/Madam

I get an error when I run a assetsFit function of fPortfolio package.

You just need to load the fPortfolio package and run this command to get
the same error that I get. The only fitting method that works is "norm"
function.the "snorm" and "st" methods give error.

library(fPortfolio)
fit <- assetsFit(LPP2005.RET[, 1:3], method = "st")

Error in .mvstFit(x = x, fixed.df = fixed.df, trace = trace, ...) :
could not find function "mst.mle"

Thank you in advance for your help.

Best,

Arsa Nikzad

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reversed variables in stats::reshape()

2014-01-30 Thread Doug Morrison
Thanks Arun! I see now that the description of the "varying" argument for
reshape includes the following:

"This is canonically a list of vectors of variable names"

Originally, I saw in the Details section:

"Notice that the order of variables in varying is like x.1,y.1,x.2,y.2."

Best,
Doug



On Fri, Jan 24, 2014 at 1:03 PM, Doug Morrison  wrote:

> Dear R-Help readers,
>
> I am writing to ask about some behavior by stats::reshape() that surprised
> me. In the example below, I expected the values of variables "R" and "L" in
> data.frame "test" to be the reverse of what they are - ie I expected that
> test$R = seq(1:29, by =2) and test$L = seq(2:30, by = 2).
>
> data1 = data.frame(
> check.names = F,
> Participant = 1,
> Treatment = "A", "2 min R" = 1L,
> "2 min L" = 2L, "4 min R" = 3L, "4 min L" = 4L, "6 min R" = 5L,
> "6 min L" = 6L, "8 min R" = 7L, "8 min L" = 8L, "10 min R" = 9L,
> "10 min L" = 10L, "12 min R" = 11L, "12 min L" = 12L, "14 min R" =
> 13L,
> "14 min L" = 14L, "16 min R" = 15L, "16 min L" = 16L, "18 min R" =
> 17L,
> "18 min L" = 18L, "20 min R" = 19L, "20 min L" = 20L, "22 min R" =
> 21L,
> "22 min L" = 22L, "24 min R" = 23L, "24 min L" = 24L, "26 min R" =
> 25L,
> "26 min L" = 26L, "28 min R" = 27L, "28 min L" = 28L, "30 min R" =
> 29L,
> "30 min L" = 30L)
>
> varying1 = colnames(data1)[3:32]
>
> test = reshape(
> data = data1,
> direction = "long",
> idvar = c("Participant","Treatment"),
> v.names = (c("R","L")),
> timevar = "Time",
> times = seq(2, 30, by = 2),
> varying = varying1)
>
> test
>
> [ end code]
>
> I looked into the definition of reshape, and found the following line:
>
> varying <- split(varying, rep(v.names, ntimes))
>
> The following edit seems to produce the behavior I expected:
>
> varying <- split(varying, rep(v.names, ntimes))[v.names]
>
> However, I strongly suspect I am making a mistake; I'd be grateful if
> someone would help me find it.
>
> Here is the output of  sessionInfo():
>
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] graphics  datasets  stats utils grDevices methods   base
>
> other attached packages:
> [1] rj_1.1.3-1
>
> loaded via a namespace (and not attached):
> [1] rj.gd_1.1.3-1 tools_3.0.2
>
> Thanks,
> Doug
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Generalized Ordered Logit in R

2014-01-30 Thread B. Longhurst

Dear community,

I am replicating a paper (The Effect of Structures and Power on State 
Bargaining Strategies, http://tinyurl.com/oyk289c) for a class project. 
The author used “gologit” in STATA, and I need the equivalent function 
in R. She used a “Generalized Ordered Logit” (see below model 
description).


Can you run a “generalized ordered logit” with glm? I could not find the 
specification in the help file for the function. Or is there another R 
package for it?


The data can be accessed here: http://tinyurl.com/oydwl4d (McKibben_AJPS 
Bargaining Strategies data.tab)


The results should look like this: 
http://s17.postimg.org/3zldxs133/Brooke.png


Thank-you so much for your help. It is greatly appreciated.

Kind Regards,

Brooke


Model Description from the original paper:

To test the effects of the bargaining structure against more standard 
explanations of variation in state bargaining behavior, I estimate two 
generalized ordered logit models with clustered standard errors. Model 1 
includes lnGDP, focusing on states capabilities as the central measure 
of their power, and Model 2 includes the alternative, Voting power 
index. While standard ordered logit models might appear to be the 
appropriate statistical technique, the data violate the assumption of 
proportional odds that underpins these models. In other words, all 
independent variables do not exert the same effect across all categories 
of the dependent variable. The generalized ordered logit model relaxes 
this assumption, allowing the effects of the independent variables to 
vary across the different cate- gories of cooperative bargaining 
behavior. Standard er- rors were also clustered by negotiation phase to 
deal with the fact that each state’s choice of bargaining behavior 
within a given bargaining interaction is dependent on the choices of 
other states. (These models were estimated using the gologit2 command in 
Stata)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] print.matrix

2014-01-30 Thread David Winsemius

On Jan 30, 2014, at 6:46 AM, Göran Broström wrote:

> In the documentation of 'prmatrix' (base) I read, under Details:
> 
> ‘prmatrix’ is an earlier form of ‘print.matrix’
> 
> but 'print' doesn't seem to have a 'matrix' method. And in the 'Examples' 
> section:
> 
> chm <- matrix(...
> chm # uses print.matrix()
> 
> Is this a bug in the documentation?
> 
> R-3.0.2 on ubuntu 13.10.

While waiting for someone who actually knows the answer I will tangentially 
note that there is a write.matrix in MASS that gives the same console output as 
prmatrix.

MASS::write.matrix(m6 <- diag(6) )
1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 1

Unlike prmatrix, write.matrix does not return its argument.
-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Controlling font size on code chunk outputs using Knitr

2014-01-30 Thread Jeff Johnson
Thanks Yihui and Jeff.

I've retrieved the default CSS file and made a tweak to it (changing a
header 1 size just to test it) and saved it to the same local directory as
my .Rmd file using the name 'mymarkdown.css' for testing.

I've added:
options(rstudio.markdownToHTML =
  function(inputFile, outputFile) {
require(markdown)
markdownToHTML(inputFile, outputFile, stylesheet='mymarkdown.css')
  }
)

to the top of my testfile.Rmd file so that my file now looks like:

options(rstudio.markdownToHTML =
  function(inputFile, outputFile) {
require(markdown)
markdownToHTML(inputFile, outputFile, stylesheet='mymarkdown.css')
  }
)

Title


This is an R Markdown document. Markdown is a simple formatting syntax for
authoring web pages (click the **Help** toolbar button for more details on
using R Markdown).

When you click the **Knit HTML** button a web page will be generated that
includes both content as well as the output of any embedded R code chunks
within the document. You can embed an R code chunk like this:

```{r}
summary(cars)
```

But when I knit it, it just writes the "options" chunk at the top of my
document. Am I supposed to add something else to get the .rmd file to
reference the css?

I'm quite new to programming and R (as if you couldn't tell!), so not sure
what additional steps I need to add.

Thanks much.
Jeff



On Thu, Jan 30, 2014 at 1:48 PM, Yihui Xie  wrote:

> Exactly. Please see RStudio documentation:
>
> https://support.rstudio.com/hc/en-us/articles/200552186-Customizing-Markdown-Rendering
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Web: http://yihui.name
>
>
> On Thu, Jan 30, 2014 at 10:57 AM, Jeff Newmiller
>  wrote:
> > This sounds like a classic "you need to write a custom CSS file"
> problem... Which is off-topic here, so is homework for you.
> >
> > On January 30, 2014 8:34:32 AM PST, Jeff Johnson 
> wrote:
> >>Hi Yihui,
> >>
> >>The package I have installed is "knitr". To generate the HTML, I run
> >>Knit
> >>HTML from within R Studio version .98.490 (there's an icon to initiate
> >>it.
> >>
> 
> >>
> >>You can load that dataset, then:
> >>Print the column names
> >>```{r, echo=showcode, comment=commentchar}
> >>colnames(mydf)
> >>```
> >>The resulting font is a couple of points larger than I'd like. I'd like
> >>to
> >>be able to control this either globally or at the code chunk level.
> >>
> >>Thanks for your help with this!
>



-- 
Jeff

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Controlling font size on code chunk outputs using Knitr

2014-01-30 Thread Yihui Xie
Exactly. Please see RStudio documentation:
https://support.rstudio.com/hc/en-us/articles/200552186-Customizing-Markdown-Rendering

Regards,
Yihui
--
Yihui Xie 
Web: http://yihui.name


On Thu, Jan 30, 2014 at 10:57 AM, Jeff Newmiller
 wrote:
> This sounds like a classic "you need to write a custom CSS file" problem... 
> Which is off-topic here, so is homework for you.
>
> On January 30, 2014 8:34:32 AM PST, Jeff Johnson  
> wrote:
>>Hi Yihui,
>>
>>The package I have installed is "knitr". To generate the HTML, I run
>>Knit
>>HTML from within R Studio version .98.490 (there's an icon to initiate
>>it.
>>

>>
>>You can load that dataset, then:
>>Print the column names
>>```{r, echo=showcode, comment=commentchar}
>>colnames(mydf)
>>```
>>The resulting font is a couple of points larger than I'd like. I'd like
>>to
>>be able to control this either globally or at the code chunk level.
>>
>>Thanks for your help with this!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glmmADMB error

2014-01-30 Thread Sebastián Daza
I ran an example using glmmADMB, and I got this error:

data(bacteria,package="MASS")
bacteria$present <- as.numeric(bacteria$y)-1
(bfit <-  glmmadmb(present ~ trt + I(week > 2), random = ~ 1 | ID,
+ family = "binomial", data = bacteria))

Error in paste0(symbol1, paste0(paste0(var, collapse = symbol2))) :
  argument "symbol1" is missing, with no default

Any ideas?

R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
base

other attached packages:
[1] multcomp_1.3-1TH.data_1.0-3 survival_2.37-7   mvtnorm_0.9-9997
 data.table_1.8.10 glmmADMB_0.7.7R2admb_0.7.10
[8] MASS_7.3-29   foreign_0.8-59

loaded via a namespace (and not attached):
[1] grid_3.0.2  lattice_0.20-24 Matrix_1.1-1.1  nlme_3.1-113
 sandwich_2.3-0  tools_3.0.2 zoo_1.7-10

-- 
Sebastián Daza

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Universal regression program - multiple regressions on set of data

2014-01-30 Thread nooldor
Hi,

Here is description of my problem:

I have data frame that contains 110 variables (columns) and 595
observations each.
Some of the variables will be my Y-dependent variable, some will be Q, X, Z
- independent variables
I need to estimate robust regression models: Y~X+Z+Q

I want to create 4 subsets from this data frame:
(first column contains dates - and should be skipped)

Y <- columns 2:31[30 variables/columns] (let's call them y1, y2, y3
... y30 - each of them has some specific different, unordered name, here i
just call them y1, y2 and so on, *I think of them as of vectors, e.g. y1-
is vector of variable called - in this example - "y1" with 595
observations*- some of them can be NA)

X<- columns 32:50[19 variables/columns] (let's call them x1,x2,x3 ...
x19 - each of them has some specific different, unordered name, here i just
call them y1, yw2 and so on , I think of them as of vectors, e.g. x1- is
vector of variable called - in this example - "x1" with 154 observations
some of them can be NA)

Z<- columns 51:80  [30 variables/columns]  let's call them z1, z2, z3 ...
z30 - analogously as X

Q<- columns 81:110  [30 variables/columns] let's call them q1, q2, q3 ...
q30 - analogously as X

y1 is corresponding to z1 and q1 (and so on) in regressions below:


The goal:
I want to write code that will generate 30 x 19 robust regressions (package
MASS: rlm), like that:

y*1*~x1+z*1*+q*1*
y*1*~x2+z*1*+q*1*
...
y*1*~x19+z*1*+q*1*


y*2*~x1+z*2*+q*2*
y*2*~x2+z*2*+q*2*
...
y*2*~x19+z*2*+q*2*


y*30*~x1+z*30*+q*30*
y*30*~x2+z*30*+q*30*
...
y*30*~x19+z*30*+q*30*

(as previously described y2 - means second vector in Y subset, x19 - means
19th vector in X subset, z2- 2nd vector in Z subset ... and so on)

so first vector of Y subset should be regressed on first vector of Z subset
and first vector of subset Q but with "changing" vector of X subsets ...
and so on for all 30 vectors in Y subset

- during running each of those rlm regressions, program should extract
residuals of each regression and check if ArchTest() (package: finTS)
[ArchTest(resid,*lags=5*, *demean = FALSE*)] p-value of this test is lower
then 0,05 if yes then it should estimate Garch (1,1) model described in
here:
http://stats.stackexchange.com/questions/45482/how-to-estimate-garch-in-r-exogenous-variables-in-mean-equation
then the program should check again (the same) equation with ArchTest and
if p-value is again lower then 0,05 it should apply Garch (1,2) model and
so on (applying garch(1,3), garch(1,4) and so on) till p-value from
ArchTest will be grater then 0,05, if p-val form ArchTest will be finally
grater then 0,05 program should go to next equation and repeat procedure.
In the end I would like to have one data frame as result that contains
coefficients of all of 30 x 19 regressions (there will be 30 x 19 x
4coefficients) and p-values of them.

I was thinking about solving it like that:
-creating 4 lists of names of each subsets and using lapply, but I am yet
lack of skills in R to do it myself ... especially the garch part ...
therefore I ask you for help.

Best regards and thank in advance!
T.S.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mlogit: message "invalid 'row.names' length" after subsetting data

2014-01-30 Thread craux
Thanks a lot Bill!
This works fine!
Useful to know regarding subset.
Charles





--
View this message in context: 
http://r.789695.n4.nabble.com/mlogit-message-invalid-row-names-length-after-subsetting-data-tp4684465p4684470.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mlogit: message "invalid 'row.names' length" after subsetting data

2014-01-30 Thread William Dunlap
> dd1 <- subset(dd, dd$Effect=='none')

Try using
   dd1 <- dd[dd$Effect=='none', ]
instead of that call to subset().  subset() strips
the non-data.frame attributes from its output.

It may be easier to use the subset argument to mlogit,
as in mlogit(..., subset = Effect=="none").

Bill Dunlap
TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of craux
> Sent: Thursday, January 30, 2014 10:14 AM
> To: r-help@r-project.org
> Subject: [R] mlogit: message "invalid 'row.names' length" after subsetting 
> data
> 
> Hi!
> I have been working a while with mlogit, estimating successfully a series of
> models.
> This time I have a series of discrete choice experiments with different
> effects to test (var Effect). I use the code below: when I estimate the
> first model with "dd" it's work fine. Then I perform subset dd into dd1,
> filtering the category of Effect "none", and I get the expected subset of
> experiment (1800 rows = choices). When I estimate the model with dd1 I get
> the error message "invalid 'row.names' length".
> What is wrong?
> ---
> library(mlogit)
> dd <- mlogit.data(Dataset, varying = 17:24, choice = "choice", id.var =
> "ID", shape = "wide", sep = "_")
> summary(m1 <- mlogit(choice ~ price + duration, dd))
> dd1 <- subset(dd, dd$Effect=='none')
> summary(m1 <- mlogit(choice ~ price + duration, dd1))
> --
> 
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/mlogit-message-invalid-
> row-names-length-after-subsetting-data-tp4684465.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mlogit: message "invalid 'row.names' length" after subsetting data

2014-01-30 Thread craux
Hi!
I have been working a while with mlogit, estimating successfully a series of
models. 
This time I have a series of discrete choice experiments with different
effects to test (var Effect). I use the code below: when I estimate the
first model with "dd" it's work fine. Then I perform subset dd into dd1,
filtering the category of Effect "none", and I get the expected subset of
experiment (1800 rows = choices). When I estimate the model with dd1 I get
the error message "invalid 'row.names' length".
What is wrong?
---
library(mlogit)
dd <- mlogit.data(Dataset, varying = 17:24, choice = "choice", id.var =
"ID", shape = "wide", sep = "_")
summary(m1 <- mlogit(choice ~ price + duration, dd))
dd1 <- subset(dd, dd$Effect=='none')
summary(m1 <- mlogit(choice ~ price + duration, dd1))
--




--
View this message in context: 
http://r.789695.n4.nabble.com/mlogit-message-invalid-row-names-length-after-subsetting-data-tp4684465.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coeficiente Gini

2014-01-30 Thread Achim Zeileis

On Thu, 30 Jan 2014, Gastón Acosta c. wrote:

He le?do bastante sobre el programa R. No soy un programador. Estoy 
interesado en obtener un programa R que permita calcular los intervalos 
de confianza y el error de estimaci?n del coeficiente Gini, ya que el 
c?lculo de este y su gr?fico, as? como la curva de Lorenz, se logran por 
otros medios de computaci?n. Si usted me puede ayudar en esto le estar? 
agradecido. Gast?n Acosta C.


Please note that this is an English-speaking mailing list.

The Gini coefficient and Lorenz curve are available in the "ineq" package 
for example. The package does not provide standard errors for the Gini, 
though.


hth,
Z
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Coeficiente Gini

2014-01-30 Thread Gastón Acosta c .
He leído bastante sobre el programa R. No soy un programador. Estoy interesado 
en obtener un programa R que permita calcular los intervalos de confianza y el 
error de estimación del coeficiente Gini, ya que el cálculo de este y su 
gráfico, así como la curva de Lorenz, se logran por otros medios de 
computación. Si usted me puede ayudar en esto le estaré agradecido. Gastón 
Acosta C.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] SPRINT 1.0.5 release (parallelised R functions for complex analysis and large data sets)

2014-01-30 Thread SPRINT Project

Dear All

We have recently released SPRINT v1.0.5. SPRINT provides parallelised  
versions of existing R functions (e.g. statistical, machine learning,  
utility) that often exceed computational limits (speed or memory) with  
large or complex data sets.



SPRINT 1.0.5 now:

-> runs on Mac OSX multi-cores (i.e. you may not need to obtain access  
to compute clusters)


-> provides a parallelised version of the Hamming distance (based on  
original version contained in stringdistmatrix() by M van der Loo, and  
parallelised differently from the option provided therein)


-> compliant with MPI3 (and the latest version of the mpich package)


Download and instructions available at www.r-sprint.org.

SPRINT 1.0.5 is not available on CRAN as we have not yet completed  
testing against the latest major release of R (3.0.x). However, we  
will release a small incremental update of SPRINT as version 1.0.6  
within the next few weeks that will be available via CRAN and will  
work with R3.



Notes:
- With the availability of SPRINT for Mac users, we hope to address  
computational issues that are too large or too slow for these users,  
but still small enough to not require a much larger number of nodes  
and cores (i.e. compute clusters or supercomputers like HECToR or  
ARCHER).
- We have parallelised the Hamming distance as from collaborators and  
course participants we know this to have potential in use for  
Next-Gen-Sequencing, e.g. measuring pairwise distances between  
nucleotide sequences. In principle, though, Hamming is a distance  
useful for string or binary data vectors.




SPRINT Project

spr...@ed.ac.uk
www.r-sprint.org

The University of Edinburgh

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Parsing Complex Text in Single Cell

2014-01-30 Thread arun
Another way would be:
library(qdap)
library(stringr)
x <- scan(what="character",)

 x1 <- c(x,x) 

x1 <- paste(x1,collapse=" ")
 x2 <- gsub('"',"",bracketXtract(x1,"curly"))
 res2 <- 
as.data.frame(str_trim(do.call(rbind,genXtract(paste0(x2,","),":",","))),stringsAsFactors=FALSE)
res2[,1:3] <- lapply(res2[,1:3],as.numeric)
colnames(res2) <- str_trim(genXtract(paste0(",",x2),",",":")[[1]])
row.names(res2) <- 1:nrow(res2)

head(res2,3)
#  trial corr resp_dur stim    cond
#1 1    1  799 â†*Â*â†*Â*â†*Â*â†*Â*â†*Â*   congruent
#2 2    1    0 xx→xx    nogo
#3 3    0   NA â†*Â*â†*Â*→â†*Â*â†*Â* incongruent

A.K.




On Thursday, January 30, 2014 6:37 AM, Rui Barradas  
wrote:
Hello,

Maybe something like the following.


x <- scan(what = "character", text = '
{"trial":1,"corr":1,"resp_dur":799,"stim":"â†*Â*â†*Â*â†
*Â*â†*Â*â†*Â*","cond
":"congruent"},{"trial":2,"corr":1,"resp_dur":0,"stim":"xxâ†
’xx","cond":"
nogo"},{"trial":3,"corr":0,"resp_dur":null,"stim":"â†*Â*â†
*Â*→â†*Â*â†*Â*
","cond":"incongruent"},{"trial":4,"corr":1,"resp_dur":528,"stim":"
→→→→â†
’","cond":"congruent"},{"trial":5,"corr":0,"resp_dur":null,"
stim":"â–¡â–¡â†
’â–¡â–¡","cond":"neutral"},{"trial":6,"corr":0,"resp_dur
":574,"stim":"→→â†*Â*→â†
’","cond":"incongruent"},{"trial":7,"corr":1,"
resp_dur":541,"stim":"â–¡â–¡â†
’â–¡â–¡","cond":"neutral"},{"trial":8,"corr
":1,"resp_dur":500,"stim":"â–¡â–¡â†
*Â*â–¡â–¡","cond":"neutral"},{"trial":9,"
corr":1,"resp_dur":0,"stim":"xxâ†
’xx","cond":"nogo"},{"trial":10,"corr":0,"
resp_dur":637,"stim":"â†*Â*â†*Â*→â†*Â*â†
*Â*","cond":"incongruent"}]')


x <- paste(x, collapse = ' ')
x <- gsub('"', '', x)
x <- gsub('\\]', '', x)

y <- unlist(strsplit(x, "\\{"))
y <- sub("\\}", "", y)
y <- y[y != ""]
y <- strsplit(y, ",")

fun <- function(x){
    y <- strsplit(x, ":")
    z <- lapply(y, '[[', 2)
    z[1:3] <- lapply(z[1:3], as.numeric)
    z <- as.data.frame(t(unlist(z)))
    z
}

res <- do.call(rbind, lapply(y, fun))
names(res) <- lapply(strsplit(y[[1]], ":"), '[[', 1)
res


Note that the two warnings are ok, they are due to the two values 'null' 
in your data, that are coerced to NA.

Hope this helps,

Rui Barradas

Em 29-01-2014 22:14, Patzelt, Edward escreveu:
> R Experts -
>
> We have a complex problem whereby Qualtrics exported our data into a single
> cell as seen below.
>
> We attempted to parse it using scan() without much success. Hoping to get a
> little nudge here. I've posted the full data set here:
> https://www.dropbox.com/s/e246uiui6jrux6c/CoopandSelfControl_N90_1.24.14_GNGData.csv
>
> {"trial":1,"corr":1,"resp_dur":799,"stim":"â†*Â*â†*Â*â†*Â*â†
> *Â*â†*Â*","cond
> ":"congruent"},{"trial":2,"corr":1,"resp_dur":0,"stim":"xx→xx","cond":"
> nogo"},{"trial":3,"corr":0,"resp_dur":null,"stim":"â†*Â*â†*Â*â†
> ’â†*Â*â†*Â*
> ","cond":"incongruent"},{"trial":4,"corr":1,"resp_dur":528,"stim":"
> →→→→â†
> ’","cond":"congruent"},{"trial":5,"corr":0,"resp_dur":null,"
> stim":"â–¡â–¡â†
> ’â–¡â–¡","cond":"neutral"},{"trial":6,"corr":0,"resp_dur
> ":574,"stim":"→→â†*Â*→â†
> ’","cond":"incongruent"},{"trial":7,"corr":1,"
> resp_dur":541,"stim":"â–¡â–¡â†
> ’â–¡â–¡","cond":"neutral"},{"trial":8,"corr
> ":1,"resp_dur":500,"stim":"â–¡â–¡â†
> *Â*â–¡â–¡","cond":"neutral"},{"trial":9,"
> corr":1,"resp_dur":0,"stim":"xxâ†
> ’xx","cond":"nogo"},{"trial":10,"corr":0,"
> resp_dur":637,"stim":"â†*Â*â†*Â*→â†*Â*â†
> *Â*","cond":"incongruent"}]

>
>
> Cheers,
>
>
> Edward
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inconsistent results between first run of Rprof and next runs of Rprof

2014-01-30 Thread Nathan Uyttendaele
Hello again,

turns out that I was able to nail down the problem to this :

Prior to the profiling, I have several packages and a lot of functions to
load, including the C code for the rankCR function in R.

If I load first the libraries and next the C functions

library(Hmisc)
library(copula)
library(igraph)
library(ape)
library(phytools)
library(pcaPP)
if(OS=="Darwin") {
dyn.load("NAC_package/JKendall.so")
dyn.load("NAC_package/Npsi.so")
dyn.load("NAC_package/rankC.so")
} else {
dyn.load("NAC_package/JKendall.dll")
dyn.load("NAC_package/Npsi.dll")
dyn.load("NAC_package/rankC.dll")
}


No problem occurs at all.

But if load the libraries after the C functions prior to the profiling,
like this


if(OS=="Darwin") {
dyn.load("NAC_package/JKendall.so")
dyn.load("NAC_package/Npsi.so")
dyn.load("NAC_package/rankC.so")
} else {
dyn.load("NAC_package/JKendall.dll")
dyn.load("NAC_package/Npsi.dll")
dyn.load("NAC_package/rankC.dll")
}
library(Hmisc)
library(copula)
library(igraph)
library(ape)
library(phytools)
library(pcaPP)

the problem with rankC occurs.

If I try to reload in the same R session the C functions and the
librairies, the problem disappears permanently. The reason is that the C
functions are reloaded once more, but the already loaded libraries are not
and thus the last thing loaded is the C functions, exactly as in the
previous described case.



Any thoughts why I should load the libraries first? I already tried to
recompile the C functions with different names to be sure there are not
somehow erased by the C functions in the libraries, but it didn't help.


On Thu, Jan 30, 2014 at 8:46 AM, Nathan Uyttendaele wrote:

> Hello Jim, thanks for the reply!
>
> What I'm trying to do:
>
> I have a small function that makes use of many other functions, such as
> the one I called "estimate.NAC.structure.of()"
> I'm trying to make run everything faster. Thus I'm using Rprof to perform
> a line by line profiling to help me decide what lines of code I should
> replace. Mostly it's this estimate.NACstructure.of function that requires
> time. I could copy paste this function, but it has more than a 1000 lines
> of code...
>
> The line #33 and what's around is
>
> rankCR=function(a, b, c) {
> vvv=c(-1, -1, -1)
> dens=.C("rankC", as.integer(sample.size), as.double(vvv), as.double(a), 
> as.double(b),as.double(c),
> result=as.double(rep(-1, sample.size*3)), ties=as.double(rep(-1,
> sample.size*3)))  line 33
>  return(list(matrix(dens[["result"]], ncol=3, nrow=sample.size,
> byrow=TRUE), as.numeric(dens[["ties"]])))
> }
>
> This rankCR function is then used in a loop:
>
> pre.r <- rankCR(pseudo.samples[[a]], pseudo.samples[[b]],
> pseudo.samples[[c]])
>
> where pseudo.samples[[whatever]] is just a vector.
>
>
> And so the problem is that this rankCR function eats up more than 50% of
> the execution time according to the first profiling (more precisely, line
> 33 is eating up more than 50% of the execution time), then does not even
> show up in the top 30 in the second profiling... thus what I have to
> improve to make the code run faster is getting not that clear.
>
>
>
> On Wed, Jan 29, 2014 at 10:02 PM, jim holtman  wrote:
>
>> You should at least post the script so that we see what line 33 is.
>> For example, was it an input statement so that on the second time you
>> ran the data was cached in memory?  Did you remove all the objects and
>> do a gc() to clean up memory before trying again (maybe there was some
>> data hanging around that helped out).  It is not unknown to get
>> different results if you are having to access, especially, external
>> data.
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>>
>> On Wed, Jan 29, 2014 at 8:29 AM, Nathan Uyttendaele 
>> wrote:
>> > Hello,
>> >
>> > when I run this code in a brand new R session
>> >
>> > --
>> > ### loading of libraries and other functions
>> >
>> > Rprof("profiling.out")
>> > start.time=proc.time()[3]
>> > for(i in 1:50) {
>> > main.function()
>> > }
>> > end.time=proc.time()[3]
>> > Rprof()
>> > ---
>> >
>> > I get as result of the profiling
>> >
>> > estimateNACstructureof.R#3356.690
>> >
>> > estimateNACstructureof.R#93 5.220
>> >
>> > estimateNACstructureof.R#87 4.970
>> >
>> > estimateNACstructureof.R#78 1.940
>> >
>> > oldFriedman.R#391.260
>> >
>> > estimateNACstructureof.R#81 1.210
>> >
>> > estimateNACstructureof.R#88 1.200
>> >
>> > estimateNACstructureof.R#6570.960
>> >
>> > todecompose.R#340.690
>> >
>> > estimateNACstructureof.R#6580.660
>> >
>> > estimateNACstructureof.R#2640.650
>> >
>> > estimateNACstructureof.R#9230.590
>> >
>> > aresameNACstructures.R#40.560
>> >
>> > estimateNACstructureof.

Re: [R] Controlling font size on code chunk outputs using Knitr

2014-01-30 Thread Jeff Newmiller
This sounds like a classic "you need to write a custom CSS file" problem... 
Which is off-topic here, so is homework for you.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On January 30, 2014 8:34:32 AM PST, Jeff Johnson  wrote:
>Hi Yihui,
>
>The package I have installed is "knitr". To generate the HTML, I run
>Knit
>HTML from within R Studio version .98.490 (there's an icon to initiate
>it.
>
>Here's a simple example:
>showcode <- FALSE
>commentchar <- NA
>
>You can load this data as 'mydf'...
>dput(mydf)
>structure(list(PERSONPROFILE_POS = "DV", PARTY_ID = 95252415L,
>PERSON_FIRST_NAME = "Julie", PERSON_LAST_NAME = "herlastname",
>PERSON_MIDDLE_NAME = NA_character_,
> PARTY_NUMBER = 49229698L, ACCOUNT_NUMBER = 104205066L, ABILITEC_LINK =
>25455695,
>ADDRESS1 = "332 SE SOME RD", ADDRESS2 = NA_character_,
> ADDRESS3 = NA_character_, ADDRESS4 = NA_character_, CITY = "SOMECITY",
>COUNTY = "SOMECOUNTY", STATE = "OR", PROVINCE = NA_character_,
>POSTAL_CODE = "97111-", COUNTRY = "US", PRIMARY_PER_TYPE = "N",
>  SELLTOADDR_LOS = "DV", LOCATION_ID = 6222438L, SELLTOADDR_SOS = "DV",
>PARTY_SITE_ID = 7292226L, PRIMARYPHONE_CPOS = "DV",
>CONTACT_POINT_ID_PCP = 62243903L,
>CONTACT_POINT_PURPOSE_PCP = "PERSONAL", PHONE_LINE_TYPE = "GEN",
>PRIMARY_FLAG_PCP = "Y", PHONE_COUNTRY_CODE = NA_integer_,
>  PHONE_AREA_CODE = "244", PHONE_NUMBER = "244", EMAIL_CPOS = "DV",
>CONTACT_POINT_ID_ECP = 6202L, CONTACT_POINT_PURPOSE_ECP =
>NA_character_,
>PRIMARY_FLAG_ECP = "Y", EMAIL_ADDRESS = "someem...@yahoo.com",
>BB_PARTY_ID = NA, VALID_COUNTRY = TRUE, VALID_USSTATE = TRUE,
>POSTAL_PATTERN = "9-", VALID_USPP = TRUE, FULL_PHONE =
>"244244",
>FULL_PHONE_PATTERN = "NA99", FNAME_PATTERN = "A",
>FNAME_LENGTH = 5L, FNAME_TOKEN_COUNT = 1L, LNAME_LENGTH = 4L,
>LNAME_PATTERN = "", MNAME_LENGTH = 2L, MNAME_PATTERN =
>NA_character_,
>MNAME_TOKEN_COUNT = 1L, LNAME_TOKEN_COUNT = 1L, EMAIL_LENGTH = 19L,
>VALID_EMAIL = TRUE), .Names = c("PERSONPROFILE_POS", "PARTY_ID",
>"PERSON_FIRST_NAME", "PERSON_LAST_NAME", "PERSON_MIDDLE_NAME",
>"PARTY_NUMBER", "ACCOUNT_NUMBER", "ABILITEC_LINK", "ADDRESS1",
>"ADDRESS2", "ADDRESS3", "ADDRESS4", "CITY", "COUNTY", "STATE",
>"PROVINCE", "POSTAL_CODE", "COUNTRY", "PRIMARY_PER_TYPE",
>"SELLTOADDR_LOS",
>"LOCATION_ID", "SELLTOADDR_SOS", "PARTY_SITE_ID", "PRIMARYPHONE_CPOS",
>"CONTACT_POINT_ID_PCP", "CONTACT_POINT_PURPOSE_PCP", "PHONE_LINE_TYPE",
>"PRIMARY_FLAG_PCP", "PHONE_COUNTRY_CODE", "PHONE_AREA_CODE",
>"PHONE_NUMBER", "EMAIL_CPOS", "CONTACT_POINT_ID_ECP",
>"CONTACT_POINT_PURPOSE_ECP",
>"PRIMARY_FLAG_ECP", "EMAIL_ADDRESS", "BB_PARTY_ID", "VALID_COUNTRY",
>"VALID_USSTATE", "POSTAL_PATTERN", "VALID_USPP", "FULL_PHONE",
>"FULL_PHONE_PATTERN", "FNAME_PATTERN", "FNAME_LENGTH",
>"FNAME_TOKEN_COUNT",
>"LNAME_LENGTH", "LNAME_PATTERN", "MNAME_LENGTH", "MNAME_PATTERN",
>"MNAME_TOKEN_COUNT", "LNAME_TOKEN_COUNT", "EMAIL_LENGTH", "VALID_EMAIL"
>), row.names = 1L, class = "data.frame")
>
>You can load that dataset, then:
>Print the column names
>```{r, echo=showcode, comment=commentchar}
>colnames(mydf)
>```
>The resulting font is a couple of points larger than I'd like. I'd like
>to
>be able to control this either globally or at the code chunk level.
>
>Thanks for your help with this!
>
>
>On Wed, Jan 29, 2014 at 5:57 PM, Yihui Xie  wrote:
>
>> Please provide a minimal example -- are you using R Markdown or R
>> HTML? Both can produce HTML output:
>> http://yihui.name/knitr/demo/minimal/
>>
>> Regards,
>> Yihui
>> --
>> Yihui Xie 
>> Web: http://yihui.name
>>
>>
>> On Wed, Jan 29, 2014 at 10:49 AM, Jeff Johnson
>
>> wrote:
>> > Hi there,
>> > I'm currently using knitr to generate an html file, however the
>output of
>> > my code is in a font size that's larger than I desire. I've been
>looking
>> > through various options for controlling the font size of the code
>> results,
>> > such as the knitr manual, opts_chunk, and latex.
>> >
>> > The actual code itself is not being outputted as desired (I set
>> echo=FALSE
>> > intentionally). However, I wish to make the results of executing
>the
>> code a
>> > couple of font sizes smaller. I'll likely wish to have all code
>output
>> > chunks be smaller, so a global setting is fine, though I would also
>> > appreciate understanding how to control it at the chunk level as
>well.
>> >
>> > Does any one have a recommendation on how to do this? Lots of
>discussion
>> on
>> > Google, but I d

Re: [R] Controlling font size on code chunk outputs using Knitr

2014-01-30 Thread Jeff Johnson
Hi Yihui,

The package I have installed is "knitr". To generate the HTML, I run Knit
HTML from within R Studio version .98.490 (there's an icon to initiate it.

Here's a simple example:
showcode <- FALSE
commentchar <- NA

You can load this data as 'mydf'...
dput(mydf)
structure(list(PERSONPROFILE_POS = "DV", PARTY_ID = 95252415L,
PERSON_FIRST_NAME = "Julie", PERSON_LAST_NAME = "herlastname",
PERSON_MIDDLE_NAME = NA_character_,
PARTY_NUMBER = 49229698L, ACCOUNT_NUMBER = 104205066L, ABILITEC_LINK =
25455695,
ADDRESS1 = "332 SE SOME RD", ADDRESS2 = NA_character_,
ADDRESS3 = NA_character_, ADDRESS4 = NA_character_, CITY = "SOMECITY",
COUNTY = "SOMECOUNTY", STATE = "OR", PROVINCE = NA_character_,
POSTAL_CODE = "97111-", COUNTRY = "US", PRIMARY_PER_TYPE = "N",
SELLTOADDR_LOS = "DV", LOCATION_ID = 6222438L, SELLTOADDR_SOS = "DV",
PARTY_SITE_ID = 7292226L, PRIMARYPHONE_CPOS = "DV",
CONTACT_POINT_ID_PCP = 62243903L,
CONTACT_POINT_PURPOSE_PCP = "PERSONAL", PHONE_LINE_TYPE = "GEN",
PRIMARY_FLAG_PCP = "Y", PHONE_COUNTRY_CODE = NA_integer_,
PHONE_AREA_CODE = "244", PHONE_NUMBER = "244", EMAIL_CPOS = "DV",
CONTACT_POINT_ID_ECP = 6202L, CONTACT_POINT_PURPOSE_ECP =
NA_character_,
PRIMARY_FLAG_ECP = "Y", EMAIL_ADDRESS = "someem...@yahoo.com",
BB_PARTY_ID = NA, VALID_COUNTRY = TRUE, VALID_USSTATE = TRUE,
POSTAL_PATTERN = "9-", VALID_USPP = TRUE, FULL_PHONE =
"244244",
FULL_PHONE_PATTERN = "NA99", FNAME_PATTERN = "A",
FNAME_LENGTH = 5L, FNAME_TOKEN_COUNT = 1L, LNAME_LENGTH = 4L,
LNAME_PATTERN = "", MNAME_LENGTH = 2L, MNAME_PATTERN =
NA_character_,
MNAME_TOKEN_COUNT = 1L, LNAME_TOKEN_COUNT = 1L, EMAIL_LENGTH = 19L,
VALID_EMAIL = TRUE), .Names = c("PERSONPROFILE_POS", "PARTY_ID",
"PERSON_FIRST_NAME", "PERSON_LAST_NAME", "PERSON_MIDDLE_NAME",
"PARTY_NUMBER", "ACCOUNT_NUMBER", "ABILITEC_LINK", "ADDRESS1",
"ADDRESS2", "ADDRESS3", "ADDRESS4", "CITY", "COUNTY", "STATE",
"PROVINCE", "POSTAL_CODE", "COUNTRY", "PRIMARY_PER_TYPE", "SELLTOADDR_LOS",
"LOCATION_ID", "SELLTOADDR_SOS", "PARTY_SITE_ID", "PRIMARYPHONE_CPOS",
"CONTACT_POINT_ID_PCP", "CONTACT_POINT_PURPOSE_PCP", "PHONE_LINE_TYPE",
"PRIMARY_FLAG_PCP", "PHONE_COUNTRY_CODE", "PHONE_AREA_CODE",
"PHONE_NUMBER", "EMAIL_CPOS", "CONTACT_POINT_ID_ECP",
"CONTACT_POINT_PURPOSE_ECP",
"PRIMARY_FLAG_ECP", "EMAIL_ADDRESS", "BB_PARTY_ID", "VALID_COUNTRY",
"VALID_USSTATE", "POSTAL_PATTERN", "VALID_USPP", "FULL_PHONE",
"FULL_PHONE_PATTERN", "FNAME_PATTERN", "FNAME_LENGTH", "FNAME_TOKEN_COUNT",
"LNAME_LENGTH", "LNAME_PATTERN", "MNAME_LENGTH", "MNAME_PATTERN",
"MNAME_TOKEN_COUNT", "LNAME_TOKEN_COUNT", "EMAIL_LENGTH", "VALID_EMAIL"
), row.names = 1L, class = "data.frame")

You can load that dataset, then:
Print the column names
```{r, echo=showcode, comment=commentchar}
colnames(mydf)
```
The resulting font is a couple of points larger than I'd like. I'd like to
be able to control this either globally or at the code chunk level.

Thanks for your help with this!


On Wed, Jan 29, 2014 at 5:57 PM, Yihui Xie  wrote:

> Please provide a minimal example -- are you using R Markdown or R
> HTML? Both can produce HTML output:
> http://yihui.name/knitr/demo/minimal/
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Web: http://yihui.name
>
>
> On Wed, Jan 29, 2014 at 10:49 AM, Jeff Johnson 
> wrote:
> > Hi there,
> > I'm currently using knitr to generate an html file, however the output of
> > my code is in a font size that's larger than I desire. I've been looking
> > through various options for controlling the font size of the code
> results,
> > such as the knitr manual, opts_chunk, and latex.
> >
> > The actual code itself is not being outputted as desired (I set
> echo=FALSE
> > intentionally). However, I wish to make the results of executing the
> code a
> > couple of font sizes smaller. I'll likely wish to have all code output
> > chunks be smaller, so a global setting is fine, though I would also
> > appreciate understanding how to control it at the chunk level as well.
> >
> > Does any one have a recommendation on how to do this? Lots of discussion
> on
> > Google, but I don't see any tangible results. I'm still pretty new to R
> > however.
> >
> > Thanks in advance.
> > --
> > Jeff
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jeff

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] print.matrix

2014-01-30 Thread Göran Broström

In the documentation of 'prmatrix' (base) I read, under Details:

‘prmatrix’ is an earlier form of ‘print.matrix’

but 'print' doesn't seem to have a 'matrix' method. And in the 
'Examples' section:


chm <- matrix(...
chm # uses print.matrix()

Is this a bug in the documentation?

R-3.0.2 on ubuntu 13.10.

Göran Broström

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package mgcv - predict with bam: Error in X[ind, ] : subscript out of bounds

2014-01-30 Thread Katharina May
Dear R-Community,

I`m trying to apply the mgcv package to fill gaps in sensor data from
different sites (9 sites, 2 sensors per site) and do the filling on a
site-wise level.
Based on 
http://r.789695.n4.nabble.com/mgcv-gamm-predict-to-reflect-random-s-effects-td3622738.html
my model looks like this:
 xylemRohWeekXnn.fit.bam  <- bam(sensor1 ~ sensor2 + s(site, bs="re")
+ s(site, NthSampling, bs="re") ,  data=xylemRohWeekXnn2011,
na.action=na.omit)

However, than I try to use predict, I get an error:
gapData <- xylemRohWeekXnn2011[is.na(xylemRohWeekXnn2011[,2]) &
!is.na(xylemRohWeekXnn2011[,11]),c(2:3,6:7, 11)]
xylemRohWeekXnnSite.fit <-
predict.gam(xylemRohWeekXnn.fit.bam,gapData, type="response", se=F)
Error in X[ind, ] : subscript out of bounds

I was hoping that someone might be able to provide a quick hint on if
there is an obvious problem or mistake  within my model
declaration/approach?
I attached the sessionInfo() Output below and the xylemRohWeekXnn2011
dump can be downloaded here:
https://webdisk.ads.mwn.de/Handlers/AnonymousDownload.ashx?folder=1a7cbaa4&path=xylemRohWeekXnn2011.txt
I`m appreciating any help and hints!

Thank you very much, Katharina

-
sessionInfo()
-
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=CLC_TIME=German_Germany.1252

attached base packages:
[1] splines   stats graphics  grDevices utils datasets
methods   base

other attached packages:
 [1] mgcv_1.7-27 plyr_1.8ggplot2_0.9.3.1 lattice_0.20-24
gdata_2.13.2nlme_3.1-113
 [7] zoo_1.7-10  xlsx_0.5.5  xlsxjars_0.5.0  rJava_0.9-6

loaded via a namespace (and not attached):
 [1] colorspace_1.2-4   dichromat_2.0-0digest_0.6.4
grid_3.0.2 gtable_0.1.2
 [6] gtools_3.2.1   labeling_0.2   MASS_7.3-29
Matrix_1.1-2   munsell_0.4.2
[11] proto_0.3-10   RColorBrewer_1.0-5 reshape2_1.2.2
scales_0.2.3   stringr_0.6.2
[16] tools_3.0.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Parsing Complex Text in Single Cell

2014-01-30 Thread Rui Barradas

Hello,

Maybe something like the following.


x <- scan(what = "character", text = '
{"trial":1,"corr":1,"resp_dur":799,"stim":"â†*�*â†*�*â†
*�*â†*�*â†*�*","cond
":"congruent"},{"trial":2,"corr":1,"resp_dur":0,"stim":"xxâ†
’xx","cond":"
nogo"},{"trial":3,"corr":0,"resp_dur":null,"stim":"â†*�*â†
*�*→â†*�*â†*�*

","cond":"incongruent"},{"trial":4,"corr":1,"resp_dur":528,"stim":"
→→→→â†
’","cond":"congruent"},{"trial":5,"corr":0,"resp_dur":null,"
stim":"â–¡â–¡â†
’â–¡â–¡","cond":"neutral"},{"trial":6,"corr":0,"resp_dur
":574,"stim":"→→â†*�*→â†
’","cond":"incongruent"},{"trial":7,"corr":1,"
resp_dur":541,"stim":"â–¡â–¡â†
’â–¡â–¡","cond":"neutral"},{"trial":8,"corr
":1,"resp_dur":500,"stim":"â–¡â–¡â†
*�*â–¡â–¡","cond":"neutral"},{"trial":9,"
corr":1,"resp_dur":0,"stim":"xxâ†
’xx","cond":"nogo"},{"trial":10,"corr":0,"
resp_dur":637,"stim":"â†*�*â†*�*→â†*�*â†
*�*","cond":"incongruent"}]')



x <- paste(x, collapse = ' ')
x <- gsub('"', '', x)
x <- gsub('\\]', '', x)

y <- unlist(strsplit(x, "\\{"))
y <- sub("\\}", "", y)
y <- y[y != ""]
y <- strsplit(y, ",")

fun <- function(x){
y <- strsplit(x, ":")
z <- lapply(y, '[[', 2)
z[1:3] <- lapply(z[1:3], as.numeric)
z <- as.data.frame(t(unlist(z)))
z
}

res <- do.call(rbind, lapply(y, fun))
names(res) <- lapply(strsplit(y[[1]], ":"), '[[', 1)
res


Note that the two warnings are ok, they are due to the two values 'null' 
in your data, that are coerced to NA.


Hope this helps,

Rui Barradas

Em 29-01-2014 22:14, Patzelt, Edward escreveu:

R Experts -

We have a complex problem whereby Qualtrics exported our data into a single
cell as seen below.

We attempted to parse it using scan() without much success. Hoping to get a
little nudge here. I've posted the full data set here:
https://www.dropbox.com/s/e246uiui6jrux6c/CoopandSelfControl_N90_1.24.14_GNGData.csv

{"trial":1,"corr":1,"resp_dur":799,"stim":"â†*�*â†*�*â†*�*â†*�*â†
*�*","cond
":"congruent"},{"trial":2,"corr":1,"resp_dur":0,"stim":"xx→xx","cond":"
nogo"},{"trial":3,"corr":0,"resp_dur":null,"stim":"â†*�*â†*�*â†
’â†*�*â†*�*
","cond":"incongruent"},{"trial":4,"corr":1,"resp_dur":528,"stim":"
→→→→â†
’","cond":"congruent"},{"trial":5,"corr":0,"resp_dur":null,"
stim":"â–¡â–¡â†
’â–¡â–¡","cond":"neutral"},{"trial":6,"corr":0,"resp_dur
":574,"stim":"→→â†*�*→â†
’","cond":"incongruent"},{"trial":7,"corr":1,"
resp_dur":541,"stim":"â–¡â–¡â†
’â–¡â–¡","cond":"neutral"},{"trial":8,"corr
":1,"resp_dur":500,"stim":"â–¡â–¡â†
*�*â–¡â–¡","cond":"neutral"},{"trial":9,"
corr":1,"resp_dur":0,"stim":"xx→xx","cond":"nogo"},{"trial":10,"corr":0,"
resp_dur":637,"stim":"â†*�*â†*�*→â†*�*â†
*�*","cond":"incongruent"}]


Cheers,


Edward



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regsubsets n

2014-01-30 Thread wild_manul
Hello

I am trying to run a model where the number of observations is less than
the number of predictors. When I've tried to run the regsubsets on a dummy
dataset of random normally distributed numbers it gives me an error
whenever I set n to be less than p. For instance, when number of
observations=150, number of predictors=151.

My code is as following:

a<-regsubsets
(X,Y,nbest=1,nvmax=5,really.big=T,method="exhaustive",intercept=FALSE)

X is a matrix of 150 rows and 151 columns. Y is a vector of 150 rows

It gives a following error:

Reordering variables and trying again:
Error in if (any(index[force.out] == -1)) stop("Can't force the same
variable in and out") :
  missing value where TRUE/FALSE needed

I am new to R and would appreciate your help. Is it possible to run all
subsets selection search when nhttp://r.789695.n4.nabble.com/Regsubsets-n-p-tp4684438.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.