from:"jonathan"

Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?

2024-09-13 Thread Jonathan Dushoff

On Fri, Sep 13, 2024 at 1:10 PM Duncan Murdoch wrote:

> [You don't often get email from murdoch.dun...@gmail.com. Learn why this is 
> important at https://aka.ms/LearnAboutSenderIdentification ]

> Caution: External email.


> On 2024-09-13 8:53 a.m., Jonathan Dushoff wrote:
> >> Message: 4
> >> Date: Thu, 12 Sep 2024 11:21:02 -0400
> >> From: Duncan Murdoch
> >> That's not the correct formula, is it? I think the result should be x *
> >> Conj(y) / Mod(y)^2 .

> > Correct, sorry. And thanks.

> >> So that would involve * and
> >> / , not just real arithmetic.

> > Not an expert, but I don't see it. Conj and Mod seem to be numerically
> > straightforward real-like operations. We do those, and then multiply
> > one complex number by one real quotient.

> Are you sure? We aren't dealing with real numbers and complex numbers
> here, we're dealing with those sets extended with infinities and other
> weird things.

Definitely not sure, just thought I would suggest it as a possibility.

> So for example if y is some kind of infinite complex number, then 1/y
> should come out to zero, and if x is finite, the final result of x/y
> should be zero.

> But if we evaluate x/y as (x / Mod(y)^2) * Conj(y), won't we get a NaN
> from zero times infinity?

Yes, and it's not trivial to work around, so probably not worth it.

Thanks,

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?

2024-09-13 Thread Jonathan Dushoff

> Message: 4
> Date: Thu, 12 Sep 2024 11:21:02 -0400
> From: Duncan Murdoch 
> That's not the correct formula, is it?  I think the result should be x *
> Conj(y) / Mod(y)^2 .

Correct, sorry. And thanks.

> So that would involve  *  and
>  / , not just real arithmetic.

Not an expert, but I don't see it. Conj and Mod seem to be numerically
straightforward real-like operations. We do those, and then multiply
one complex number by one real quotient.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Subject: Re: BUG: atan(1i) / 5 = NaN+Infi ?

2024-09-12 Thread Jonathan Dushoff

> In this case, I do think we should look into the consequences of
> indeed distinguishing
>* 
>*   and
>/ 
> from their respective current {1. coerce to complex, 2. use complex arith}
> arithmetic.

I'm wondering whether – if this indeed gets opened up – it might also
make sense to calculate  x /  y using real arithmetic
(as x*y / |y|²)

Jonathan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R

2021-03-16 Thread Jonathan Lim

Hi

I found your email on a website

Can I ask some questions about R please

Many thanks

Jonathan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Best settings for RStudio video recording?

2020-08-13 Thread Jonathan Greenberg

Folks:

I was wondering if you all would suggest some helpful RStudio
configurations that make recording a session via e.g. zoom the most useful
for students doing remote learning.  Thoughts?

--j

-- 
Jonathan A. Greenberg, PhD
Randall Endowed Professor and Associate Professor of Remote Sensing
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Natural Resources & Environmental Science
University of Nevada, Reno
1664 N Virginia St MS/0186
Reno, NV 89557
Phone: 415-763-5476
https://www.gearslab.org/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Checking for a proper "stop" statement...

2018-02-21 Thread Jonathan Greenberg

Folks:

Consider the following two use cases:

goodfunction <- function()
{
stop("Something went wrong..."
}

# vs.

badfunction <- function()
{
notgood()
}

Is there a way for me to test if the functions make use of a stop()
statement WITHOUT modifying the stop() output (assume I can't mod the
function containing the stop() statement itself)?  For "goodfunction" the
answer is TRUE, for "badfunction" the answer is FALSE.  Both return an
error, but only one does it "safely".

I thought the answer might lie in a tryCatch statement but I'm having a
hard time figuring out how to do this test.

--j
-- 
-- 
Jonathan A. Greenberg, PhD
Randall Endowed Professor and Associate Professor of Remote Sensing
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Natural Resources & Environmental Science
University of Nevada, Reno
1664 N Virginia St MS/0186
Reno, NV 89557
Phone: 415-763-5476
http://www.unr.edu/nres
Gchat: jgrn...@gmail.com, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New package: rDotNet

2017-09-04 Thread Jonathan Shore


I’ve published a package on CRAN called ‘rDotNet’.  rDotNet allows R to access 
.NET libraries. From R one can:

* create .NET objects
* call member functions
* call class functions (i.e. static members)
* access and set properties
* access indexing members

The package will run with either mono on OS X / Linux or the Microsoft .NET VM 
on windows.   Find the source and description of the package on:

https://github.com/tr8dr/.Net-Bridge/blob/master/src/R/rDotNet/ 
<https://github.com/tr8dr/.Net-Bridge/blob/master/src/R/rDotNet/>

And the CRAN link as:

https://cran.r-project.org/web/packages/rDotNet/index.html 
<https://cran.r-project.org/web/packages/rDotNet/index.html>

The package is stable, as has been in use for some years, but only now packaged 
up for public use on CRAN.  Feel free to contact with questions or suggestions 
on GitHub or by email.  

Regards
--
Jonathan Shore




[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odd results from rpart classification tree

2017-05-15 Thread Marshall, Jonathan

Thanks Terry!

I managed to figure that out shortly after posting (as is the way!) Adding an 
additional covariate that splits below one of the x branches but not the other 
and means the class proportion to go over 0.5 means the x split is retained.

However, I now have another conundrum, this time with rpart in anova mode...

library(rpart)
test_split <- function(offset) {
  y <- c(rep(0,10),rep(0.5,2)) + offset
  x <- c(rep(0,10),rep(1,2))
  if (is.null(rpart(y ~ x, minsplit=1, cp=0, xval=0)$splits)) 0 else 1
}

sum(replicate(1000, test_split(0))) # 1000, i.e. always splits
sum(replicate(1000, test_split(0.5))) # 2-12, i.e. splits only sometimes...

Adding a constant to y and getting different trees is a bit strange, 
particularly stochastically.

Will see if I can track down a copy of the CART book.

Jonathan

From: Therneau, Terry M., Ph.D. [thern...@mayo.edu]
Sent: 16 May 2017 00:43
To: r-help@r-project.org; Marshall, Jonathan
Subject: Re: Odd results from rpart classification tree

You are mixing up two of the steps in rpart.  1: how to find the best candidate 
split and
2: evaluation of that split.

With the "class" method we use the information or Gini criteria for step 1.  
The code
finds a worthwhile candidate split at 0.5 using exactly the calculations you 
outline.  For
step 2 the criteria is the "decision theory" loss.  In your data the estimated 
rate is 0
for the left node and 15/45 = .333 for the right node.  As a decision rule both 
predict
y=0 (since both are < 1/2).  The split predicts 0 on the left and 0 on the 
right, so does
nothing.

The CART book (Brieman, Freidman, Olshen and Stone) on which rpart is based 
highlights the
difference between odds-regression (for which the final prediction is a 
percent, and error
is Gini) and classification.  For the former treat y as continuous.

Terry T.

On 05/15/2017 05:00 AM, r-help-requ...@r-project.org wrote:
> The following code produces a tree with only a root. However, clearly the 
> tree with a split at x=0.5 is better. rpart doesn't seem to want to produce 
> it.
>
> Running the following produces a tree with only root.
>
> y <- c(rep(0,65),rep(1,15),rep(0,20))
> x <- c(rep(0,70),rep(1,30))
> f <- rpart(y ~ x, method='class', minsplit=1, cp=0.0001, 
> parms=list(split='gini'))
>
> Computing the improvement for a split at x=0.5 manually:
>
> obs_L <- y[x<.5]
> obs_R <- y[x>.5]
> n_L <- sum(x<.5)
> n_R <- sum(x>.5)
> gini <- function(p) {sum(p*(1-p))}
> impurity_root <- gini(prop.table(table(y)))
> impurity_L <- gini(prop.table(table(obs_L)))
> impurity_R <- gini(prop.table(table(obs_R)))
> impurity <- impurity_root * n - (n_L*impurity_L + n_R*impurity_R) # 2.880952
>
> Thus, an improvement of 2.88 should result in a split. It does not.
>
> Why?
>
> Jonathan
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odd results from rpart classification tree

2017-05-14 Thread Marshall, Jonathan

The following code produces a tree with only a root. However, clearly the tree 
with a split at x=0.5 is better. rpart doesn't seem to want to produce it.

Running the following produces a tree with only root.

y <- c(rep(0,65),rep(1,15),rep(0,20))
x <- c(rep(0,70),rep(1,30))
f <- rpart(y ~ x, method='class', minsplit=1, cp=0.0001, 
parms=list(split='gini'))

Computing the improvement for a split at x=0.5 manually:

obs_L <- y[x<.5]
obs_R <- y[x>.5]
n_L <- sum(x<.5)
n_R <- sum(x>.5)
gini <- function(p) {sum(p*(1-p))}
impurity_root <- gini(prop.table(table(y)))
impurity_L <- gini(prop.table(table(obs_L)))
impurity_R <- gini(prop.table(table(obs_R)))
impurity <- impurity_root * n - (n_L*impurity_L + n_R*impurity_R) # 2.880952

Thus, an improvement of 2.88 should result in a split. It does not.

Why?

Jonathan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New package: ggghost 0.1.0 - Capture the spirit of your ggplot2 calls

2016-08-08 Thread Jonathan Carroll

Greetings, R users!

I am pleased to announce the release of my first CRAN package: ggghost.

https://cran.r-project.org/web/packages/ggghost
https://github.com/jonocarroll/ggghost

Features:

 - Minimal user-space overhead for implementation; p %g<% ggplot(dat,
aes(x,y))
 - ggplot2 components added to the plot object (p <- p + geom_point()) are
stored in a list within p, and evaluation delayed
 - The incoming data is captured and retained for reproducibility
 - The list of calls can be added to (+), subtracted from (-, via regex),
and subset
 - The list of calls can be inspected (via summary)
 - The data and calls can be recovered from the object p even if removed
from the workspace.

Provides a solution to a question posed here:
https://twitter.com/JennyBryan/status/755417584359632896

Whether the pun name or the R code came first is a secret that dies with me.

I welcome any feedback or suggestions you may have.

Kind regards,

- Jonathan Carroll.

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 2x2x2 rm ANOVA, varying results

2016-05-15 Thread Jonathan Reardon

Hello,

I ran a 2x2x2 repeated measures ANOVA which turned out fine:
 DfSum Sq   Mean Sq   F 
value Pr(>F)  Attend1   0.5540  
0.55402 7.03740.01079 *PercGrp  1   
0.0058  0.00580 0.07370.78719  Pres 1   
0.1794  0.17944 2.27940.13766  Attend:PercGrp   
  1   0.0017  0.00172 0.02180.88324  Attend:Pres
 1 0.0189  0.018940.2406  0.62598  
PercGrp:Pres  1  0.0534  0.053440.6789 
0.41405  Attend:PercGrp:Pres  1 0.0046 0.004640.0590
  0.80912  Residuals  483.7788  0.07872
However, when I run the interactions alone (from the same dataset), I get a 
different set of results than what was originally shown e.g.
> anova(lm(main~Attend+PercGrp+Pres))Analysis of Variance Table
Response: main  Df   Sum SqMean SqF value  
Pr(>F)   Attend 10.5540   0.554027.4682 0.008561 
**PercGrp   1   0.00580.005800.0782 0.780849   Pres 
10.17940.179442.4189 0.125942   Residuals 52 3.8575 
   0.07418  
I also get different results when i run the interactions alone too. Curious to 
know why this is.
Thanks,Jon




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problems with pooling Multiply Imuputed datasets, of a multilevel logistic model, using (MICE)

2016-03-30 Thread Jonathan Halls via R-help

I am having problems with the MICE package in R, particularity with pooling the 
imputed data sets.
I am running a multilevel binomial logistic regression, with Level1 - topic 
(participant response to 10 questions on different topics, e.g. T_Darkness, 
T_Day) nested within Level2 - individuals. 
The model is created using R2MLwiN, the formula is 
> fit1 <-runMLwiN( c(probit(T_Darkness, cons), probit(T_Day, cons), 
> probit(T_Light, cons), probit(T_Night, cons), probit(T_Rain, cons), 
> probit(T_Rainbows, cons), probit(T_Snow, cons), probit(T_Storms, cons), 
> probit(T_Waterfalls, cons), probit(T_Waves, cons)) ~ 1, D=c("Mixed", 
> "Binomial", "Binomial","Binomial","Binomial", "Binomial", "Binomial", 
> "Binomial", "Binomial", "Binomial" ,"Binomial"), estoptions = list(EstM = 0), 
> data=data)Unfortunately, there is missing data in all of the Level1 (topic) 
> responses. I have been using the mice package ([CRAN][1]) to multiply impute 
> the missing values. 
I can fit the model to the imputed datasets, using the formula 
> fitMI <- (with(MI.Data, runMLwiN( c(probit(T_Darkness, cons), probit(T_Day, 
> cons), probit(T_Light, cons), probit(T_Night, cons), probit(T_Rain, cons), 
> probit(T_Rainbows, cons), probit(T_Snow, cons), probit(T_Storms, cons), 
> probit(T_Waterfalls, cons), probit(T_Waves, cons)) ~ 1, D=c("Mixed", 
> "Binomial", "Binomial","Binomial","Binomial", "Binomial", "Binomial", 
> "Binomial", "Binomial", "Binomial" ,"Binomial"), estoptions = list(EstM = 0), 
> data=data)))
 However, when I come to pool the analyses with the call code > pool(fitMI) it 
fails, with the Error:Error in pool(with(tempData, 
runMLwiN(c(probit(T_Darkness, cons), probit(T_Day, : Object has no coef() 
method.
I am not sure why it is saying there is no coefficient, as the analyses of the 
individual MI datasets provide both fixed parts (coefficients) and random parts 
(covariances)
Any help with what is going wrong would be much appreciated. I should warn you 
that this is my first foray into using R and multilevel modelling. Also I know 
there is a MlwiN package ([REALCOM][2]) that can do this but I don't have the 
background to use the MLwiN software outside of R.
thanks
johnny

 R reproducible example
Libraries used
 > library(R2MLwiN) > library(mice)
Subset of data`
 > T_Darkness <- c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0, NA, 0, 0, 0, NA, 1, 0, NA,NA, 
 > 1, 0, 0, 0, 1, 0, 0, 0, NA, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
 > 0, 0, 0, 0, 0, 0, 0, 1, NA, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, NA, 1, 0) 
> T_Day <- c(0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 
> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, NA, 0, 0, 0, 
> 0, NA, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, NA, NA, 0) 
> T_Light <- c(0, 0, NA, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 
> 0, 0, 1, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> 0, 0, 0, 0, 1, NA, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0) 
> T_Night <- c(0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 
> 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
> 0, 0,NA, 0, NA, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, NA, 0, 0) 
> T_Rain <- c(1, 0, 0, 1, 1, 0, 0, NA, 0, 1, 0, 0, 1, 0, 0, 0, 0, NA, 0, 0, 1, 
> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, NA, 0, 0, 0, 0, 1, 0, 
> 0, 0, NA, 1, NA, 0, 0, 0, 0, 1, NA, 1, 0, 0, 0, 0, 1, NA, 0, 0) 
> T_Rainbows <- c(1, 1, 1, 1, 0, 1, 0, 1, 0, 1, NA, 1, 1, 0, 0, 1, 0, NA, 0, 1, 
> 0, NA, 0, 1, 0, 0, 0, 0, 0, NA, 0, 0, 0, NA, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 
> 0, 1, 0, 1, 1, 1, 1, NA, 1, 0, 1, NA, 0, 0, 1, 0, 1, 1, 1, 0, 1) 
> T_Snow <- c(0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, NA, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
> 0, 0, 1, 1, 0, 0, 0, NA, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
> 0, 0, NA, 0, 0, 1, NA, 1, 0, 1, 1, 0, 0, 0, 0, 0, NA, 0, 0, 0) 
> T_Storms <- c(0, 0, 0, 1, 1, 1, 0, 1, 0, 1, NA, 0, 0, 0, 0, 1, 0, NA, 0, 0, 
> 1, 0, 0, NA, 1, 1, NA, 0, 0, NA, 0, 1, 0, NA, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 
> 0, 0, 1, 0, 0, 0, 1, 0, NA, 1, 0, NA, 0, 0, 0, 1, 1, 0, 1, NA, NA, 1) 
> T_Waterfalls <- c(0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 
> 0, 1, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, NA, 0, 
> 0, 0, 0, 0, NA, 0, 1, 0, NA, 1, 0, 1, 0, 0, 0, NA, 0, 0, 0, NA, NA, 0) 
> T_Waves <- c(0, 1, 0, 1, 1, 0, 1, NA, 0, 0, NA, 0, 0, 0, NA, 1, 0, 0, 0, 0, 
> 1, 0, NA, 0, NA, 0, 0, NA, 0, 0, 0, 0, 0, 0, NA, 1, 0, 0, 0, 1, 0, 0, NA, 0, 
> 1, 0, 0, 0, 0, 0, 1, 1, NA, 1, 1, NA, 0, 0, 0, NA, 0, 0, 0, NA, 0, 0) 
> data <- data.frame (T_Darkness, T_Day, T_Light, T_Night, T_Rain, T_Rainbows, 
> T_Snow, T_Storms, T_Waterfalls, T_Waves) 
> data$cons <- 1

Data imputed using mice with
 > MI.Data <- mice(data,m=5,maxit=50,meth='pmm',seed=500)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/list

[R] R Licensing Question

2016-01-26 Thread Jonathan Gellar

Hello,

I have found a list of all software licenses supported by CRAN at the following 
site:

https://svn.r-project.org/R/trunk/share/licenses/license.db

There is also the list of commonly used licenses here:

https://cran.r-project.org/web/licenses/

I have tried to read through some of these licenses, but I am not a lawyer and 
some of the legal jargon is difficult to get through. I have a simple question:

Are there any packages available on CRAN that have a license that requires that 
every use of a particular package (e.g. in an analysis) be made open source as 
well? I have never heard of this being the case, and it does not appear to be 
true for any of the most commonly used licenses, but from what I understand it 
would be possible for someone to create a license that has this requirement.

I apologize for the mass email if this is not the best forum for this question, 
but I could not find an answer elsewhere.

Thank you,
Jonathan

__
Jonathan Gellar
Statistician
Mathematica Policy Research
1100 First Street NE, 12th Floor
Washington, DC 20002

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Replace NaN with value from the same row

2015-10-18 Thread Jonathan Reardon

Ok, i will do, thanks for your help.
J

> Subject: RE: [R] Replace NaN with value from the same row
> From: jdnew...@dcn.davis.ca.us
> Date: Sun, 18 Oct 2015 12:55:14 -0700
> To: jonathanrear...@outlook.com
> CC: r-help@r-project.org
> 
> You should (re-)read the intro document that comes with R, "An Introduction 
> to R". Pay particular attention to sections 2., 2.7, and 5.2.
> 
> The "idx" variable that I defined is a vector in the current environment (in 
> your case apparently a local function environment). It is not a column in 
> your data frame. You should look at it using the str function. (You might 
> need to print the result of str, or use the debug capability of R to 
> single-step through your function and then use str. Read the help at ?debug.)
> 
> The df[ idx, "offset" ] notation uses the logical indexing and string 
> indexing concepts in section 2.7 to select a subset of the rows and one 
> column of the data frame.
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> --- 
> Sent from my phone. Please excuse my brevity.
> 
> On October 18, 2015 12:24:42 PM PDT, Jonathan Reardon 
>  wrote:
> >Hi, Sorry to be a pain. Would you be kind enough to briefly explain
> >what the lines are doing?
> >From what i can gather, 'idx <- is.na( df$mean )' is making a new
> >column called 'idx', finds the NaN values and inserts the boolean TRUE
> >in the respective cell.
> >df[ idx, "mean" ] <- df[ idx, "offset" ]  << i am unsure what this
> >is doing exactly.
> >Jon
> >
> >
> >> Subject: RE: [R] Replace NaN with value from the same row
> >> From: jdnew...@dcn.davis.ca.us
> >> Date: Sun, 18 Oct 2015 12:09:02 -0700
> >> To: jonathanrear...@outlook.com
> >> 
> >> The Posting Guide mentioned at the bottom of every email in the list
> >tells you that such an option is in your email software, which I know
> >nothing about. Most software lets you choose the format as part of
> >composing each email, but some software will let you set a default
> >format to use for each email address (so all your emails to e.g.
> >r-help@r-project.org will be plain text).
> >>
> >---
> >> Jeff NewmillerThe .   .  Go
> >Live...
> >> DCN:Basics: ##.#.   ##.#.  Live
> >Go...
> >>   Live:   OO#.. Dead: OO#.. 
> >Playing
> >> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> >> /Software/Embedded Controllers)   .OO#.   .OO#. 
> >rocks...1k
> >>
> >---
> >
> >> Sent from my phone. Please excuse my brevity.
> >> 
> >> On October 18, 2015 11:29:51 AM PDT, Jonathan Reardon
> > wrote:
> >> >How do i send an email in plain text format and not HTML?
> >> >I tried:
> >> >idx <- is.na( df$mean )
> >> >df[ idx, "mean" ] <- df[ idx, "offset" ]
> >> >I got the error message:
> >> >In is.na(df$mean) : is.na() applied to non-(list or vector) of type
> >> >'NULL'
> >> >Jon
> >> >
> >> >> Subject: Re: [R] Replace NaN with value from the same row
> >> >> From: jdnew...@dcn.davis.ca.us
> >> >> Date: Sun, 18 Oct 2015 11:06:44 -0700
> >> >> To: jonathanrear...@outlook.com; r-help@r-project.org
> >> >> 
> >> >> Next time send your email using plain text format rather than HTML
> >so
> >> >we see what you saw.
> >> >> 
> >> >> Try 
> >> >> 
> >> >> idx <- is.na( df$mean )
> >> >> df[ idx, "mean" ] <- df[ idx, "offset" ]
> >> >> 
> >> >> BTW there is a commonly-used function called df, so you might
> >improve
> >> >clarity by using DF for your temporary data frame name.
> >> >>
> >

[R] Replace NaN from 1 column with a value from the same row

2015-10-18 Thread Jonathan Reardon

Hi everyone,
Ignore my previous post, i realised that the rows and columns i typed into the 
email were unreadable, sincere apologies for this.
A simple question, but i cannot figure this out.
I have a data-frame with 4 columns (onset, offset, outcome, mean):
df<-data.frame(onset=c(72071,142598,293729), offset=c(72503,143030,294161), 
outcome=c(1,1,1), mean=c(7244615,NaN,294080))
For each 'NaN' in the mean column, i want to replace that NaN with the 'offset' 
value in the same row. I tried:
df$mean <- replace(df$mean, is.na(df$mean), df$offset)
but i get the error message: 'number of items to replace is not a multiple of 
replacement length'. I'm assuming because this is trying to insert the whole 
'offset' column into my one NaN cell. Is this a correct interpretation of the 
error message?
Can anyone tell me how to replace any mean row NaN's with the offset value from 
that very same row?I don't want to use any pasting etc as this needs to be used 
as part of a function working over a larger data set than the one shown here.
CheersJonathan
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Replace NaN with value from the same row

2015-10-18 Thread Jonathan Reardon

Hi everyone,
A simple question, but i cannot figure this out.

I have a data-frame with 4 columns (onset, offset, outcome, mean):
 onset offset outcome   mean8   72071  72503   1  7244615 142598 143030 
  1NaN30 293729 294161   1 294080
For each 'NaN' in the mean column, i want to replace that NaN with the 'offset' 
value in the same row.
Intended outcome: 
 onset offset outcome   mean8   72071  72503   1  7244615 142598 143030 
  114303030 293729 294161   1 294080
I have tried:
 df$mean <- replace(df$mean, is.na(df$mean), df$offset)
but i get the error message: 'number of items to replace is not a multiple of 
replacement length'. I'm assuming because this is trying to insert the whole 
'offset' column into my one NaN cell. Is this a correct interpretation of the 
error message?
Can anyone tell me how to replace any mean row NaN's  with the offset value 
from that very same row?
I don't want to use any pasting etc as this needs to be used as part of a 
function working over a large dataset than the one shown here.
Cheers
Jonathan  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] kknn::predict and kknn$fitted.values

2015-08-29 Thread Jonathan Henkelman

In thinking about this 'problem' last night, I found the 'solution'. Any NN
algorithm needs to keep track of all the data it is given, both X and Y
data, otherwise how could it find and report the nearest neighbour! When
predicting (i.e. predict.kknn) it will find the closest match (nearest
neighbour), which, for a point from the original dataset /is that point/!

In contrast, the kknn$fitted.values are derived from some cross validation
approach; likely either finding the nearest point with non-zero distance, or
build a model without that point and see where it falls. Otherwise, it
wouldn't be possible to report the accuracy of the model using only a single
dataset.

I will retest the algorithm using a split training/test dataset to better
understand how predict.kknn selects a model from the suite generated by
train.kknn—my original question. I assume it chooses kknn$best.parameters,
but want to verify this.

Hopefully that clarifies the issue. I post here in case future users have a
similar question. 

Thanks to any who took the time to think about this!
Jonathan



--
View this message in context: 
http://r.789695.n4.nabble.com/kknn-predict-and-kknn-fitted-values-tp4711625p4711634.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] kknn::predict and kknn$fitted.values

2015-08-28 Thread Jonathan Henkelman

I am noticing that there is a difference between the fitted.values returned
by train.kknn, and the values returned using predict with the same model and
dataset. For example:

> data (glass)
> tmp <- train.kknn(Type ~ ., glass, kmax=1, kernel="rectangular",
> distance=1)
> tmp$fitted.values
[[1]]
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1
 [62] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 1 2 1 2 2 1 2 2 5 2 2 2 6 2 2 2 2 2 2 2 2 2 2 2 2
[123] 2 2 2 2 3 2 2 2 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 2 3 3
3 3 3 3 2 3 7 5 5 5 5 5 5 5 5 5 5 2 5 6 6 6 6 6 6 6
[184] 2 6 7 7 2 6 7 7 7 7 7 7 7 7 7 7 7 7 5 7 7 7 7 7 7 7 7 7 7 7 7
attr(,"kernel")
[1] rectangular
attr(,"k")
[1] 1
Levels: 1 2 3 5 6 7

> predict (tmp,glass)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [62] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[123] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6
[184] 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7
Levels: 1 2 3 5 6 7

When I check the confusion matricies for these I see that fitted.values is
giving some confusion, that is, like it is a true fit, whereas predict is
returning the exact answers.

> table (tmp$fitted.values[[1]],glass$Type)
 1  2  3  5  6  7
  1 69  4  0  0  0  0
  2  1 67  2  1  1  1
  3  0  1 15  0  0  0
  5  0  3  0 11  0  1
  6  0  1  0  0  8  1
  7  0  0  0  1  0 26

> table (predict(tmp,glass),glass$Type)   
 1  2  3  5  6  7
  1 70  0  0  0  0  0
  2  0 76  0  0  0  0
  3  0  0 17  0  0  0
  5  0  0  0 13  0  0
  6  0  0  0  0  9  0
  7  0  0  0  0  0 29

Can anyone clarify what fitted.values and predict actually do? I would have
expected they would give the same output.

Thanks... Jonathan



--
View this message in context: 
http://r.789695.n4.nabble.com/kknn-predict-and-kknn-fitted-values-tp4711625.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R command to open a file "browser" on Windows and Mac?

2015-08-03 Thread Jonathan Greenberg

Folks:

Is there an easy function to open a finder window (on mac) or windows
explorer window (on windows) given an input folder?  A lot of times I want
to be able to see via a file browser my working directory.  Is there a good
R hack to do this?

--j

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Correlation question

2015-02-21 Thread Jonathan Thayn

Of course! Thank you, I knew I was missing something painfully obvious. Its 
seems, then, that this line

1-sum((cars$dist-fitted.wrong)^2)/sum((cars$dist-mean(cars$dist))^2)

is finding something other than the traditional correlation. I found this in a 
lecture introducing correlation, but , now, I'm not sure what it is. It does do 
a better job of showing that the fitted.wrong variable is not a good prediction 
of the distance. 



On Feb 21, 2015, at 4:36 PM, Kehl Dániel wrote:

> Hi,
> 
> try
> 
> cor(fitted.right,fitted.wrong)
> 
> should give 1 as both are a linear function of speed! Hence 
> cor(cars$dist,fitted.right)^2 and cor(x=cars$dist,y=fitted.wrong)^2 must be 
> the same.
> 
> HTH
> d
> 
> Feladó: R-help [r-help-boun...@r-project.org] ; meghatalmazó: Jonathan 
> Thayn [jth...@ilstu.edu]
> Küldve: 2015. február 21. 22:42
> To: r-help@r-project.org
> Tárgy: [R] Correlation question
> 
> I recently compared two different approaches to calculating the correlation 
> of two variables, and I cannot explain the different results:
> 
> data(cars)
> model <- lm(dist~speed,data=cars)
> coef(model)
> fitted.right <- model$fitted
> fitted.wrong <- -17+5*cars$speed
> 
> 
> When using the OLS fitted values, the lines below all return the same R2 
> value:
> 
> 1-sum((cars$dist-fitted.right)^2)/sum((cars$dist-mean(cars$dist))^2)
> cor(cars$dist,fitted.right)^2
> (sum((cars$dist-mean(cars$dist))*(fitted.right-mean(fitted.right)))/(49*sd(cars$dist)*sd(fitted.right)))^2
> 
> 
> However, when I use my estimated parameters to find the fitted values, 
> "fitted.wrong", the first equation returns a much lower R2 value, which I 
> would expect since the fit is worse, but the other lines return the same R2 
> that I get when using the OLS fitted values.
> 
> 1-sum((cars$dist-fitted.wrong)^2)/sum((cars$dist-mean(cars$dist))^2)
> cor(x=cars$dist,y=fitted.wrong)^2
> (sum((cars$dist-mean(cars$dist))*(fitted.wrong-mean(fitted.wrong)))/(49*sd(cars$dist)*sd(fitted.wrong)))^2
> 
> 
> I'm sure I'm missing something simple, but can someone explain the difference 
> between these two methods of finding R2? Thanks.
> 
> Jon
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Correlation question

2015-02-21 Thread Jonathan Thayn

I recently compared two different approaches to calculating the correlation of 
two variables, and I cannot explain the different results: 

data(cars)
model <- lm(dist~speed,data=cars)
coef(model)
fitted.right <- model$fitted
fitted.wrong <- -17+5*cars$speed


When using the OLS fitted values, the lines below all return the same R2 value:

1-sum((cars$dist-fitted.right)^2)/sum((cars$dist-mean(cars$dist))^2)
cor(cars$dist,fitted.right)^2
(sum((cars$dist-mean(cars$dist))*(fitted.right-mean(fitted.right)))/(49*sd(cars$dist)*sd(fitted.right)))^2


However, when I use my estimated parameters to find the fitted values, 
"fitted.wrong", the first equation returns a much lower R2 value, which I would 
expect since the fit is worse, but the other lines return the same R2 that I 
get when using the OLS fitted values.

1-sum((cars$dist-fitted.wrong)^2)/sum((cars$dist-mean(cars$dist))^2)
cor(x=cars$dist,y=fitted.wrong)^2
(sum((cars$dist-mean(cars$dist))*(fitted.wrong-mean(fitted.wrong)))/(49*sd(cars$dist)*sd(fitted.wrong)))^2


I'm sure I'm missing something simple, but can someone explain the difference 
between these two methods of finding R2? Thanks.

Jon
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] prediction intervals for robust regression

2015-02-11 Thread Burns, Jonathan (NONUS)

I have created robust regression models using least trimmed squares and 
MM-regression (using the R package robustbase).

I am now looking to create prediction intervals for the predicted results.  
While I have seen some discussion in the literature about confidence intervals 
on the estimates for robust regression, I haven't had much success in finding 
out how to create prediction intervals for the results.  I was wondering if 
anyone would be able to provide some direction on how to create these 
prediction intervals in the robust regression setting.

Thanks,

Jonathan Burns
Sr. Statistician
General Dynamics Information Technology
Medicare & Medicaid Solutions
One West Pennsylvania Avenue
Baltimore, MD 21204
(410)-842-1594
jonathan.bur...@gdit.com<mailto:jonathan.bur...@gdit.com>
www.gdit.com<http://www.gdit.com/>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using PCA to filter a series

2014-10-04 Thread Jonathan Thayn

This is exactly what I was looking for. Thank you.


Jonathan Thayn




On Oct 3, 2014, at 10:32 AM, David L Carlson wrote:

> You can reconstruct the data from the first component. Here's an example 
> using singular value decomposition on the original data matrix:
> 
>> d <- cbind(d1, d2, d3, d4)
>> d.svd <- svd(d)
>> new <- d.svd$u[,1] * d.svd$d[1]
> 
> new is basically your cp1. If we multiply it by each of the loadings, we can 
> create reconstructed values based on the first component:
> 
>> dnew <- sapply(d.svd$v[,1], function(x) new * x)
>> round(head(dnew), 1)
>  [,1]  [,2]  [,3]  [,4]
> [1,] 119.3 134.1 135.7 134.6
> [2,] 104.2 117.2 118.6 117.6
> [3,] 109.7 123.3 124.8 123.8
> [4,] 109.3 122.9 124.3 123.3
> [5,] 105.8 119.0 120.4 119.4
> [6,] 111.5 125.4 126.9 125.8
>> head(d)
>  d1  d2  d3  d4
> [1,] 113 138 138 134
> [2,] 108 115 120 115
> [3,] 105 127 129 120
> [4,] 103 127 129 120
> [5,] 109 119 120 117
> [6,] 115 126 126 123
> 
>> diag(cor(d, dnew))
> [1] 0.9233742 0.9921703 0.9890085 0.9910287
> 
> Since you want a single variable to stand for all four, you could scale new 
> to the mean:
> 
>> newd <- new*mean(d.svd$v[,1])
>> head(newd)
> [1] 130.9300 114.3972 120.3884 119.9340 116.1588 122.3983
> 
> -----
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
> 
> 
> 
> -Original Message-
> From: Jonathan Thayn [mailto:jth...@ilstu.edu] 
> Sent: Thursday, October 2, 2014 11:11 PM
> To: David L Carlson
> Cc: r-help@r-project.org
> Subject: Re: [R] Using PCA to filter a series
> 
> I suppose I could calculate the eigenvectors directly and not worry about 
> centering the time-series, since they essentially the same range to begin 
> with:
> 
> vec <- eigen(cor(cbind(d1,d2,d3,d4)))$vector
> cp <- cbind(d1,d2,d3,d4)%*%vec
> cp1 <- cp[,1]
> 
> I guess there is no way to reconstruct the original input data using just the 
> first component, though, is there? Not the original data in it entirety, just 
> one time-series that we representative of the general pattern. Possibly 
> something like the following, but with just the first component:
> 
> o <- cp%*%solve(vec)
> 
> Thanks for your help. It's been a long time since I've played with PCA.
> 
> Jonathan Thayn
> 
> 
> 
> 
> On Oct 2, 2014, at 4:59 PM, David L Carlson wrote:
> 
>> I think you want to convert your principal component to the same scale as 
>> d1, d2, d3, and d4. But the "original space" is a 4-dimensional space in 
>> which d1, d2, d3, and d4 are the axes, each with its own mean and standard 
>> deviation. Here are a couple of possibilities
>> 
>> # plot original values for comparison
>>> matplot(cbind(d1, d2, d3, d4), pch=20, col=2:5)
>> # standardize the pc scores to the grand mean and sd
>>> new1 <- scale(pca$scores[,1])*sd(c(d1, d2, d3, d4)) + mean(c(d1, d2, d3, 
>>> d4))
>>> lines(new1)
>> # Use least squares regression to predict the row means for the original 
>> four variables
>>> new2 <- predict(lm(rowMeans(cbind(d1, d2, d3, d4))~pca$scores[,1]))
>>> lines(new2, col="red")
>> 
>> -----
>> David L Carlson
>> Department of Anthropology
>> Texas A&M University
>> College Station, TX 77840-4352
>> 
>> 
>> 
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
>> Behalf Of Don McKenzie
>> Sent: Thursday, October 2, 2014 4:39 PM
>> To: Jonathan Thayn
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Using PCA to filter a series
>> 
>> 
>> On Oct 2, 2014, at 2:29 PM, Jonathan Thayn  wrote:
>> 
>>> Hi Don. I would like to "de-rotate� the first component back to its 
>>> original state so that it aligns with the original time-series. My goal is 
>>> to create a �cleaned�, or a �model� time-series from which noise has been 
>>> removed. 
>> 
>> Please cc the list with replies. It�s considered courtesy plus you�ll get 
>> more help that way than just from me.
>> 
>> Your goal sounds almost metaphorical, at least to me.  Your first axis 
>> �aligns� with the original time series already in that it captures the 
>> dominant variation
>> across all four. Beyond that, there are many approaches to signal/noise 
>> relations within time-series analysis. I am not a good source of help on 
>> these, and you probably need a statistical consul

Re: [R] Using PCA to filter a series

2014-10-02 Thread Jonathan Thayn

I suppose I could calculate the eigenvectors directly and not worry about 
centering the time-series, since they essentially the same range to begin with:

vec <- eigen(cor(cbind(d1,d2,d3,d4)))$vector
cp <- cbind(d1,d2,d3,d4)%*%vec
cp1 <- cp[,1]

I guess there is no way to reconstruct the original input data using just the 
first component, though, is there? Not the original data in it entirety, just 
one time-series that we representative of the general pattern. Possibly 
something like the following, but with just the first component:

o <- cp%*%solve(vec)

Thanks for your help. It's been a long time since I've played with PCA.

Jonathan Thayn




On Oct 2, 2014, at 4:59 PM, David L Carlson wrote:

> I think you want to convert your principal component to the same scale as d1, 
> d2, d3, and d4. But the "original space" is a 4-dimensional space in which 
> d1, d2, d3, and d4 are the axes, each with its own mean and standard 
> deviation. Here are a couple of possibilities
> 
> # plot original values for comparison
>> matplot(cbind(d1, d2, d3, d4), pch=20, col=2:5)
> # standardize the pc scores to the grand mean and sd
>> new1 <- scale(pca$scores[,1])*sd(c(d1, d2, d3, d4)) + mean(c(d1, d2, d3, d4))
>> lines(new1)
> # Use least squares regression to predict the row means for the original four 
> variables
>> new2 <- predict(lm(rowMeans(cbind(d1, d2, d3, d4))~pca$scores[,1]))
>> lines(new2, col="red")
> 
> -
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
> 
> 
> 
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Don McKenzie
> Sent: Thursday, October 2, 2014 4:39 PM
> To: Jonathan Thayn
> Cc: r-help@r-project.org
> Subject: Re: [R] Using PCA to filter a series
> 
> 
> On Oct 2, 2014, at 2:29 PM, Jonathan Thayn  wrote:
> 
>> Hi Don. I would like to "de-rotate� the first component back to its original 
>> state so that it aligns with the original time-series. My goal is to create 
>> a �cleaned�, or a �model� time-series from which noise has been removed. 
> 
> Please cc the list with replies. It�s considered courtesy plus you�ll get 
> more help that way than just from me.
> 
> Your goal sounds almost metaphorical, at least to me.  Your first axis 
> �aligns� with the original time series already in that it captures the 
> dominant variation
> across all four. Beyond that, there are many approaches to signal/noise 
> relations within time-series analysis. I am not a good source of help on 
> these, and you probably need a statistical consult (locally?), which is not 
> the function of this list.
> 
>> 
>> 
>> Jonathan Thayn
>> 
>> 
>> 
>> On Oct 2, 2014, at 2:33 PM, Don McKenzie  wrote:
>> 
>>> 
>>> On Oct 2, 2014, at 12:18 PM, Jonathan Thayn  wrote:
>>> 
>>>> I have four time-series of similar data. I would  like to combine these 
>>>> into a single, clean time-series. I could simply find the mean of each 
>>>> time period, but I think that using principal components analysis should 
>>>> extract the most salient pattern and ignore some of the noise. I can 
>>>> compute components using princomp
>>>> 
>>>> 
>>>> d1 <- c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, 
>>>> 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, 
>>>> 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, 
>>>> 114, 114, 107, 108, 108, 131, 131, 117, 113)
>>>> d2 <- c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, 
>>>> 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, 
>>>> 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, 
>>>> 147, 147, 124, 124, 120, 128, 141, 141, 138)
>>>> d3 <- c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, 
>>>> 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, 
>>>> 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, 
>>>> 156, 156, 125, 125, 115, 139, 140, 140, 138)
>>>> d4 <- c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, 
>>>> 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, 
>>>> 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, 
>>>> 149, 149, 124, 124, 119, 131, 135, 135, 134)
>>>> 
>>>> 
>>>>

[R] Using PCA to filter a series

2014-10-02 Thread Jonathan Thayn

I have four time-series of similar data. I would  like to combine these into a 
single, clean time-series. I could simply find the mean of each time period, 
but I think that using principal components analysis should extract the most 
salient pattern and ignore some of the noise. I can compute components using 
princomp


d1 <- c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, 110, 
104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, 173, 173, 
174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, 114, 114, 107, 
108, 108, 131, 131, 117, 113)
d2 <- c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, 115, 
109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, 181, 179, 
173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, 147, 147, 124, 
124, 120, 128, 141, 141, 138)
d3 <- c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, 115, 
109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, 181, 181, 
173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, 156, 156, 125, 
125, 115, 139, 140, 140, 138)
d4 <- c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, 114, 
114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, 182, 182, 
182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, 149, 149, 124, 
124, 119, 131, 135, 135, 134)


pca <- princomp(cbind(d1,d2,d3,d4))
plot(pca$scores[,1])

This seems to have created the clean pattern I want, but I would like to 
project the first component back into the original axes? Is there a simple way 
to do that?




Jonathan B. Thayn
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building R for better performance

2014-09-15 Thread Anspach, Jonathan P

All,

I’ve attached the actual benchmark TACC and I used.  I’ve also attached a paper 
I wrote covering this in a little more detail.  The paper specifies the 
hardware configuration I used.  Let me know if you have any other questions.

Regards,
Jonathan Anspach
Sr. Software Engineer
Intel Corp.
jonathan.p.ansp...@intel.com<mailto:jonathan.p.ansp...@intel.com>
713-751-9460

From: henrik.bengts...@gmail.com [mailto:henrik.bengts...@gmail.com] On Behalf 
Of Henrik Bengtsson
Sent: Thursday, September 11, 2014 9:18 AM
To: Anspach, Jonathan P
Cc: arnaud gaboury; r-help@r-project.org
Subject: Re: [R] Building R for better performance

You'll find R-benchmark-25.R, which I assume is the same and the proper pointer 
to use, at 
http://<http://r.research.att.com/benchmarks/>r.research.att.com<http://r.research.att.com/benchmarks/>/benchmarks/<http://r.research.att.com/benchmarks/>

Henrik
I'm out of the office today, but will resend it tomorrow.

Jonathan Anspach
Intel Corp.

Sent from my mobile phone.

On Sep 11, 2014, at 3:49 AM, "arnaud gaboury" 
mailto:arnaud.gabo...@gmail.com>> wrote:

>>> I got the benchmark script, which I've attached, from Texas Advanced
>>> Computing Center.  Here are my results (elapsed times, in secs):
>
>
> Where can we get the benchmark script?

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building R for better performance

2014-09-11 Thread Anspach, Jonathan P

Yes, that's the original. Then TACC increased the matrix sizes for their tests.

Jonathan Anspach
Intel Corp.

Sent from my mobile phone.

On Sep 11, 2014, at 9:18 AM, "Henrik Bengtsson" 
mailto:h...@biostat.ucsf.edu>> wrote:

You'll find R-benchmark-25.R, which I assume is the same and the proper pointer 
to use, at 
http://<http://r.research.att.com/benchmarks/>r.research.att.com<http://r.research.att.com/benchmarks/>/benchmarks/<http://r.research.att.com/benchmarks/>

Henrik

I'm out of the office today, but will resend it tomorrow.

Jonathan Anspach
Intel Corp.

Sent from my mobile phone.

On Sep 11, 2014, at 3:49 AM, "arnaud gaboury" 
mailto:arnaud.gabo...@gmail.com>> wrote:

>>> I got the benchmark script, which I've attached, from Texas Advanced
>>> Computing Center.  Here are my results (elapsed times, in secs):
>
>
> Where can we get the benchmark script?

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building R for better performance

2014-09-11 Thread Anspach, Jonathan P

I'm out of the office today, but will resend it tomorrow.

Jonathan Anspach
Intel Corp.

Sent from my mobile phone.

On Sep 11, 2014, at 3:49 AM, "arnaud gaboury"  wrote:

>>> I got the benchmark script, which I've attached, from Texas Advanced
>>> Computing Center.  Here are my results (elapsed times, in secs):
> 
> 
> Where can we get the benchmark script?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] table over a matrix dimension...

2014-07-10 Thread Jonathan Greenberg

R-helpers:

I'm trying to determine the frequency of characters for a matrix
applied to a single dimension, and generate a matrix as an output.
I've come up with a solution, but it appears inelegant -- I was
wondering if there is an easier way to accomplish this task:

# Create a matrix of "factors" (characters):
random_characters=matrix(sample(letters[1:4],1000,replace=TRUE),100,10)

# Applying with the table() function doesn't work properly, because not all rows
# have ALL of the factors, so I get a list output:
apply(random_characters,1,table)

# Hacked solution:
unique_values = letters[1:4]

countsmatrix <- t(apply(random_characters,1,function(x,unique_values)
{
counts=vector(length=length(unique_values))
for(i in seq(unique_values))
{
counts[i] = sum(x==unique_values[i])
}
return(counts)
},
unique_values=unique_values
))

# Gets me the output I want but requires two nested loops (apply and
for() ), so
# not efficient for very large datasets.

###

Is there a more elegant solution to this?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] StatET and R 3.1.0

2014-04-11 Thread Jonathan Greenberg

R-helpers:

I posted a message to the statet listserv, but I thought I'd ask here
as well since it is one of the major R developer environments-- has
anyone gotten the StatET plugin for Eclipse working with R 3.1.0 yet?
Any tricks?  I did manage to get rj updated to 2.0 via:

install.packages(c("rj", "rj.gd"),
repos="http://download.walware.de/rj-2.0",type="source";)

But the plugin is throwing an error (using the last maintenance update
at http://download.walware.de/eclipse-4.3/testing):

Launching the R Console was cancelled, because it seems starting the R
engine failed.  Please make sure that R package 'rj' (1.1 or
compatible) is installed
and that the R library paths are set correctly for the R environment
configuration 'R'.

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ignore escape characters in a string...

2014-04-09 Thread Jonathan Greenberg

Thanks all, I'll try some of these suggestions out but it seems like a
raw string ability could come in helpful -- there aren't any packages
out there that have this capability?

--j

On Tue, Apr 8, 2014 at 1:23 PM, Jeff Newmiller  wrote:
> What is wrong with
>
> winpath <- readLines("clipboard ")
>
> ?
>
> If you want to show that as a literal in your code, then don't bother 
> assigning it to a variable, but let it echo to output and copy THAT and put 
> it in your source code.
>
> There is also file.choose()...
>
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> On April 8, 2014 8:00:03 AM PDT, Jonathan Greenberg  wrote:
>>R-helpers:
>>
>>One of the minor irritations I have is copying paths from Windows
>>explorer, which look like:
>>
>>C:\Program Files\R\R-3.0.3
>>
>>and using them in a setwd() statement, since the "\" is, of course,
>>interpreted as an escape character.  I have to, at present, manually
>>add in the double slashes or reverse them.
>>
>>So, I'd like to write a quick function that takes this path:
>>
>>winpath <- "C:\Program Files\R\R-3.0.3"
>>
>>and converts it to a ready-to-go R path -- is there a way to have R
>>IGNORE escape characters in a character vector?
>>
>>Alternatively, is there some trick to using a copy/paste from Windows
>>explorer I'm not aware of?
>>
>>--j
>



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ignore escape characters in a string...

2014-04-08 Thread Jonathan Greenberg

R-helpers:

One of the minor irritations I have is copying paths from Windows
explorer, which look like:

C:\Program Files\R\R-3.0.3

and using them in a setwd() statement, since the "\" is, of course,
interpreted as an escape character.  I have to, at present, manually
add in the double slashes or reverse them.

So, I'd like to write a quick function that takes this path:

winpath <- "C:\Program Files\R\R-3.0.3"

and converts it to a ready-to-go R path -- is there a way to have R
IGNORE escape characters in a character vector?

Alternatively, is there some trick to using a copy/paste from Windows
explorer I'm not aware of?

--j



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] numeric to factor via lookup table

2014-03-28 Thread Jonathan Greenberg

R-helpers:

Hopefully this is an easy one.  Given a lookup table:

mylevels <- data.frame(ID=1:10,code=letters[1:10])

And a set of values (note these do not completely cover the mylevels range):

values <- c(1,2,5,5,10)

How do I convert values to a factor object, using the mylevels to
define the correct levels (ID matches the values), and code is the
label?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Determine breaks based on a break type...

2014-03-22 Thread Jonathan Greenberg

R-helpers:

I was wondering, given a vector of data, if there is a way to
calculate the break points based on the breaks= parameter from
histogram, but skipping all the other calculations (all I want is the
breakpoints, not the frequencies).  I can, of course, simply run the
histogram and extract the break component:

mybreaks <- hist(runif(100))$breaks

But is there a faster way to do this, if this is all I want?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overriding predict based on newdata...

2014-03-18 Thread Jonathan Greenberg

David:

Thanks!  Is it generally frowned upon (if I'm Incorporating this into
a package) to "override" a generic function like "predict", even if I
plan on making it a pass-through function (same parameters, and if the
data type doesn't match my "weird" data type, it will simply pass the
parameters through to the generic S3 "predict")?

--j

On Mon, Mar 17, 2014 at 4:08 AM, David Winsemius  wrote:
> S3 classes only dispatch on the basis of the first parameter class. That was 
> one of the reasons for the development of S4-classed objects. You say you 
> have the expectation that the object is of a class that has an ordinary 
> `predict` method presumably S3 in character,  so you probably need to write a 
> function that will mask the existing method. You would rewrite the existing 
> test for the existence of 'newdata' and the the definition of the new 
> function would persist through the rest of the session and could be 
> source()-ed in further sessions.
>
> --
> David.
>
>
> On Mar 16, 2014, at 2:09 PM, Jonathan Greenberg wrote:
>
>> R-helpers:
>>
>> I'm having some trouble with this one -- I figure because I'm a bit of
>> a noob with S3 classes...  Here's my challenge: I want to write a
>> custom predict statement that is triggered based on the presence and
>> class of a *newdata* parameter (not the "object" parameter).  The
>> reason is I am trying to write a custom function based on an oddly
>> formatted dataset that has been assigned an R class.  If the predict
>> function "detects" it (class(newdata) == "myweirdformat") it does a
>> conversion of the newdata to what most predict statements expect (e.g.
>> a dataframe) and then passes the converted dataset along to the
>> generic predict statement.  If newdata is missing or is not of the odd
>> class it should just pass everything along to the generic predict as
>> usual.
>>
>> What would be the best way to approach this problem?  Since (my
>> understanding) is that predict is dispatched based on the object
>> parameter, this is causing me confusion -- my object should still
>> remain the model, I'm just allowing a new data type to be fed into the
>> predict model(s).
>>
>> Cheers!
>>
>> --j
>>
>> --
>> Jonathan A. Greenberg, PhD
>> Assistant Professor
>> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
>> Department of Geography and Geographic Information Science
>> University of Illinois at Urbana-Champaign
>> 259 Computing Applications Building, MC-150
>> 605 East Springfield Avenue
>> Champaign, IL  61820-6371
>> Phone: 217-300-1924
>> http://www.geog.illinois.edu/~jgrn/
>> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Overriding predict based on newdata...

2014-03-16 Thread Jonathan Greenberg

R-helpers:

I'm having some trouble with this one -- I figure because I'm a bit of
a noob with S3 classes...  Here's my challenge: I want to write a
custom predict statement that is triggered based on the presence and
class of a *newdata* parameter (not the "object" parameter).  The
reason is I am trying to write a custom function based on an oddly
formatted dataset that has been assigned an R class.  If the predict
function "detects" it (class(newdata) == "myweirdformat") it does a
conversion of the newdata to what most predict statements expect (e.g.
a dataframe) and then passes the converted dataset along to the
generic predict statement.  If newdata is missing or is not of the odd
class it should just pass everything along to the generic predict as
usual.

What would be the best way to approach this problem?  Since (my
understanding) is that predict is dispatched based on the object
parameter, this is causing me confusion -- my object should still
remain the model, I'm just allowing a new data type to be fed into the
predict model(s).

Cheers!

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building R for better performance

2014-03-10 Thread Anspach, Jonathan P

CE,

Sorry for the delay.  I haven't installed any additional packages, so I don't 
know the answer to your question.  Let me look into it and get back to you.

Regards,
Jonathan Anspach
Sr. Software Engineer
Intel Corp.
jonathan.p.ansp...@intel.com
713-751-9460


-Original Message-
From: ce [mailto:zadi...@excite.com] 
Sent: Wednesday, March 05, 2014 8:54 PM
To: r-help@r-project.org; Anspach, Jonathan P
Subject: Re: [R] Building R for better performance


Hi Jonathan,

I think most people would be interested in such a tool, because main complaint 
of R is its slowness for some operations and big data.
Even thought the intel software is paying , I could install it free since I am 
not selling any software and work for non-profit. 
I compiled successfully on my opensuse.. My question is : after make install , 
do I need to give special options to install.packages or they will be complied 
with icc automatically ?

Regards
CE


-Original Message-
From: "Anspach, Jonathan P" [jonathan.p.ansp...@intel.com]
Date: 03/05/2014 12:28 AM
To: "r-help@r-project.org" 
Subject: [R] Building R for better performance

Greetings,

I'm a software engineer with Intel.  Recently I've been investigating R 
performance on Intel Xeon and Xeon Phi processors and RH Linux.  I've also 
compared the performance of R built with the Intel compilers and Intel Math 
Kernel Library to a "default" build (no config options) that uses the GNU 
compilers.  To my dismay, I've found that the GNU build always runs on a single 
CPU core, even during matrix operations.  The Intel build runs matrix 
operations on multiple cores, so it is much faster on those operations.  
Running the benchmark-2.5 on a 24 core Xeon system, the Intel build is 13x 
faster than the GNU build (21 seconds vs 275 seconds).  Unfortunately, this 
advantage is not documented anywhere that I can see.

Building with the Intel tools is very easy.  Assuming the tools are installed 
in /opt/intel/composerxe, the process is simply (in bash shell):

$ . /opt/intel/composerxe/bin/compilervars.sh intel64 $ ./configure 
--with-blas="-L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 
-lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm" --with-lapack CC=icc 
CFLAGS=-O2 CXX=icpc CXXFLAGS=-O2 F77=ifort FFLAGS=-O2 FC=ifort FCFLAGS=-O2 $ 
make $ make check

My questions are:
1) Do most system admins and/or R installers know about this performance 
difference, and use the Intel tools to build R?
2) Can we add information on the advantage of building with the Intel tools, 
and how to do it, to the installation instructions and FAQ?

I can post my data if anyone is interested.

Thanks,
Jonathan Anspach
Sr. Software Engineer
Intel Corp.
jonathan.p.ansp...@intel.com
713-751-9460

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building R for better performance

2014-03-05 Thread Anspach, Jonathan P

Simon,

Thanks for the information and links.  First of all, did you ever resolve your 
problem?  If not, did you file an issue in Intel Premier Support?  That's the 
best way to bring it to our attention.  If you don't want to do that I can try 
to get a compiler or MKL support engineer to look at your Intel Developer Zone 
discussion.  I have no experience with OS X, so I wouldn't be much help.

I got the benchmark script, which I've attached, from Texas Advanced Computing 
Center.  Here are my results (elapsed times, in secs):


 gcc build (default)
 icc/MKL build
Creation, transp., deformation of a 5000x5000 matrix
3.25  2.95
5000x5000 normal distributed random matrix ^1000
   5.13  1.52
Sorting of 14,000,000 random values 
 1.61  1.64
5600x5600 cross-product matrix (b = a' * a) 
  97.44  0.56
Linear regr. over a 4000x4000 matrix (c = a \ b')   
46.06   0.49
FFT over 4,800,000 random values
   0.65   0.61
Eigenvalues of a 1200x1200 random matrix
  5.55   1.37
Determinant of a 5000x5000 random matrix
  34.18   0.55
Cholesky decomposition of a 6000x6000 matrix
37.07   0.47
Inverse of a 3200x3200 random matrix
 29.49   0.57
3,500,000 Fibonacci numbers calculation (vector calc)   
   1.310.38
Creation of a 6000x6000 Hilbert matrix (matrix calc)
 0.77 0.99
Grand common divisors of 400,000 pairs (recursion)  
  0.63 0.56
Creation of a 1000x1000 Toeplitz matrix (loops) 
2.24 2.34
Escoufier's method on a 90x90 matrix (mixed)
   9.55 6.02
Total   
  274.93
   21.01

Regards,
Jonathan Anspach
Sr. Software Engineer
Intel Corp.
jonathan.p.ansp...@intel.com
713-751-9460


-Original Message-
From: Simon Zehnder [mailto:szehn...@uni-bonn.de] 
Sent: Wednesday, March 05, 2014 3:55 AM
To: Anspach, Jonathan P
Cc: r-help@r-project.org
Subject: Re: [R] Building R for better performance

Jonathan,

I myself tried something like this - comparing gcc, clang and intel on a Mac. 
From my experiences in HPC on the university cluster (where we also use the 
Xeon Phi, Landeshochleistungscluster University RWTH Aachen), the Intel 
compiler has better code optimization in regard to vectorisation, etc. (clang 
is up to now suffering from a not yet implemented OpenMP library).

Here is a revolutionanalytics article about this topic: 
http://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html

As I usually use the Rcpp package for C++ extensions this could give me further 
performance. Though, I already failed when trying to compile R with the Intel 
compiler and linking against the MKL (see my topic in the Intel developer zone: 
http://software.intel.com/en-us/comment/1767418 and my threads on the R-User 
list: https://stat.ethz.ch/pipermail/r-sig-mac/2013-November/010472.html). 

So, to your questions:

1) I think that most admins do not even use the Intel compiler to compile R - 
this seems to me rare. There are some people I know they do and I think they 
could be aware of it - but these are only a few. As R is growing in usage and I 
do know from regional user meetings that very large companies start using it in 
their BI units - this should be of interest.

2) I would really welcome this step because compilation with intel (especially 
on a Mac) and linking to the MKL seems to be delicate. 

I am interested in the data - so if it is possible send it via the list or 
directly to my account. Fur

[R] Building R for better performance

2014-03-04 Thread Anspach, Jonathan P

Greetings,

I'm a software engineer with Intel.  Recently I've been investigating R 
performance on Intel Xeon and Xeon Phi processors and RH Linux.  I've also 
compared the performance of R built with the Intel compilers and Intel Math 
Kernel Library to a "default" build (no config options) that uses the GNU 
compilers.  To my dismay, I've found that the GNU build always runs on a single 
CPU core, even during matrix operations.  The Intel build runs matrix 
operations on multiple cores, so it is much faster on those operations.  
Running the benchmark-2.5 on a 24 core Xeon system, the Intel build is 13x 
faster than the GNU build (21 seconds vs 275 seconds).  Unfortunately, this 
advantage is not documented anywhere that I can see.

Building with the Intel tools is very easy.  Assuming the tools are installed 
in /opt/intel/composerxe, the process is simply (in bash shell):

$ . /opt/intel/composerxe/bin/compilervars.sh intel64
$ ./configure --with-blas="-L/opt/intel/composerxe/mkl/lib/intel64 
-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm" 
--with-lapack CC=icc CFLAGS=-O2 CXX=icpc CXXFLAGS=-O2 F77=ifort FFLAGS=-O2 
FC=ifort FCFLAGS=-O2
$ make
$ make check

My questions are:
1) Do most system admins and/or R installers know about this performance 
difference, and use the Intel tools to build R?
2) Can we add information on the advantage of building with the Intel tools, 
and how to do it, to the installation instructions and FAQ?

I can post my data if anyone is interested.

Thanks,
Jonathan Anspach
Sr. Software Engineer
Intel Corp.
jonathan.p.ansp...@intel.com
713-751-9460

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] install packages from R-forge SVN

2014-02-26 Thread Jonathan Greenberg

R-helpers:

I was curious if anyone developed a package/approach to installing
packages directly from the R-forge SVN subsystem (rather than waiting
for it to build)?  I can, of course, SVN it command line but I was
hoping for an install.packages("svn://") sort of approach.  Cheers!

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Checking for and adding "..." arguments to a function...

2014-02-17 Thread Jonathan Greenberg

R-helpers:

I'm guessing this is an easy one for some of you, but I'm a bit
stumped.  Given some arbitrary function (doesn't matter what it does):

myfunction <- function(a,b,c)
{
return(a+b+c)
}

I want to test this function for the presence of the ellipses ("...")
and, if they are missing, create a new function that has them:

myfunction <- function(a,b,c,...)
{
return(a+b+c)
}

So, 1) how do I test for whether a function has an ellipses argument
and, 2) how do I "append" the ellipses to the argument list if they do
exist?

Note that the test/modification should be done without invoking the
function, e.g. I'm not asking how to test for this WITHIN the
function, I'm asking how to test "myfunction" directly as an R object.

Thanks!

--j


-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Official way to set/retrieve options in packages?

2013-10-18 Thread Jonathan Greenberg

Wanted to re-start this thread a bit, since I'm still not exactly sure
the best approach to my problem -- basically, the parameters I'm try
to make persistent are installation locations of a particular command
line program that is not installed along with an R package I'm working
on (GDAL, for those of you who are interested in the specifics).  The
function tries to dummy-proof this process by doing a (mostly)
brute-force search of the user's drive for the program location the
first time it executes, and then stores this information (the path to
a given executable) in an option for use with other functions.   This
search process can take some time, so I'd prefer to have this option
set in a semi-permanent way (so it persists between sessions).

Now, Brian Ripley suggested modifying the .Rprofile, but Bert Guntner
suggested this might not be a welcome behavior.  Given that, on an
operating system level, there are often per-program directories for
preferences, would it follow that it might make sense to store
package-options in some standardized location?  If so, where might
this be?  Would it make sense to drop then in the package directory?

Is this a discussion that should move over to r-developers?

--j

On Sat, Jun 1, 2013 at 4:57 PM, Prof Brian Ripley  wrote:
> On 01/06/2013 22:44, Anthony Damico wrote:
>> hope this helps..  :)
>>
>>  # define an object `x`
>>  x <- list( "any value here" , 10 )
>>
>>  # set `myoption` to that object
>>  options( "myoption" = x )
>>
>>  # retrieve it later (perhaps within a function elsewhere in the package)
>>  ( y <- getOption( myoption ) )
>>
>>
>> it's nice to name your options `mypackage.myoption` so users know what
>> package the option is associated with in case they type `options()`
>>
>>
>> here's the `.onLoad` function in the R survey package.  notice how the
>> options are only set *if* they don't already exist--
>
> But a nicer convention is that used by most packages in R itself: if the
> option is not set, the function using it assumes a suitable default.
> That would make sense for all the FALSE defaults below.
>
> Note though that this is not 'persistent': users have to set options in
> their startup files (see ?Startup).   There is no official location to
> store package configurations.  Users generally dislike software saving
> settings in their own file space so it seems very much preferable to use
> the standard R mechanisms (.Rprofile etc).
>
>>
>>> survey:::.onLoad
>>
>> function (...)
>> {
>>  if (is.null(getOption("survey.lonely.psu")))
>> options(survey.lonely.psu = "fail")
>>  if (is.null(getOption("survey.ultimate.cluster")))
>> options(survey.ultimate.cluster = FALSE)
>>  if (is.null(getOption("survey.want.obsolete")))
>> options(survey.want.obsolete = FALSE)
>>  if (is.null(getOption("survey.adjust.domain.lonely")))
>> options(survey.adjust.domain.lonely = FALSE)
>>  if (is.null(getOption("survey.drop.replicates")))
>> options(survey.drop.replicates = TRUE)
>>  if (is.null(getOption("survey.multicore")))
>> options(survey.multicore = FALSE)
>>  if (is.null(getOption("survey.replicates.mse")))
>> options(survey.replicates.mse = FALSE)
>> }
>> 
>>
>>
>>
>>
>> On Sat, Jun 1, 2013 at 4:01 PM, Jonathan Greenberg wrote:
>>
>>> R-helpers:
>>>
>>> Say I'm developing a package that has a set of user-definable options that
>>> I would like to be persistent across R-invocations (they are saved
>>> someplace).  Of course, I can create a little text file to be written/read,
>>> but I was wondering if there is an "officially sanctioned" way to do this?
>>>   I see there is an options() and getOptions() function, but I'm unclear how
>>> I would use this in my own package to create/save new options for my
>>> particular package.  Cheers!
>>>
>>> --j
>>>
>>> --
>>> Jonathan A. Greenberg, PhD
>>> Assistant Professor
>>> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
>>> Department of Geography and Geographic Information Science
>>> University of Illinois at Urbana-Champaign
>>> 607 South Mathews Avenue, MC 150
>>> Urbana, IL 61801
>>> Phone: 217-300-1924
>>> http://www.geog.illinois.edu/~jgrn/
>>> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>>
>>>

[R] Multimodal multidimensional optimization

2013-10-18 Thread Jonathan Thayn

Hello all

I've been performing a series of multidimensional optimizations (3 variables) 
using the optima() function. Recently, I noticed that the solution is rarely 
unimodal. Is there a package or function that handles multimodal 
multidimensional optimizations? I really appreciate any suggestions, I'm quite 
a bit beyond my expertise here. Thanks.



Jonathan B. Thayn, Ph.D.

Ridgely Fellow of Geography
Department of Geography  Geology
Illinois State University
Felmley Hall of Science, Rm 200A
Normal, IL 61790

jth...@ilstu.edu
my.ilstu.edu/~jthayn





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: C stack usage is too close to the limit when using list.files()

2013-09-28 Thread Jonathan Greenberg

Thanks all -- ok, so the symbolic link issue is a distinct
possibility, but fundamentally doesn't solve the issue since most
users will have symbolic links on their machines SOMEPLACE, so a full
drive scan will run into these issues --  is list.files calling find,
or is it using a different algorithm?  This seems like a shortcoming
in the list.files algorithm -- is there a better solution (short of a
System call, which I'm still not sure will work on Macs without Xcode
-- a colleague of mine did NOT have Xcode, and reported not being able
to run find from the command line) -- perhaps a different package?

--j

On Fri, Sep 27, 2013 at 3:08 PM, William Dunlap  wrote:
> Toss a couple of extra files in there and you will see the output grow 
> exponentially.
>
> % touch dir/IMPORTANT_1 dir/subdir/IMPORTANT_2
>
> and in R those two new files cause 82 more strings to appear in list.file's 
> output:
>
>> nchar(list.files("dir", recursive=TRUE))
>  [1]  11  18  33  40  55  62  77  84  99 106 121 128 143 150 165 172 187 194 
> 209
> [20] 216 231 238 253 260 275 282 297 304 319 326 341 348 363 370 385 392 407 
> 414
> [39] 429 436 451 458 473 480 495 502 517 524 539 546 561 568 583 590 605 612 
> 627
> [58] 634 649 656 671 678 693 700 715 722 737 744 759 766 781 788 803 810 825 
> 832
> [77] 847 854 869 876 891 898 901
>
> 'find', by default, does not following symbolic links.
>
> % find dir
> dir
> dir/subdir
> dir/subdir/IMPORTANT_2
> dir/subdir/linkToUpperDir
> dir/IMPORTANT_1
>
> The -L option makes it follow them, but it won't follow loops:
>
> % find -L dir
> dir
> dir/subdir
> dir/subdir/IMPORTANT_2
> find: File system loop detected; `dir/subdir/linkToUpperDir' is part of the 
> same file system loop as `dir'.
> dir/IMPORTANT_1
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
>> Behalf
>> Of William Dunlap
>> Sent: Friday, September 27, 2013 12:56 PM
>> To: Jonathan Greenberg; r-help
>> Subject: Re: [R] Error: C stack usage is too close to the limit when using 
>> list.files()
>>
>> Do you have some symbolic links that make loops in your file system?
>> list.files() has problems with such loops and find does not.  E.g.,  on a 
>> Linux box:
>>
>> % cd /tmp
>> % mkdir dir dir/subdir
>> % cd dir/subdir
>> % ln -s ../../dir linkToUpperDir
>> % cd /tmp
>> % R --quiet
>> > list.files("dir", recursive=TRUE, full=TRUE)
>> [1]
>> "dir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToU
>> pperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkT
>> oUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/li
>> nkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdi
>> r/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/su
>> bdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir
>> /subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpper
>> Dir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUp
>> perDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkTo
>> UpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir/subdir/lin
>> kToUpperDir/subdir/linkToUpperDir/subdir/linkToUpperDir"
>> > system("find dir")
>> dir
>> dir/subdir
>> dir/subdir/linkToUpperDir
>>
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>
>>
>> > -Original Message-
>> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] 
>> > On Behalf
>> > Of Jonathan Greenberg
>> > Sent: Friday, September 27, 2013 12:13 PM
>> > To: r-help
>> > Subject: [R] Error: C stack usage is too close to the limit when using 
>> > list.files()
>> >
>> > R-helpers:
>> >
>> > I'm running a file search on my entire drive (Mac OS X) using:
>> >
>> > files_found <- 
>> > list.files(dir="/",pattern=somepattern,recursive=TRUE,full.names=TRUE)
>> > where somepattern is a search pattern (which I have confirmed via a
>> > unix "find / -name somepattern" only returns ~ 3 results).
>> >
>> > I keep getting an error:
>> >
>> > Error: C stack usage is too close

Re: [R] Error: C stack usage is too close to the limit when using list.files()

2013-09-27 Thread Jonathan Greenberg

Ben:

I'd like to avoid using that (previous version of my code solved it in
that way) -- I would like cross-platform compatibility and I am pretty
sure, along with Windows, vanilla Macs don't come with "find" either
unless XCode has been installed.

Is the list.files() code itself recursive when using recursive=TRUE
(so it has one recursion per bottom-folder)?

--j

P.S. I recognized that in my initial post I indicated using "dir" as
the parameter -- it should have been "path" (the error occurred
through the correct usage of list.files(path="/",...)   That'll teach
me not to copy/paste from my code...

On Fri, Sep 27, 2013 at 2:36 PM, Ben Bolker  wrote:
> Jonathan Greenberg  illinois.edu> writes:
>
>>
>> R-helpers:
>>
>> I'm running a file search on my entire drive (Mac OS X) using:
>>
>> files_found <-
> list.files(dir="/",pattern=somepattern,recursive=TRUE,full.names=TRUE)
>> where somepattern is a search pattern (which I have confirmed via a
>> unix "find / -name somepattern" only returns ~ 3 results).
>>
>> I keep getting an error:
>>
>> Error: C stack usage is too close to the limit
>>
>> when running this command.  Any ideas on 1) how to fix this or 2) if
>> there is an alternative to using list.files() to accomplish this
>> search without resorting to an external package?
>
>   I assuming that using
>
> system("find / -name somepattern")
>
> (possibly with intern=TRUE) isn't allowed?  (I don't know what you're
> trying to do, but if you don't need it to work on Windows-without-cygwin,
> this should work across most Unix variants (although a "-print" might
> be required)
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error: C stack usage is too close to the limit when using list.files()

2013-09-27 Thread Jonathan Greenberg

R-helpers:

I'm running a file search on my entire drive (Mac OS X) using:

files_found <- 
list.files(dir="/",pattern=somepattern,recursive=TRUE,full.names=TRUE)
where somepattern is a search pattern (which I have confirmed via a
unix "find / -name somepattern" only returns ~ 3 results).

I keep getting an error:

Error: C stack usage is too close to the limit

when running this command.  Any ideas on 1) how to fix this or 2) if
there is an alternative to using list.files() to accomplish this
search without resorting to an external package?

Cheers!

--jonathan


-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Confusing behaviour in data.table: unexpectedly changing variable

2013-09-25 Thread Jonathan Dushoff

Thanks for your help, and sorry for mis-posting.

JD

On Wed, Sep 25, 2013 at 3:18 AM, Matthew Dowle  wrote:

> Very sorry to hear this bit you.  If you need a copy of names before
> changing them by reference :

> oldnames <- copy(names(DT))

> This will be documented and it's on the bug list to do so. copy is needed in
> other circumstances too, see ?copy.

> More details here :

> http://stackoverflow.com/questions/18662715/colnames-being-dropped-in-data-table-in-r
> http://stackoverflow.com/questions/15913417/why-does-data-table-update-namesdt-by-reference-even-if-i-assign-to-another-v

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Confusing behaviour in data.table: unexpectedly changing variable

2013-09-24 Thread Jonathan Dushoff

I got bitten badly when a variable I created for the purpose of
recording an old set of names changed when I didn't think I was going
near it.

I'm not sure if this is a desired behaviour, or documented, or warned
about.  I read the data.table intro and the FAQ, and also ?setnames.

Ben Bolker created a minimal reproducible example:

library(data.table)
DT = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
names(DT)
## [1] "x" "y" "v"

oldnames <- names(DT)
print(oldnames)
## [1] "x" "y" "v"

setnames(DT, LETTERS[1:3])
print(oldnames)
## [1] "A" "B" "C"

-- 
McMaster University Department of Biology
http://lalashan.mcmaster.ca/theobio/DushoffLab/index.php/Main_Page
https://twitter.com/jd_mathbio

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-3.0.1 g77 errors

2013-09-19 Thread Prigot, Jonathan

Sadly, I am limited to the Solaris 10 system. I wish that I could use
Linux, the world uses it.
-- 
Jonathan M. Prigot 
Partners Healthcare Systems

On Wed, 2013-09-18 at 17:20 +0200, Simon Zehnder wrote:
> On my systems Linux Scientific and Mac OS X I use as well for the F77 the 
> gfortran compiler and this works. You could give it a trial.
> 
> Best
> 
> Simon
> 
> On Sep 18, 2013, at 3:14 PM, "Prigot, Jonathan"  wrote:
> 
> > I am trying to build R-3.0.1 on our SPARC Solaris 10 system, but it
> > fails part way through with g77 errors. Has anyone run into this? Any
> > suggestions? For what it's worth, R-2.15.1 is the last one to build
> > error free for us.
> > ===
> > Jon Prigot
> > 
> > R is now configured for sparc-sun-solaris2.10
> > 
> > Source directory:  .
> > Installation directory:/usr/local
> > 
> > C compiler:gcc -std=gnu99  -g -O2
> > Fortran 77 compiler:   g77  -g -O2
> > 
> > C++ compiler:  g++  -g -O2
> > Fortran 90/95 compiler:gfortran 
> > Obj-C compiler: 
> > 
> > Interfaces supported:  X11, tcltk
> > External libraries:readline, ICU
> > Additional capabilities:   PNG, JPEG, TIFF, NLS
> > Options enabled:   shared BLAS, R profiling
> > 
> > Recommended packages:  yes
> > 
> > make 
> > ...
> > 
> > g77 -fPIC  -g -O2 -ffloat-store -c dlamch.f -o dlamch.o
> > dlamch.f: In function `dlamch':
> > dlamch.f:89: warning:
> >   INTRINSIC  DIGITS, EPSILON, HUGE, MAXEXPONENT,
> >  ^
> > Reference to unimplemented intrinsic `DIGITS' at (^) (assumed EXTERNAL)
> > dlamch.f:89: 
> >   INTRINSIC  DIGITS, EPSILON, HUGE, MAXEXPONENT,
> >  ^
> > Invalid declaration of or reference to symbol `digits' at (^) [initially
> > seen at (^)]
> > dlamch.f:89: warning:
> >   INTRINSIC  DIGITS, EPSILON, HUGE, MAXEXPONENT,
> >  ^
> > Reference to unimplemented intrinsic `EPSILON' at (^) (assumed EXTERNAL)
> > 
> > -- 
> > Jonathan M. Prigot 
> > Partners Healthcare Systems
> > 
> > 
> > 
> > The information in this e-mail is intended only for th...{{dropped:18}}
> 
> 


The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R-3.0.1 g77 errors

2013-09-18 Thread Prigot, Jonathan

I am trying to build R-3.0.1 on our SPARC Solaris 10 system, but it
fails part way through with g77 errors. Has anyone run into this? Any
suggestions? For what it's worth, R-2.15.1 is the last one to build
error free for us.
===
Jon Prigot

R is now configured for sparc-sun-solaris2.10

  Source directory:  .
  Installation directory:/usr/local

  C compiler:gcc -std=gnu99  -g -O2
  Fortran 77 compiler:   g77  -g -O2

  C++ compiler:  g++  -g -O2
  Fortran 90/95 compiler:gfortran 
  Obj-C compiler: 

  Interfaces supported:  X11, tcltk
  External libraries:readline, ICU
  Additional capabilities:   PNG, JPEG, TIFF, NLS
  Options enabled:   shared BLAS, R profiling

  Recommended packages:  yes

make 
...

g77 -fPIC  -g -O2 -ffloat-store -c dlamch.f -o dlamch.o
dlamch.f: In function `dlamch':
dlamch.f:89: warning:
 INTRINSIC  DIGITS, EPSILON, HUGE, MAXEXPONENT,
^
Reference to unimplemented intrinsic `DIGITS' at (^) (assumed EXTERNAL)
dlamch.f:89: 
 INTRINSIC  DIGITS, EPSILON, HUGE, MAXEXPONENT,
^
Invalid declaration of or reference to symbol `digits' at (^) [initially
seen at (^)]
dlamch.f:89: warning:
 INTRINSIC  DIGITS, EPSILON, HUGE, MAXEXPONENT,
^
Reference to unimplemented intrinsic `EPSILON' at (^) (assumed EXTERNAL)

-- 
Jonathan M. Prigot 
Partners Healthcare Systems



The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] library() and install.packages() no longer working ("Access is denied" error)

2013-09-13 Thread Jonathan Greenberg

In the last week, SOMETHING on my system must have changed because
when trying to library() or install.packages() on R 3.0.1 x64 on a
Windows 2008 R2 server:

> library("raster")
Error in normalizePath(path.expand(path), winslash, mustWork) :
  path[1]="D:/Users/[UID]/Documents/R/win-library/3.0": Access is denied

> install.packages("raster")
Installing package into ‘D:/Users/[UID]/Documents/R/win-library/3.0’
(as ‘lib’ is unspecified)
trying URL 
'http://ftp.osuosl.org/pub/cran/bin/windows/contrib/3.0/raster_2.1-49.zip'
Content type 'application/zip' length 2363295 bytes (2.3 Mb)
opened URL
downloaded 2.3 Mb

Error in normalizePath(path.expand(path), winslash, mustWork) :
  path[1]="D:\Users\[UID]\Documents\R\win-library\3.0": Access is denied
In addition: Warning message:
In normalizePath(path.expand(path), winslash, mustWork) :
  path[1]="D:/Users/[UID]/Documents/R/win-library/3.0": Access is denied

The permissions on that directory APPEAR to be correct (I can add
files/folders, rename them, delete them), but alas R continues to give
me these errors.  Both the users and the sysadmin claim nothing was
changed, but clearly something did.

As a heads up, I did try removing PATHTO/win-library/3.0, and re-ran
the install.packages("raster"), at which point R asked me "Would you
like to use a personal directory instead?".  I clicked yes.  It then
asks me "Would you like to create a personal library
'D:/Users/[UID]/Documents/R/win-library/3.0' to install packages into?
 I clicked yes.  The Mirror browser shows up, I select a mirror.  A
3.0 directory is created, but I got the same error, and when examining
the (new) 3.0 directory, nothing is created inside of it.

Any ideas what this could be caused by?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ifelse question (I'm not sure why this is working)...

2013-09-10 Thread Jonathan Greenberg

R-helpers:

One of my intrepid students came up with a solution to a problem where
they need to write a function that takes a vector x and a "scalar" d,
and return the indices of the vector x where x %% d is equal to 0 (x
is evenly divisible by d).  I thought I had a good handle on the
potential solutions, but one of my students sent me a function that
WORKS, but for the life of me I can't figure out WHY.  Here is the
solution:

remainderFunction<-function(x,d)
{
   ifelse(x%%d==0,yes=return(which(x%%d==0)),no=return(NULL))
}
remainderFunction(x=c(23:47),d=3)

I've never seen an ifelse statement used that way, and I was fully
expecting that to NOT work, or to place the output of which(x%%d==0)
in each location where the statement x%%d==0 was true.

Any ideas on deconstructing this?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Vectorized version of colMeans/rowMeans for higher dimension arrays?

2013-08-29 Thread Jonathan Greenberg

For matrices, colMeans/rowMeans are quick, vectorized functions.  But
say I have a higher dimensional array:

moo <- array(runif(400*9*3),dim=c(400,9,3))

And I want to get the mean along the 2nd dimension.  I can, of course,
use apply:

moo1 <- apply(moo,c(1,3),mean)

But this is not a vectorized operation (so it doesn't execute as
quickly).  How would one vectorize this operation (if possible)?  Is
there an array equivalent of colMeans/rowMeans?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Parallel version of Map(rather, mapply)

2013-08-28 Thread Greenberg, Jonathan

Hi Saptarshi:

There are quite a few parallel mapply's out there -- my recommendation is to 
use the foreach package, since it allows you to be flexible in the parallel 
backend, and you don't have to write two statements (a sequential and a 
parallel statement) -- if a parallel backend is running, it will use that, 
otherwise it'll execute in sequential mode.

--j

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
Saptarshi Guha [saptarshi.g...@gmail.com]
Sent: Wednesday, August 28, 2013 1:24 PM
To: R-help@r-project.org
Subject: [R] Parallel version of Map(rather, mapply)

Hello,
I find Map to be nice interface to mapply. However
Map calls mapply which in turn calls mapply via .Internal.

Is there a parallel version of mapply (like mcapply) or do I need to write
this myself?

Regards
Saptarshi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Determining the maximum memory usage of a function

2013-06-20 Thread Jonathan Greenberg

Jim:

Thanks, but I'm looking for something that can be used somewhat
automatically -- the function in question would be user-provided and
passed to my "chunking" algorithm, so in this case it would be the
end-user (not me) who would have to embed these -- would
Rprof(memory.profiling=TRUE)
# my function
Rprof(NULL)
... and then taking the max of the tseries output be a reasonable
approach?  If so, which of the three outputs (vsize.small vsize.large
  nodes) would be best compared against the available memory?

Cheers!

--j

On Thu, Jun 20, 2013 at 10:07 AM, jim holtman  wrote:
> What I would do is to use "memory.size()" to get the amount of memory being
> used.  Do a call at the beginning of the function to determine the base, and
> then at other points in the code to see what the difference from the base is
> and keep track of the maximum difference.  I am not sure if just getting the
> memory usage at the end would be sufficient since there may be some garbage
> collection in between, or you might be creating some large objects and then
> deleting/reusing them.  So keep track after large chunks of code to see what
> is happening.
>
>
> On Thu, Jun 20, 2013 at 10:45 AM, Jonathan Greenberg 
> wrote:
>>
>> Folks:
>>
>> I apologize for the cross-posting between r-help and r-sig-hpc, but I
>> figured this question was relevant to both lists.  I'm writing a
>> function to be applied to an input dataset that will be broken up into
>> chunks for memory management reasons and for parallel execution.  I am
>> trying to determine, for a given function, what the *maximum* memory
>> usage during its execution is (which may not be the beginning or the
>> end of the function, but somewhere in the middle), so I can "plan" for
>> the chunk size (e.g. have a table of chunk size vs. max memory usage).
>>
>> Is there a trick for determining this?
>>
>> --j
>>
>> --
>> Jonathan A. Greenberg, PhD
>> Assistant Professor
>> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
>> Department of Geography and Geographic Information Science
>> University of Illinois at Urbana-Champaign
>> 607 South Mathews Avenue, MC 150
>> Urbana, IL 61801
>> Phone: 217-300-1924
>> http://www.geog.illinois.edu/~jgrn/
>> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Determining the maximum memory usage of a function

2013-06-20 Thread Jonathan Greenberg

Folks:

I apologize for the cross-posting between r-help and r-sig-hpc, but I
figured this question was relevant to both lists.  I'm writing a
function to be applied to an input dataset that will be broken up into
chunks for memory management reasons and for parallel execution.  I am
trying to determine, for a given function, what the *maximum* memory
usage during its execution is (which may not be the beginning or the
end of the function, but somewhere in the middle), so I can "plan" for
the chunk size (e.g. have a table of chunk size vs. max memory usage).

Is there a trick for determining this?

--j

--
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Instant search for R documentation

2013-06-13 Thread jonathan cornelissen

Hi Spencer,
Thanks for the pointers.
> > I just wanted to share with you that we made a website over the weekend 
> > that allows "instant search" of the R documentation on CRAN, see: 
> > www.Rdocumentation.org. It's a first version, so any 
> > feedback/comments/criticism most welcome.
> 
> 
>Interesting.  Are you aware of the following:
>  * help.start()>>> Of course, but I think checking R 
> documentation online instead of with the built-in R help function could 
> provide some extra benefits. First, you are capable of searching through the 
> latest version of all R packages, even those that are not installed on your 
> device. This makes it not only a help tool, but also a tool for discovery 
> (the fact that you can see search results while typing in the search box,  
> increases the "discovery element" further). Second, I added the discussion 
> system Disqus. For every function and package, Disqus allows users to ask 
> questions, add extra examples to the documentation, etc.  This could become 
> an added value (conditionally on the fact that people use of it of course 
> :-)>  * The R Wiki (the fourth item under "Documentation" on the 
> > left at "r-project.org").  Might you want to consider merging your 
> "rdocumentation.org" with this?
>>> That would be an interesting option, but I guess that's not up to me to 
>>> decide ;-).
>- NOTE:  The R Wiki unfortunately has not gotten the 
> attention and development I believe it deserves.  I'm not sure why this 
> is.  The standard Wikipedia gets many contributors.  One difference I 
> noticed is that to edit this, one needs to login. That's not true for 
> Wikimedia projects.  Beyond that, with stardard Mediawiki markup 
> language can intimidate some people.  Fortunately, difficulties in using 
> the Mediawiki softaware will soon be reduced. One of the primary 
> priorities of the software development team at the Wikimedia Foundation 
> is modifying the Mediawiki software to include a beta version of a 
> visual (WYSIWYG) editor.  I saw a demo of a beta version of this a month 
> ago.  It's already available for limited use, but I don't think it's 
> quite ready yet.  I think it might be wise to check for it later this year.
> 
> 
>  * The "sos" package with its vignette for searching CRAN 
> packages and getting the result sorted to place first the package with 
> the most matches.>>> I was unaware of the sos package, looks very nice, thank 
> you for sharing!
>Hope this helps.
>Spencer
> 
> > Best regards,
> >
> > Jonathan
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Instant search for R documentation

2013-06-12 Thread jonathan cornelissen

Hi, 

I just wanted to share with you that we made a website over the weekend that 
allows "instant search" of the R documentation on CRAN, see: 
www.Rdocumentation.org. It's a first version, so any 
feedback/comments/criticism most welcome.

Best regards, 

Jonathan  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Official way to set/retrieve options in packages?

2013-06-02 Thread Jonathan Greenberg

What would be an example of setting, saving, and re-loading an option to a
user's .Rprofile -- and would this be a no-no in a CRAN package?

--j


On Sat, Jun 1, 2013 at 4:57 PM, Prof Brian Ripley wrote:

> On 01/06/2013 22:44, Anthony Damico wrote:
> > hope this helps..  :)
> >
> >  # define an object `x`
> >  x <- list( "any value here" , 10 )
> >
> >  # set `myoption` to that object
> >  options( "myoption" = x )
> >
> >  # retrieve it later (perhaps within a function elsewhere in the
> package)
> >  ( y <- getOption( myoption ) )
> >
> >
> > it's nice to name your options `mypackage.myoption` so users know what
> > package the option is associated with in case they type `options()`
> >
> >
> > here's the `.onLoad` function in the R survey package.  notice how the
> > options are only set *if* they don't already exist--
>
> But a nicer convention is that used by most packages in R itself: if the
> option is not set, the function using it assumes a suitable default.
> That would make sense for all the FALSE defaults below.
>
> Note though that this is not 'persistent': users have to set options in
> their startup files (see ?Startup).   There is no official location to
> store package configurations.  Users generally dislike software saving
> settings in their own file space so it seems very much preferable to use
> the standard R mechanisms (.Rprofile etc).
>
> >
> >> survey:::.onLoad
> >
> > function (...)
> > {
> >  if (is.null(getOption("survey.lonely.psu")))
> > options(survey.lonely.psu = "fail")
> >  if (is.null(getOption("survey.ultimate.cluster")))
> > options(survey.ultimate.cluster = FALSE)
> >  if (is.null(getOption("survey.want.obsolete")))
> > options(survey.want.obsolete = FALSE)
> >  if (is.null(getOption("survey.adjust.domain.lonely")))
> > options(survey.adjust.domain.lonely = FALSE)
> >  if (is.null(getOption("survey.drop.replicates")))
> > options(survey.drop.replicates = TRUE)
> >  if (is.null(getOption("survey.multicore")))
> > options(survey.multicore = FALSE)
> >  if (is.null(getOption("survey.replicates.mse")))
> > options(survey.replicates.mse = FALSE)
> > }
> > 
> >
> >
> >
> >
> > On Sat, Jun 1, 2013 at 4:01 PM, Jonathan Greenberg  >wrote:
> >
> >> R-helpers:
> >>
> >> Say I'm developing a package that has a set of user-definable options
> that
> >> I would like to be persistent across R-invocations (they are saved
> >> someplace).  Of course, I can create a little text file to be
> written/read,
> >> but I was wondering if there is an "officially sanctioned" way to do
> this?
> >>   I see there is an options() and getOptions() function, but I'm
> unclear how
> >> I would use this in my own package to create/save new options for my
> >> particular package.  Cheers!
> >>
> >> --j
> >>
> >> --
> >> Jonathan A. Greenberg, PhD
> >> Assistant Professor
> >> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
> >> Department of Geography and Geographic Information Science
> >> University of Illinois at Urbana-Champaign
> >> 607 South Mathews Avenue, MC 150
> >> Urbana, IL 61801
> >> Phone: 217-300-1924
> >> http://www.geog.illinois.edu/~jgrn/
> >> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of

[R] Official way to set/retrieve options in packages?

2013-06-01 Thread Jonathan Greenberg

R-helpers:

Say I'm developing a package that has a set of user-definable options that
I would like to be persistent across R-invocations (they are saved
someplace).  Of course, I can create a little text file to be written/read,
but I was wondering if there is an "officially sanctioned" way to do this?
 I see there is an options() and getOptions() function, but I'm unclear how
I would use this in my own package to create/save new options for my
particular package.  Cheers!

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RCurl: using ls instead of NLST

2013-05-29 Thread Jonathan Greenberg

R-helpers:

I'm trying to retrieve the contents of a directory from an ftp site
(ideally, the file/folder names as a character vector):
"ftp://e4ftl01.cr.usgs.gov/MOTA/MCD12C1.005/";
# (MODIS data)

Where I get the following error via RCurl:
require("RCurl")
url <- "ftp://e4ftl01.cr.usgs.gov/MOTA/MCD12C1.005/";
filenames = getURL(url,ftp.use.epsv=FALSE,ftplistonly=TRUE)
> Error in function (type, msg, asError = TRUE)  : RETR response: 550

Through some sleuthing, it turns out the ftp site does not support NLST
(which RCurl is using), but will use "ls" to list the directory contents --
is there any way to use "ls" remotely on this site?  Thanks!

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] points overlay axis

2013-05-17 Thread Jonathan Phillips

Apologies to John - I should have thought to give an example.  However, xpd
is what I was looking for.  Thanks for the help!


On 14 May 2013 14:55, David Carlson  wrote:

> Let's try again after restraining Outlook's desire to use html.
>
> set.seed(42)
> dat <- matrix(c(runif(48), 0, 0), 25, 2, byrow=TRUE)
>
> # Complete plot symbol on axes, but axis on top
> plot(dat, xaxs="i", yaxs="i", pch=16, col="red", xpd=TRUE)
>
> # Complete plot symbol on axes with symbol on top
> plot(dat, xaxs="i", yaxs="i", type="n")
> points(dat, xaxs="i", yaxs="i", pch=16, col="red", xpd=TRUE)
>
> 
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On
> Behalf Of John Kane
> Sent: Tuesday, May 14, 2013 7:47 AM
> To: Jonathan Phillips; r-help@r-project.org
> Subject: Re: [R] points overlay axis
>
> Probably but since we don't know what you are doing, it is very hard to
> give
> any advice.
>
> Please read this for a start
> https://github.com/hadley/devtools/wiki/Reproducibility and give us a
> clear
> statement of the problem
>
> Thanks
>
> John Kane
> Kingston ON Canada
>
>
> > -Original Message-
> > From: 994p...@gmail.com
> > Sent: Tue, 14 May 2013 13:34:35 +0100
> > To: r-help@r-project.org
> > Subject: [R] points overlay axis
> >
> > Hi,
> > I'm trying to do quite a simple task, but I'm stuck.
> >
> > I've set xaxs = 'i' as I want the origin to be (0,0), but
> > unfortunately I have points that are sat on the axis.  R draws the
> > axis over the points, which hides the points somewhat and looks
> unsightly.
> > Is there any way of getting a point to be drawn over the axis?
> >
> > Thanks,
> > Jon Phillips
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> 
> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
> family!
> Visit http://www.inbox.com/photosharing to find out more!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R help: Batch read files based on names in a list

2013-05-15 Thread Jonathan Dry

*

I am currently reading in a series of files, applying the same functions to
them one at a time, and then merging the resulting data frames e.g.:

>MyRows <- c("RowA", "RowB", "RowC")>>File1_DF <- 
>read.delim("DirectoryToFiles\\File1_Folder\\File1.txt", 
>stringsAsFactors=FALSE, check.names=FALSE)>File1_DF <- 
>as.data.frame(t(File1_DF[MyRows,]))>File1_DF <- 
>as.data.frame(t(File1_DF))>mergeDF <- merge(mergeDF,File1_DF, by.x = 
>"Row.names", by.y="row.names")>>File2_DF <- 
>read.delim("DirectoryToFiles\\File2_Folder\\File2.txt", 
>stringsAsFactors=FALSE, check.names=FALSE)>File2_DF <- 
>as.data.frame(t(File2_DF[MyRows,]))>File2_DF <- 
>as.data.frame(t(File2_DF))>mergeDF <- merge(mergeDF,File2_DF, by.x = 
>"Row.names", by.y="row.names")

...etc

I want to know if I can use a list of the filenames c("File1", "File2",
"File2") etc. and apply a function to do this in a more automated fasion?
This would involve using the list value in the directory path to read in
the file i.e.

>*MyFilesValue*_DF <- 
>read.delim("DirectoryToFolders\\*MyFilesValue*_Folder\\*MyFilesValue*.txt",
> stringsAsFactors=FALSE, check.names=FALSE)

Any help appreciated
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] points overlay axis

2013-05-14 Thread Jonathan Phillips

Hi,
I'm trying to do quite a simple task, but I'm stuck.

I've set xaxs = 'i' as I want the origin to be (0,0), but unfortunately I
have points that are sat on the axis.  R draws the axis over the points,
which hides the points somewhat and looks unsightly.
Is there any way of getting a point to be drawn over the axis?

Thanks,
Jon Phillips

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Stepwise regression for multivariate case in R?

2013-04-26 Thread Jonathan Jansson

Hi! I am trying to make a stepwise regression in the multivariate case, using 
Wilks' Lambda test.
I've tried this: 
>  greedy.wilks(cbind(Y1,Y2) ~ . , data=my.data )
But it only returns:
Error in model.frame.default(formula = X[, j] ~ grouping, drop.unused.levels = 
TRUE) : 
  variable lengths differ (found for 'grouping') 
What can be wrong here? I have checked and all variables in my.data is of the 
same length.
//Jonathan
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Singular design matrix in rq

2013-04-19 Thread Jonathan Greenberg

Roger:

Doh!  Just realized I had that error in the code -- raw_data is the same as
mydata, so it should be:

mydata <- read.csv("singular.csv")
plot(mydata$predictor,mydata$response)
# A big cloud of points, nothing too weird
summary(mydata)
# No NAs:

#   Xresponse predictor
# Min.   :1   Min.   :0.0   Min.   : 0.000
# 1st Qu.:12726   1st Qu.:  851.2   1st Qu.: 0.000
# Median :25452   Median : 2737.0   Median : 0.000
# Mean   :25452   Mean   : 3478.0   Mean   : 5.532
# 3rd Qu.:38178   3rd Qu.: 5111.6   3rd Qu.: 5.652
# Max.   :50903   Max.   :26677.8   Max.   :69.342

fit_spl <- rq(response ~ bs(predictor,df=15),tau=1,data=mydata)
# Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix

--j



On Fri, Apr 19, 2013 at 8:15 AM, Koenker, Roger W wrote:

> Jonathan,
>
> This is not what we call a reproducible example... what is raw_data?  Does
> it have something to do with mydata?
> what is i?
>
> Roger
>
> url:www.econ.uiuc.edu/~rogerRoger Koenker
> emailrkoen...@uiuc.eduDepartment of Economics
> vox: 217-333-4558University of Illinois
> fax:   217-244-6678Urbana, IL 61801
>
> On Apr 16, 2013, at 2:58 PM, Greenberg, Jonathan wrote:
>
> > Quantreggers:
> >
> > I'm trying to run rq() on a dataset I posted at:
> >
> https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
> > (it's a 1500kb csv file named "singular.csv") and am getting the
> following error:
> >
> > mydata <- read.csv("singular.csv")
> > fit_spl <- rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
> > > Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
> >
> > Any ideas what might be causing this or, more importantly, suggestions
> for how to solve this?  I'm just trying to fit a smoothed hull to the top
> of the data cloud (hence the large df).
> >
> > Thanks!
> >
> > --jonathan
> >
> >
> > --
> > Jonathan A. Greenberg, PhD
> > Assistant Professor
> > Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
> > Department of Geography and Geographic Information Science
> > University of Illinois at Urbana-Champaign
> > 607 South Mathews Avenue, MC 150
> > Urbana, IL 61801
> > Phone: 217-300-1924
> > http://www.geog.illinois.edu/~jgrn/
> > AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>
>


-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Singular design matrix in rq

2013-04-16 Thread Jonathan Greenberg

Quantreggers:

I'm trying to run rq() on a dataset I posted at:
https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
(it's a 1500kb csv file named "singular.csv") and am getting the following
error:

mydata <- read.csv("singular.csv")
fit_spl <- rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
> Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix

Any ideas what might be causing this or, more importantly, suggestions for
how to solve this?  I'm just trying to fit a smoothed hull to the top of
the data cloud (hence the large df).

Thanks!

--jonathan


-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Check if a character vector can be coerced to numeric?

2013-03-21 Thread Jonathan Greenberg

Yep, type.convert was exactly what I was looking for (with as.is=TRUE).
 Thanks!


On Thu, Mar 21, 2013 at 1:31 PM, Prof Brian Ripley wrote:

> On 21/03/2013 18:20, Jonathan Greenberg wrote:
>
>> Given an arbitrary set of character vectors:
>>
>> myvect1 <- c("abc","3","4")
>> myvect2 <- c("2","3","4")
>>
>> I would like to develop a function that will convert any vectors that can
>> be PROPERLY converted to a numeric (myvect2) into a numeric, but leaves
>> character vectors which cannot be converted (myvect1) alone.  Is there any
>> simple way to do this (e.g. some function that tests if a vector is
>> coercible to a numeric before doing so)?
>>
>> --j
>>
>
> ?type.convert
>
> It does depend what you mean by 'properly'.  Can
> "123.456789012344567890123455" be converted 'properly'?  [See the NEWS for
> R-devel.]
>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  
> http://www.stats.ox.ac.uk/~**ripley/<http://www.stats.ox.ac.uk/~ripley/>
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Check if a character vector can be coerced to numeric?

2013-03-21 Thread Jonathan Greenberg

Given an arbitrary set of character vectors:

myvect1 <- c("abc","3","4")
myvect2 <- c("2","3","4")

I would like to develop a function that will convert any vectors that can
be PROPERLY converted to a numeric (myvect2) into a numeric, but leaves
character vectors which cannot be converted (myvect1) alone.  Is there any
simple way to do this (e.g. some function that tests if a vector is
coercible to a numeric before doing so)?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Issue with matrices within nested for-loops

2013-02-22 Thread Jonathan Richar

Greetings!

I am trying to compare simulated environmental conditions from a model
against a recruitment time series for a species of crab by first dropping 5
data points, and then using the remainder to attempt to simulate the
missing data as a measure of best fit and using the following code:


all.mat<-as.matrix(comb,ncol=ncol(comb),nrow=nrow(comb))
obs<-as.matrix(R2,24,1)
mod<-all.mat

results<-numeric(ncol(mod))


for(i in mod) {
x<-mod[,i]
resid <- matrix(NA, 1000, 5)
for(k in 1:1000) {
sub<-sample(1:24,19)
fit<-lm(obs~x,subset=sub)
cf<-coef(fit)
p <- cf[1] + cf[2] * x[-sub]
resid[k,] <- obs[-sub] - p
}
results[i] <- mean(resid^2)
}


where* R2* is a 24x1 matrix with recruitment data, *comb* was a cbind()
object combining two matrices and *all.mat* is the final 565x24 matrix of
modeled environmental scenarios. When the script is run the first 99
scenarios are processed properly and I get readable output. At scenario 100
however, I get this message:

*Error in na.omit.data.frame(list(obs = c(0.414153096303487,
1.39649463342491,  : subscript out of bounds*

Which I understand to mean that the bounds of the indicated vector/matrix
have been violated. I am however at a loss as to how to resolve this. Any
advice would be appreciated

Cheers!

JR

-- 

Jonathan Richar
Doctoral candidate
UAF SFOS Fisheries Division
17101 Pt. Lena Loop Rd.
University of Alaska Fairbanks
Juneau, AK 99801
Phone: (907) 796-5459

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Mixed models with missing data

2013-02-15 Thread Bone, Jonathan

Hi,

I am creating a mixed model based on a experiment where each subject has 2 
repeats. In some instances though there is only data for one of a given 
subjects repeats for most there is data for both. Can I still justify having 
subject as a random effect?

Thanks,

Jonathan

[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mixed effects regression with non-independent data

2013-02-15 Thread Bone, Jonathan

Hi,
I will be performing mixed effects regression on subjects total scores from 2 
player games (prisoners dilemma) that played. I am aware that including both 
players score from a game will cause problems due to non-independence. Is there 
a way that to deal with this apart from randomly picking one subject from each 
game for the analysis (and so losing half the data). Is there a way to 
introduce this into the model instead, perhaps as a random effect??

Thanks,

Jonathan
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Inserting rows of interpolated data

2013-02-11 Thread Benstead, Jonathan

Dear help list - I have light data with 5-min time-stamps. I would like to 
insert four 1-min time-stamps between each row and interpolate the light data 
on each new row. To do this I have come up with the following code:

lightdata <- read.table("Test_light_data.csv", header = TRUE, sep = ",")  # 
read data file into object "lightdata"
library(chron)
mins <- data.frame(times(1:1439/1440)) # generate a dataframe of 24 hours of 
1-min timestamps
Nth.delete <- function(dataframe, n)dataframe[-(seq(n, to=nrow(dataframe), 
by=n)),] # function for deleting nth row
empty <- data.frame("1/9/13", Nth.delete(mins, 5), "NA") # delete all 5-min 
timestamps in a new dataframe
colnames(empty) <- c("date", "time", "light") # add correct column name to 
empty timestamp dataframe
newdata <- rbind(lightdata, empty)

I get the following error message:

Warning message:
In `[<-.factor`(`*tmp*`, ri, value = c(0.000694, 
0.00139,  :
  invalid factor level, NAs generated

Digging into this a little, I can see that the two time columns are doing what 
I need and APPEAR to be similar in format:

> head(lightdata)
datetime   light
1 1/9/13 0:00:00 -0.00040925
2 1/9/13 0:05:00 -0.00023386
3 1/9/13 0:10:00 -0.00032155
4 1/9/13 0:15:00 -0.00017539
5 1/9/13 0:20:00 -0.00029232
6 1/9/13 0:25:00 -0.00038002

> head(empty)
date time light
1 1/9/13 00:01:00NA
2 1/9/13 00:02:00NA
3 1/9/13 00:03:00NA
4 1/9/13 00:04:00NA
5 1/9/13 00:06:00NA
6 1/9/13 00:07:00NA

but they clearly are not as far as R is concerned, as shown by str:

> str(lightdata)
'data.frame':   288 obs. of  3 variables:
 $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ...
 $ time : Factor w/ 288 levels "0:00:00","0:05:00",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ light: num  -0.000409 -0.000234 -0.000322 -0.000175 -0.000292 ...

> str(empty)
'data.frame':   1152 obs. of  3 variables:
 $ date : Factor w/ 1 level "1/9/13": 1 1 1 1 1 1 1 1 1 1 ...
 $ time :Class 'times'  atomic [1:1152] 0.000694 0.001389 0.002083 0.002778 
0.004167 ...
  .. ..- attr(*, "format")= chr "h:m:s"
 $ light: Factor w/ 1 level "NA": 1 1 1 1 1 1 1 1 1 1 ...

In the first (original) dataframe, light is a factor, while in the dataframe of 
generated timestamps, the timestamps are actually still in fractions of a day.

Presumably this is why rbind is not working? Can anyone help? By the way, I 
know I can use na.approx in zoo to do the eventual interpolation of the light 
data. It's getting there that has me stumped for now.

Many thanks, Jon (new R user).
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Count of Histogram Bins using Shingles with lattice

2013-02-08 Thread Burns, Jonathan (NONUS)

I know that I can get a count of histogram bins in base R with plot=FALSE. 
However, I'd like to do the same thing with lattice.  The problem is that I've 
set up shingles, and I'd like to get the count within each bin within each 
shingle.  plot=FALSE doesn't seem to do it.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loading a list into the environment

2013-02-02 Thread Jonathan Greenberg

Thanks all!  list2env was exactly what I was looking for.  As an FYI (and
please correct me if I'm wrong), if you want to load a list into the
current environment, use:

myvariables <- list(a=1:10,b=20)
loadenv <- list2env(myvariables ,envir=environment())
a
b

--j


On Fri, Feb 1, 2013 at 5:49 PM, Rui Barradas  wrote:

> Hello,
>
> Something like this?
>
> myfun <- function(x, envir = .GlobalEnv){
> nm <- names(x)
> for(i in seq_along(nm))
> assign(nm[i], x[[i]], envir)
> }
>
> myvariables <- list(a=1:10,b=20)
>
> myfun(myvariables)
> a
> b
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 01-02-2013 22:24, Jonathan Greenberg escreveu:
>
>  R-helpers:
>>
>> Say I have a list:
>>
>> myvariables <- list(a=1:10,b=20)
>>
>> Is there a way to load the list components into the environment as
>> variables based on the component names?  i.e. by applying this theoretical
>> function to myvariables I would have the variables a and b loaded into the
>> environment without having to explicitly define them.
>>
>> --j
>>
>>


-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Loading a list into the environment

2013-02-01 Thread Jonathan Greenberg

R-helpers:

Say I have a list:

myvariables <- list(a=1:10,b=20)

Is there a way to load the list components into the environment as
variables based on the component names?  i.e. by applying this theoretical
function to myvariables I would have the variables a and b loaded into the
environment without having to explicitly define them.

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scaling of nonbinROC penalties - accurate classification with random data?

2013-01-24 Thread Jonathan Williams

ccuracy for an ordinal gold standard should depend on the 
absolute values of the penalty matrix.

So, I would like to ask, ought there to be some constraint on the values of the 
penalty matrix? For example, (a) should the penalty matrix always contain at 
least one penalty with a value of 1 and/or (b) should there be any other 
constraint on the sum of penalties in the matrix (e.g. should the matrix sum to 
some multiple of the number of categories), or (c) is one free to use 
arbitrarily-scaled penalty matrices?

I apologise if I am wasting your by making an obvious mistake. I am a 
clinician, not a statistician. So, I do not understand the mathematics. 

Thanks, in advance, for your help,

Jonathan Williams 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scaling of nonbinROC penalties - accurate classification with random data?

2013-01-24 Thread Jonathan Williams




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding a line to barchart

2013-01-23 Thread Jonathan Greenberg

Great!  This really helped!  One quick follow-up -- is there a trick to
placing a label wherever the line intersects the x-axis (either above or
below the plot)?


On Tue, Jan 22, 2013 at 11:49 PM, PIKAL Petr  wrote:

> Hi
> This function adds line to each panel
>
> addLine <- function (a = NULL, b = NULL, v = NULL, h = NULL, ..., once = F)
> {
> tcL <- trellis.currentLayout()
> k <- 0
> for (i in 1:nrow(tcL)) for (j in 1:ncol(tcL)) if (tcL[i,
> j] > 0) {
> k <- k + 1
> trellis.focus("panel", j, i, highlight = FALSE)
> if (once)
> panel.abline(a = a[k], b = b[k], v = v[k], h = h[k],
> ...)
> else panel.abline(a = a, b = b, v = v, h = h, ...)
> trellis.unfocus()
> }
>   }
>
>
> addLine(v=2, col=2, lty=3)
>
> Petr
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of Jonathan Greenberg
> > Sent: Tuesday, January 22, 2013 11:42 PM
> > To: r-help
> > Subject: [R] Adding a line to barchart
> >
> > R-helpers:
> >
> > I need a quick help with the following graph (I'm a lattice newbie):
> >
> > require("lattice")
> > npp=1:5
> > names(npp)=c("A","B","C","D","E")
> > barchart(npp,origin=0,box.width=1)
> >
> > # What I want to do, is add a single vertical line positioned at x = 2
> > that lays over the bars (say, using a dotted line).  How do I go about
> > doing this?
> >
> > --j
> >
> > --
> > Jonathan A. Greenberg, PhD
> > Assistant Professor
> > Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
> > Department of Geography and Geographic Information Science University
> > of Illinois at Urbana-Champaign
> > 607 South Mathews Avenue, MC 150
> > Urbana, IL 61801
> > Phone: 217-300-1924
> > http://www.geog.illinois.edu/~jgrn/
> > AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding a line to barchart

2013-01-22 Thread Jonathan Greenberg

R-helpers:

I need a quick help with the following graph (I'm a lattice newbie):

require("lattice")
npp=1:5
names(npp)=c("A","B","C","D","E")
barchart(npp,origin=0,box.width=1)

# What I want to do, is add a single vertical line positioned at x = 2 that
lays over the bars (say, using a dotted line).  How do I go about doing
this?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scaling of nonbinROC penalties

2013-01-18 Thread Jonathan Williams

penalties in the matrix (e.g. should the matrix sum to some 
multiple of the number of categories), or (c) is one free to use 
arbitrarily-scaled penalty matrices for estimates of the accuracy of an ordinal 
gold standard?

Thanks, in advance, for your help,

Jonathan Williams


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Best way to coerce numerical data to a predetermined histogram bin?

2012-12-06 Thread Jonathan Greenberg

Folks:

Say I have a set of histogram breaks:

breaks=c(1:10,15)

# With bin ids:

bin_ids=1:(length(breaks)-1)

# and some data (note that some of it falls outside the breaks:

data=runif(min=1,max=20,n=100)

***

What is the MOST EFFICIENT way to "classify" data into the histogram bins
(return the bin_ids) and, say, return NA if the value falls outside of the
bins.

By classify, I mean if the data value is greater than one break, and less
than or equal to the next break, it gets assigned that bin's ID (note that
length(breaks) = length(bin_ids)+1)

Also note that, as per this example, the bins are not necessarily equal
widths.

I can, of course, cycle through each element of data, and then move through
breaks, stopping when it finds the correct bin, but I feel like there is
probably a faster (and more elegant) approach to this.  Thoughts?

--j





-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Anomalous outputs from rbeta when using two different random number seeds

2012-12-06 Thread Jonathan Minton

Hi, in the code below, I am drawing 1000 samples from two beta
distributions, each time using the same random number seed.

Using set.seed(80) produces results I expect, in that the differences
between the distributions are very small.

Using set.seed(20) produces results I can't make sense of. Around half of
the time, it behaves as with set.seed(80), but around half of the time, it
behaves very differently, with a much wider distribution of differences
between the two distributions.


# Beta parameters

#distribution 1
u1.a <- 285.14
u1.b <- 190.09

# distribution 2
u2.a <- 223.79
u2.b <- 189.11

#Good example: output is as expected

set.seed(80); u1.good <- rbeta(1000, u1.a, u1.b)
set.seed(80); u2.good <- rbeta(1000, u2.a, u2.b)


#Bad example: output is different to expected
set.seed(20); u1.bad <- rbeta(1000, u1.a, u1.b)
set.seed(20); u2.bad <- rbeta(1000, u2.a, u2.b)


# plot of distributions using set.seed(80), which behaves as expected
plot(u2.good ~ u1.good, ylim=c(0.45, 0.70), xlim=c(0.45, 0.70))
abline(0,1)

# plot of distributions using set.seed(20), which is different to expected
plot(u2.bad ~ u1.bad, ylim=c(0.45, 0.70), xlim=c(0.45, 0.70))
abline(0,1)

# plot of differences when using set.seed(80)
plot(u1.good - u2.good, ylim=c(-0.2, 0.2))
abline(h=0)

# plot of differences when using set.seed(20)
plot(u1.bad - u2.bad, ylim=c(-0.2, 0.2))
abline(h=0)


Could you explain why using set.seed(20) produces this chaotic pattern of
behaviour?


Many thanks,
Jon

-- 
Dr Jon Minton
Research Associate
Health Economics & Decision Science
School of Health and Related Research
University of Sheffield
Times Higher Education University of the Year
Tel: +44(0)114 222 0836
email: j.min...@sheffield.ac.uk

http://www.shef.ac.uk/scharr/sections/heds
http://scharrheds.blogspot.co.uk/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to change smoothing constant selection procedure for Winters Exponential Smoothing models?

2012-11-28 Thread Jonathan Seaver

Hello all,

I am looking for some help in understanding how to change the way R
optimizes the smoothing constant selection process for the HoltWinters
function.

I'm a SAS veteran but very new to R and still learning my way around.

Here is some sample data and the current HoltWinters code I'm using:

rawdata <- c(294, 316, 427, 487, 441, 395, 473, 423, 389, 422, 458, 411,
433, 454, 551, 623, 552, 520, 553, 510, 565, 547, 529, 526, 550, 577, 588,
606, 595, 622, 603, 672, 733, 793, 890, 830)
timeseries_01 <- ts(rawdata, frequency=12, start=c(2009,1))
plot.ts(timeseries_01)

m <- HoltWinters(timeseries_01, alpha = NULL, beta = NULL, gamma = TRUE,
seasonal = c("multiplicative"),
start.periods = 2, l.start = NULL, b.start = NULL, s.start =
NULL)
p <- predict(m, 24, prediction.interval = TRUE)
plot(m, p)


My problem is that I disagree with how R is choosing these smoothing
constants and I would like to explore how some of the other methodologies
listed in the OPTIM function [such as Nelder-Mead, BFGS, CG, L-BFGS-B,
SANN, and Brent], but it is unclear to me how I would go about doing this.

For example, the above code results in the following constants:

alpha:  0.7952587
beta :  0.01382988
gamma:  1


However, using alternate software, I find that...

 alpha:  0.990
 beta :  0.001
 gamma:  0.001


...actually fit this series much better, thus I would like to see if I can
adjust R to reproduce this method of optimizing the three smoothing
constants.

Can anyone help?

Thank you,
Jonathan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting the "non-attribute" part of an object

2012-11-15 Thread Jonathan Dushoff

Thanks for all of these useful answers.

Thanks also to Ben Bolker, who told me offline that c() is a general
way to access the "main" part of an object (not tested).

I also tried:

> identical(matrix(tm), matrix(tmm))
[1] TRUE

which also works, but does _not_ solve the problem Rolf warns about
below (to my disappointment).

JD

On Thu, Nov 15, 2012 at 6:47 PM, Rolf Turner  wrote:

> I think that what you are looking for is:

> all.equal(tm,tmm,check.attributes=FALSE)

> But BEWARE:

> m   <- matrix(1:36,4,9)
> mm <- matrix(1:36,12,3)
> all.equal(m,mm,check.attributes=FALSE)

> gives TRUE!!!  I.e. sometimes attributes really are vital characteristics.

> cheers,

>     Rolf Turner


> On 16/11/12 08:52, Jonathan Dushoff wrote:

>> I have two matrices, generated by R functions that I don't understand.
>>   I want to confirm that they're the same, but I know that they have
>> different attributes.

>> If I want to compare the dimnames, I can say

>>> identical(attr(tm, "dimnames"), attr(tmm, "dimnames"))

>> [1] FALSE

>> or even:

>>> identical(dimnames(tm), dimnames(tmm))

>> [1] FALSE

>> But I can't find any good way to compare the "main" part of objects.

>> What I'm doing now is:

>>> tm_new <- tm
>>> tmm_new <- tmm
>>> attributes(tm_new) <- attributes(tmm_new) <- NULL
>>> identical(tm_new, tmm_new)

>> [1] TRUE

>> But that seems very inaesthetic, besides requiring that I create two
>> pointless objects.

>> I have read ?attributes, ?attr and some web introductions to how R
>> objects work, but have not found an answer.

>> Thanks for any help.

>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Selecting the "non-attribute" part of an object

2012-11-15 Thread Jonathan Dushoff

I have two matrices, generated by R functions that I don't understand.
 I want to confirm that they're the same, but I know that they have
different attributes.

If I want to compare the dimnames, I can say

> identical(attr(tm, "dimnames"), attr(tmm, "dimnames"))
[1] FALSE

or even:

> identical(dimnames(tm), dimnames(tmm))
[1] FALSE

But I can't find any good way to compare the "main" part of objects.

What I'm doing now is:

> tm_new <- tm
> tmm_new <- tmm

> attributes(tm_new) <- attributes(tmm_new) <- NULL

> identical(tm_new, tmm_new)
[1] TRUE

But that seems very inaesthetic, besides requiring that I create two
pointless objects.

I have read ?attributes, ?attr and some web introductions to how R
objects work, but have not found an answer.

Thanks for any help.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Best R textbook for undergraduates

2012-10-23 Thread Jonathan Greenberg

R-helpers:

I'm sure this question has been asked and answered through the ages, but
given there are some new textbooks out there, I wanted to re-pose it.  For
a course that will cover the application of R for general computing and
spatial modeling, what textbook would be best to introduce computing with R
to *undergrads*?  I understand Bivand and Pebesma's book is fine for
spatial work, but it appears to be for more advanced users -- I'd like a
companion textbook that is better for complete beginners to ALL forms of
programming (e.g. they don't know what an object is, a loop is, an if-then
statement, etc).  Suggestions?  In particular, I'd like to hear from those
of you who have TAUGHT classes using R.  Thanks!

--jonathan

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can I please be taken off the mailing list

2012-10-20 Thread Jonathan Brown



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Completely ignoring an error in a function...

2012-10-17 Thread Jonathan Greenberg

The code base is a bit too complicated to paste in here, but the gist of my
question is this: given I have a function

myfunction <- function(x)
{
# Do something A
# Do something B
# Do something C
}

Say "#Do something B" returns this error:
Error in cat(list(...), file, sep, fill, labels, append) :
  argument 2 (type 'list') cannot be handled by 'cat'

A standard function would stop here.  HOWEVER, I want, in this odd case, to
say "keep going" to my function and have it proceeed to # Do something C.
 How do I accomplish this?  I thought suppressWarnings() would do it but it
doesn't appear to.

Assume that debugging "Do something B" is out of the question.  Why am I
doing this?  Because in my odd case, "Do something B" actually does what I
needed it to, but returned an error that is irrelevant to my special case
(it creates two files, failing on the second of the two files -- but the
first file it creates is what I wanted and there is no current way to
create that single file on its own without a lot of additional coding).

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Proposal: Package update log

2012-10-02 Thread Starkweather, Jonathan

Thanks to all for the responses and suggestions.

I was primarily proposing a more detailed change log for packages on CRAN. To 
my mind, repositories like R-forge host packages more 'raw' than those on CRAN 
(i.e. CRAN seems to me to contain more 'finished' packages which occasionally 
are updated or added-to). Also, some packages on R-forge do not contain any 
information regarding changes/updates [I'm hesitant to offer an example because 
I'm really just a lemming in terms of R community stature...]. 

I guess what I'm saying is, the 
news(package = "yourPackageHere") 
function is not particularly useful currently because (in my *limited* 
experience) very few packages contain the news file and those which do, do not 
contain much in the way of description. Perhaps I'm being a bit too ambitious 
here, but I would just like to be able to see what has been changed (and why; 
if possible) each time a package is updated.

It would seem, from my rather rudimentary understanding, that using current 
TeX/LaTeX based tools for the basis of package documentation lends itself to 
having a better, more organized, change log or news file based on the package 
manual table of contents (toc). For instance, it would be great if we had 
something like: 
news(package = "yourPackageHere", function = "functionOfInterest") 
which could display a log of changes/updates sequentially for the named 
function of interest. Admittedly, I have not created a package myself, but I do 
have some experience with LaTeX and it may be as simple as changing the 
preamble to existing TeX file templates or style files. 

In terms of enforcement; yes I agree it would require more work from the 
package authors, as well as managers/moderators of CRAN; but, if we expect each 
package to have a working help file, then why not a (meaningfully) working 
'news' file. 

Respectfully, 


Jon Starkweather, PhD
University of North Texas
Research and Statistical Support
http://www.unt.edu/rss/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Proposal: Package update log

2012-10-02 Thread Starkweather, Jonathan

I'm relatively new to R and would first like to sincerely thank all those who 
contribute to its development. Thank you. 

I would humbly like to propose a rule which creates a standard (i.e., strongly 
encouraged, mandatory, etc.) for authors to include a `change log' documenting 
specific changes for each update made to their package(s). The more detailed 
the description, the better; and it would be exceptionally useful if the 
document were available from the R-console (similar to the help function). In 
other words, I am suggesting that the `change log' file be included in the 
package(s) and preferably accessible from the R-console. 

I am aware that many packages available on CRAN have a `change log' or `news' 
page. However; not all packages have something like that and many which do, are 
not terribly detailed in conveying what has been updated or changed. 

I am also aware that package authors are not a particularly lazy group, sitting 
around with nothing to do. My proposal would likely add a non-significant 
amount of work to the already very generous (and appreciated) work performed by 
package authors, maintainers, etc. I do, however, believe it would be greatly 
appreciated and beneficial to have more detailed update information available 
from the R-console as some of us (users) update packages daily and are often 
left wondering what exactly has been updated. 

I did not post this to the R-devel list because I consider this proposal more 
of a documentation issue than a development issue. Also, I would like to see 
some discussion of this proposal from a varied pool of stakeholders (e.g., 
users and not just developers, package authors, package maintainers, etc.).  


Respectfully,

Jon Starkweather, PhD
University of North Texas
Research and Statistical Support
http://www.unt.edu/rss/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-sig-hpc] Quickest way to make a large "empty" file on disk?

2012-09-28 Thread Jonathan Greenberg

Rui:

Quick follow-up -- it looks like seek does do what I want (I see Simon
suggested it some time ago) -- what do mean by "trash your disk"?  What I'm
trying to accomplish is getting parallel, asynchronous writes to a large
binary image (just a binary file) working.  Each node writes to a different
sector of the file via mmap, "filling in" the values as the process runs,
but the file needs to be pre-created before I can mmap it.  Running a
writeBin with a bunch of 0s would mean I'd basically have to write the file
twice, but the seek/ff trick seems to be much faster.

Do I risk doing some damage to my filesystem if I use seek?  I see there is
a strongly worded warning in the help for ?seek:

"Use of seek on Windows is discouraged. We have found so many errors in the
Windows implementation of file positioning that users are advised to use it
only at their own risk, and asked not to waste the *R* developers' time
with bug reports on Windows' deficiencies." --> there's no detail here on
which errors people have experienced, so I'm not sure if doing something as
simple as just "creating" a file using seek falls under the "discouraging"
category.

As a note, we are trying to work this up on both Windows and *nix systems,
hence our wanting to have a single approach that works on both OSs.

--j

On Thu, Sep 27, 2012 at 3:49 PM, Rui Barradas  wrote:

>  Hello,
>
> If you really need to trash your disk, why not use seek()?
>
> > fl <- file("Test.txt", open = "wb")
> > seek(fl, where = 1024, origin = "start", rw = "write")
> [1] 0
> > writeChar(character(1), fl, nchars = 1, useBytes = TRUE)
> Warning message:
> In writeChar(character(1), fl, nchars = 1, useBytes = TRUE) :
>   writeChar: more characters requested than are in the string - will
> zero-pad
> > close(fl)
>
>
> File "Test.txt" is now 1Kb in size.
>
> Hope this helps,
>
> Rui Barradas
> Em 27-09-2012 20:17, Jonathan Greenberg escreveu:
>
> Folks:
>
> Asked this question some time ago, and found what appeared (at first) to be
> the best solution, but I'm now finding a new problem.  First off, it seemed
> like ff as Jens suggested worked:
>
> # outdata_ncells = the number of rows * number of columns * number of bands
> in an image:
> out<-ff(vmode="double",length=outdata_ncells,filename=filename)
> finalizer(out) <- close
> close(out)
>
> This was working fine until I attempted to set length to a VERY large
> number: outdata_ncells = 17711913600.  This would create a file that is
> 131.964GB.  Big, but not obscenely so (and certainly not larger than the
> filesystem can handle).  However, length appears to be restricted
> by .Machine$integer.max (I'm on a 64-bit windows box):
>
>  .Machine$integer.max
>
>  [1] 2147483647
>
> Any suggestions on how to solve this problem for much larger file sizes?
>
> --j
>
>
> On Thu, May 3, 2012 at 10:44 AM, Jonathan Greenberg  
> wrote:
>
>
>  Thanks, all!  I'll try these out.  I'm trying to work up something that is
> platform independent (if possible) for use with mmap.  I'll do some tests
> on these suggestions and see which works best. I'll try to report back in a
> few days.  Cheers!
>
> --j
>
>
>
> 2012/5/3 "Jens Oehlschlägel"  
> 
>
>  Jonathan,
>
> On some filesystems (e.g. NTFS, see below) it is possible to create
> 'sparse' memory-mapped files, i.e. reserving the space without the cost of
> actually writing initial values.
> Package 'ff' does this automatically and also allows to access the file
> in parallel. Check the example below and see how big file creation is
> immediate.
>
> Jens Oehlschlägel
>
>
>
>  library(ff)
> library(snowfall)
> ncpus <- 2
> n <- 1e8
> system.time(
>
>  + x <- ff(vmode="double", length=n, filename="c:/Temp/x.ff")
> + )
>User  System verstrichen
>0.010.000.02
>
>  # check finalizer, with an explicit filename we should have a 'close'
>
>  finalizer
>
>  finalizer(x)
>
>  [1] "close"
>
>  # if not, set it to 'close' inorder to not let slaves delete x on slave
>
>  shutdown
>
>  finalizer(x) <- "close"
> sfInit(parallel=TRUE, cpus=ncpus, type="SOCK")
>
>  R Version:  R version 2.15.0 (2012-03-30)
>
> snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2
> CPUs.
>
>
>  sfLibrary(ff)
>
>  Library ff loaded.
> Library ff loaded in cluster.
>
> Warnmeldung:
> In library(package = "ff

Re: [R] [R-sig-hpc] Quickest way to make a large "empty" file on disk?

2012-09-27 Thread Jonathan Greenberg

Folks:

Asked this question some time ago, and found what appeared (at first) to be
the best solution, but I'm now finding a new problem.  First off, it seemed
like ff as Jens suggested worked:

# outdata_ncells = the number of rows * number of columns * number of bands
in an image:
out<-ff(vmode="double",length=outdata_ncells,filename=filename)
finalizer(out) <- close
close(out)

This was working fine until I attempted to set length to a VERY large
number: outdata_ncells = 17711913600.  This would create a file that is
131.964GB.  Big, but not obscenely so (and certainly not larger than the
filesystem can handle).  However, length appears to be restricted
by .Machine$integer.max (I'm on a 64-bit windows box):
> .Machine$integer.max
[1] 2147483647

Any suggestions on how to solve this problem for much larger file sizes?

--j


On Thu, May 3, 2012 at 10:44 AM, Jonathan Greenberg wrote:

> Thanks, all!  I'll try these out.  I'm trying to work up something that is
> platform independent (if possible) for use with mmap.  I'll do some tests
> on these suggestions and see which works best. I'll try to report back in a
> few days.  Cheers!
>
> --j
>
>
>
> 2012/5/3 "Jens Oehlschlägel" 
>
>> Jonathan,
>>
>> On some filesystems (e.g. NTFS, see below) it is possible to create
>> 'sparse' memory-mapped files, i.e. reserving the space without the cost of
>> actually writing initial values.
>> Package 'ff' does this automatically and also allows to access the file
>> in parallel. Check the example below and see how big file creation is
>> immediate.
>>
>> Jens Oehlschlägel
>>
>>
>> > library(ff)
>> > library(snowfall)
>> > ncpus <- 2
>> > n <- 1e8
>> > system.time(
>> + x <- ff(vmode="double", length=n, filename="c:/Temp/x.ff")
>> + )
>>User  System verstrichen
>>0.010.000.02
>> > # check finalizer, with an explicit filename we should have a 'close'
>> finalizer
>> > finalizer(x)
>> [1] "close"
>> > # if not, set it to 'close' inorder to not let slaves delete x on slave
>> shutdown
>> > finalizer(x) <- "close"
>> > sfInit(parallel=TRUE, cpus=ncpus, type="SOCK")
>> R Version:  R version 2.15.0 (2012-03-30)
>>
>> snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2
>> CPUs.
>>
>> > sfLibrary(ff)
>> Library ff loaded.
>> Library ff loaded in cluster.
>>
>> Warnmeldung:
>> In library(package = "ff", character.only = TRUE, pos = 2, warn.conflicts
>> = TRUE,  :
>>   'keep.source' is deprecated and will be ignored
>> > sfExport("x") # note: do not export the same ff multiple times
>> > # explicitely opening avoids a gc problem
>> > sfClusterEval(open(x, caching="mmeachflush")) # opening with
>> 'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS
>> write storms when the file is larger than RAM
>> [[1]]
>> [1] TRUE
>>
>> [[2]]
>> [1] TRUE
>>
>> > system.time(
>> + sfLapply( chunk(x, length=ncpus), function(i){
>> +   x[i] <- runif(sum(i))
>> +   invisible()
>> + })
>> + )
>>User  System verstrichen
>>0.000.00   30.78
>> > system.time(
>> + s <- sfLapply( chunk(x, length=ncpus), function(i) quantile(x[i],
>> c(0.05, 0.95)) )
>> + )
>>User  System verstrichen
>>0.000.004.38
>> > # for completeness
>> > sfClusterEval(close(x))
>> [[1]]
>> [1] TRUE
>>
>> [[2]]
>> [1] TRUE
>>
>> > csummary(s)
>>  5%  95%
>> Min.0.04998 0.95
>> 1st Qu. 0.04999 0.95
>> Median  0.05001 0.95
>> Mean0.05001 0.95
>> 3rd Qu. 0.05002 0.95
>> Max.0.05003 0.95
>> > # stop slaves
>> > sfStop()
>>
>> Stopping cluster
>>
>> > # with the close finalizer we are responsible for deleting the file
>> explicitely (unless we want to keep it)
>> > delete(x)
>> [1] TRUE
>> > # remove r-side metadata
>> > rm(x)
>> > # truly free memory
>> > gc()
>>
>>
>>
>>  *Gesendet:* Donnerstag, 03. Mai 2012 um 00:23 Uhr
>> *Von:* "Jonathan Greenberg" 
>> *An:* r-help , r-sig-...@r-project.org
>> *Betreff:* [R-sig-hpc] Quickest way to make

Re: [R] list of funtions

2012-09-13 Thread Jonathan Phillips

Yes, seen it, and it's obviously the wrong thing to do or I'd be
getting the result I'm looking for.  But I can't see the correct way
of doing it.  I.e. I can't see any way of setting each element of the
list to a function with a different 'form' value without using some
'i' like variable in a loop.
I don't think it's something obvious I've missed...

On 13 September 2012 18:06, Uwe Ligges  wrote:
>
>
> On 13.09.2012 19:01, Jonathan Phillips wrote:
>>
>> Hi,
>> I have a function called fitMicroProtein which takes a value called
>> form, this can be any integer from 0-38.
>> In a larger function I'm making (it's called Newton), the first thing
>> I want to do is construct a list of functions where form is already
>> set.  So in pseudocode
>>
>> fs[[1]](...) <- fitMicroProtein(form=0,...)
>> fs[[2]](...) <- fitMicroProtein(form=1,...)
>> .
>> .
>> .
>>
>> I've tried that and it doesn't work.  Here's my code:
>>
>> Newton <- function(metaf,par,niter,dealwith_NA,...)
>> {
>>  fs <- list()
>>  for(i in 0:(length(par)-1))
>>  {
>>  fs[[i+1]] <- function(par) return(metaf(par,form=i,...))
>>  }
>> .
>> .
>> .
>>
>>
>> and the problem is with the variable 'i'.
>> If I use the debugger, I find that it is specifically that:
>>
>> When it makes f[[1]] we have
>> f[[1]] == function(par) return(metaf(par,form=0,...)
>> but the next thing it does is increment 'i', so f[[1]] becomes
>> function(par) return(metaf(par,form=1,...)
>> where I want f[[1]] to stay as
>> function(par) return(metaf(par,form=0,...)
>>
>> Does anybody know how to stop the value of f[[1]] being dependant on
>> the current value of 'i'?
>
>
> Er, you know that you have
>
>  function(par) return(metaf(par,form=i,...))
>
> in your loop. If you want to have it independent if i, why do you specify
> it?
>
> Uwe Ligges
>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] list of funtions

2012-09-13 Thread Jonathan Phillips

Hi,
I have a function called fitMicroProtein which takes a value called
form, this can be any integer from 0-38.
In a larger function I'm making (it's called Newton), the first thing
I want to do is construct a list of functions where form is already
set.  So in pseudocode

fs[[1]](...) <- fitMicroProtein(form=0,...)
fs[[2]](...) <- fitMicroProtein(form=1,...)
.
.
.

I've tried that and it doesn't work.  Here's my code:

Newton <- function(metaf,par,niter,dealwith_NA,...)
{
fs <- list()
for(i in 0:(length(par)-1))
{
fs[[i+1]] <- function(par) return(metaf(par,form=i,...))
}
.
.
.


and the problem is with the variable 'i'.
If I use the debugger, I find that it is specifically that:

When it makes f[[1]] we have
   f[[1]] == function(par) return(metaf(par,form=0,...)
but the next thing it does is increment 'i', so f[[1]] becomes
   function(par) return(metaf(par,form=1,...)
where I want f[[1]] to stay as
   function(par) return(metaf(par,form=0,...)

Does anybody know how to stop the value of f[[1]] being dependant on
the current value of 'i'?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Shading in prediction intervals

2012-09-12 Thread Jonathan Zhang


I have the following code for the minimum and maximum of my prediction interval

> y.down=lines(x[x.order], set1.pred[,2][x.order], col=109)
> y.up=lines(x[x.order], set1.pred[,3][x.order], col=109)

domain=min(x):max(x)

polygon(c(domain,rev(domain)),c(y.up,rev(y.down)),col=109)

It doesnt seem to shade the right region, it gives me a trapezoid.

Any help? Thanks!
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How do you learn/teach R?

2012-08-29 Thread Jonathan Cornelissen

Hi, 

I am conducting a survey on how people learn and teach R. I think the results 
of the survey could lead to interesting insights that could benefit the R 
community, and the educational practices in specific. The data and the results 
of the survey will be shared with everyone.

Please help us and fill out the survey on http://www.Rcademy.org. It takes less 
than 5 minutes.

Thank you, 

Jonathan Cornelissen
Doctoral researcher,
KU Leuven
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem plotting in a grid

2012-07-29 Thread Jonathan

Hi all,
I'm trying to generate a grid of four plots.  The first 2 appear
just fine, but the final 2 will not appear in the grid, instead
overwriting the first two.Any ideas on how to get them all in the
same window would be greatly appreciated.

Cheers,
Jonathan

library(fields)

par(mfrow=c(2,2)) #2x2 plot windows
plot(c(2,4),c(2,2))  # works fine
plot(c(2,4),c(2,2))  # works fine

  x <- 1:4
  y <- 5:10
  z <- matrix(0,length(x),length(y))
  z2 <- matrix(0,length(x),length(y))
  for(i in 1:length(x))
  {
for (j in 1:length(y))
{
z[i,j] <- sample(4:10,1)
z2[i,j] <- sample(4:10,1)
}
  }

  filled.contour(x,y,z,color.palette=topo.colors)  # doesn't work
  image.plot(x,y,z2,add=TRUE)  # doesn't work

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fitting particular distributions

2012-07-29 Thread Jonathan Hughes



Hi there,
I got a question that is both about stats and R. Imagine two alternative 
stochastic processes:
a. Each subject has an exponential distribution of lifetime, with a parameter 
lambda; If I simulate 100 such subjects and rank their lifetimes, they'd look 
like this:








>plot(sort(rexp(100), decreasing=T))
b. The alternative process is slightly different. Imagine that the first 
subject obeys the same rule as (a) and therefore would have an expected 
lifetime of 1/lambda. The second subject would have half of the lifetime of the 
first subject (i.e., 1/(2*lambda)), the third would have one third of the 
lifetime as the first subject (1/(2*lambda), and so on. 
The distribution of ranked data of process (b) would therefore be more concave 
than (a).
Now, here's my question. If I have a given dataset of subject lifetimes, how 
can I specify and test these alternative processes in R?
Any help will be greatly appreciated.
Jonathan  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with filled.contour() -

2012-07-03 Thread Jonathan Hughes











Dear all,
I can't figure out a way to have more than one plot using filled.contour() in a 
single plate. I tried to use layout() or par(), but the way filled.contour() is 
written seems to override those commands. 
Any suggestions would be really appreciated.
Jonathan
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 >

1 - 100 of 586 matches

Mail list logo