Re: [R] Nonnormal Residuals and GAMs

2013-11-06 Thread COLLINL
>>> The default functional link for mgcv::gam is "log", so I doubt that
>> your theoretical understanding applies to GAM's in general. When Simon
>> Wood wrote his book on GAMs his first chapter was on linear models, his
>> second chapter was on generalized lienar models at which point he had
>> written over 100 pages, and only then did he "introduce" GAMs. I think
>> you need to follow the same progression, and this forum is not the
>> correct one for statistics education. Perhaps pose your follow-up
>> questions to CrossValidated.com
>>
>> David, thank you for your advice, has the default changed for mgcv::gam?
>> Based upon the help pages for the version I have (1.7-27) I had thought
>> that the default family was gaussian() with link "identity".
>>
>> In any event I will look again at Simon Woods' book and consider
>> CrossValidated in the future.
>
> I may have gotten this wrong by only referring to my memory. I'm not able
> to tell by looking at either ?mgcv::gam or ?gam::gam pages where I
> picked up this notion.

Ok, thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonnormal Residuals and GAMs

2013-11-06 Thread David Winsemius

On Nov 6, 2013, at 5:44 PM, Collin Lynch wrote:

>> The default functional link for mgcv::gam is "log", so I doubt that
> your theoretical understanding applies to GAM's in general. When Simon
> Wood wrote his book on GAMs his first chapter was on linear models, his
> second chapter was on generalized lienar models at which point he had
> written over 100 pages, and only then did he "introduce" GAMs. I think
> you need to follow the same progression, and this forum is not the
> correct one for statistics education. Perhaps pose your follow-up
> questions to CrossValidated.com
> 
> David, thank you for your advice, has the default changed for mgcv::gam?
> Based upon the help pages for the version I have (1.7-27) I had thought
> that the default family was gaussian() with link "identity".
> 
> In any event I will look again at Simon Woods' book and consider
> CrossValidated in the future.

I may have gotten this wrong by only referring to my memory. I'm not able to 
tell by looking at either ?mgcv::gam or ?gam::gam pages where I picked 
up this notion.

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot a single frequency of a ts object

2013-11-06 Thread William Dunlap
> y2 <- ts(x2, frequency=4, start=c(1952,1))
> y2w<-ts(y2[seq(1,61,by=4)],frequency=1,start=1952)

I think it is simpler to compute y2w using the window() function, which
figures out the right call to seq() for you:
window(y2, frequency=1, start=c(1952,1)) # use c(1952,2) for springs, etc.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Jim Lemon
> Sent: Wednesday, November 06, 2013 5:54 PM
> To: Stefano Sofia
> Cc: r-help@r-project.org
> Subject: Re: [R] plot a single frequency of a ts object
> 
> On 11/07/2013 04:56 AM, Stefano Sofia wrote:
> > Dear list users,
> > I transformed two vectors of seasonal data in ts objects of frequency 4:
> >
> > y1<- ts(x1, frequency=4, start=c(1952,1))
> > y2<- ts(x2, frequency=4, start=c(1952,1))
> >
> > In this way Qtr1 corresponds to Winters, Qtr2 corresponds to Springs and so 
> > on.
> > I would like to plot on the same graph both y1 and all the Winters of y2.
> > I am not able to find an easy and straightforward way to do that.
> > Could somebody please help me in this?
> 
> Hi Stefano,
> This may do what you want:
> 
> x1<-runif(64,1,4)
> x2<-runif(64,2,5)
> y1 <- ts(x1, frequency=4, start=c(1952,1))
> y2 <- ts(x2, frequency=4, start=c(1952,1))
> plot(y1,ylim=c(1,5))
> y2w<-ts(y2[seq(1,61,by=4)],frequency=1,start=1952)
> lines(y2w)
> 
> Jim
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot a single frequency of a ts object

2013-11-06 Thread Jim Lemon

On 11/07/2013 04:56 AM, Stefano Sofia wrote:

Dear list users,
I transformed two vectors of seasonal data in ts objects of frequency 4:

y1<- ts(x1, frequency=4, start=c(1952,1))
y2<- ts(x2, frequency=4, start=c(1952,1))

In this way Qtr1 corresponds to Winters, Qtr2 corresponds to Springs and so on.
I would like to plot on the same graph both y1 and all the Winters of y2.
I am not able to find an easy and straightforward way to do that.
Could somebody please help me in this?


Hi Stefano,
This may do what you want:

x1<-runif(64,1,4)
x2<-runif(64,2,5)
y1 <- ts(x1, frequency=4, start=c(1952,1))
y2 <- ts(x2, frequency=4, start=c(1952,1))
plot(y1,ylim=c(1,5))
y2w<-ts(y2[seq(1,61,by=4)],frequency=1,start=1952)
lines(y2w)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonnormal Residuals and GAMs

2013-11-06 Thread Collin Lynch
> The default functional link for mgcv::gam is "log", so I doubt that
 your theoretical understanding applies to GAM's in general. When Simon
 Wood wrote his book on GAMs his first chapter was on linear models, his
 second chapter was on generalized lienar models at which point he had
 written over 100 pages, and only then did he "introduce" GAMs. I think
 you need to follow the same progression, and this forum is not the
 correct one for statistics education. Perhaps pose your follow-up
 questions to CrossValidated.com

David, thank you for your advice, has the default changed for mgcv::gam?
Based upon the help pages for the version I have (1.7-27) I had thought
that the default family was gaussian() with link "identity".

In any event I will look again at Simon Woods' book and consider
CrossValidated in the future.

Best,
Collin.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting multiple horizontal lines to data

2013-11-06 Thread MacQueen, Don
Possibly see the
   strucchange
package

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 11/6/13 9:19 AM, "Sashikanth Chandrasekaran"
 wrote:

>I am not trying to fit a horizontal line at every unique value of y. I am
>trying fit the y values with as few horizontal lines by trading off the
>number of horizontal lines with the error. The actual problem I am trying
>to solve is to smooth data in a time series. Here is a realistic example
>of
>y
>
>y=c(134.45,141.82,143.81,141.81,145,141.61,143.72,145.71,200,175,140,200,1
>48.77,71.64,111.57,118.15,119.15,112.8,111.64,111.64,157.26,143.8,40.19,64
>.99,64.99,129.98,64.99,65,64.98,64.99)
>
>An example fit for y using multiple horizontal lines (may not be the best
>fit in terms of squared error or another error metric, but I have included
>the y value for concreteness)
>
>1. A horizontal line at approximately y=140 (to fit the first 13 values -
>134.45 to 148.77)
>2. A horizontal line at approximately y=110 (to fit the next 7 values -
>71.64 to 111.64)
>3. A horizontal line at approximately y=150 (to fit the next 2 values -
>157.26 to 143.8)
>4. A horizontal line at approximately y=65 (to fit the last 8 values -
>40.19 to 64.99)
>-sashi.
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function does not see variables outside the function

2013-11-06 Thread Zhong-Yuan Zhang
Dear John Fox:


I highly appreciate

your help!!!  Problems solved.

 Best Wishes Always.


2013/11/6 John Fox 

> Dear Zhong-Yuan Zhang,
>
> R is lexically scoped. Pretending that you're using a different programming
> language is probably a bad idea.
>
> The findGlobals() function in the codetools package, which is part of the
> standard R distribution, can help you locate references to global variables
> (and functions) in a function. For example,
>
> > f <- function() g(a)
>
> > findGlobals(f)
> [1] "a" "g"
>
> > ff <- function() {a <- 10; g(a)}
>
> > findGlobals(ff)
> [1] "{"  "<-" "g"
>
> > fff <- function(a) g(a)
>
> > findGlobals(fff)
> [1] "g"
>
> I hope this helps,
>  John
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of Zhong-Yuan Zhang
> > Sent: Wednesday, November 06, 2013 7:32 AM
> > To: r-help@r-project.org
> > Subject: Re: [R] Function does not see variables outside the function
> >
> > Dear Experts:
> >
> > I am very appreciate your comments and help!
> >
> > Actually I am a new comer from MATLAB. If the function
> >
> > can see global variables, then it may output wrong results without
> >
> > any error messages. For example, there is a gloabl variable named
> >
> > v, and I write one funciton with one local variable x. However, in some
> > line,
> >
> > I misspelled x to v, which would results in unexpected errors without
> > warning.
> >
> > In summary, I want to disable the ability to make debugging easier.
> >
> > Best.
> >
> >
> > 2013/11/5 Carl Witthoft 
> >
> > > Why would you want to impose this restriction?  Perhaps if you
> > explain what
> > > you are trying to do, we can suggest approaches that will satisfy
> > your
> > > specific needs.
> > > (note- one can always redefine whatever variables are to be
> > "excluded."
> > > E.g.
> > > to keep the body of a function from referring to 'foo' in the calling
> > > environment, just add the line 'foo<-NA' inside the function)
> > >
> > >
> > > Zhong-Yuan Zhang wrote
> > > >  In MATLAB, functions cannot see variables outside the
> > > >
> > > > functions.  However, in R, the functions can do that. Is there
> > > >
> > > > any settings that can disable this ability of functions?
> > > >
> > > >
> > > > __
> > >
> > > > R-help@
> > >
> > > >  mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > >
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > > http://r.789695.n4.nabble.com/Function-does-not-see-variables-
> > outside-the-function-tp4679762p4679768.html
> > > Sent from the R help mailing list archive at Nabble.com.
> > >
> > > __
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
> >
> > --
> > Zhong-Yuan Zhang (PhD.)
> > Associate Professor
> > School of Statistics
> > Central University of Finance and Economics
> > 39 South College Road, Haidian District, Beijing, P.R.China 100081
> > Email: zhyua...@gmail.com
> > Homepage: http://en.stat.cufe.edu.cn/zhongyuanzhang/
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Zhong-Yuan Zhang (PhD.)
Associate Professor
School of Statistics
Central University of Finance and Economics
39 South College Road, Haidian District, Beijing, P.R.China 100081
Email: zhyua...@gmail.com
Homepage: http://en.stat.cufe.edu.cn/zhongyuanzhang/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New fortune nomination ... was Re: R help-classification accuracy of DFA and RF using caret

2013-11-06 Thread Achim Zeileis

On Thu, 7 Nov 2013, Rolf Turner wrote:


On 11/07/13 13:19, David Winsemius wrote:

Seen on StackOverflow. Any seconds?


"I would heed the warnings and diagnostics. They are there for a reason. 
The Ostrich algorithm does not help you."


   -- Dirk Eddelbuettel commenting on StackOverflow when a questioner said 
he had not run R CMD check because he suspected other problems would be 
found (Nov 2013)


Right on!!!


Yes! :-)

Now also in the devel-version on R-Forge.

Thanks,
Z


   cheers,

   Rolf



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New fortune nomination ... was Re: R help-classification accuracy of DFA and RF using caret

2013-11-06 Thread Rolf Turner

On 11/07/13 13:19, David Winsemius wrote:

Seen on StackOverflow. Any seconds?


"I would heed the warnings and diagnostics. They are there for a reason. The Ostrich 
algorithm does not help you."

   -- Dirk Eddelbuettel commenting on StackOverflow when a questioner said he 
had not run R CMD check because he suspected other problems would be found (Nov 
2013)


Right on!!!

cheers,

Rolf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Treatment effects on measurements through time: how to tell when (in time) treatment has a significant effect?

2013-11-06 Thread Jim Lemon

On 11/07/2013 07:46 AM, c_e_cressler wrote:

Hi,

The data (attached) I am looking at consists of measurements of growth rate
at different ages, for individuals in two treatments (control and infected).
What I want to know is whether and when (what age) the growth rate of
infected individuals is higher than the growth rate for control individuals.

The simplest way to approach this question is to just do a t-test at each
age, but because the growth rates at a given age depend on the growth rates
at previous ages before, that seems statistically invalid. I have looked at
some of the time series literature, but most of that seems more complicated
than what I am trying to do. What I would like to be able to say is
something like, "The growth rate of infected individuals is higher than
control individuals for ages 18-30."


Hi Clay,
If you calculate the mean growth rates:

inf_mean<-apply(as.matrix(inf.grates),1,
 mean,na.rm=TRUE)
cntl_mean<-apply(as.matrix(cntl.grates),1,
 mean,na.rm=TRUE)

and plot them:

plot(cntl_mean,col=3)
points(inf_mean,col=2)

It looks like the growth rate in the infected group is consistently 
greater. Testing the linear models:


summary(lm(cntl_mean~I(1:length(cnf_mean
summary(lm(inf_mean~I(1:length(inf_mean

looks like there is a significant effect. The proper comparison would be 
a mixed model with the individual scores, I think.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New fortune nomination ... was Re: R help-classification accuracy of DFA and RF using caret

2013-11-06 Thread David Winsemius
Seen on StackOverflow. Any seconds?


"I would heed the warnings and diagnostics. They are there for a reason. The 
Ostrich algorithm does not help you."

  -- Dirk Eddelbuettel commenting on StackOverflow when a questioner said he 
had not run R CMD check because he suspected other problems would be found (Nov 
2013)



Achim: I also have a fortunes spelling correction (at least for v 1.5-0):
Looking at:  fortune("calibration")  == fortune(277),  The author's last name 
needs to be Beleites.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot load MagAct96-98 - Extracurricular affiliation data

2013-11-06 Thread David Winsemius

On Nov 6, 2013, at 1:24 PM, email wrote:

> Hi:
> 
> I have installed the NetData package, and want to use the  MagAct96-98
> - Extracurricular affiliation data. But while loading the data, its
> giving an error. Any help?
> 
> 
> install.packages("NetData")
> library(NetData)
> data('studentnets.magact96.97.98', package = "NetData")
> 
> Warning message:
> In data("studentnets.magact96.97.98", package = "NetData") :
>  data set ‘studentnets.magact96.97.98’ not found

I just loaded that package and looked at its Index. I don't see any item with 
that name. I see three items that appear as though they could be descendents or 
predecessors of such a file:

magact96MagAct96-98 - Extracurricular affiliation data by year 
(1996-1998)
magact97MagAct96-98 - Extracurricular affiliation data by year 
(1996-1998)
magact98MagAct96-98 - Extracurricular affiliation data by year 
(1996-1998)


-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cannot load MagAct96-98 - Extracurricular affiliation data

2013-11-06 Thread email
Hi:

I have installed the NetData package, and want to use the  MagAct96-98
- Extracurricular affiliation data. But while loading the data, its
giving an error. Any help?


install.packages("NetData")
library(NetData)
data('studentnets.magact96.97.98', package = "NetData")

Warning message:
In data("studentnets.magact96.97.98", package = "NetData") :
  data set ‘studentnets.magact96.97.98’ not found

Thanks:
John

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help-classification accuracy of DFA and RF using caret

2013-11-06 Thread Bert Gunter
Second (perhaps with the slight addition indicated)

-- Bert

... And **amen!** to the sentiment expressed.


On Wed, Nov 6, 2013 at 2:08 PM, Rolf Turner  wrote:
> On 11/07/13 10:57, David Winsemius wrote:
>
> 
>>
>> I think you need to add a statistician to your [PhD] committee. The 
>> difficulties
>> you are facing (of which you appear to be unaware) are not just related to
>> being new to R.
>
> 
>
> Fortune?
>
> cheers,
>
> Rolf
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hidden functions

2013-11-06 Thread Carl Witthoft
Why would you need to?  The whole point of "::" and ":::" is to specify the
origin of a function.




--
View this message in context: 
http://r.789695.n4.nabble.com/hidden-functions-tp4679849p4679856.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonnormal Residuals and GAMs

2013-11-06 Thread David Winsemius

On Nov 6, 2013, at 12:46 PM, Collin Lynch wrote:

> Greetings, My question is more algorithmic than prectical.  What I am
> trying to determine is, are the GAM algorithms used in the mgcv package
> affected by nonnormally-distributed residuals?
> 
> As I understand the theory of linear models the Gauss-Markov theorem
> guarantees that least-squares regression is optimal over all unbiased
> estimators iff the data meet the conditions linearity, homoscedasticity,
> independence, and normally-distributed residuals.  Absent the last
> requirement it is optimal but only over unbiased linear estimators.
> 
> What I am trying to determine is whether or not it is necessary to check
> for normally-distributed errors in a GAM from mgcv.  I know that the
> unsmoothed terms, if any, will be fitted by ordinary least-squares but I
> am unsure whether the default Penalized Iteratively Reweighted Least
> Squares method used in the package is also based upon this assumption or
> falls under any analogue to the Gauss-Markov Theorem.

The default functional link for mgcv::gam is "log", so I doubt that your 
theoretical understanding applies to GAM's in general. When Simon Wood wrote 
his book on GAMs his first chapter was on linear models, his second chapter was 
on generalized lienar models at which point he had written over 100 pages, and 
only then did he "introduce" GAMs. I think you need to follow the same 
progression, and this forum is not the correct one for statistics education. 
Perhaps pose your follow-up questions to CrossValidated.com

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help-classification accuracy of DFA and RF using caret

2013-11-06 Thread Rolf Turner

On 11/07/13 10:57, David Winsemius wrote:


I think you need to add a statistician to your committee. The 
difficulties you are facing (of which you appear to be unaware) are 
not just related to being new to R.



Fortune?

cheers,

Rolf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting multiple horizontal lines to data

2013-11-06 Thread David Winsemius

On Nov 6, 2013, at 9:19 AM, Sashikanth Chandrasekaran wrote:

> I am not trying to fit a horizontal line at every unique value of y. I am
> trying fit the y values with as few horizontal lines by trading off the
> number of horizontal lines with the error. The actual problem I am trying
> to solve is to smooth data in a time series. Here is a realistic example of
> y
> 
> y=c(134.45,141.82,143.81,141.81,145,141.61,143.72,145.71,200,175,140,200,148.77,71.64,111.57,118.15,119.15,112.8,111.64,111.64,157.26,143.8,40.19,64.99,64.99,129.98,64.99,65,64.98,64.99)
> 
> An example fit for y using multiple horizontal lines (may not be the best
> fit in terms of squared error or another error metric, but I have included
> the y value for concreteness)
> 

The human brain searches for patterns and often finds them where there is no 
underlying mechanism. If you are asking for a regime-change method that is 
statistically based and will replicate your brain-driven pencil-and-paper 
methods you will probably be disappointed.

> plot(y)
> abline(h=c(140,110,150,65) )
> abline(v=c(13,20,22,30) ,col="red")


> 1. A horizontal line at approximately y=140 (to fit the first 13 values -
> 134.45 to 148.77)
> 2. A horizontal line at approximately y=110 (to fit the next 7 values -
> 71.64 to 111.64)
> 3. A horizontal line at approximately y=150 (to fit the next 2 values -
> 157.26 to 143.8)
> 4. A horizontal line at approximately y=65 (to fit the last 8 values -
> 40.19 to 64.99)
> -sashi.

If you want a method that is driven by the magnitude of the shift in adjacent 
values, then this would find some but not all of your proposed breakpoints:

> which( abs(diff(y)) >55)
[1] 11 13 22 25 26

You could perhaps refine that set of candidates by requiring that the next 
value have some other defining feature but I was unable to come up with a 
simple rule-set that agreed with your candidates . There are packages that do 
segmented regression but hey are not generall set up to assume all regression 
coefficients are 0 and that you are only interested in


> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] deSolve, unresolved namespace error

2013-11-06 Thread Thomas Petzoldt

On 11/6/2013 6:50 PM, Adam Clark wrote:> Addendum:

unloading and reloading deSolve.so does indeed fix the problem:

library.dynam.unload("deSolve", libpath=paste(.libPaths()[1],
"//deSolve", sep=""))
library.dynam("deSolve", package="deSolve", lib.loc=.libPaths()[1])

However, this is a little clunky, and seems like overkill. Does
anybody have an idea for a more elegant workaround?



Adam,

the solver lock is used for the ODEPACK solvers to prevent simultaneous
(i.e. nested) calls of the solver within R session, because the ODEPACK
algorithms use some global variables. The RK solvers do not use
global variables, so they have no lock.

If you use the ode solvers in the intended way, an internal call of:

 on.exit(.C("unlock_solver"))

should always unlock the solver, even if the function is exited due to
an error. However, your example may point to another problem in the
code, for which we need a minimal reproducible example.


As a first workaround, you may try to call:

.C("unlock_solver")

... but calling an internal .C package function outside a package is
generally not recommended, so you should definitely try to find the real
cause of your problem.

Thomas



PS: please move this thread to either R-Devel or (preferred) to the
specialised mailing list:

mailto:r-sig-dynamic-mod...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-dynamic-models

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help-classification accuracy of DFA and RF using caret

2013-11-06 Thread David Winsemius

On Nov 6, 2013, at 10:07 AM, Henderson, Robin Michelle wrote:

> Hi,
> 
> I am a graduate student applying published R scripts to compare the 
> classification accuracy of 2 predictive models, one built using discriminant 
> function analysis and one using random forests (webpage link for these 
> scripts is provided below).  The purpose of these models is to predict the 
> biotic integrity of streams.  Specifically, I am trying to compare the 
> classification accuracy (i.e., prediction of group membership)of both the DFA 
> and RF models using k-fold crossvalidation for the following metrics: AUC 
> ROC, percent correctly classified, specificity, sensitivity, and Kappa.

Sensitivity, "accuracy" (= percent correct), and specificity are only defined 
when you establish a particular threshold for decision. The is no "sensitivity" 
or "specificity" that will accrue to a classification model. AUC is an effort 
at presenting such an overall value, but it has deficiencies and is insensitive 
to statistically significant differences in models.

> I would also like to obtain the F statistic, Wilks lambda, MSE or RMSE for 
> the random forest models as the script does not contain code to get this data.

I doubt very much that is by accident or oversight on the part of the 
randomForest developers.

>  I think I need to use the caret package to obtain the classification 
> accuracy, but I keep getting error messages when I apply the train function 
> to my data.  As I am relatively new to R and my thesis committee is unable to 
> help as they are also unfamiliar with R, I thought it best to ask for help.

I think you need to add a statistician to your committee. The difficulties you 
are facing (of which you appear to be unaware) are not just related to being 
new to R.


>  Would someone be willing to help me?
> 
> 
> Thanks,
> Robin
> 
> http://www.epa.gov/wed/pages/models/rivpacs/rivpacs.htm
> 
> 
>> TrainDataDFAgrps2 <-predcal
>> TrainClassesDFAgrps2 <-grp.2;
>> DFAgrps2Fit1 <- train(TrainDataDFAgrps2, TrainClassesDFAgrps2,
> +  method = "lda",
> + tuneLength = 10,
> + trControl = trainControl(method = "cv"));
> Error in train.default(TrainDataDFAgrps2, TrainClassesDFAgrps2, method = 
> "lda",  :
>  wrong model type for regression

That error is pointing out that you are choosing a method that expects a 
particular form of outcome (continuous) and does not accept a categorical 
(possibly an R factor?) outcome. I suspect you may be using the `caret` 
package, but it's unclear. I think this is further evidence of the need for 
competent statistical consultation. You would be advised to study further in 
Venables and Ripley's MASS(v4) or in Hastie, Tibshirani, and Freidmans ESL(v2).

This link, found with a simple google search, suggests that the author of the 
cited code is at an academic institution only one state away from you: 
fw.oregonstate.edu/system/files/Van%20Sickle%20CV%20consult.pdf‎. He may be 
willing to offer assistance.

-- 
David.

> 
>> RFgrps2Fit1 <- train(TrainDataRFgrps2, TrainClassesRFgrps2,
> +  method = "rf",
> + tuneLength = 10,
> + trControl = trainControl(method = "cv"));
> There were 50 or more warnings (use warnings() to see the first 50)
> 
> Clip of predcal (same length as grp.2, but too much data to display all):
>> predcal
>  Reference_Test HUC12_AREA_HA_log10 ELEV_m M_Slp_sqt Precip_mm 
> Temp_CX10
> 2370   R 3.7  588.0   2.2  1751   
> 148
> 559R 4.0  643.1   1.8  1674   
> 141
> 2062   R 4.0  643.1   1.8  1674   
> 141
> 2467   R 4.0  643.1   1.8  1674   
> 141
> 1176   R 3.9  694.3   2.4  1534   
> 131
> 1840   R 3.9  694.3   2.4  1534   
> 131
> 2052   R 3.9  694.3   2.4  1534   
> 131
> 1174   R 4.1  605.0   2.1  1382   
> 138
> 1841   R 4.1  605.0   2.1  1382   
> 138
> 2051   R 4.1  605.0   2.1  1382   
> 138
> 1831   R 4.1  363.9   1.7   937   
> 156
> 
> 
> Grps.2:
> grp.2
>  [1] 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
> 2 2 1 2 1 2 1 1
> [45] 2 2 1 1 1 1 1 1 1 2 2 1 1 1 2 2 1 2 2 1 1 1 2 2 2 2 2 2 1 1 1 2 2 2 1 2 
> 2 2 2 2 2 2 2 1
> [89] 1 2 2 2 2 2 1 1 2 2 2 1 2 1 2 2 1 2 1 1 2
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA


[R] MPICH2 Rmpi and doSNOW

2013-11-06 Thread Matthias Salvisberg
Hi

I have managed to install MPICH2 and Rmpi on my Windows 7 machine. I can
also run the following code

> library(Rmpi)
> mpi.spawn.Rslaves()
4 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 5 is running on: MyMaster 
slave1 (rank 1, comm 1) of size 5 is running on: MyMaster 
slave2 (rank 2, comm 1) of size 5 is running on: MyMaster 
slave3 (rank 3, comm 1) of size 5 is running on: MyMaster 
slave4 (rank 4, comm 1) of size 5 is running on: MyMaster 
> mpichhosts()
 master  slave1  slave2  slave3  slave4 
"localhost" "localhost" "localhost" "localhost" "localhost" 
> mpi.universe.size()
[1] 4
> mpi.close.Rslaves()
[1] 1

library(doSNOW)

But every time I try to set up a cluster via

cluster <- makeCluster(4, type = "MPI")

My computer hangs up and I have to close the R session.

Any advice how I get this running?
Thanks in advance

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multistate data w/out individual ID

2013-11-06 Thread Emilio A. Laca
Dear R help community,

I have a version of the data below generated by counting the number of 
individuals in each state every 2-3 time units for a series of cohorts (sets).
States are A, B, C, E and T.
It is known that the following transitions are possible A->B, B->C, C->E, B->T, 
C->T and E->T. 
The identity of the individuals in each set was not recorded.
I am interested in estimating the transition rates or probabilities and sojourn 
times as a function of covariates.
We assume that rates (probabilities) are independent of time.

Is there a package to estimate those quantities?
Is there a method to estimate those quantities?
Do you think a Bayesian approach integrating over all possible paths might work?
Everything I read about multi-state modeling packages seems to indicate that it 
is necessary to know the identity of individuals.

time,N,A,B,C,E,T,set,cov1,cov2
199,24,22,2,0,0,0,s1,1,4.65816667
201,24,22,2,0,0,0,s1,3,0.2992
203,24,22,1,1,0,0,s1,10,9.570125
205,24,21,2,1,0,0,s1,6,9.94370139
207,24,20,3,1,0,0,s1,4,1.34693056
209,24,15,8,1,0,0,s1,2,1.20429167
212,24,14,9,1,0,0,s1,9,8.3008125
214,24,12,10,2,0,0,s1,15,9.14613194
216,24,10,11,3,0,0,s1,8,8.94250694
219,24,4,12,8,0,0,s1,5,0.97334722
221,24,3,12,9,0,0,s1,7,7.29372917
223,24,3,6,12,2,1,s1,17,1.28570139
225,24,1,7,8,5,3,s1,16,1.41599306
227,24,1,4,5,11,3,s1,18,7.07947222
229,24,1,3,2,15,3,s1,14,6.09359028
231,24,0,4,0,17,3,s1,12,4.7067
233,24,0,4,0,17,3,s1,13,5.31577083
235,24,0,2,0,19,3,s1,11,4.62228472
199,24,19,5,0,0,0,s2,1,4.65816667
201,24,23,1,0,0,0,s2,3,0.2992
203,24,22,2,0,0,0,s2,10,9.570125
205,24,22,2,0,0,0,s2,6,9.94370139
207,23,21,2,0,0,0,s2,4,1.34693056
209,23,19,3,1,0,0,s2,2,1.20429167
212,23,12,10,1,0,0,s2,9,8.3008125
214,23,10,12,1,0,0,s2,15,9.14613194
216,23,3,18,2,0,0,s2,8,8.94250694
219,23,0,16,7,0,0,s2,5,0.97334722
221,23,0,14,8,0,1,s2,7,7.29372917
223,23,0,5,14,2,2,s2,17,1.28570139
225,23,0,0,18,2,3,s2,16,1.41599306
227,23,0,2,13,2,6,s2,18,7.07947222
229,23,0,0,6,7,10,s2,14,6.09359028
231,23,0,0,3,8,12,s2,12,4.7067
233,26,0,0,1,13,12,s2,13,5.31577083
235,28,0,0,0,16,12,s2,11,4.62228472
199,24,20,4,0,0,0,s3,1,4.65816667
201,24,22,2,0,0,0,s3,3,0.2992
203,24,18,5,1,0,0,s3,10,9.570125
205,24,16,7,1,0,0,s3,6,9.94370139
207,24,13,10,1,0,0,s3,4,1.34693056
209,24,8,12,4,0,0,s3,2,1.20429167
212,24,5,14,5,0,0,s3,9,8.3008125
214,24,4,10,10,0,0,s3,15,9.14613194
216,24,2,9,12,1,0,s3,8,8.94250694
219,24,0,8,13,3,0,s3,5,0.97334722
221,24,2,6,5,11,0,s3,7,7.29372917
223,22,0,5,4,12,1,s3,17,1.28570139
225,24,1,3,1,17,2,s3,16,1.41599306
227,24,0,2,1,18,3,s3,18,7.07947222
229,25,0,2,0,20,3,s3,14,6.09359028
231,24,0,2,0,19,3,s3,12,4.7067
233,24,0,1,1,19,3,s3,13,5.31577083
235,24,0,1,0,20,3,s3,11,4.62228472

Any comment will be appreciated.
Regards,

Emilio A. Laca, Professor
One Shields Avenue, 2306 PES Bldg.
Plant Sciences  Mail Stop 1eal...@ucdavis.edu 
University of California fax: (530) 752-4361
 
Davis, California  95616  voice: (530) 754-4083
  
mobile: (530) 220-5315
Include the Mail stop in the address or mail will be delayed.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Treatment effects on measurements through time: how to tell when (in time) treatment has a significant effect?

2013-11-06 Thread c_e_cressler
Hi, 

The data (attached) I am looking at consists of measurements of growth rate
at different ages, for individuals in two treatments (control and infected).
What I want to know is whether and when (what age) the growth rate of
infected individuals is higher than the growth rate for control individuals.

The simplest way to approach this question is to just do a t-test at each
age, but because the growth rates at a given age depend on the growth rates
at previous ages before, that seems statistically invalid. I have looked at
some of the time series literature, but most of that seems more complicated
than what I am trying to do. What I would like to be able to say is
something like, "The growth rate of infected individuals is higher than
control individuals for ages 18-30."

Any help or insight would be greatly appreciated.

Thanks!
Clay

P.S. The data is provided as two matrices: the first column are the ages at
which data was collected, and subsequent columns are the growth rate
trajectories for different individuals.  cntl.grates.rda
  
inf.grates.rda   



--
View this message in context: 
http://r.789695.n4.nabble.com/Treatment-effects-on-measurements-through-time-how-to-tell-when-in-time-treatment-has-a-significant--tp4679911.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hanning window in r

2013-11-06 Thread Baro
Hi experts

I would like to run Hamming window on a time series in r. I am using  e1071
package and STFT function in it like this:

s<-stft(datalist, win=min(80,floor(length(datalist)/10)),
inc=min(24,floor(length(datalist)/30)), coef=10, wtype="hanning.window")

My expectation is so, that I get a vector of values with the length same as
the original time series, but with this command I get a 30X10 Matrix

could someone explain me? what does this matrix mean and what should I do
to do window function on a time series?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Nonnormal Residuals and GAMs

2013-11-06 Thread Collin Lynch
Greetings, My question is more algorithmic than prectical.  What I am
trying to determine is, are the GAM algorithms used in the mgcv package
affected by nonnormally-distributed residuals?

As I understand the theory of linear models the Gauss-Markov theorem
guarantees that least-squares regression is optimal over all unbiased
estimators iff the data meet the conditions linearity, homoscedasticity,
independence, and normally-distributed residuals.  Absent the last
requirement it is optimal but only over unbiased linear estimators.

What I am trying to determine is whether or not it is necessary to check
for normally-distributed errors in a GAM from mgcv.  I know that the
unsmoothed terms, if any, will be fitted by ordinary least-squares but I
am unsure whether the default Penalized Iteratively Reweighted Least
Squares method used in the package is also based upon this assumption or
falls under any analogue to the Gauss-Markov Theorem.

Thank you in advance for any help.

Sincrely,
Collin Lynch.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting multiple horizontal lines to data

2013-11-06 Thread David Carlson
The changepoint package might give you a way to do this.

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Sashikanth
Chandrasekaran
Sent: Wednesday, November 6, 2013 11:20 AM
To: r-help@r-project.org
Subject: [R] Fitting multiple horizontal lines to data

I am not trying to fit a horizontal line at every unique value
of y. I am
trying fit the y values with as few horizontal lines by trading
off the
number of horizontal lines with the error. The actual problem I
am trying
to solve is to smooth data in a time series. Here is a realistic
example of
y

y=c(134.45,141.82,143.81,141.81,145,141.61,143.72,145.71,200,175
,140,200,148.77,71.64,111.57,118.15,119.15,112.8,111.64,111.64,1
57.26,143.8,40.19,64.99,64.99,129.98,64.99,65,64.98,64.99)

An example fit for y using multiple horizontal lines (may not be
the best
fit in terms of squared error or another error metric, but I
have included
the y value for concreteness)

1. A horizontal line at approximately y=140 (to fit the first 13
values -
134.45 to 148.77)
2. A horizontal line at approximately y=110 (to fit the next 7
values -
71.64 to 111.64)
3. A horizontal line at approximately y=150 (to fit the next 2
values -
157.26 to 143.8)
4. A horizontal line at approximately y=65 (to fit the last 8
values -
40.19 to 64.99)
-sashi.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 beginner question

2013-11-06 Thread Jon BR
Hello,
   I'm having fun exploring the pretty graphing options in R, although
I'm struggling to figure out how to do some simple things; would be
thankful if someone could point me toward relevant sections of the manual
or provide some starter code to get me going.

I'd like to extend what is offered in the manual here for stacked bar plots:

http://docs.ggplot2.org/current/geom_bar.html

For starters

library(ggplot2)
ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar()

Which makes a nice stacked barplot featuring counts on the y-axis. I'd like
to transform this to fraction or percentage, and (with some googling) came
up with this:

ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar(position = 'fill')

However, I prefer using  a line via frequency polygons.  Using counts, this
is:

ggplot(diamonds, aes(clarity, colour=cut)) + geom_freqpoly(aes(group = cut))

I'd like to adjust this to show fraction instead of counts on the y-axis
(as in the previous example), but this command is obviously incorrectly
constructed:

ggplot(diamonds, aes(clarity, colour=cut)) + geom_freqpoly(aes(group =
cut), position = 'fill')
Error: position_fill requires the following missing aesthetics: ymax

Any pointers would be appreciated.

Best,
Jonathan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix Question, calculate with 3 different matrices

2013-11-06 Thread rissa
Hi there

I've got a (I think) simple problem, but I just can't figure it out.

I've got 3 dataset (datasets attached below).
In 'Value' there is the measured mean value for every unit (SFZN, SGKN,
SGSN, SHLTN, SIK) and for every treatment (1 to 10).
In 'SE' is the Standard Error for every unit and every treatment.
In 'DF' is the degrees of freedom for every unit and every treatment.

First, I want to calculate the t-value for every unit and treatment based on
the null hypothesis Value= 0 and safe it in a new matrix that has the same
form like SE etc (value names in first column and Treatments in the header).
That means something like...

tvalue<-as.matrix(Value)/as.matrix(SE)

Next I want to calculate the critical t value (p=0.05, two-sided) in a
separate matrix (same form as SE etc). That means something like...

critical<-qf(0.05/2,as.matrix(DF))

And last. I want to compare 'tvalue' with 'critical' and give out
'significant' if abs(critical) > 
abs(critical) or 'not' if it is not true and safe the results in a matrix
(same form as SE).

I know it's just a matrix question, but I'm a bit stuck.

P.S. Another question (not related to the above)

As far as I know in R the command var() calculates the sample variance (that
means 1/(N-1) ∑〖x-x ̅ 〗 ). Is there a way to tell R to calculate the
Population Variance (that means divided through N instead of N-1)



Kind regards

DF.txt   
SE.txt   
Value.txt   




--
View this message in context: 
http://r.789695.n4.nabble.com/Matrix-Question-calculate-with-3-different-matrices-tp4679907.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R help-classification accuracy of DFA and RF using caret

2013-11-06 Thread Henderson, Robin Michelle
Hi,

I am a graduate student applying published R scripts to compare the 
classification accuracy of 2 predictive models, one built using discriminant 
function analysis and one using random forests (webpage link for these scripts 
is provided below).  The purpose of these models is to predict the biotic 
integrity of streams.  Specifically, I am trying to compare the classification 
accuracy (i.e., prediction of group membership)of both the DFA and RF models 
using k-fold crossvalidation for the following metrics: AUC ROC, percent 
correctly classified, specificity, sensitivity, and Kappa. I would also like to 
obtain the F statistic, Wilks lambda, MSE or RMSE for the random forest models 
as the script does not contain code to get this data.  I think I need to use 
the caret package to obtain the classification accuracy, but I keep getting 
error messages when I apply the train function to my data.  As I am relatively 
new to R and my thesis committee is unable to help as they are also unf!
 amiliar with R, I thought it best to ask for help.  Would someone be willing 
to help me?


Thanks,
Robin

http://www.epa.gov/wed/pages/models/rivpacs/rivpacs.htm


> TrainDataDFAgrps2 <-predcal
> TrainClassesDFAgrps2 <-grp.2;
> DFAgrps2Fit1 <- train(TrainDataDFAgrps2, TrainClassesDFAgrps2,
+  method = "lda",
+ tuneLength = 10,
+ trControl = trainControl(method = "cv"));
Error in train.default(TrainDataDFAgrps2, TrainClassesDFAgrps2, method = "lda", 
 :
  wrong model type for regression

> RFgrps2Fit1 <- train(TrainDataRFgrps2, TrainClassesRFgrps2,
+  method = "rf",
+ tuneLength = 10,
+ trControl = trainControl(method = "cv"));
There were 50 or more warnings (use warnings() to see the first 50)

Clip of predcal (same length as grp.2, but too much data to display all):
> predcal
  Reference_Test HUC12_AREA_HA_log10 ELEV_m M_Slp_sqt Precip_mm 
Temp_CX10
2370   R 3.7  588.0   2.2  1751   
148
559R 4.0  643.1   1.8  1674   
141
2062   R 4.0  643.1   1.8  1674   
141
2467   R 4.0  643.1   1.8  1674   
141
1176   R 3.9  694.3   2.4  1534   
131
1840   R 3.9  694.3   2.4  1534   
131
2052   R 3.9  694.3   2.4  1534   
131
1174   R 4.1  605.0   2.1  1382   
138
1841   R 4.1  605.0   2.1  1382   
138
2051   R 4.1  605.0   2.1  1382   
138
1831   R 4.1  363.9   1.7   937   
156


Grps.2:
grp.2
  [1] 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 1 2 1 2 1 1
[45] 2 2 1 1 1 1 1 1 1 2 2 1 1 1 2 2 1 2 2 1 1 1 2 2 2 2 2 2 1 1 1 2 2 2 1 2 2 
2 2 2 2 2 2 1
[89] 1 2 2 2 2 2 1 1 2 2 2 1 2 1 2 2 1 2 1 1 2







[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed issue: gsub on large data frame

2013-11-06 Thread Carl Witthoft
If you could, please identify which responder's idea you used, as well as the
"strsplit" -- related code you ended up with.
That may help someone who browses the mail archives in the future.

Carl


SPi wrote
> I'll answer myself:
> using strsplit with fixed=true took like 2minutes!





--
View this message in context: 
http://r.789695.n4.nabble.com/speed-issue-gsub-on-large-data-frame-tp4679747p4679906.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message glmer using R: “ 'what' must be a character string or a function”

2013-11-06 Thread Ben Bolker
William Dunlap  tibco.com> writes:

 
> > You can reproduce the problem by having a data.frame (or anything
> > else) in your environment: > > I left out "called 'new'" in the
> > above statement.  The example is correct.
 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> > -Original Message-

[snip]

> > You can reproduce the problem by having a data.frame (or anything
> > else) in your environment: new <- data.frame(ï..VAR1 =
> > rep(c(TRUE,NA,FALSE), c(10,2,8)), random=rep(1:3,len=20),
> > clustno=rep(c(1:5),len=20), validatedRS6=rep(0:1,len=20)) model1<-
> > glmer(validatedRS6 ~ random + (1|clustno), data=new,
> > family=binomial(), nAGQ=3) # Error in do.call(new, c(list(Class =
> > "glmResp", family = family), ll[setdiff(names(ll), : # 'what' must
> > be a character string or a function The problem is in the call
> > do.call(new, list()) It finds your dataset 'new' (in .GlobalEnv),
> > which is not a function or the name of a function, not the
> > function 'new' from the 'methods' package.  Rename your dataset,
> > so you do not have anything called 'new' masking the one in
> > package:methods, and things should work.  Write to the maintainer
> > of the package (use maintainer("lme4") for the address) about the
> > problem.

For what it's worth, I saw this on StackOverflow:

[broken URL]
http://stackoverflow.com/questions/19801070/
  error-message-glmer-using-r-what-must-be-a-character-string-or-a-function

answered it there, and have fixed it on Github: 

https://github.com/lme4/lme4/commit/9c12f002821f9567d5454e2ce3b78076dabffb54

While it is not officially forbidden in 
http://www.r-project.org/posting-guide.html (to my surprise, I can't
even find any proscription against cross-posting to R mailing lists,
although there is a section about "which list to post to"), posting
to both r-help and Stack Overflow tends to lead to duplicated effort/
frustration.  Please choose one or the other (in my opinion it's
OK to cross-post after a few days if you don't get any responses
in one place, provided that you say that you've cross-posted and
ideally provide a reference link to the cross-post).

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed issue: gsub on large data frame

2013-11-06 Thread SPi
I'll answer myself:
using strsplit with fixed=true took like 2minutes!



--
View this message in context: 
http://r.789695.n4.nabble.com/speed-issue-gsub-on-large-data-frame-tp4679747p4679905.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed issue: gsub on large data frame

2013-11-06 Thread SPi
Good idea! 

I'm trying your approach right now, but I am wondering if using str_split
(package: 'stringr') or strsplit is the right way to go in terms of speed? I
ran str_split over the text column of the data frame and it's processing for
2 hours now..? 

I did: 
splittedStrings<-str_split(dataframe$text, " ")

The $text column already contains cleaned text, so no double blanks etc or
unnecessary symbols. Just full words.




--
View this message in context: 
http://r.789695.n4.nabble.com/speed-issue-gsub-on-large-data-frame-tp4679747p4679904.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about R

2013-11-06 Thread Marc Schwartz
On Nov 6, 2013, at 11:09 AM, Silvia Espinoza  wrote:

> Good morning. I am interested in downloading R.  I would appreciate if you
> can help me with the following questions, please.
> 
> 1.   Is R free, or I have to pay for support/maintenance, or it depends
> on the version? Is there a paid version?
> 


Yes, it is free, although there are commercial versions of R available, if you 
decide that you do need/want commercial support.

Some additional info on commercial versions here:

  http://cran.r-project.org/doc/FAQ/R-FAQ.html#What-is-R_002dplus_003f


None of this has any effect on your ability to use R in a commercial setting, 
though there are some CRAN packages that do have such limitations: 

  
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Can-I-use-R-for-commercial-purposes_003f



> 2.   How safe is it to work with data using R? Is there any risk that
> someone else can have access to the information?


That is outside of the scope of R and is dependent upon the security of the 
computer system(s) and possibly networks, upon and over which R is running and 
where your data is stored and managed.

Regards,

Marc Schwartz


> Thanks in advance for your attention and for any help you can provide me.
> 
> Silvia Espinoza

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R 3.0.2 - How to create intervals and group another variable in those intervals?

2013-11-06 Thread jim holtman
You need to read up on the use of the "apply" functions: this is just
another 'tapply' call with the 'speed' in it.

> n <- 1000
> x <- data.frame(speed = runif(n, 0, 85.53)
+  , spacing = rnorm(n)
+  )
> # create interval for speed
> int <- seq(0, max(x) + 4.5, by = 4.5)
> # split the data and find average of spacing
> tapply(x$spacing, cut(x$speed, int), mean)
 (0,4.5]  (4.5,9] (9,13.5](13.5,18](18,22.5]
(22.5,27](27,31.5](31.5,36]
-0.260755737 -0.017766405 -0.097255963  0.234719308 -0.038908267
-0.004559798 -0.065556109 -0.144327936
   (36,40.5](40.5,45](45,49.5](49.5,54](54,58.5]
(58.5,63](63,67.5](67.5,72]
-0.135491878 -0.120387573  0.033821289 -0.110058896 -0.107173009
-0.091258106 -0.198911676 -0.138334909
   (72,76.5](76.5,81](81,85.5]
-0.127941501  0.198943106  0.136345405
>
> # find the average speed
> tapply(x$speed, cut(x$speed, int), mean)
  (0,4.5]   (4.5,9]  (9,13.5] (13.5,18] (18,22.5] (22.5,27] (27,31.5]
(31.5,36] (36,40.5] (40.5,45]
 2.276792  6.804097 11.024388 15.699358 20.184667 24.644743 29.050843
34.017265 38.288514 42.617572
(45,49.5] (49.5,54] (54,58.5] (58.5,63] (63,67.5] (67.5,72] (72,76.5]
(76.5,81] (81,85.5]
47.357247 51.808874 56.162820 60.719576 65.202813 70.055034 74.147431
79.070495 83.376027
>

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Nov 6, 2013 at 12:36 PM, umair durrani  wrote:
> Thanks Jim. You now provided me with the average of spacing but how can I
> find the average of speed for every speed interval? I am a newbie in R,
> sorry for my stupid questions
>
>
> Umair Durrani
> email: umairdurr...@outlook.com
>
>
>> Date: Wed, 6 Nov 2013 12:22:50 -0500
>> Subject: Re: [R] R 3.0.2 - How to create intervals and group another
>> variable in those intervals?
>> From: jholt...@gmail.com
>> To: umairdurr...@outlook.com
>> CC: r-help@r-project.org
>>
>> should of had in the script:
>>
>>
>> int <- seq(0, max(x$speed) + 4.5, by = 4.5)
>>
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>>
>> On Wed, Nov 6, 2013 at 12:19 PM, jim holtman  wrote:
>> > Is this what you are after:
>> >
>> >
>> >> n <- 1000
>> >> x <- data.frame(speed = runif(n, 0, 85.53)
>> > + , spacing = rnorm(n)
>> > + )
>> >> # create interval for speed
>> >> int <- seq(0, max(x) + 4.5, by = 4.5)
>> >> # split the data and find average of spacing
>> >> tapply(x$spacing, cut(x$speed, int), mean)
>> > (0,4.5] (4.5,9] (9,13.5] (13.5,18] (18,22.5]
>> > (22.5,27] (27,31.5] (31.5,36]
>> > 0.27840783 -0.08349567 -0.10659408 -0.01476840 -0.08773255
>> > -0.06643826 0.21873016 0.17627232
>> > (36,40.5] (40.5,45] (45,49.5] (49.5,54] (54,58.5]
>> > (58.5,63] (63,67.5] (67.5,72]
>> > -0.16568350 -0.15458191 -0.04909331 -0.01179396 0.25022296
>> > -0.27553812 -0.14927483 -0.21000177
>> > (72,76.5] (76.5,81] (81,85.5]
>> > -0.09884137 -0.08459709 0.02864456
>> >>
>> >>
>> >
>> > plotting is left as an exercise for the reader, but given the averages
>> > above, you can find the midpoints easily.
>> >
>> > Jim Holtman
>> > Data Munger Guru
>> >
>> > What is the problem that you are trying to solve?
>> > Tell me what you want to do, not how you want to do it.
>> >
>> >
>> > On Wed, Nov 6, 2013 at 10:48 AM, umair durrani
>> >  wrote:
>> >> I have two columns for speed ('Smoothed velocity') and Spacing. What I
>> >> want to do is to first create the intervals of speed (minimum value=0, max
>> >> value= 85.53), group the Spacing values falling in a particular Speed
>> >> interval, find the average of the Spacing for an interval and finally plot
>> >> the average spacing of each interval against the mid-point of the Speed
>> >> interval. I want to have fixed intervals of 4.5 feet per second, i.e. 
>> >> 0-4.5,
>> >> 4.5-9,..xx-85.53.After hours of search I found a function for creating
>> >> intervals called classIntervals() but I can't figure out how to create 
>> >> fixed
>> >> intervals of 4.5. Here is what I tried:classIntervals(s21[,'Smoothed
>> >> velocity'], style='fixed', fixedBreaks=4.5)But the results were unexpected
>> >> and there was a Warning message:In classIntervals(s21[, "Smoothed
>> >> velocity"], style = "fixed", fixedBreaks = 4.5) :
>> >> variable range greater than fixedBreaksEven after intervals are
>> >> created, I need to group spacing and find the avg. for every interval. How
>> >> can I do this? I have tried what I could, please help
>> >>
>> >> Umair Durrani
>> >>
>> >> email: umairdurr...@outlook.com
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-help@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contai

Re: [R] Remove from the mailing list

2013-11-06 Thread David Winsemius

On Nov 6, 2013, at 8:33 AM, Mario Garrido wrote:

> Dear administrators,
> I tried to stop receving mails from your webpage to my current mail (
> gaiarr...@usal.es). I try to do it through your webpage but i was not
> able. I would appreciate to be removed from the mailing list.

If you were unable to login with your current addres it might have been because 
there was not an exact match for the domain name. This happens when an 
institution changes their domain name and forwards mail for a time to a new 
address. In this case, however, you are subscribed with that address. The 
moderators do not have delete rights to subscriptions, nor can they generate 
password reminders. Your password should be sent on a regular basis. (From the 
web page where you subscribed: "Once a month, your password will be emailed to 
you as a reminder.")


> 
> Thanks in advance.
> 
> 
> -- 
> Mario Garrido Escudero

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about R

2013-11-06 Thread Barry Rowlingson
On Wed, Nov 6, 2013 at 5:44 PM, Ye Lin  wrote:
> You can get details at http://www.r-project.org/
>
> But to answer your question: Yes it is free

 But there is also a paid version. Send me $1000 and I will send you R
on a USB stick, complete with all the source code.

 Seriously, other companies do supply support and extensions for R at
a cost, and although I can legally sell you a copy of R for $1000
nobody bothers charging for R because the license can't stop you
giving your copy away.

 As for your security/data safety question, well your operating system
is probably the weaker link in that chain. However if you are running
R in a client-server fashion then you should make sure the data is
encrypted - and then the weakest link is possession of the private
half of the encryption key, which is your responsibility.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot a single frequency of a ts object

2013-11-06 Thread Stefano Sofia
Dear list users,
I transformed two vectors of seasonal data in ts objects of frequency 4:

y1 <- ts(x1, frequency=4, start=c(1952,1))
y2 <- ts(x2, frequency=4, start=c(1952,1))

In this way Qtr1 corresponds to Winters, Qtr2 corresponds to Springs and so on.
I would like to plot on the same graph both y1 and all the Winters of y2.
I am not able to find an easy and straightforward way to do that.
Could somebody please help me in this?

Thank you
Stefano Sofia






AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere 
informazioni confidenziali, pertanto è destinato solo a persone autorizzate 
alla ricezione. I messaggi di posta elettronica per i client di Regione Marche 
possono contenere informazioni confidenziali e con privilegi legali. Se non si 
è il destinatario specificato, non leggere, copiare, inoltrare o archiviare 
questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al 
mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi 
dell'art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità ed 
urgenza, la risposta al presente messaggio di posta elettronica può essere 
visionata da persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by 
persons entitled to receive the confidential information it may contain. E-mail 
messages to clients of Regione Marche may contain information that is 
confidential and legally privileged. Please do not read, copy, forward, or 
store this message unless you are an intended recipient of it. If you have 
received this message in error, please forward it to the sender and delete it 
completely from your computer system.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] deSolve, unresolved namespace error

2013-11-06 Thread Adam Clark
Addendum:
unloading and reloading deSolve.so does indeed fix the problem:

library.dynam.unload("deSolve", libpath=paste(.libPaths()[1], "//deSolve",
sep=""))
library.dynam("deSolve", package="deSolve", lib.loc=.libPaths()[1])

However, this is a little clunky, and seems like overkill. Does anybody
have an idea for a more elegant workaround?



On Wed, Nov 6, 2013 at 11:20 AM, Adam Clark  wrote:

> I'm having trouble running the "ode" function from the "deSolve" package.
>
> I am running RStudio under Ubuntu 13.1
>
> I am running ode() on compiled code that returns delta values using the .C
> convention. While I can include an example of the code, I suspect that it
> will not be helpful, since the problem is not replicable among systems
> (e.g. Solaris or Mac).
>
> When I call ode() on my compiled code, it occasionally will return:
>
> Error in .C("unlock_solver") :
>   "unlock_solver" not resolved from current namespace (deSolve)
>
>
> All subsequent calls to ode() return the same error message, regardless of
> what I run. The problem is resolved only if I restart RStudio. However,
> since I am running many iterations of this command, I would like to find a
> way to resolve this without restarting RStudio.
>
> If I call: is.loaded("unlock_solver", PACKAGE="deSolve"), R returns TRUE.
>
> If I call: .C("unlock_solver"), R returns:
> list()
>
> If I first unload "deSolve" using
> detach("package:deSolve", unload=TRUE), deSolve disappears from my
> search() space, but  is.loaded("unlock_solver", PACKAGE="deSolve") still
> returns TRUE. If I reload deSolve, the problem persists.
>
> Based on what I have read in the help files on namespace conventions in
> packages,  I suspect that the problem is that the .C function
> "unlock_solver" is not corrected loaded, or was unloaded but not marked as
> unloaded.
>
> I have two guesses for what is going on:
> 1) "unlock_solver" was loaded using the library.dynam() function, but
> unloaded using the dyn.unload() function. As I understand it, this would
> leave a blank entry in the namespace, leading R to think that
> "unlock_solver" is loaded, even though the function no longer does
> anything. However, even when deSolve is working correctly,
> .C("unlock_solver") returns a blank list, so this may not be the case.
>
> 2) Some call deep in deSolve is not pointed towards the right package, and
> therefore cannot find "unlock_solver".
>
> From the source code for deSolve, posted at
> https://r-forge.r-project.org/scm/viewvc.php/pkg/deSolve/src/deSolve_utils.c?view=log&root=desolve&pathrev=344,
>  "unlock_solver"
> seems to be a pretty simple function, inside the deSolve_utils.c file:
>
> void unlock_solver(void) {solver_locked = 0;}
>
> This command is a pretty recent addition to deSolve (it appeared somewhere 
> between revision 319 and 324), and is meant to "prevent nesting of solvers 
> that have global variables", according to the change annotation.
>
>
> In any case, I'd like to find a way to specifically unload "unlock_solver"
> and reload it, or barring that, unload as many of the DLL's associated with
> deSolve as possible and reload them. I suspect that this will solve my
> problems.
>
> Many thanks, and sorry if this is a silly question,
> Adam
>
> --
> Adam Clark
> University of Minnesota, EEB
> 100 Ecology Building
> 1987 Upper Buford Circle
> St. Paul, MN 55108
> (857)-544-6782
>



-- 
Adam Clark
University of Minnesota, EEB
100 Ecology Building
1987 Upper Buford Circle
St. Paul, MN 55108
(857)-544-6782

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about R

2013-11-06 Thread Ye Lin
You can get details at http://www.r-project.org/

But to answer your question: Yes it is free


On Wed, Nov 6, 2013 at 9:09 AM, Silvia Espinoza  wrote:

> Good morning. I am interested in downloading R.  I would appreciate if you
> can help me with the following questions, please.
>
> 1.   Is R free, or I have to pay for support/maintenance, or it depends
> on the version? Is there a paid version?
>
> 2.   How safe is it to work with data using R? Is there any risk that
> someone else can have access to the information?
>
> Thanks in advance for your attention and for any help you can provide me.
>
> Silvia Espinoza
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Basic question: why does a scatter plot of a variable against itself works like this?

2013-11-06 Thread Barry Rowlingson
Interestingly, fitting an LM with x on both sides gives a warning, and
then drops it from the RHS, leaving you with just an intercept:

> lm(x~x,data=d)

Call:
lm(formula = x ~ x, data = d)

Coefficients:
(Intercept)
  4

Warning messages:
1: In model.matrix.default(mt, mf, contrasts) :
  the response appeared on the right-hand side and was dropped
2: In model.matrix.default(mt, mf, contrasts) :
  problem with term 1 in model.matrix: no columns are assigned

there's no numerical problem fitting a line through the points:

 > d$xx=d$x
 > lm(x~xx,data=d)

Call:
lm(formula = x ~ xx, data = d)

Coefficients:
(Intercept)   xx
  5.128e-161.000e+00

It seems to be R saying "Ummm did you really mean to do this? It's kinda dumb".

I suppose this could occur if you had a nested loop over all columns
in a data frame, fitting an LM with every column, and didn't skip if
i==j

Except of course it doesn't:

 - fit with two indexes set to one:

> i=1;j=1
> lm(d[,i]~d[,j])

Call:
lm(formula = d[, i] ~ d[, j])

Coefficients:
(Intercept)   d[, j]
  5.128e-161.000e+00

- fit with two ones:

> lm(d[,1]~d[,1])

Call:
lm(formula = d[, 1] ~ d[, 1])

Coefficients:
(Intercept)
  4

Warning messages:
1: In model.matrix.default(mt, mf, contrasts) :
  the response appeared on the right-hand side and was dropped
2: In model.matrix.default(mt, mf, contrasts) :
  problem with term 1 in model.matrix: no columns are assigned

Obviously this can all be explained in terms of R (or lm's, or
model.matrix's) evaluation schemes, but it seems far from intuitive.

Barry



On Wed, Nov 6, 2013 at 4:59 PM, William Dunlap  wrote:
> It probably happens because plot(formula) makes one call to terms(formula) to
> analyze the formula.  terms() says there is one variable in the formula,
> the response, so plot(x~x) is the same a plot(seq_along(x), x).
> If you give it plot(~x) , terms() also says there is one variable, but
> no response, so you get the same plot as plot(x, rep(1,length(x))).
> This is also the reason that plot(y1+y2 ~ x1+x2) makes one plot of the sum of 
> y1 and y2
> for each term on the right side instead of 4 plots, plot(x1,y1), 
> plot(x1,y2),plot(x2,y1),
> and plot(x2,y2).
>
> One could write a plot function that called terms separately on the left and
> right sides of the formula.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
>> Behalf
>> Of Tal Galili
>> Sent: Wednesday, November 06, 2013 8:40 AM
>> To: r-help@r-project.org
>> Subject: [R] Basic question: why does a scatter plot of a variable against 
>> itself works like
>> this?
>>
>> Hello all,
>>
>> I just noticed the following behavior of plot:
>> x <- c(1,2,9)
>> plot(x ~ x) # this is just like doing:
>> plot(x)
>> # when maybe we would like it to give this:
>> plot(x ~ c(x))
>> # the same as:
>> plot(x ~ I(x))
>>
>> I was wondering if there is some reason for this behavior.
>>
>>
>> Thanks,
>> Tal
>>
>>
>>
>> Contact
>> Details:---
>> Contact me: tal.gal...@gmail.com |
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>> --
>>
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MPICH2 Rmpi and doSNOW

2013-11-06 Thread Matthias Salvisberg
Hi

 

I have managed to install MPICH2 and Rmpi on my Windows 7 machine. I can
also run the following code

 

> library(Rmpi)

> mpi.spawn.Rslaves()

4 slaves are spawned successfully. 0 failed.

master (rank 0, comm 1) of size 5 is running on: MyMaster 

slave1 (rank 1, comm 1) of size 5 is running on: MyMaster 

slave2 (rank 2, comm 1) of size 5 is running on: MyMaster 

slave3 (rank 3, comm 1) of size 5 is running on: MyMaster 

slave4 (rank 4, comm 1) of size 5 is running on: MyMaster 

> mpichhosts()

 master  slave1  slave2  slave3  slave4 

"localhost" "localhost" "localhost" "localhost" "localhost" 

> mpi.universe.size()

[1] 4

> mpi.close.Rslaves()

[1] 1

 

library(doSNOW)

 

But every time I try to set up a cluster via

 

cluster <- makeCluster(4, type = "MPI")

 

My computer hangs up and I have to close the R session.

 

Any advice how I get this running?

Thanks in advance

 

> sessionInfo()

R version 3.0.1 (2013-05-16)

Platform: x86_64-w64-mingw32/x64 (64-bit)

 

locale:

[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252   

[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C   

[5] LC_TIME=German_Switzerland.1252

 

attached base packages:

[1] stats graphics  grDevices utils datasets  methods   base 

 

other attached packages:

[1] Rmpi_0.6-3

 

loaded via a namespace (and not attached):

[1] tools_3.0.1

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] deSolve, unresolved namespace error

2013-11-06 Thread Prof Brian Ripley

On 06/11/2013 17:20, Adam Clark wrote:

I'm having trouble running the "ode" function from the "deSolve" package.

I am running RStudio under Ubuntu 13.1

I am running ode() on compiled code that returns delta values using the .C
convention. While I can include an example of the code, I suspect that it
will not be helpful, since the problem is not replicable among systems
(e.g. Solaris or Mac).

When I call ode() on my compiled code, it occasionally will return:

Error in .C("unlock_solver") :
   "unlock_solver" not resolved from current namespace (deSolve)


All subsequent calls to ode() return the same error message, regardless of
what I run. The problem is resolved only if I restart RStudio. However,
since I am running many iterations of this command, I would like to find a
way to resolve this without restarting RStudio.

If I call: is.loaded("unlock_solver", PACKAGE="deSolve"), R returns TRUE.

If I call: .C("unlock_solver"), R returns:
list()

If I first unload "deSolve" using
detach("package:deSolve", unload=TRUE), deSolve disappears from my search()
space, but  is.loaded("unlock_solver", PACKAGE="deSolve") still returns
TRUE. If I reload deSolve, the problem persists.

Based on what I have read in the help files on namespace conventions in
packages,  I suspect that the problem is that the .C function
"unlock_solver" is not corrected loaded, or was unloaded but not marked as
unloaded.

I have two guesses for what is going on:
1) "unlock_solver" was loaded using the library.dynam() function, but
unloaded using the dyn.unload() function. As I understand it, this would


Who said it was unloaded?  That it is not by default is explicit in ?detach.


leave a blank entry in the namespace, leading R to think that
"unlock_solver" is loaded, even though the function no longer does
anything. However, even when deSolve is working correctly,
.C("unlock_solver") returns a blank list, so this may not be the case.

2) Some call deep in deSolve is not pointed towards the right package, and
therefore cannot find "unlock_solver".


From the source code for deSolve, posted at

https://r-forge.r-project.org/scm/viewvc.php/pkg/deSolve/src/deSolve_utils.c?view=log&root=desolve&pathrev=344,
"unlock_solver"
seems to be a pretty simple function, inside the deSolve_utils.c file:

void unlock_solver(void) {solver_locked = 0;}

This command is a pretty recent addition to deSolve (it appeared
somewhere between revision 319 and 324), and is meant to "prevent
nesting of solvers that have global variables", according to the
change annotation.


In any case, I'd like to find a way to specifically unload "unlock_solver"
and reload it, or barring that, unload as many of the DLL's associated with
deSolve as possible and reload them. I suspect that this will solve my
problems.


I am not sure why you think it is reasonable to do that.  Clearly the 
designers of deSolve did not think so, as it does not have an .onUnload 
action.


> library(deSolve)
> names(getLoadedDLLs())
[1] "base"  "utils" "methods"   "grDevices" "graphics"  "stats"
[7] "deSolve"   "tools"
> detach("package:deSolve", unload=TRUE)
> names(getLoadedDLLs())
[1] "base"  "utils" "methods"   "grDevices" "graphics"  "stats"
[7] "deSolve"   "tools"


Many thanks, and sorry if this is a silly question,


It is really an R-devel question: see the posting guide.  In particular 
OSes differ in how (or even if) they can reload DLLs, and the details 
are way too technical for R-help.


In any case, there are no reproduction instructions here: see the 
posting guide.



Adam




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Questions about R

2013-11-06 Thread Silvia Espinoza
Good morning. I am interested in downloading R.  I would appreciate if you
can help me with the following questions, please.

1.   Is R free, or I have to pay for support/maintenance, or it depends
on the version? Is there a paid version?

2.   How safe is it to work with data using R? Is there any risk that
someone else can have access to the information?

Thanks in advance for your attention and for any help you can provide me.

Silvia Espinoza

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting multiple horizontal lines to data

2013-11-06 Thread Sashikanth Chandrasekaran
I am not trying to fit a horizontal line at every unique value of y. I am
trying fit the y values with as few horizontal lines by trading off the
number of horizontal lines with the error. The actual problem I am trying
to solve is to smooth data in a time series. Here is a realistic example of
y

y=c(134.45,141.82,143.81,141.81,145,141.61,143.72,145.71,200,175,140,200,148.77,71.64,111.57,118.15,119.15,112.8,111.64,111.64,157.26,143.8,40.19,64.99,64.99,129.98,64.99,65,64.98,64.99)

An example fit for y using multiple horizontal lines (may not be the best
fit in terms of squared error or another error metric, but I have included
the y value for concreteness)

1. A horizontal line at approximately y=140 (to fit the first 13 values -
134.45 to 148.77)
2. A horizontal line at approximately y=110 (to fit the next 7 values -
71.64 to 111.64)
3. A horizontal line at approximately y=150 (to fit the next 2 values -
157.26 to 143.8)
4. A horizontal line at approximately y=65 (to fit the last 8 values -
40.19 to 64.99)
-sashi.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R 3.0.2 - How to create intervals and group another variable in those intervals?

2013-11-06 Thread jim holtman
should of had in the script:


int <- seq(0, max(x$speed) + 4.5, by = 4.5)



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Nov 6, 2013 at 12:19 PM, jim holtman  wrote:
> Is this what you are after:
>
>
>> n <- 1000
>> x <- data.frame(speed = runif(n, 0, 85.53)
> + , spacing = rnorm(n)
> + )
>> # create interval for speed
>> int <- seq(0, max(x) + 4.5, by = 4.5)
>> # split the data and find average of spacing
>> tapply(x$spacing, cut(x$speed, int), mean)
> (0,4.5] (4.5,9](9,13.5]   (13.5,18]   (18,22.5]
> (22.5,27]   (27,31.5]   (31.5,36]
>  0.27840783 -0.08349567 -0.10659408 -0.01476840 -0.08773255
> -0.06643826  0.21873016  0.17627232
>   (36,40.5]   (40.5,45]   (45,49.5]   (49.5,54]   (54,58.5]
> (58.5,63]   (63,67.5]   (67.5,72]
> -0.16568350 -0.15458191 -0.04909331 -0.01179396  0.25022296
> -0.27553812 -0.14927483 -0.21000177
>   (72,76.5]   (76.5,81]   (81,85.5]
> -0.09884137 -0.08459709  0.02864456
>>
>>
>
> plotting is left as an exercise for the reader, but given the averages
> above, you can find the midpoints easily.
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
>
> On Wed, Nov 6, 2013 at 10:48 AM, umair durrani  
> wrote:
>> I have two columns for speed ('Smoothed velocity') and Spacing. What I want 
>> to do is to first create the intervals of speed (minimum value=0, max value= 
>> 85.53), group the Spacing values falling in a particular Speed interval, 
>> find the average of the Spacing for an interval and finally plot the average 
>> spacing of each interval against the mid-point of the Speed interval. I want 
>> to have fixed intervals of 4.5 feet per second, i.e. 0-4.5, 
>> 4.5-9,..xx-85.53.After hours of search I found a function for creating 
>> intervals called classIntervals() but I can't figure out how to create fixed 
>> intervals of 4.5. Here is what I tried:classIntervals(s21[,'Smoothed 
>> velocity'], style='fixed', fixedBreaks=4.5)But the results were unexpected 
>> and there was a Warning message:In classIntervals(s21[, "Smoothed 
>> velocity"], style = "fixed", fixedBreaks = 4.5) :
>>   variable range greater than fixedBreaksEven after intervals are created, I 
>> need to group spacing and find the avg. for every interval. How can I do 
>> this? I have tried what I could, please help
>>
>> Umair Durrani
>>
>> email: umairdurr...@outlook.com
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R 3.0.2 - How to create intervals and group another variable in those intervals?

2013-11-06 Thread jim holtman
Is this what you are after:


> n <- 1000
> x <- data.frame(speed = runif(n, 0, 85.53)
+ , spacing = rnorm(n)
+ )
> # create interval for speed
> int <- seq(0, max(x) + 4.5, by = 4.5)
> # split the data and find average of spacing
> tapply(x$spacing, cut(x$speed, int), mean)
(0,4.5] (4.5,9](9,13.5]   (13.5,18]   (18,22.5]
(22.5,27]   (27,31.5]   (31.5,36]
 0.27840783 -0.08349567 -0.10659408 -0.01476840 -0.08773255
-0.06643826  0.21873016  0.17627232
  (36,40.5]   (40.5,45]   (45,49.5]   (49.5,54]   (54,58.5]
(58.5,63]   (63,67.5]   (67.5,72]
-0.16568350 -0.15458191 -0.04909331 -0.01179396  0.25022296
-0.27553812 -0.14927483 -0.21000177
  (72,76.5]   (76.5,81]   (81,85.5]
-0.09884137 -0.08459709  0.02864456
>
>

plotting is left as an exercise for the reader, but given the averages
above, you can find the midpoints easily.

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Nov 6, 2013 at 10:48 AM, umair durrani  wrote:
> I have two columns for speed ('Smoothed velocity') and Spacing. What I want 
> to do is to first create the intervals of speed (minimum value=0, max value= 
> 85.53), group the Spacing values falling in a particular Speed interval, find 
> the average of the Spacing for an interval and finally plot the average 
> spacing of each interval against the mid-point of the Speed interval. I want 
> to have fixed intervals of 4.5 feet per second, i.e. 0-4.5, 
> 4.5-9,..xx-85.53.After hours of search I found a function for creating 
> intervals called classIntervals() but I can't figure out how to create fixed 
> intervals of 4.5. Here is what I tried:classIntervals(s21[,'Smoothed 
> velocity'], style='fixed', fixedBreaks=4.5)But the results were unexpected 
> and there was a Warning message:In classIntervals(s21[, "Smoothed velocity"], 
> style = "fixed", fixedBreaks = 4.5) :
>   variable range greater than fixedBreaksEven after intervals are created, I 
> need to group spacing and find the avg. for every interval. How can I do 
> this? I have tried what I could, please help
>
> Umair Durrani
>
> email: umairdurr...@outlook.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] deSolve, unresolved namespace error

2013-11-06 Thread Adam Clark
I'm having trouble running the "ode" function from the "deSolve" package.

I am running RStudio under Ubuntu 13.1

I am running ode() on compiled code that returns delta values using the .C
convention. While I can include an example of the code, I suspect that it
will not be helpful, since the problem is not replicable among systems
(e.g. Solaris or Mac).

When I call ode() on my compiled code, it occasionally will return:

Error in .C("unlock_solver") :
  "unlock_solver" not resolved from current namespace (deSolve)


All subsequent calls to ode() return the same error message, regardless of
what I run. The problem is resolved only if I restart RStudio. However,
since I am running many iterations of this command, I would like to find a
way to resolve this without restarting RStudio.

If I call: is.loaded("unlock_solver", PACKAGE="deSolve"), R returns TRUE.

If I call: .C("unlock_solver"), R returns:
list()

If I first unload "deSolve" using
detach("package:deSolve", unload=TRUE), deSolve disappears from my search()
space, but  is.loaded("unlock_solver", PACKAGE="deSolve") still returns
TRUE. If I reload deSolve, the problem persists.

Based on what I have read in the help files on namespace conventions in
packages,  I suspect that the problem is that the .C function
"unlock_solver" is not corrected loaded, or was unloaded but not marked as
unloaded.

I have two guesses for what is going on:
1) "unlock_solver" was loaded using the library.dynam() function, but
unloaded using the dyn.unload() function. As I understand it, this would
leave a blank entry in the namespace, leading R to think that
"unlock_solver" is loaded, even though the function no longer does
anything. However, even when deSolve is working correctly,
.C("unlock_solver") returns a blank list, so this may not be the case.

2) Some call deep in deSolve is not pointed towards the right package, and
therefore cannot find "unlock_solver".

>From the source code for deSolve, posted at
https://r-forge.r-project.org/scm/viewvc.php/pkg/deSolve/src/deSolve_utils.c?view=log&root=desolve&pathrev=344,
"unlock_solver"
seems to be a pretty simple function, inside the deSolve_utils.c file:

void unlock_solver(void) {solver_locked = 0;}

This command is a pretty recent addition to deSolve (it appeared
somewhere between revision 319 and 324), and is meant to "prevent
nesting of solvers that have global variables", according to the
change annotation.


In any case, I'd like to find a way to specifically unload "unlock_solver"
and reload it, or barring that, unload as many of the DLL's associated with
deSolve as possible and reload them. I suspect that this will solve my
problems.

Many thanks, and sorry if this is a silly question,
Adam

-- 
Adam Clark
University of Minnesota, EEB
100 Ecology Building
1987 Upper Buford Circle
St. Paul, MN 55108
(857)-544-6782

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert one digit numbers to two digits one

2013-11-06 Thread Barry Rowlingson
All these suggestions of using 'sprintf' might be right but you might
be doing it wrong...

If you are working with times, then use the date/time classes and the
handy functions for working on them. Which means the lubridate
package, most likely.

Are these times part of a calendar time, or are they just clock times
without reference to any day, or are they durations in hours and
minutes?



On Wed, Nov 6, 2013 at 4:34 PM, Marc Schwartz  wrote:
> On Nov 6, 2013, at 10:25 AM, Alaios  wrote:
>
>> Hi all,
>> the following returns the hour and the minutes
>>
>> paste(DataSet$TimeStamps[selectedInterval$start,4], 
>> DataSet$TimeStamps[selectedInterval$start,5],sep=":")
>> [1] "12:3"
>>
>> the problem is that from these two I want to create a time stamp so 12:03. 
>> The problem is that the number 3 is not converted to 03. Is there an easy 
>> way when I have one digit integer to add a zero in the front? Two digits 
>> integers are working fine so far, 12:19, or 12:45 would appear correctly
>>
>> I would like to thank you in advance for your help
>>
>> Regards
>> Alex
>
>
> This is an example where using ?sprintf gives you more control:
>
>> sprintf("%02d:%02d", 12, 3)
> [1] "12:03"
>
>> sprintf("%02d:%02d", 9, 3)
> [1] "09:03"
>
>
> The syntax '%02d' tells sprintf to print the integer and pad with leading 
> zeroes to two characters where needed.
>
> Regards,
>
> Marc Schwartz
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R 3.0.2 - How to create intervals and group another variable in those intervals?

2013-11-06 Thread umair durrani
I have two columns for speed ('Smoothed velocity') and Spacing. What I want to 
do is to first create the intervals of speed (minimum value=0, max value= 
85.53), group the Spacing values falling in a particular Speed interval, find 
the average of the Spacing for an interval and finally plot the average spacing 
of each interval against the mid-point of the Speed interval. I want to have 
fixed intervals of 4.5 feet per second, i.e. 0-4.5, 4.5-9,..xx-85.53.After 
hours of search I found a function for creating intervals called 
classIntervals() but I can't figure out how to create fixed intervals of 4.5. 
Here is what I tried:classIntervals(s21[,'Smoothed velocity'], style='fixed', 
fixedBreaks=4.5)But the results were unexpected and there was a Warning 
message:In classIntervals(s21[, "Smoothed velocity"], style = "fixed", 
fixedBreaks = 4.5) :
  variable range greater than fixedBreaksEven after intervals are created, I 
need to group spacing and find the avg. for every interval. How can I do this? 
I have tried what I could, please help

Umair Durrani

email: umairdurr...@outlook.com
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple String word replacements: Performance Issue

2013-11-06 Thread Simon Pickert
Dear experts,
I’ve been on this for weeks now, and couldn’t find a solution..Sorry for the 
long description. I figured I post many details, so you get the problem 
entirely, although it’s not hard to grasp.

**Situation:**
Data frame consisting of 4 million entries (total size: 250 MB). Two columns: 
`ID` and `TEXT`. Text strings are each up to 200 characters.


**Task:**
Preprocessing the text strings

Example Data:


+——+—+
|  ID| Text 
|  
+——+—+
| 123  | $AAPL is up +5%|  
| 456  | $MSFT , $EBAY doing great.  www.url.com   |
  ..
+——+—+

Should become

+——+——-——+
|  ID| Text clean   
 |  First Ticker  |  All Ticker   |   Ticker Count  
+——++——+ +———-—+
| 123  | [ticker] is up [positive_percentage]   |   
$aapl   |   $aapl|  1
| 456  | [ticker] [ticker] doing great [url] [pos_emotion] |   
$msft   |   $msft,$ebay  |  2
  ..
+——++——-+——+——+



**Problem:**
It takes too long. On my 8GB RAM Dual-Core machine: Cancelled after 1 day. On a 
70GB 8-Core Amazon EC2 instance: Cancelled after 1 day.


**Details:**
I am basically 

 - Counting how often certain words appear in one string
 - Write this number into a new column (COUNT)
 - Replace this (counted) word
 - Replace other words (which I don't need to count before)
 - Replace some regular expressions

The vectors which are used as patterns look like this:

"\\bWORD1\\b|\\bWORD2\\b|\\bWORD3\\b|\\bWORD4\\b..."

Thus, those 'replacement vectors' are character vectors of length 1, each 
containing up to 800 words



**Main code:**

library("parallel")
library("stringr")

preprocessText<-function(x){
  
  # Replace the 'html-and'
  arguments<-list(pattern="\\&\\;",replacement="and",x=x, 
ignore.case=TRUE)
  y<-do.call(gsub, arguments)  
  
  # Remove some special characters
   
arguments<-list(pattern="[^-[:alnum:]\\'\\:\\/\\$\\%\\.\\,\\+\\-\\#\\@\\_\\!\\?+[:space:]]",replacement="",x=y,
 ignore.case=TRUE)
  y<-do.call(gsub, arguments)  
  
  # Lowercase 
  arguments<-list(string=y,pattern=tolower(rep_ticker))
  first<-do.call(str_match,arguments)  
  
  # Identify signal words and count them
  # Need to be done in parts, because otherwise R can't handle this many at 
once
  arguments<-list(string=x, pattern=rep_words_part1)
  t1<-do.call(str_extract_all,arguments)
   
  arguments<-list(string=x, pattern=rep_words_part2)
  t2<-do.call(str_extract_all,arguments)
  
  arguments<-list(string=x, pattern=rep_words_part3)
  t3<-do.call(str_extract_all,arguments)
  
  arguments<-list(string=x, pattern=rep_words_part4)
  t4<-do.call(str_extract_all,arguments)
  
  count=length(t1[[1]])+length(t2[[1]])+length(t3[[1]])+length(t4[[1]])
  signal_words=c(t1[[1]],t2[[1]],t3[[1]],t4[[1]])
  

  # Replacements
  
  arguments<-list(pattern=rep_wordsA,replacement="[ticker]",x=y, 
ignore.case=TRUE)
  y<-do.call(gsub, arguments) 
   
  arguments<-list(pattern=rep_wordB_part1,replacement="[ticker] ",x=y, 
ignore.case=TRUE)
  y<-do.call(gsub, arguments)   

  arguments<-list(pattern=rep_wordB_part2,replacement="[ticker] ",x=y, 
ignore.case=TRUE)
  y<-do.call(gsub, arguments)   

  arguments<-list(pattern=rep_wordB_part3,replacement="[ticker2] ",x=y, 
ignore.case=TRUE)
  y<-do.call(gsub, arguments)   

  arguments<-list(pattern=rep_wordB_part4,replacement=“[ticker2] ",x=y, 
ignore.case=TRUE)
  y<-do.call(gsub, arguments)   
  
  arguments<-list(pattern=rep_email,replacement=" [email_address] ",x=y, 
ignore.case=TRUE)
  y<-do.call(gsub, arguments)   
  
  arguments<-list(pattern=rep_url,replacement=" [url] ",x=y, 
ignore.case=TRUE)
  y<-do.call(gsub, arguments)   
 
  arguments<-list(pattern=rep_wordC,replacement=" [wordC] ",x=y, 
ignore.case=TRUE)
  y<-do.call(gsub, arguments)   
  
  # Some regular expressions
  arguments<-list(pattern="\\+[[:digit:]]*.?[[:digit:]]+%",replacement=" 
[positive_percentage] ",x=y, ignore.case=TRUE)
  y<-do.call(gsub, arguments)   
  
  arguments<-list(pattern="-[[:digit:]]*.?[[:digit:]]+%",replacement=" 
[negative_percentage] ",x=y, ignore.case=TRUE)
  y<-do.call(gsub, arguments)   
  
  arguments<-list(pattern="[[:digit:]]*.?[[:digit:]]+%",replacement=" 
[percentage] ",x=y, ignore.case=TRUE)
  y<-do.call(gsub, arguments)   
  
  arguments

[R] SNPRelate- problem performing PCA

2013-11-06 Thread Danica Fabrigar
Hi, 
I have a dataset containing approximately 2 million SNPs. The data is in PLINK 
format. The data conversion was successful except for a warning message "Nan 
introduced due to coercion". My problem is that I get an error message when I 
perform a PCA:

"Principal Component Analysis (PCA) on SNP genotypes: Removing 67 non-autosomal 
SNPs.Error in snpgdsPCA(genofile) : There is no SNP!"


Here is my script:
#load BED files

bed2L<-“/home/2L_hwe_cleaned.bed”

bim2L<-“/home/2L_hwe_cleaned.bim”

fam2L<-“/home/2L_hwe_cleaned.fam”

 

#convert format

snpgdsBED2GDS(bed2L,
bim2L, fam2L, “chr2L”)
#open filegenofile<-openfn.gds(“chr2L”)
#perform pcapca<-snpgdsPCA(genofile)


I would appreciate some help.
Thanks,Danica 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Basic question: why does a scatter plot of a variable against itself works like this?

2013-11-06 Thread William Dunlap
It probably happens because plot(formula) makes one call to terms(formula) to
analyze the formula.  terms() says there is one variable in the formula,
the response, so plot(x~x) is the same a plot(seq_along(x), x).
If you give it plot(~x) , terms() also says there is one variable, but
no response, so you get the same plot as plot(x, rep(1,length(x))).
This is also the reason that plot(y1+y2 ~ x1+x2) makes one plot of the sum of 
y1 and y2
for each term on the right side instead of 4 plots, plot(x1,y1), 
plot(x1,y2),plot(x2,y1),
and plot(x2,y2).

One could write a plot function that called terms separately on the left and
right sides of the formula.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Tal Galili
> Sent: Wednesday, November 06, 2013 8:40 AM
> To: r-help@r-project.org
> Subject: [R] Basic question: why does a scatter plot of a variable against 
> itself works like
> this?
> 
> Hello all,
> 
> I just noticed the following behavior of plot:
> x <- c(1,2,9)
> plot(x ~ x) # this is just like doing:
> plot(x)
> # when maybe we would like it to give this:
> plot(x ~ c(x))
> # the same as:
> plot(x ~ I(x))
> 
> I was wondering if there is some reason for this behavior.
> 
> 
> Thanks,
> Tal
> 
> 
> 
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
> --
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Basic question: why does a scatter plot of a variable against itself works like this?

2013-11-06 Thread Marc Schwartz

On Nov 6, 2013, at 10:40 AM, Tal Galili  wrote:

> Hello all,
> 
> I just noticed the following behavior of plot:
> x <- c(1,2,9)
> plot(x ~ x) # this is just like doing:
> plot(x)
> # when maybe we would like it to give this:
> plot(x ~ c(x))
> # the same as:
> plot(x ~ I(x))
> 
> I was wondering if there is some reason for this behavior.
> 
> 
> Thanks,
> Tal


Hi Tal,

In your example:

  plot(x ~ x)

the formula method of plot() is called, which essentially does the following 
internally:

> model.frame(x ~ x)
  x
1 1
2 2
3 9

Note that there is only a single column in the result. Thus, the plot is based 
upon 'y' = c(1, 2, 9), while 'x' = 1:3, which is NOT the row names for the 
resultant data frame, but the indices of the vector elements in the 'x' column. 

This is just like:

  plot(c(1, 2, 9))


On the other hand:

> model.frame(x ~ c(x))
  x c(x)
1 11
2 22
3 99

> model.frame(x ~ I(x))
  x I(x)
1 11
2 22
3 99


In both of the above cases, you get two columns of data back, thus the result 
is essentially:

  plot(c(1, 2, 9), c(1, 2, 9))


Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message glmer using R: “ 'what' must be a character string or a function”

2013-11-06 Thread William Dunlap
> You can reproduce the problem by having a data.frame (or anything else) in 
> your
> environment:

I left out "called 'new'" in the above statement.  The example is correct.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of William Dunlap
> Sent: Wednesday, November 06, 2013 8:04 AM
> To: EmmaB; r-help@r-project.org
> Subject: Re: [R] Error message glmer using R: “ 'what' must be a character 
> string or a
> function”
> 
> You can reproduce the problem by having a data.frame (or anything else) in 
> your
> environment:
> 
> new <- data.frame(ï..VAR1 = rep(c(TRUE,NA,FALSE), c(10,2,8)),
> random=rep(1:3,len=20), clustno=rep(c(1:5),len=20), 
> validatedRS6=rep(0:1,len=20))
> model1<- glmer(validatedRS6 ~ random + (1|clustno), data=new, 
> family=binomial(),
> nAGQ=3)
> # Error in do.call(new, c(list(Class = "glmResp", family = family), 
> ll[setdiff(names(ll),  :
> #  'what' must be a character string or a function
> 
> The problem is in the call
>do.call(new, list())
> It finds your dataset 'new' (in .GlobalEnv), which is not a function or the 
> name of a
> function,
> not the function 'new' from the 'methods' package.  Rename your dataset, so 
> you do not
> have anything called 'new' masking the one in package:methods, and things 
> should work.
> 
> Write to the maintainer of the package (use maintainer("lme4") for the 
> address) about
> the
> problem.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> > Behalf
> > Of EmmaB
> > Sent: Tuesday, November 05, 2013 4:48 PM
> > To: r-help@r-project.org
> > Subject: Re: [R] Error message glmer using R: “ 'what' must be a character 
> > string or a
> > function”
> >
> > > str(new)
> > 'data.frame':   1214 obs. of  4 variables:
> >  $ ï..VAR1 : logi  NA NA NA NA NA NA ...
> >  $ random  : int  1 1 1 1 1 1 1 1 1 1 ...
> >  $ clustno : int  1 1 1 1 1 1 1 1 1 1 ...
> >  $ validatedRS6: int  0 0 0 0 0 0 0 0 0 0 ...
> >
> >
> >
> > --
> > View this message in context: 
> > http://r.789695.n4.nabble.com/Error-message-glmer-
> > using-R-what-must-be-a-character-string-or-a-function-tp4679829p4679836.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Basic question: why does a scatter plot of a variable against itself works like this?

2013-11-06 Thread Tal Galili
Hello all,

I just noticed the following behavior of plot:
x <- c(1,2,9)
plot(x ~ x) # this is just like doing:
plot(x)
# when maybe we would like it to give this:
plot(x ~ c(x))
# the same as:
plot(x ~ I(x))

I was wondering if there is some reason for this behavior.


Thanks,
Tal



Contact
Details:---
Contact me: tal.gal...@gmail.com |
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] resdiuals of random model estimated by plm function

2013-11-06 Thread alfonso . carfora

Hi all,


I have estimated a random panel model using plm function.

I have a question about the vector of resduals obtained with the  
object $residuals.


example:

data("Produc", package = "plm")
zz <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,  
model="random", data = Produc, index = c("state","year"))


res<-zz$residuals # vector of the residuals.

the vector res is the sum of the idyosiyncratic (eit) and individual  
(ui) component or is only the idyosiyncratic (eit) component?


Thanks
Alfonso

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert one digit numbers to two digits one

2013-11-06 Thread Marc Schwartz
On Nov 6, 2013, at 10:25 AM, Alaios  wrote:

> Hi all,
> the following returns the hour and the minutes
> 
> paste(DataSet$TimeStamps[selectedInterval$start,4], 
> DataSet$TimeStamps[selectedInterval$start,5],sep=":")
> [1] "12:3"
> 
> the problem is that from these two I want to create a time stamp so 12:03. 
> The problem is that the number 3 is not converted to 03. Is there an easy way 
> when I have one digit integer to add a zero in the front? Two digits integers 
> are working fine so far, 12:19, or 12:45 would appear correctly
> 
> I would like to thank you in advance for your help
> 
> Regards
> Alex


This is an example where using ?sprintf gives you more control:

> sprintf("%02d:%02d", 12, 3)
[1] "12:03"

> sprintf("%02d:%02d", 9, 3)
[1] "09:03"


The syntax '%02d' tells sprintf to print the integer and pad with leading 
zeroes to two characters where needed.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert one digit numbers to two digits one

2013-11-06 Thread Bert Gunter
(Assuming I understand) tons of ways of doing this.

So I'll just point out the
?nchar
function, which you can use to count characters in your tail end and
paste a "0" if there's only one, e.g. via ifelse() .

-- Bert

On Wed, Nov 6, 2013 at 8:25 AM, Alaios  wrote:
> Hi all,
> the following returns the hour and the minutes
>
> paste(DataSet$TimeStamps[selectedInterval$start,4], 
> DataSet$TimeStamps[selectedInterval$start,5],sep=":")
> [1] "12:3"
>
> the problem is that from these two I want to create a time stamp so 12:03. 
> The problem is that the number 3 is not converted to 03. Is there an easy way 
> when I have one digit integer to add a zero in the front? Two digits integers 
> are working fine so far, 12:19, or 12:45 would appear correctly
>
> I would like to thank you in advance for your help
>
> Regards
> Alex
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Remove from the mailing list

2013-11-06 Thread Mario Garrido
Dear administrators,
I tried to stop receving mails from your webpage to my current mail (
gaiarr...@usal.es). I try to do it through your webpage but i was not
able. I would appreciate to be removed from the mailing list.

Thanks in advance.


-- 
Mario Garrido Escudero
Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca.
Agrícola
Fac. de Farmacia
Campus Unamuno
Universidad de Salamanca

gaiarr...@usal.es

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert one digit numbers to two digits one

2013-11-06 Thread David Winsemius

On Nov 6, 2013, at 8:25 AM, Alaios wrote:

> Hi all,
> the following returns the hour and the minutes
> 
> paste(DataSet$TimeStamps[selectedInterval$start,4], 
> DataSet$TimeStamps[selectedInterval$start,5],sep=":")
> [1] "12:3"
> 
> the problem is that from these two I want to create a time stamp so 12:03. 
> The problem is that the number 3 is not converted to 03. Is there an easy way 
> when I have one digit integer to add a zero in the front? Two digits integers 
> are working fine so far, 12:19, or 12:45 would appear correctly
> 

?sprintf  # other options are linked from that page.

> 
>   [[alternative HTML version deleted]]
Sigh.
-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] convert one digit numbers to two digits one

2013-11-06 Thread Alaios
Hi all,
the following returns the hour and the minutes

paste(DataSet$TimeStamps[selectedInterval$start,4], 
DataSet$TimeStamps[selectedInterval$start,5],sep=":")
[1] "12:3"

the problem is that from these two I want to create a time stamp so 12:03. The 
problem is that the number 3 is not converted to 03. Is there an easy way when 
I have one digit integer to add a zero in the front? Two digits integers are 
working fine so far, 12:19, or 12:45 would appear correctly

I would like to thank you in advance for your help

Regards
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] WriteBin problem

2013-11-06 Thread Jeff Newmiller
Sorry, Carl, but you missed the boat on both responses.

Using readBin to read what they wrote won't help the OP if they don't 
understand what they are writing.  Nor is byte 0 the EOF marker on any 
operating system I have ever used. (It does happen to be the string terminator 
for in-memory strings in the C language, and it is actually not unusual to find 
NUL-termination used in strings that are stored in binary files. The OP does 
seem to be confused about the difference between binary files and text files, 
both of which can be affected by the disk format, CPU type, and operating 
system that are in use.)

I recommend that the OP use raw vectors (see ?raw) if they want to read and 
write binary data. I also recommend using the hexView package to manipulate the 
contents of raw vectors.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Carl Witthoft  wrote:
>
>First of all,  use "readBin" to verify you get the desired data back.  
>Second, that '00' is, I believe the  character you'll find at the
>end
>of any file.
>
>
>Harutyun Khachatryan wrote
>> Dear R project officials,
>> 
>> I have found that in R 3.0.1 version "writeBin" function of "base"
>package
>> might not work correctly. For command writeBin("100",raw()) it
>answers "31
>> 30 30 00" the last double 0 is differs from
>> http://www.branah.com/ascii-converter there ascii codes are "31 30
>30". So
>> is it normal having double 0-s after ascii codes and what it means? 
>
>
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/WriteBin-problem-tp4679853p4679855.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message glmer using R: “ 'what' must be a character string or a function”

2013-11-06 Thread William Dunlap
You can reproduce the problem by having a data.frame (or anything else) in your
environment:

new <- data.frame(ï..VAR1 = rep(c(TRUE,NA,FALSE), c(10,2,8)), 
random=rep(1:3,len=20), clustno=rep(c(1:5),len=20), 
validatedRS6=rep(0:1,len=20))
model1<- glmer(validatedRS6 ~ random + (1|clustno), data=new, 
family=binomial(), nAGQ=3)
# Error in do.call(new, c(list(Class = "glmResp", family = family), 
ll[setdiff(names(ll),  : 
#  'what' must be a character string or a function

The problem is in the call
   do.call(new, list())
It finds your dataset 'new' (in .GlobalEnv), which is not a function or the 
name of a function,
not the function 'new' from the 'methods' package.  Rename your dataset, so you 
do not
have anything called 'new' masking the one in package:methods, and things 
should work.

Write to the maintainer of the package (use maintainer("lme4") for the address) 
about the
problem.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of EmmaB
> Sent: Tuesday, November 05, 2013 4:48 PM
> To: r-help@r-project.org
> Subject: Re: [R] Error message glmer using R: “ 'what' must be a character 
> string or a
> function”
> 
> > str(new)
> 'data.frame':   1214 obs. of  4 variables:
>  $ ï..VAR1 : logi  NA NA NA NA NA NA ...
>  $ random  : int  1 1 1 1 1 1 1 1 1 1 ...
>  $ clustno : int  1 1 1 1 1 1 1 1 1 1 ...
>  $ validatedRS6: int  0 0 0 0 0 0 0 0 0 0 ...
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Error-message-glmer-
> using-R-what-must-be-a-character-string-or-a-function-tp4679829p4679836.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CRAN mirror for R in India: new one at WBUT, how do we get listed in the CRAN website?

2013-11-06 Thread Uwe Ligges

See

http://cran.r-project.org/mirror-howto.html

Best,
Uwe Ligges

On 06.11.2013 13:34, Abhinav Kashyap wrote:



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] WriteBin problem

2013-11-06 Thread Carl Witthoft

First of all,  use "readBin" to verify you get the desired data back.  
Second, that '00' is, I believe the  character you'll find at the end
of any file.


Harutyun Khachatryan wrote
> Dear R project officials,
> 
> I have found that in R 3.0.1 version "writeBin" function of "base" package
> might not work correctly. For command writeBin("100",raw()) it answers "31
> 30 30 00" the last double 0 is differs from
> http://www.branah.com/ascii-converter there ascii codes are "31 30 30". So
> is it normal having double 0-s after ascii codes and what it means? 





--
View this message in context: 
http://r.789695.n4.nabble.com/WriteBin-problem-tp4679853p4679855.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function does not see variables outside the function

2013-11-06 Thread John Fox
Dear Zhong-Yuan Zhang,

R is lexically scoped. Pretending that you're using a different programming
language is probably a bad idea. 

The findGlobals() function in the codetools package, which is part of the
standard R distribution, can help you locate references to global variables
(and functions) in a function. For example,

> f <- function() g(a)

> findGlobals(f)
[1] "a" "g"

> ff <- function() {a <- 10; g(a)}

> findGlobals(ff)
[1] "{"  "<-" "g"

> fff <- function(a) g(a)

> findGlobals(fff)
[1] "g"

I hope this helps,
 John

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Zhong-Yuan Zhang
> Sent: Wednesday, November 06, 2013 7:32 AM
> To: r-help@r-project.org
> Subject: Re: [R] Function does not see variables outside the function
> 
> Dear Experts:
> 
> I am very appreciate your comments and help!
> 
> Actually I am a new comer from MATLAB. If the function
> 
> can see global variables, then it may output wrong results without
> 
> any error messages. For example, there is a gloabl variable named
> 
> v, and I write one funciton with one local variable x. However, in some
> line,
> 
> I misspelled x to v, which would results in unexpected errors without
> warning.
> 
> In summary, I want to disable the ability to make debugging easier.
> 
> Best.
> 
> 
> 2013/11/5 Carl Witthoft 
> 
> > Why would you want to impose this restriction?  Perhaps if you
> explain what
> > you are trying to do, we can suggest approaches that will satisfy
> your
> > specific needs.
> > (note- one can always redefine whatever variables are to be
> "excluded."
> > E.g.
> > to keep the body of a function from referring to 'foo' in the calling
> > environment, just add the line 'foo<-NA' inside the function)
> >
> >
> > Zhong-Yuan Zhang wrote
> > >  In MATLAB, functions cannot see variables outside the
> > >
> > > functions.  However, in R, the functions can do that. Is there
> > >
> > > any settings that can disable this ability of functions?
> > >
> > >
> > > __
> >
> > > R-help@
> >
> > >  mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> > http://r.789695.n4.nabble.com/Function-does-not-see-variables-
> outside-the-function-tp4679762p4679768.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> --
> Zhong-Yuan Zhang (PhD.)
> Associate Professor
> School of Statistics
> Central University of Finance and Economics
> 39 South College Road, Haidian District, Beijing, P.R.China 100081
> Email: zhyua...@gmail.com
> Homepage: http://en.stat.cufe.edu.cn/zhongyuanzhang/
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variable standardization in manova() call

2013-11-06 Thread Michael Friendly

On 11/4/2013 10:45 AM, Sergio Fonda wrote:

Hi,
I'm not able to get information about the following question:

is the variables standardization a default option in manova() (stats package)?
Or if you want to compare variables with different units or scales and
rather different variances, you have to previously standardize the
variables ?



If you mean the response variables, manova() does not require equal
variances and does not standardize.


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grnn issue

2013-11-06 Thread Cyril Auburtin
Could you please approve this question and let it be sent in R_help?

addendum on the question: it also fails when using a data.frame in guess,
so I have no clue

> guess(grnn, data.frame(x1=c(2), x2=c(4)))
Error in (X - Xa) %*% t(X - Xa) :
  requires numeric/complex matrix/vector arguments



2013/11/6 Cyril Auburtin 

> I'm trying grnn package, and reproduced the example (
> http://cran.r-project.org/web/packages/grnn/grnn.pdf), I tried the
> example with another x input column in the dataset:
>
> but I'm getting the following error  "Error in Ya * patterns1 :
> non-conformable arrays", though I took care to pass an input of length 2
>
> n <- 100
> set.seed(1)
>
> x1 <- runif(n, -2, 2)
> x2 = x1^2
> y0 <- x1 * x2
>
> epsilon <- rnorm(n, 0, .1)
> y <- y0 + epsilon
> grnn <- learn(data.frame(y,x1, x2))
> grnn <- smooth(grnn,sigma=0.1)
> guess(grnn, c(2,4))
>
> # Error in Ya * patterns1 : non-conformable arrays
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] WriteBin problem

2013-11-06 Thread Duncan Murdoch

On 06/11/2013 3:35 AM, Harutyun Khachatryan wrote:

Dear R project officials,

I have found that in R 3.0.1 version "writeBin" function of "base" package might not work correctly. For 
command writeBin("100",raw()) it answers "31 30 30 00" the last double 0 is differs from 
http://www.branah.com/ascii-converter there ascii codes are "31 30 30". So is it normal having double 0-s after ascii 
codes and what it means?



From ?writeBin:

"|readBin| and |writeBin| read and write C-style zero-terminated 
character strings."


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function does not see variables outside the function

2013-11-06 Thread Zhong-Yuan Zhang
Dear Experts:

I am very appreciate your comments and help!

Actually I am a new comer from MATLAB. If the function

can see global variables, then it may output wrong results without

any error messages. For example, there is a gloabl variable named

v, and I write one funciton with one local variable x. However, in some
line,

I misspelled x to v, which would results in unexpected errors without
warning.

In summary, I want to disable the ability to make debugging easier.

Best.


2013/11/5 Carl Witthoft 

> Why would you want to impose this restriction?  Perhaps if you explain what
> you are trying to do, we can suggest approaches that will satisfy your
> specific needs.
> (note- one can always redefine whatever variables are to be "excluded."
> E.g.
> to keep the body of a function from referring to 'foo' in the calling
> environment, just add the line 'foo<-NA' inside the function)
>
>
> Zhong-Yuan Zhang wrote
> >  In MATLAB, functions cannot see variables outside the
> >
> > functions.  However, in R, the functions can do that. Is there
> >
> > any settings that can disable this ability of functions?
> >
> >
> > __
>
> > R-help@
>
> >  mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Function-does-not-see-variables-outside-the-function-tp4679762p4679768.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Zhong-Yuan Zhang (PhD.)
Associate Professor
School of Statistics
Central University of Finance and Economics
39 South College Road, Haidian District, Beijing, P.R.China 100081
Email: zhyua...@gmail.com
Homepage: http://en.stat.cufe.edu.cn/zhongyuanzhang/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CRAN mirror for R in India: new one at WBUT, how do we get listed in the CRAN website?

2013-11-06 Thread Abhinav Kashyap


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fraud and Anomaly detection packages in R...

2013-11-06 Thread Baro
take a look at zoo and outliers package. may be help you


On Wed, Nov 6, 2013 at 3:00 AM, Hossam Hassanien  wrote:

> Hello,
>
> hope all are well. I just wanted some help with finding some of the
> packages that could be used for Fraud and Anomaly detection purposes. I
> would be more than thankful if some packages might be referenced which
> could be in the telecommunications industry.
>
> Regards,
> Hossam Hassanien
>
> ---
> *Hossam El-Din Hassanien Mohammed*
>
>  View Hossam Hassanien - MSc (BI/EDW)'s
> profile
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fraud and Anomaly detection packages in R...

2013-11-06 Thread Hossam Hassanien
Hello,

hope all are well. I just wanted some help with finding some of the
packages that could be used for Fraud and Anomaly detection purposes. I
would be more than thankful if some packages might be referenced which
could be in the telecommunications industry.

Regards,
Hossam Hassanien

---
*Hossam El-Din Hassanien Mohammed*

 View Hossam Hassanien - MSc (BI/EDW)'s
profile

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] WriteBin problem

2013-11-06 Thread Harutyun Khachatryan
Dear R project officials,

I have found that in R 3.0.1 version "writeBin" function of "base" package 
might not work correctly. For command writeBin("100",raw()) it answers "31 30 
30 00" the last double 0 is differs from http://www.branah.com/ascii-converter 
there ascii codes are "31 30 30". So is it normal having double 0-s after ascii 
codes and what it means? 

Thank you in advance.
Regards, Harutyun Khachatryan.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hidden functions

2013-11-06 Thread Mikis Stasinopoulos
Is there a way which I can use a hidden function say f() of a package say 
"foo", (that is, f()is non exported in the NAMESPACE of "foo")  within the 
package "foo" without using foo:::f()

Mikis
   
Prof Mikis Stasinopoulos
d.stasinopou...@londonmet.ac.uk



Companies Act 2006 : http://www.londonmet.ac.uk/companyinfo


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on error (Error: could not find function "kernelUD")

2013-11-06 Thread jwd
On Tue, 5 Nov 2013 16:22:26 -0700
"Angela Dwyer"  wrote:

You didn't forget to load the library did you?  The bit of output you
provide doesn't show a "library(adehabitat)" line.  That needs to be run
before the function can be found.

JWDougherty

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help.start hangs

2013-11-06 Thread hoffmann

Hi,

After helpstart() and the first jump into a package, the nest jump will 
keep trying endlessly Then I go back to the R window, where '> Making 
'packages.html' ... done' is being displayed but no prompt. So I bring 
back the prompt by ^C^C. Then, o wonder, the prompt comes back and in 
the browser (Thunderbird) the desired package display is shown.


Is there a remedy for this?


sessionInfo()

R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] C

attached base packages:
 [1] tools tcltk stats4splines   parallel  datasets  compiler
 [8] graphics  grDevices stats grid  utils methods   base

other attached packages:
 [1] survival_2.37-4spatial_7.3-7  rpart_4.1-3 
nnet_7.3-7
 [5] mgcv_1.7-26nlme_3.1-111   foreign_0.8-55 
codetools_0.2-8
 [9] cluster_1.14.4 class_7.3-9boot_1.3-9 
Matrix_1.0-14
[13] MASS_7.3-29KernSmooth_2.23-10 cwhmisc_4.2 
lattice_0.20-23




Who is willing to give me advice?

TIA  -- Christian

--
Christian W. Hoffmann,
CH - 8915 Hausen am Albis, Switzerland
Rigiblickstrasse 15 b, Tel.+41-44-7640853
christ...@echoffmann.ch,
www.echoffmann.ch


--
Christian W. Hoffmann,
CH - 8915 Hausen am Albis, Switzerland
Rigiblickstrasse 15 b, Tel.+41-44-7640853
(!! c-w.hoffm...@sunrise.ch, )
mailto: christ...@echoffmann.ch
home: www.echoffmann.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ascii-grid export

2013-11-06 Thread Barry Rowlingson
On Mon, Nov 4, 2013 at 7:27 AM, Enzo Cocca  wrote:
> yes barry I really need this.
>
> I tried to use raster or rgdal but with poor results.
>
> I have this function:
>
> VGM_PARAM_A3 <- gstat(id="bos_bison",
> formula=combusto~1,locations=~coord_x+coord_y, data=archezoology_table,
> nmax = 10)
>
> VGM_PARAM_A3 <- gstat(VGM_PARAM_A3, model=vgm(1, "Sph", 5, 0),
> fill.all=TRUE)
>
> ESV_A3 <- variogram(VGM_PARAM_A3, map=True, with=0.1, cutoff=9)
>
> VARMODEL_A3 = fit.lmc(ESV_A3, VGM_PARAM_A3)
>
> plot(ESV_A3, threshold = 5, col.regions = bpy.colors(), xlab=, ylab=,
> main="Map - A3")
> png("C:\Users\User\pyarchinit_R_folder\A3 semivariogram_map.png",
> width=1, height=1, res=400)
>
> I make a png file but how can I convert it in ascii-grid?

Why do you want to make an ascii-grid out of this? The variogram map
isn't in geographical coordinates, its in coordinate differences in x
and y

The return value when map=TRUE doesn't seem to be too well documented,
but looks like it is a list with a 'map' element that is a spatial
pixels data frame. Here's an example using the demo data (I can't run
your code because I don't have your data, please try and make your
problems easily reproducible):

require(sp)
require(gstat)
data(meuse)
coordinates(meuse)=~x+y
v=variogram(log(zinc)~1, meuse,map=TRUE,cutoff=900,width=10)
class(v$map)
[1] "SpatialPixelsDataFrame"
attr(,"package")
[1] "sp"

Now that can be written using rgdal's writeGDAL function.

However, you need the AAIGrid driver to work properly:

> writeGDAL(v$map,"vmap.ag",driver="AAIGrid")
Error in .local(.Object, ...) : Dataset copy failed

raster package to the rescue:

> require(raster)
> writeRaster(raster(v$map),"v.asc","ascii")
class   : RasterLayer

[etc]

When I look at the file, I have an ESRI grid file:

$ head -5 v.asc
NCOLS 181
NROWS 181
XLLCORNER -905
YLLCORNER -905
CELLSIZE 10
[+ data]

Now, that's assuming you wanted to write the data, not a pretty image
picture like you get when you plot the variogram map. And there's
still the mystery of why you want to write a non-geographic
coordinate-based dataset to a geographic data format...

 And you should probably have asked this on R-sig-geo where the
geographRs (including the authors of gstat) hang out.

 Hope this helps anyway.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.