Re: [R] R how to find outliers and zero mean columns?

2016-03-30 Thread Jim Lemon
Perhaps if you go back to the example that I sent, you will notice
that those vectors of logical values (which_cols, which_rows) were
among the results. Have you tried:

names(X)[which_cols]

to see whether it is what you want?

Jim

On Thu, Mar 31, 2016 at 2:42 PM, Norman Pat  wrote:
> Hi Jim,
>
> I want to have such a thing
> names(X)[which_cols] where means=0
> then it should print all the features with zero mean
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R how to find outliers and zero mean columns?

2016-03-30 Thread David Winsemius

> On Mar 30, 2016, at 6:39 PM, Norman Pat  wrote:
> 
> Hi David,
> 
> > Please find the  attached data sample.
> 
> No. Nothing attached. Please read the Rhelp Info page and the Posting Guide.
> I attached it. Anyway I have attached it again (sample train.xlsx).

I didn't say you didn't attach it. I only said there was nothing attached. 
There's a difference. The mail-server strips most attachments. I _told_ you to 
read certain documents. You are not demonstrating that you are capable of 
following basic instructions. 

-- 
David Winsemius


> 
> Who is assigning you this task? Homework? (Read the Posting Guide.)
> This is my new job role so I have to do that. I know some basic R 
> 
> > 1. How to Identify features (names) that have all zeros?
> 
> That's generally pretty simple if "names" refers to columns in a data frame.
> You mean such as something like names(data.nrow(means==0))
> 
> > 2. How to remove features that have all zeros from the dataset?
> 
> But maybe you mean to process by rows?
> in a column(feature) 
> 
> > 3. How to identify features (names) that have outliers such as 9,-1 in
> > the data frame.
> Please refer to the attached excel file
> 
> > 4. How to remove outliers?
> 
> You could start by defining "outliers" in something other than vague 
> examples. If this is data from a real-life data gathering effort, then 
> defining outliers would start with an explanation of the context.
> By looking at data I need to find the outliers
> 
> Thanks 
> 
> 
> On Thu, Mar 31, 2016 at 12:20 PM, David Winsemius  
> wrote:
> 
> > On Mar 30, 2016, at 3:56 PM, Norman Pat  wrote:
> >
> > Hi team
> >
> > I am new to R so please help me to do this task.
> >
> > Please find the  attached data sample.
> 
> No. Nothing attached. Please read the Rhelp Info page and the Posting Guide.
> 
> > But in the original data frame I
> > have 350 features and 40 observations.
> >
> > I need to carryout these tasks.
> 
> Who is assigning you this task? Homework? (Read the Posting Guide.)
> 
> > 1. How to Identify features (names) that have all zeros?
> 
> That's generally pretty simple if "names" refers to columns in a dataframe.
> 
> >
> > 2. How to remove features that have all zeros from the dataset?
> 
> But maybe you mean to process by rows?
> 
> 
> > 3. How to identify features (names) that have outliers such as 9,-1 in
> > the data frame.
> >
> > 4. How to remove outliers?
> 
> You could start by defining "outliers" in something other than vague 
> examples. If this is data from a real-life data gathering effort, then 
> defining outliers would start with an explanation of the context.
> 
> 
> >
> >
> > Many thanks
> 
> Please at least do the following "homework".
> 
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 
> 
> 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R how to find outliers and zero mean columns?

2016-03-30 Thread Norman Pat
Hi Jim,
Thanks for your reply. I know these basic stuffs in R.

But I want to know let say you have a data frame X with 300 features.
>From that 300 features I need to pullout the names of each feature
that has zero values for all the observations in that sample.

Here I am looking for a package or a function to do that.

And how do I know whether there are abnormal values for each feature. Let
say
I have 300 features and 10 observations. It is hard to look everything
in the excel file. Instead of that I am looking for a package that does the
work.

I hope you understood.

Thanks a lot

Cheers


On Thu, Mar 31, 2016 at 1:13 PM, Jim Lemon  wrote:

> Hi Norman,
> To check whether all values of an object (say "x") fulfill a certain
> condition (==0):
>
> all(x==0)
>
> If your object (X) is indeed a data frame, you can only do this by
> column, so if you want to get the results:
>
> X<-data.frame(A=c(0,1:10),B=c(0,2:10,9),
>  C=c(0,-1,3:11),D=rep(0,11))
> all_zeros<-function(x) return(all(x==0))
> which_cols<-unlist(lapply(X,all_zeros))
>
> If your data frame (or a subset) contains all numeric values, you can
> finesse the problem like this:
>
> which_rows<-apply(as.matrix(X),1,all_zeros)
>
> What you get is a list of logical (TRUE/FALSE) values from lapply, so
> it has to be unlisted to get a vector of logical values like you get
> with "apply".
>
> You can then use that vector to index (subset) the original data frame
> by logically inverting it with ! (NOT):
>
> X[,!which_cols]
> X[!which_rows,]
>
> Your "outliers" look suspiciously like missing values from certain
> statistical packages. If you know the values you are looking for, you
> can do something like:
>
> NA9<-X==9
>
> and then "remove" them by replacing those values with NA:
>
> X[NA9]<-NA
>
> Be aware that all these hackles (diminutive of hacks) are pretty
> specific to this example. Also remember that if this is homework, your
> karma has just gone down the cosmic sinkhole.
>
> Jim
>
>
> On Thu, Mar 31, 2016 at 9:56 AM, Norman Pat  wrote:
> > Hi team
> >
> > I am new to R so please help me to do this task.
> >
> > Please find the  attached data sample. But in the original data frame I
> > have 350 features and 40 observations.
> >
> > I need to carryout these tasks.
> >
> > 1. How to Identify features (names) that have all zeros?
> >
> > 2. How to remove features that have all zeros from the dataset?
> >
> > 3. How to identify features (names) that have outliers such as 9,-1
> in
> > the data frame.
> >
> > 4. How to remove outliers?
> >
> >
> > Many thanks
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R how to find outliers and zero mean columns?

2016-03-30 Thread Norman Pat
Hi David,

> Please find the  attached data sample.

No. Nothing attached. Please read the Rhelp Info page and the Posting Guide.
*I attached it. Anyway I have attached it again (sample train.xlsx).*

Who is assigning you this task? Homework? (Read the Posting Guide.)
*This is my new job role so I have to do that. I know some basic R *

> 1. How to Identify features (names) that have all zeros?

That's generally pretty simple if "names" refers to columns in a data frame.
*You mean such as something like names(data.nrow(means==0))*

> 2. How to remove features that have all zeros from the dataset?

But maybe you mean to process by rows?
*in a column(feature) *

> 3. How to identify features (names) that have outliers such as 9,-1 in
> the data frame.
*Please refer to the attached excel file*

> 4. How to remove outliers?

You could start by defining "outliers" in something other than vague
examples. If this is data from a real-life data gathering effort, then
defining outliers would start with an explanation of the context.
*By looking at data I need to find the outliers*

*Thanks *


On Thu, Mar 31, 2016 at 12:20 PM, David Winsemius 
wrote:

>
> > On Mar 30, 2016, at 3:56 PM, Norman Pat  wrote:
> >
> > Hi team
> >
> > I am new to R so please help me to do this task.
> >
> > Please find the  attached data sample.
>
> No. Nothing attached. Please read the Rhelp Info page and the Posting
> Guide.
>
> > But in the original data frame I
> > have 350 features and 40 observations.
> >
> > I need to carryout these tasks.
>
> Who is assigning you this task? Homework? (Read the Posting Guide.)
>
> > 1. How to Identify features (names) that have all zeros?
>
> That's generally pretty simple if "names" refers to columns in a dataframe.
>
> >
> > 2. How to remove features that have all zeros from the dataset?
>
> But maybe you mean to process by rows?
>
>
> > 3. How to identify features (names) that have outliers such as 9,-1
> in
> > the data frame.
> >
> > 4. How to remove outliers?
>
> You could start by defining "outliers" in something other than vague
> examples. If this is data from a real-life data gathering effort, then
> defining outliers would start with an explanation of the context.
>
>
> >
> >
> > Many thanks
>
> Please at least do the following "homework".
>
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R how to find outliers and zero mean columns?

2016-03-30 Thread Jim Lemon
How about:

# if a data frame
names(X)[which_cols]

# and if you have rownames:
rownames(X)[which_rows]

My note about hackles was that packages generally don't know what
values are "abnormal" unless you specify them. Just like us. So you
have to specify what the range of "normal" values are, or what
specific values are "abnormal". There is a package named "outliers",
and while it would identify the 9 value in the example I used, it
wouldn't do so for the -1.

Jim


On Thu, Mar 31, 2016 at 1:30 PM, Norman Pat  wrote:
> Hi Jim,
> Thanks for your reply. I know these basic stuffs in R.
>
> But I want to know let say you have a data frame X with 300 features.
> From that 300 features I need to pullout the names of each feature
> that has zero values for all the observations in that sample.
>
> Here I am looking for a package or a function to do that.
>
> And how do I know whether there are abnormal values for each feature. Let
> say
> I have 300 features and 10 observations. It is hard to look everything
> in the excel file. Instead of that I am looking for a package that does the
> work.
>
> I hope you understood.
>
> Thanks a lot
>
> Cheers
>
>
> On Thu, Mar 31, 2016 at 1:13 PM, Jim Lemon  wrote:
>>
>> Hi Norman,
>> To check whether all values of an object (say "x") fulfill a certain
>> condition (==0):
>>
>> all(x==0)
>>
>> If your object (X) is indeed a data frame, you can only do this by
>> column, so if you want to get the results:
>>
>> X<-data.frame(A=c(0,1:10),B=c(0,2:10,9),
>>  C=c(0,-1,3:11),D=rep(0,11))
>> all_zeros<-function(x) return(all(x==0))
>> which_cols<-unlist(lapply(X,all_zeros))
>>
>> If your data frame (or a subset) contains all numeric values, you can
>> finesse the problem like this:
>>
>> which_rows<-apply(as.matrix(X),1,all_zeros)
>>
>> What you get is a list of logical (TRUE/FALSE) values from lapply, so
>> it has to be unlisted to get a vector of logical values like you get
>> with "apply".
>>
>> You can then use that vector to index (subset) the original data frame
>> by logically inverting it with ! (NOT):
>>
>> X[,!which_cols]
>> X[!which_rows,]
>>
>> Your "outliers" look suspiciously like missing values from certain
>> statistical packages. If you know the values you are looking for, you
>> can do something like:
>>
>> NA9<-X==9
>>
>> and then "remove" them by replacing those values with NA:
>>
>> X[NA9]<-NA
>>
>> Be aware that all these hackles (diminutive of hacks) are pretty
>> specific to this example. Also remember that if this is homework, your
>> karma has just gone down the cosmic sinkhole.
>>
>> Jim
>>
>>
>> On Thu, Mar 31, 2016 at 9:56 AM, Norman Pat  wrote:
>> > Hi team
>> >
>> > I am new to R so please help me to do this task.
>> >
>> > Please find the  attached data sample. But in the original data frame I
>> > have 350 features and 40 observations.
>> >
>> > I need to carryout these tasks.
>> >
>> > 1. How to Identify features (names) that have all zeros?
>> >
>> > 2. How to remove features that have all zeros from the dataset?
>> >
>> > 3. How to identify features (names) that have outliers such as 9,-1
>> > in
>> > the data frame.
>> >
>> > 4. How to remove outliers?
>> >
>> >
>> > Many thanks
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ts or xts with high-frequency data within a year

2016-03-30 Thread William Dunlap via R-help
decompose wants frequency(Y) to be more than 1 - it really wants an integer
frequency
so it can return a vector of that length containing the repeating pattern
(the "figure").

frequency(Y) is 1/3600 so you get the error (which might be better worded):

  > plot(decompose(Y))
  Error in decompose(Y) : time series has no or less than 2 periods
  >  frequency(Y)
  [1] 0.000278

Use a ts object and make the frequency relative to the period of interest
(e.g., 24 for hourly
data if you are interested in the daily pattern).




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Mar 30, 2016 at 5:37 PM, Ryan Utz  wrote:

> Bill, Josh, and Bert,
>
> Thanks for your responses. I still can't quite get this when I use actual
> dates. Here's an example of what is going wrong:
>
> X=as.data.frame(1:6000)
> X[2]=seq.POSIXt(ISOdate(2015,11,1),by='hour',length.out=6000)
> X[3]=sample(100,size=6000,replace=T)
>
> Y=xts(X[,3],order.by=X[,2])
> decompose(Y)
>
> Z=ts(X[,2],frequency=24*365)
> plot(decompose(Z))
>
> When I specify an actual date/time (rather than just a number as Bill
> posited), it does not like anything short of a year. This seems like I'm
> overlooking something obvious, but I can't get this for the life of me...
>
> Thanks for your time,
> r
>
>
> On Wed, Mar 30, 2016 at 1:03 PM, William Dunlap  wrote:
>
>> You said you specified frequency=96 when you constructed the time
>> series, but when I do that the decomposition looks reasonable:
>>
>> > time <- seq(0,9,by=1/96) # 15-minute intervals, assuming time unit is
>> day
>> > measurement <- sqrt(time) + 1/(1.2+sin(time*2*pi)) +
>> rnorm(length(time),0,.3)
>> > plot(decompose(ts(measurement, frequency=96)))
>>
>> How is your code different from the above?
>>
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Wed, Mar 30, 2016 at 8:03 AM, Ryan Utz  wrote:
>>
>>> Hello,
>>>
>>> I have a time series that represents data sampled every 15-minutes. The
>>> data currently run from November through February, 8623 total readings.
>>> There are definitely daily periodic trends and non-stationary long-term
>>> trends. I would love to decompose this using basic time series analysis.
>>>
>>> However, every time I attempt decomposition, I get the
>>>
>>> Error in decompose( ) : time series has no or less than 2 periods
>>>
>>> Is it only possible to do basic time-series analysis if you have a year
>>> or
>>> more worth of data? That seems absurd to me, since there is definite
>>> periodicity and the data are a time series. I have tried every manner of
>>> specifying frequency= with no luck (96 does not work). All manner of
>>> searching for help has turned up fruitless.
>>>
>>> Can I only do this after I wait another year or two?
>>>
>>> Thanks,
>>> Ryan
>>>
>>> --
>>>
>>> Ryan Utz, Ph.D.
>>> Assistant professor of water resources
>>> *chatham**UNIVERSITY*
>>> Home/Cell: (724) 272-7769
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>
> --
>
> Ryan Utz, Ph.D.
> Assistant professor of water resources
> *chatham**UNIVERSITY*
> Home/Cell: (724) 272-7769
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R how to find outliers and zero mean columns?

2016-03-30 Thread Jim Lemon
Hi Norman,
To check whether all values of an object (say "x") fulfill a certain
condition (==0):

all(x==0)

If your object (X) is indeed a data frame, you can only do this by
column, so if you want to get the results:

X<-data.frame(A=c(0,1:10),B=c(0,2:10,9),
 C=c(0,-1,3:11),D=rep(0,11))
all_zeros<-function(x) return(all(x==0))
which_cols<-unlist(lapply(X,all_zeros))

If your data frame (or a subset) contains all numeric values, you can
finesse the problem like this:

which_rows<-apply(as.matrix(X),1,all_zeros)

What you get is a list of logical (TRUE/FALSE) values from lapply, so
it has to be unlisted to get a vector of logical values like you get
with "apply".

You can then use that vector to index (subset) the original data frame
by logically inverting it with ! (NOT):

X[,!which_cols]
X[!which_rows,]

Your "outliers" look suspiciously like missing values from certain
statistical packages. If you know the values you are looking for, you
can do something like:

NA9<-X==9

and then "remove" them by replacing those values with NA:

X[NA9]<-NA

Be aware that all these hackles (diminutive of hacks) are pretty
specific to this example. Also remember that if this is homework, your
karma has just gone down the cosmic sinkhole.

Jim


On Thu, Mar 31, 2016 at 9:56 AM, Norman Pat  wrote:
> Hi team
>
> I am new to R so please help me to do this task.
>
> Please find the  attached data sample. But in the original data frame I
> have 350 features and 40 observations.
>
> I need to carryout these tasks.
>
> 1. How to Identify features (names) that have all zeros?
>
> 2. How to remove features that have all zeros from the dataset?
>
> 3. How to identify features (names) that have outliers such as 9,-1 in
> the data frame.
>
> 4. How to remove outliers?
>
>
> Many thanks
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R how to find outliers and zero mean columns?

2016-03-30 Thread Jordan Meyer
I strongly suggest checking out some R tutorials. Most of these tasks are
basic data management that are likely covered in just about any tutorial.
I'm afraid that this isn't the appropriate forum for such basics.
On Mar 30, 2016 9:14 PM, "Norman Pat"  wrote:

> Hi team
>
> I am new to R so please help me to do this task.
>
> Please find the  attached data sample. But in the original data frame I
> have 350 features and 40 observations.
>
> I need to carryout these tasks.
>
> 1. How to Identify features (names) that have all zeros?
>
> 2. How to remove features that have all zeros from the dataset?
>
> 3. How to identify features (names) that have outliers such as 9,-1 in
> the data frame.
>
> 4. How to remove outliers?
>
>
> Many thanks
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R how to find outliers and zero mean columns?

2016-03-30 Thread David Winsemius

> On Mar 30, 2016, at 3:56 PM, Norman Pat  wrote:
> 
> Hi team
> 
> I am new to R so please help me to do this task.
> 
> Please find the  attached data sample.

No. Nothing attached. Please read the Rhelp Info page and the Posting Guide.

> But in the original data frame I
> have 350 features and 40 observations.
> 
> I need to carryout these tasks.

Who is assigning you this task? Homework? (Read the Posting Guide.)

> 1. How to Identify features (names) that have all zeros?

That's generally pretty simple if "names" refers to columns in a dataframe.

> 
> 2. How to remove features that have all zeros from the dataset?

But maybe you mean to process by rows?


> 3. How to identify features (names) that have outliers such as 9,-1 in
> the data frame.
> 
> 4. How to remove outliers?

You could start by defining "outliers" in something other than vague examples. 
If this is data from a real-life data gathering effort, then defining outliers 
would start with an explanation of the context.


> 
> 
> Many thanks

Please at least do the following "homework".

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R how to find outliers and zero mean columns?

2016-03-30 Thread Norman Pat
Hi team

I am new to R so please help me to do this task.

Please find the  attached data sample. But in the original data frame I
have 350 features and 40 observations.

I need to carryout these tasks.

1. How to Identify features (names) that have all zeros?

2. How to remove features that have all zeros from the dataset?

3. How to identify features (names) that have outliers such as 9,-1 in
the data frame.

4. How to remove outliers?


Many thanks
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ts or xts with high-frequency data within a year

2016-03-30 Thread Ryan Utz
Bill, Josh, and Bert,

Thanks for your responses. I still can't quite get this when I use actual
dates. Here's an example of what is going wrong:

X=as.data.frame(1:6000)
X[2]=seq.POSIXt(ISOdate(2015,11,1),by='hour',length.out=6000)
X[3]=sample(100,size=6000,replace=T)

Y=xts(X[,3],order.by=X[,2])
decompose(Y)

Z=ts(X[,2],frequency=24*365)
plot(decompose(Z))

When I specify an actual date/time (rather than just a number as Bill
posited), it does not like anything short of a year. This seems like I'm
overlooking something obvious, but I can't get this for the life of me...

Thanks for your time,
r


On Wed, Mar 30, 2016 at 1:03 PM, William Dunlap  wrote:

> You said you specified frequency=96 when you constructed the time
> series, but when I do that the decomposition looks reasonable:
>
> > time <- seq(0,9,by=1/96) # 15-minute intervals, assuming time unit is day
> > measurement <- sqrt(time) + 1/(1.2+sin(time*2*pi)) +
> rnorm(length(time),0,.3)
> > plot(decompose(ts(measurement, frequency=96)))
>
> How is your code different from the above?
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Mar 30, 2016 at 8:03 AM, Ryan Utz  wrote:
>
>> Hello,
>>
>> I have a time series that represents data sampled every 15-minutes. The
>> data currently run from November through February, 8623 total readings.
>> There are definitely daily periodic trends and non-stationary long-term
>> trends. I would love to decompose this using basic time series analysis.
>>
>> However, every time I attempt decomposition, I get the
>>
>> Error in decompose( ) : time series has no or less than 2 periods
>>
>> Is it only possible to do basic time-series analysis if you have a year or
>> more worth of data? That seems absurd to me, since there is definite
>> periodicity and the data are a time series. I have tried every manner of
>> specifying frequency= with no luck (96 does not work). All manner of
>> searching for help has turned up fruitless.
>>
>> Can I only do this after I wait another year or two?
>>
>> Thanks,
>> Ryan
>>
>> --
>>
>> Ryan Utz, Ph.D.
>> Assistant professor of water resources
>> *chatham**UNIVERSITY*
>> Home/Cell: (724) 272-7769
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


-- 

Ryan Utz, Ph.D.
Assistant professor of water resources
*chatham**UNIVERSITY*
Home/Cell: (724) 272-7769

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convergence issues when using ns splines (pkg: spline) in Cox model (coxph) even when changing coxph.control

2016-03-30 Thread Göran Broström



On 2016-03-30 23:06, David Winsemius wrote:



On Mar 29, 2016, at 1:47 PM, Jennifer Wu, Miss
 wrote:

Hi,

I am currently using R v3.2.3 and on Windows 10 OS 64Bit.

I am having convergence issues when I use coxph with a interaction
term (glarg*bca_py) and interaction term with the restricted cubic
spline (glarg*bca_time_ns). I use survival and spline package to
create the Cox model and cubic splines respectively. Without the
interaction term and/or spline, I have no convergence problem. I
read some forums about changing the iterations and I have but it
did not work. I was just wondering if I am using the inter.max and
outer.max appropriately. I read the survival manual, other R-help
and stackoverflow pages and it suggested changing the iterations
but it doesn't specify what is the max I can go. I ran something
similar in SAS and did not run into a convergence problem.

This is my code:

bca_time_ns <- ns(ins_ca$bca_py, knots=3,
Boundary.knots=range(2,5,10)) test <- ins_ca$glarg*ins_ca$bca_py
test1 <- ins_ca$glarg*bca_time_ns


In your `coxph` call the variable 'bca_py' is the survival time and


Right David: I didn't notice that the 'missing main effect' in fact was 
part of the survival object! And as you say: Time to rethink the whole 
model.


Göran


yet here you are constructing not just one but two interactions (one
of which is a vector but the other one a matrix) between 'glarg' and
your survival times. Is this some sort of effort to identify a
violation of proportionality over the course of a study?

Broström sagely points out that these interactions are not in the
data-object and subsequent efforts to refer to them may be confounded
by the multiple environments from which data would be coming into the
model. Better to have everything come in from the data-object.

The fact that SAS did not have a problem with this rather
self-referential or circular model may be a poor reflection on SAS
rather than on the survival package. Unlike Therneau or Broström who
asked for data, I suggest the problem lies with the model
construction and you should be reading what Therneau has written
about identification of non-proportionality and identification of
time dependence of effects. See Chapter 6 of his "Modeling Survival
Data".



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Boosting Algorithm for Regression (Adaboost.R2)

2016-03-30 Thread Bert Gunter
https://cran.r-project.org/web/views/MachineLearning.html

What do you mean by Prediction Interval? I doubt that this has clear
meaning in the boosting context. You might want to follow up this
statistical question on a statistical or machine learning list like
stats.stackexchange.com  .


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Mar 30, 2016 at 2:13 PM, Majid Javanmard
 wrote:
> Hello
>
> I am new to R , Is there any code to run Adaboost.R2 in r ?!
> I wrote a code from example of gbm package, but I can not have prediction
> interval would you help me ?!
>
>
> library(gbm)
> mm <- read.table("E:/bagg.txt",TRUE)
> xnam <- paste("x", 1:50, sep="")
> fmla <- as.formula(paste("y ~ ", paste(xnam, collapse= "+")))
> gbm1 <- gbm(fmla, data=mm, n.trees=100, distribution="gaussian",
> interaction.depth=3, bag.fraction=0.5, train.fraction=1.0, shrinkage=0.1,
> keep.data=TRUE)
> pred <- predict(gbm1,n.trees=100)
> pred <- as.data.frame(pred)
>
>
> Thanks
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Boosting Algorithm for Regression (Adaboost.R2)

2016-03-30 Thread Majid Javanmard
Hello

I am new to R , Is there any code to run Adaboost.R2 in r ?!
I wrote a code from example of gbm package, but I can not have prediction
interval would you help me ?!


library(gbm)
mm <- read.table("E:/bagg.txt",TRUE)
xnam <- paste("x", 1:50, sep="")
fmla <- as.formula(paste("y ~ ", paste(xnam, collapse= "+")))
gbm1 <- gbm(fmla, data=mm, n.trees=100, distribution="gaussian",
interaction.depth=3, bag.fraction=0.5, train.fraction=1.0, shrinkage=0.1,
keep.data=TRUE)
pred <- predict(gbm1,n.trees=100)
pred <- as.data.frame(pred)


Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convergence issues when using ns splines (pkg: spline) in Cox model (coxph) even when changing coxph.control

2016-03-30 Thread David Winsemius

> On Mar 29, 2016, at 1:47 PM, Jennifer Wu, Miss  
> wrote:
> 
> Hi,
> 
> I am currently using R v3.2.3 and on Windows 10 OS 64Bit.
> 
> I am having convergence issues when I use coxph with a interaction term 
> (glarg*bca_py) and interaction term with the restricted cubic spline 
> (glarg*bca_time_ns). I use survival and spline package to create the Cox 
> model and cubic splines respectively. Without the interaction term and/or 
> spline, I have no convergence problem. I read some forums about changing the 
> iterations and I have but it did not work. I was just wondering if I am using 
> the inter.max and outer.max appropriately. I read the survival manual, other 
> R-help and stackoverflow pages and it suggested changing the iterations but 
> it doesn't specify what is the max I can go. I ran something similar in SAS 
> and did not run into a convergence problem.
> 
> This is my code:
> 
> bca_time_ns <- ns(ins_ca$bca_py, knots=3, Boundary.knots=range(2,5,10))
> test <- ins_ca$glarg*ins_ca$bca_py
> test1 <- ins_ca$glarg*bca_time_ns

In your `coxph` call the variable 'bca_py' is the survival time and yet here 
you are constructing not just one but two interactions (one of which is a 
vector but the other one a matrix) between 'glarg' and your survival times. Is 
this some sort of effort to identify a violation of proportionality over the 
course of a study?

Broström sagely points out that these interactions are not in the data-object 
and subsequent efforts to refer to them may be confounded by the multiple 
environments from which data would be coming into the model. Better to have 
everything come in from the data-object.

The fact that SAS did not have a problem with this rather self-referential or 
circular model may be a poor reflection on SAS rather than on the survival 
package. Unlike Therneau or Broström who asked for data, I suggest the problem 
lies with the model construction and you should be reading what Therneau has 
written about identification of non-proportionality and identification of time 
dependence of effects. See Chapter 6 of his "Modeling Survival Data".

-- 

David Winsemius
Alameda, CA, USA
> 
> coxit <- coxph.control(iter.max=1, outer.max=1)
> 
> bca<-coxph(Surv(bca_py,bca) ~ glarg + test + test1 + age + calyr + diab_dur + 
> hba1c + adm_met + adm_sulfo + adm_tzd + adm_oth +
>med_statin + med_aspirin + med_nsaids + bmi_cat + ccscat + alc + 
> smk, data=ins_ca, control=coxit, ties=c("breslow"))
> 
> 
> This is the error message I get:
> 
> Warning message:
> In fitter(X, Y, strats, offset, init, control, weights = weights,  :
>  Ran out of iterations and did not converge
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bagging Question

2016-03-30 Thread Majid Javanmard
Hello

here is the  code implements bagging that I copied from net (
http://www.r-bloggers.com/improve-predictive-performance-in-r-with-bagging/)
:

set.seed(10)
y<-c(1:1000)
x1<-c(1:1000)*runif(1000,min=0,max=2)
x2<-c(1:1000)*runif(1000,min=0,max=2)
x3<-c(1:1000)*runif(1000,min=0,max=2)

lm_fit<-lm(y~x1+x2+x3)
summary(lm_fit)

set.seed(10)
all_data<-data.frame(y,x1,x2,x3)
positions <- sample(nrow(all_data),size=floor((nrow(all_data)/4)*3))
training<- all_data[positions,]
testing<- all_data[-positions,]

lm_fit<-lm(y~x1+x2+x3,data=training)
predictions<-predict(lm_fit,newdata=testing)
error<-sqrt((sum((testing$y-predictions)^2))/nrow(testing))

library(foreach)
length_divisor<-4
iterations<-1000
predictions<-foreach(m=1:iterations,.combine=cbind) %do% {
  training_positions <- sample(nrow(training),
size=floor((nrow(training)/length_divisor)))
  train_pos<-1:nrow(training) %in% training_positions
  lm_fit<-lm(y~x1+x2+x3,data=training[train_pos,])
  predict(lm_fit,newdata=testing)
}
predictions<-rowMeans(predictions)
error<-sqrt((sum((testing$y-predictions)^2))/nrow(testing))


1) How to rank in sequence  Training and Testing in a column ?!
2) How can I have prediction interval for each predicted value ?!

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accented characters, windows

2016-03-30 Thread Jan Kacaba
Duncun, thank you for your reply. My encoding is:

> Sys.getlocale('LC_CTYPE')
[1] "Czech_Czech Republic.1250"

In RStudio I use UTF-8. I tried also other recommended encodings but some
characters are still misrepresented.

I've found solution to this. To correctly display strings in RStudio I have
to convert strings:
iconv(x,"CP1250","UTF-8")

If I want to write string into file:
zz=file("myfile.txt", "w", encoding="UTF-8")
cat(x,file = zz, sep = "\n")

It seems there is no need using icon() if I just need to write string to a
file.

I hope there is no problem processing strings with other functions like
paste, strsplit, grep though.

Derek

2016-03-30 0:56 GMT+02:00 Duncan Murdoch :

> On 29/03/2016 5:39 PM, Jan Kacaba wrote:
>
>> I have problem with accented characters. My OS is Win 8.1 and I'm using
>> RStudio.
>>
>> I make string :
>> av="ěščřž"
>>
>> When I call "av" I get result bellow.
>>
>>> av
>>>
>> [1] "ìšèøž"
>>
>> The resulting characters are different. I have similar problem when I
>> write
>> string to a file. In RGUI if I call "av" it prints characters correctly,
>> but using "write" function to print string in a file results in the same
>> problem.
>>
>> Can you please help me how to deal with it?
>>
>
> You don't say what code page you're using.
>
> R in Windows has a long standing problem that it works mainly in the local
> code page, rather than working in UTF-8 as most other systems do.  (This is
> due to the fact that when the internationalization was put in, UTF-8 was
> exotic, rather than ubiquitous as it is now.)  So R can store UTF-8 strings
> on any system, but for display it converts them to the local code page, and
> that conversion can lose information if the characters aren't supported
> locally.
>
> With your string, I don't see the same thing as you, I see
>
> "ešcrž"
>
> which is also incorrect, but looks a little closer, because it does a
> better approximation in my code page.
>
> So if you think my result is better than yours, you could change your
> system to code page 437 as I'm using, but that will probably cause you
> worse problems.
>
> Probably the only short term solution that would be satisfactory is to
> stop using Windows.  At some point in the future the internal character
> handling in R needs an overhaul, but that's a really big, really thankless
> job.  Perhaps Microsoft/Revolution will donate some programmer time to do
> it, but more likely, it will wait for volunteers in R Core to do it.  I
> don't think it will happen in 2016.
>
> Duncan Murdoch
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difficult to find index value

2016-03-30 Thread Ista Zahn
FAQ 7.31 I think. Here are a couple things you can try.

close_enough <- function(x, y) isTRUE(all.equal(x, y))

periodlimint<-seq(from=0.1, to=50, by=0.1)

indexAtest <- which(sapply(periodlimint, close_enough, y = 0.7))

match( as.character(.7), periodlimint)

Best,
Ista

On Wed, Mar 30, 2016 at 1:35 AM, Rubel Das via R-help
 wrote:
>  Dear R group memberI tried couple of hours to figure out the solution of 
> following.match function behaves strange from value 0.7 and 1.7
>
>> periodlimint<-seq(from=0.1, to=50, by=0.1)> indexAtest<-match( .6, 
>> periodlimint)> indexAtest[1] 6> periodlimint<-seq(from=0.1, to=50, by=0.1)> 
>> indexAtest<-match( .7, periodlimint)> indexAtest[1] NA> 
>> periodlimint<-seq(from=0.1, to=50, by=0.1)> indexAtest<-match( .8, 
>> periodlimint)> indexAtest[1] 8> periodlimint<-seq(from=0.1, to=50, by=0.1)> 
>> indexAtest<-match( 1.7, periodlimint)> indexAtest[1] NA
> it will be helpful if you provide your comment
> RegardsRubel
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ts or xts with high-frequency data within a year

2016-03-30 Thread William Dunlap via R-help
You said you specified frequency=96 when you constructed the time
series, but when I do that the decomposition looks reasonable:

> time <- seq(0,9,by=1/96) # 15-minute intervals, assuming time unit is day
> measurement <- sqrt(time) + 1/(1.2+sin(time*2*pi)) +
rnorm(length(time),0,.3)
> plot(decompose(ts(measurement, frequency=96)))

How is your code different from the above?



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Mar 30, 2016 at 8:03 AM, Ryan Utz  wrote:

> Hello,
>
> I have a time series that represents data sampled every 15-minutes. The
> data currently run from November through February, 8623 total readings.
> There are definitely daily periodic trends and non-stationary long-term
> trends. I would love to decompose this using basic time series analysis.
>
> However, every time I attempt decomposition, I get the
>
> Error in decompose( ) : time series has no or less than 2 periods
>
> Is it only possible to do basic time-series analysis if you have a year or
> more worth of data? That seems absurd to me, since there is definite
> periodicity and the data are a time series. I have tried every manner of
> specifying frequency= with no luck (96 does not work). All manner of
> searching for help has turned up fruitless.
>
> Can I only do this after I wait another year or two?
>
> Thanks,
> Ryan
>
> --
>
> Ryan Utz, Ph.D.
> Assistant professor of water resources
> *chatham**UNIVERSITY*
> Home/Cell: (724) 272-7769
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R logo size in package information tab of Rstudio

2016-03-30 Thread Duncan Murdoch

On 29/03/2016 3:42 PM, Marc Girondot via R-help wrote:

Two different sizes of R logo are shown in Rstudio in the Help at the
package level.

For example, numderiv shows a nice discreet logo located at (in MacosX):
/Library/Frameworks/R.framework/Versions/3.3/Resources/doc/html/logo.jpg
whereas packrat shows a huge logo located at:
/Library/Frameworks/R.framework/Versions/3.3/Resources/doc/html/Rlogo.svg

The choice between both depends on the path indicated in the file
Index.html located in html folder for each package.

It would be better to have svg version of Rlogo.svg of the same size
than logo.jpg (I have converted the Rlogo.svg into the same size than
logo.jpg and it looks much better whit the new version).


I've now managed to duplicate this.  It's an RStudio bug, not an R bug.  
They are not using the R.css file that sets the size of the logo (or 
perhaps they're using an older version of it).


I'm not completely up to date on RStudio, so it may be that you can fix 
it by updating.  I've bcc'd support at RStudio to let them know about 
this; not sure if that will get through.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Mann-Whitney con datos temporales

2016-03-30 Thread Rubén Fernández-Casal
Hola a todos,

Simplemente comentar que ademas de la dependencia espacio-temporal en estos
datos va a haber una tendencia como mínimo temporal (las medias no son
ctes). Por tanto me parecería mejor la propuesta de Carlos, aunque se está
asumiendo que las curvas de temperatura solo se diferencian en una cte.

Estimar la dependencia a partir de un conj de datos tan pequeño resultaría
difícil, por lo que en principio miraría que ocurre suponiendo errores
independientes. La estimación del efecto localización sería fiable, los
contrastes aproximados  (teniendo en cuenta que seguramente se subestima la
varianza, los pvalores "reales" deberían ser algo mayores ).

No sé cual es el objetivo final, pero podría ser recomendable considerar
mas días para modelar mejor el proceso...

Un saludo, Rubén.
El 28/3/2016 16:57, "Javier Martínez-López" 
escribió:

> Hola a tod@s,
>
> queremos hacer una comparación entre dos lugares muy alejados entre sí
> en relación a la temperatura de cada sitio usando medias horarias de
> un período determinado. Sólo hay medidas de un sensor en cada sitio y
> queremos saber si las diferencias son significativas o no entre
> sitios/curvas. Hemos usado un test de Mann–Whitney U con la función
> wilcox.test (paired=F) ya que los valores no son normales (n = 24; 24h
> en base a medias minutales). ¿Creéis que es correcto o estaríamos
> incumpliendo alguna asunción del test al ser datos temporales y/o no
> tener réplicas de los sensores?
>
> Muchas gracias y saludos,
>
> Javier
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] fftImg() error: fftw_access_func

2016-03-30 Thread Eric Handler
Took me a few days to get a reply from the student. The code he is running is:

library(ripa)

img<-readJPEG( "arthurs-seat.jpg" )
hist(fftImg(img))

I see the same error as the student on many systems. When I run
require(ripa) on the machine I mentioned before:

> require('ripa')
Loading required package: ripa
Loading required package: tcltk
Loading required package: parallel

>img<-readJPEG( "arthurs-seat.jpg" )
>hist(fftImg(img))
Error in .C("fftw_access_func", as.complex(img), as.integer(w),
as.integer(h),  :
  "fftw_access_func" not available for .C() for package "ripa"

If I should take the conversation off of r-help as I loop in the
package maintainer, please let me know.

Thanks,
Eric

--
Eric Handler
Academic Information Associate - Science Division
Macalester College - Saint Paul, MN
Olin-Rice 124
Office: 651-696-6016
View my calendar: http://goo.gl/SbxLOu

On Sat, Mar 26, 2016 at 2:23 PM, David Winsemius  wrote:
>
>> On Mar 26, 2016, at 8:42 AM, Shelby Leonard via R-help 
>>  wrote:
>>
>> So do i need to resend the email to someone else? sorry i am just confused
>
> Perhaps.
>
> The code needed to find the maintainer is at the end of this scrape of a 
> console "dialog" and scroll to the bottom if you want the proper address for 
> reporting problems with package:ripa,
>
>  but _first_ you should be checking to see if you have all the system 
> dependencies:
>
>
>> require(ripa)
> Loading required package: ripa
> Loading required package: tcltk
> Loading required package: parallel
>
> Attaching package: ‘ripa’
>
> The following object is masked from ‘package:rms’:
>
> contrast
>
> The following object is masked from ‘package:Hmisc’:
>
> zoom
>
>> packageDescription("ripa")
> Package: ripa
> Version: 2.0-2
> Date: 2014-05-29
> Title: R Image Processing and Analysis
> Authors@R: c(person("Talita", "Perciano", role = c("aut", "cre"),
>email = "talitaperci...@gmail.com"), person("Alejandro", "C
>Frery", role = "ctb", email = "acfr...@pq.cnpq.br"))
> Maintainer: Talita Perciano 
> Depends: R (>= 2.8.1), tcltk, parallel
> Suggests: e1071, rggobi, reshape, methods, jpeg, png, tkrplot,
>fftw, foreach, doSNOW
> Enhances: doMC
> SystemRequirements: BWidget, Tktable, Img, libjpeg
>
> # 
> # My guess is that you do not have all of the systemRequirements on this 
> machine.
> # Or if you do, then perhaps they are not in a directory in which the package 
> expects to find them.
> # =
>
> Description: A package including various functions for image
>processing and analysis. With this package is possible to
>process and analyse RGB, LAN (multispectral) and AVIRIS
>(hyperspectral) images. This packages also provides
>functions for reading JPEG files, extracted from the
>archived 'rimage' package.
> License: GPL (>= 2) | file LICENSE
> Imports: Rcpp (>= 0.11.0)
> LinkingTo: Rcpp
> URL: http://www.r-project.org
> Packaged: 2014-05-30 20:18:57 UTC; Talita Perciano
> Author: Talita Perciano [aut, cre], Alejandro C Frery [ctb]
> NeedsCompilation: yes
> Repository: CRAN
> Date/Publication: 2014-05-31 01:32:57
> Built: R 3.2.0; x86_64-apple-darwin13.4.0; 2015-04-21 02:07:55
>UTC; unix
>
> -- File: 
> /Library/Frameworks/R.framework/Versions/3.2/Resources/library/ripa/Meta/package.rds
>> ? fftImg
>>   data(logo)
>>   plot(normalize(fftImg(logo)))
> Error in .C("fftw_access_func", as.complex(img), as.integer(w), 
> as.integer(h),  :
>   "fftw_access_func" not available for .C() for package "ripa"
>
>> require(fftw)
> Loading required package: fftw
>
> # Tried that thinking (Incorrectly) the missing routine might be supplied in 
> that package.
>
>> data(logo)
>>   plot(normalize(fftImg(logo)))
> Error in .C("fftw_access_func", as.complex(img), as.integer(w), 
> as.integer(h),  :
>   "fftw_access_func" not available for .C() for package "ripa"
>> maintainer('ripa')
> [1] "Talita Perciano "
>
> --
>
> David
>
>
>>
>>On Saturday, March 26, 2016 9:21 AM, John Kane  
>> wrote:
>>
>>
>> It would be helpful if you actually supplied your code and a minimal data 
>> set to for people to examine.
>>
>> Please have a look at 
>> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
>>  and/or http://adv-r.had.co.nz/Reproducibility.html.
>>
>> I'm not clear if you are reporting a general R problem or a specific package 
>> with the ripa package.  If it looks like it is the latter you probably bring 
>> it to the attention of the package maintainer who may or may not monitor 
>> this mailing group.
>>
>>
>> John Kane
>> Kingston ON Canada
>>
>>
>>> -Original Message-
>>> From: ehand...@macalester.edu
>>> Sent: Fri, 25 Mar 2016 15:09:42 -0500
>>> To: r-help@r-project.org
>>> Subject: [R] fftImg() error: fftw_access_func
>>>
>>> Hello-
>>>
>>> My name is Eric Handler 

[R] Multinomial mixed models with glmmADMB

2016-03-30 Thread Ana María Prieto
Dear r-helpers,

I want to run a multinomial mixed effects model with the glmmADMB package
of R. I have read the available information of the programm but i couldn't
find which family or link has to be used for multinomial data. In the
examples are only shown models with Poisson, negative binomial and
truncated binomial /poisson families.

Use of this package for multinomial mixed models has already been published
(http://www.sciencedirect.com/science/article/pii/S0378112715007288).

Thanks a lot in advance for your answers.

Ana

-- 
Ana María Prieto Ramírez
PhD student
Departement of Conservation Biology
Center for Environmental Research UFZ, Leipzig
Zoologisches Forschungsmuseum Alexander Koenig
University of Bonn

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problems with pooling Multiply Imuputed datasets, of a multilevel logistic model, using (MICE)

2016-03-30 Thread Jonathan Halls via R-help
I am having problems with the MICE package in R, particularity with pooling the 
imputed data sets.
I am running a multilevel binomial logistic regression, with Level1 - topic 
(participant response to 10 questions on different topics, e.g. T_Darkness, 
T_Day) nested within Level2 - individuals. 
The model is created using R2MLwiN, the formula is 
> fit1 <-runMLwiN( c(probit(T_Darkness, cons), probit(T_Day, cons), 
> probit(T_Light, cons), probit(T_Night, cons), probit(T_Rain, cons), 
> probit(T_Rainbows, cons), probit(T_Snow, cons), probit(T_Storms, cons), 
> probit(T_Waterfalls, cons), probit(T_Waves, cons)) ~ 1, D=c("Mixed", 
> "Binomial", "Binomial","Binomial","Binomial", "Binomial", "Binomial", 
> "Binomial", "Binomial", "Binomial" ,"Binomial"), estoptions = list(EstM = 0), 
> data=data)Unfortunately, there is missing data in all of the Level1 (topic) 
> responses. I have been using the mice package ([CRAN][1]) to multiply impute 
> the missing values. 
I can fit the model to the imputed datasets, using the formula 
> fitMI <- (with(MI.Data, runMLwiN( c(probit(T_Darkness, cons), probit(T_Day, 
> cons), probit(T_Light, cons), probit(T_Night, cons), probit(T_Rain, cons), 
> probit(T_Rainbows, cons), probit(T_Snow, cons), probit(T_Storms, cons), 
> probit(T_Waterfalls, cons), probit(T_Waves, cons)) ~ 1, D=c("Mixed", 
> "Binomial", "Binomial","Binomial","Binomial", "Binomial", "Binomial", 
> "Binomial", "Binomial", "Binomial" ,"Binomial"), estoptions = list(EstM = 0), 
> data=data)))
 However, when I come to pool the analyses with the call code > pool(fitMI) it 
fails, with the Error:Error in pool(with(tempData, 
runMLwiN(c(probit(T_Darkness, cons), probit(T_Day, : Object has no coef() 
method.
I am not sure why it is saying there is no coefficient, as the analyses of the 
individual MI datasets provide both fixed parts (coefficients) and random parts 
(covariances)
Any help with what is going wrong would be much appreciated. I should warn you 
that this is my first foray into using R and multilevel modelling. Also I know 
there is a MlwiN package ([REALCOM][2]) that can do this but I don't have the 
background to use the MLwiN software outside of R.
thanks
johnny

 R reproducible example
Libraries used
 > library(R2MLwiN) > library(mice)
Subset of data`
 > T_Darkness <- c(0, 1, 0, 0, 0, 0, 0, 1, 0, 0, NA, 0, 0, 0, NA, 1, 0, NA,NA, 
 > 1, 0, 0, 0, 1, 0, 0, 0, NA, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
 > 0, 0, 0, 0, 0, 0, 0, 1, NA, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, NA, 1, 0) 
> T_Day <- c(0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 
> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, NA, 0, 0, 0, 
> 0, NA, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, NA, NA, 0) 
> T_Light <- c(0, 0, NA, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 
> 0, 0, 1, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> 0, 0, 0, 0, 1, NA, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0) 
> T_Night <- c(0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 
> 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
> 0, 0,NA, 0, NA, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, NA, 0, 0) 
> T_Rain <- c(1, 0, 0, 1, 1, 0, 0, NA, 0, 1, 0, 0, 1, 0, 0, 0, 0, NA, 0, 0, 1, 
> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, NA, 0, 0, 0, 0, 1, 0, 
> 0, 0, NA, 1, NA, 0, 0, 0, 0, 1, NA, 1, 0, 0, 0, 0, 1, NA, 0, 0) 
> T_Rainbows <- c(1, 1, 1, 1, 0, 1, 0, 1, 0, 1, NA, 1, 1, 0, 0, 1, 0, NA, 0, 1, 
> 0, NA, 0, 1, 0, 0, 0, 0, 0, NA, 0, 0, 0, NA, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 
> 0, 1, 0, 1, 1, 1, 1, NA, 1, 0, 1, NA, 0, 0, 1, 0, 1, 1, 1, 0, 1) 
> T_Snow <- c(0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, NA, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
> 0, 0, 1, 1, 0, 0, 0, NA, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
> 0, 0, NA, 0, 0, 1, NA, 1, 0, 1, 1, 0, 0, 0, 0, 0, NA, 0, 0, 0) 
> T_Storms <- c(0, 0, 0, 1, 1, 1, 0, 1, 0, 1, NA, 0, 0, 0, 0, 1, 0, NA, 0, 0, 
> 1, 0, 0, NA, 1, 1, NA, 0, 0, NA, 0, 1, 0, NA, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 
> 0, 0, 1, 0, 0, 0, 1, 0, NA, 1, 0, NA, 0, 0, 0, 1, 1, 0, 1, NA, NA, 1) 
> T_Waterfalls <- c(0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 
> 0, 1, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, NA, 0, 
> 0, 0, 0, 0, NA, 0, 1, 0, NA, 1, 0, 1, 0, 0, 0, NA, 0, 0, 0, NA, NA, 0) 
> T_Waves <- c(0, 1, 0, 1, 1, 0, 1, NA, 0, 0, NA, 0, 0, 0, NA, 1, 0, 0, 0, 0, 
> 1, 0, NA, 0, NA, 0, 0, NA, 0, 0, 0, 0, 0, 0, NA, 1, 0, 0, 0, 1, 0, 0, NA, 0, 
> 1, 0, 0, 0, 0, 0, 1, 1, NA, 1, 1, NA, 0, 0, 0, NA, 0, 0, 0, NA, 0, 0) 
> data <- data.frame (T_Darkness, T_Day, T_Light, T_Night, T_Rain, T_Rainbows, 
> T_Snow, T_Storms, T_Waterfalls, T_Waves) 
> data$cons <- 1

Data imputed using mice with
 > MI.Data <- mice(data,m=5,maxit=50,meth='pmm',seed=500)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

[R] ts or xts with high-frequency data within a year

2016-03-30 Thread Ryan Utz
Sorry about not providing code; I didn't think to just simulate dummy code.

Here's a situation where I have <1 year of data, hourly time sampling, and
the error that I get using my actual data:

###

X=as.data.frame(1:6000)
X[2]=seq.POSIXt(ISOdate(2015,11,1),by='hour',length.out=6000)
X[3]=sample(100,size=6000,replace=T)

Y=xts(X[,3],order.by=X[,2])
decompose(Y)

Z=ts(X[,2],start=c(2015,11),frequency=24*365)
plot(decompose(Z))

###

Am I missing something obvious here? I hope so...

-- 

Ryan Utz, Ph.D.
Assistant professor of water resources
*chatham**UNIVERSITY*
Home/Cell: (724) 272-7769

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difficult to find index value

2016-03-30 Thread Rubel Das via R-help
 Dear R group memberI tried couple of hours to figure out the solution of 
following.match function behaves strange from value 0.7 and 1.7

> periodlimint<-seq(from=0.1, to=50, by=0.1)> indexAtest<-match( .6, 
> periodlimint)> indexAtest[1] 6> periodlimint<-seq(from=0.1, to=50, by=0.1)> 
> indexAtest<-match( .7, periodlimint)> indexAtest[1] NA> 
> periodlimint<-seq(from=0.1, to=50, by=0.1)> indexAtest<-match( .8, 
> periodlimint)> indexAtest[1] 8> periodlimint<-seq(from=0.1, to=50, by=0.1)> 
> indexAtest<-match( 1.7, periodlimint)> indexAtest[1] NA
it will be helpful if you provide your comment
RegardsRubel 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ts or xts with high-frequency data within a year

2016-03-30 Thread Joshua Ulrich
On Wed, Mar 30, 2016 at 10:29 AM, Bert Gunter  wrote:
> I "think" the problem is that you failed to set the "frequency"
> attribute of your time series, so it defaults to 1. A time series with
> one observation per period cannot be decomposed, since the error term
> is confounded with the "seasonality", which is essentially your error
> message.
>
> Again, a guess, as you provided no code.
>
Another guess is that you're running into issues with converting xts
to/from ts.  That currently doesn't work well, so you should convert
to zoo and make sure your frequency attribute makes sense before
attempting to decompose.

> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Mar 30, 2016 at 8:18 AM, Bert Gunter  wrote:
>> Code please.
>>
>> Reproducible example?(e.g. 1st 100 values)
>>
>> "PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code."
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Mar 30, 2016 at 8:03 AM, Ryan Utz  wrote:
>>> Hello,
>>>
>>> I have a time series that represents data sampled every 15-minutes. The
>>> data currently run from November through February, 8623 total readings.
>>> There are definitely daily periodic trends and non-stationary long-term
>>> trends. I would love to decompose this using basic time series analysis.
>>>
>>> However, every time I attempt decomposition, I get the
>>>
>>> Error in decompose( ) : time series has no or less than 2 periods
>>>
>>> Is it only possible to do basic time-series analysis if you have a year or
>>> more worth of data? That seems absurd to me, since there is definite
>>> periodicity and the data are a time series. I have tried every manner of
>>> specifying frequency= with no luck (96 does not work). All manner of
>>> searching for help has turned up fruitless.
>>>
>>> Can I only do this after I wait another year or two?
>>>
>>> Thanks,
>>> Ryan
>>>
>>> --
>>>
>>> Ryan Utz, Ph.D.
>>> Assistant professor of water resources
>>> *chatham**UNIVERSITY*
>>> Home/Cell: (724) 272-7769
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com
R/Finance 2016 | www.rinfinance.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing packages

2016-03-30 Thread James Henson
To All,
Thanks for your help.

I uninstalled R, the 3.2 library and R Studio.  Reinstalled R and R Studio.
Now the temp files move the newly installed packages into the
R-.23.2.4revised library.

> .libPaths()
[1] "C:/Users/james_henson/Desktop/Documents/R/win-library/3.2"
[2] "C:/Program Files/R/R-3.2.4revised/library"





On Mon, Mar 21, 2016 at 9:57 PM, Jeff Newmiller 
wrote:

> I hope not. That directory is not for working in. suggestion to restart R
> sounds most likely to fix the issue.
> --
> Sent from my phone. Please excuse my brevity.
>
> On March 21, 2016 2:10:01 PM PDT, KMNanus  wrote:
>>
>> Have you set your working directory to the “3.2” folder?
>> Ken
>> kmna...@gmail.com
>> 914-450-0816 (tel)
>> 347-730-4813 (fax)
>>
>>
>>
>>  On Mar 21, 2016, at 5:07 PM, James Henson  wrote:
>>>
>>>  Dear R community,
>>>
>>>  When I install or update a package, R prints the waring below.  I go to the
>>>  ‘downloaded_packages’ folder in the Temp file and manually move the new or
>>>  updated package to the folder ‘3.2’.   How can I instruct R to download new
>>>  and updates packages into the ‘3.2’ folder?
>>>
>>>  Warning in install.packages :
>>>
>>>   unable to move temporary installation
>>>  
>>> ‘C:\Users\james_henson\Desktop\Documents\R\win-library\3.2\file1c5c6f1731c8\nlme’
>>>  to ‘C:\Users\james_henson\Desktop\Documents\R\win-library\3.2\nlme
>>>
>>>
>>>
>>>  The downloaded binary packages are in
>>>
>>>
>>>  C:\Users\james_henson\AppData\Local\Temp\RtmpIZmUa3\downloaded_packages
>>>
>>>
>>>
>>>  Thank for your help.
>>>
>>>  James F. Henson
>>>
>>>   [[alternative HTML version deleted]]
>>>
>>> --
>>>
>>>  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>>  PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>>  and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>>
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ts or xts with high-frequency data within a year

2016-03-30 Thread Bert Gunter
I "think" the problem is that you failed to set the "frequency"
attribute of your time series, so it defaults to 1. A time series with
one observation per period cannot be decomposed, since the error term
is confounded with the "seasonality", which is essentially your error
message.

Again, a guess, as you provided no code.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Mar 30, 2016 at 8:18 AM, Bert Gunter  wrote:
> Code please.
>
> Reproducible example?(e.g. 1st 100 values)
>
> "PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code."
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Mar 30, 2016 at 8:03 AM, Ryan Utz  wrote:
>> Hello,
>>
>> I have a time series that represents data sampled every 15-minutes. The
>> data currently run from November through February, 8623 total readings.
>> There are definitely daily periodic trends and non-stationary long-term
>> trends. I would love to decompose this using basic time series analysis.
>>
>> However, every time I attempt decomposition, I get the
>>
>> Error in decompose( ) : time series has no or less than 2 periods
>>
>> Is it only possible to do basic time-series analysis if you have a year or
>> more worth of data? That seems absurd to me, since there is definite
>> periodicity and the data are a time series. I have tried every manner of
>> specifying frequency= with no luck (96 does not work). All manner of
>> searching for help has turned up fruitless.
>>
>> Can I only do this after I wait another year or two?
>>
>> Thanks,
>> Ryan
>>
>> --
>>
>> Ryan Utz, Ph.D.
>> Assistant professor of water resources
>> *chatham**UNIVERSITY*
>> Home/Cell: (724) 272-7769
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ts or xts with high-frequency data within a year

2016-03-30 Thread Bert Gunter
Code please.

Reproducible example?(e.g. 1st 100 values)

"PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code."

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Mar 30, 2016 at 8:03 AM, Ryan Utz  wrote:
> Hello,
>
> I have a time series that represents data sampled every 15-minutes. The
> data currently run from November through February, 8623 total readings.
> There are definitely daily periodic trends and non-stationary long-term
> trends. I would love to decompose this using basic time series analysis.
>
> However, every time I attempt decomposition, I get the
>
> Error in decompose( ) : time series has no or less than 2 periods
>
> Is it only possible to do basic time-series analysis if you have a year or
> more worth of data? That seems absurd to me, since there is definite
> periodicity and the data are a time series. I have tried every manner of
> specifying frequency= with no luck (96 does not work). All manner of
> searching for help has turned up fruitless.
>
> Can I only do this after I wait another year or two?
>
> Thanks,
> Ryan
>
> --
>
> Ryan Utz, Ph.D.
> Assistant professor of water resources
> *chatham**UNIVERSITY*
> Home/Cell: (724) 272-7769
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ts or xts with high-frequency data within a year

2016-03-30 Thread Ryan Utz
Hello,

I have a time series that represents data sampled every 15-minutes. The
data currently run from November through February, 8623 total readings.
There are definitely daily periodic trends and non-stationary long-term
trends. I would love to decompose this using basic time series analysis.

However, every time I attempt decomposition, I get the

Error in decompose( ) : time series has no or less than 2 periods

Is it only possible to do basic time-series analysis if you have a year or
more worth of data? That seems absurd to me, since there is definite
periodicity and the data are a time series. I have tried every manner of
specifying frequency= with no luck (96 does not work). All manner of
searching for help has turned up fruitless.

Can I only do this after I wait another year or two?

Thanks,
Ryan

-- 

Ryan Utz, Ph.D.
Assistant professor of water resources
*chatham**UNIVERSITY*
Home/Cell: (724) 272-7769

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Mann-Whitney con datos temporales

2016-03-30 Thread Javier Martínez-López
En nuestro caso son series de zonas experimentales en Almería y Madrid
y pese a que no se cumple la independencia ni de las diferencias ni de
los residuos de las series suavizadas, como comentaba en el último
correo, sin embargo entiendo, como dice Carlos, que aunque estemos
sobreestimando los grados de libertad, el test no es del todo
inapropiado, al menos como algo orientativo. Pasa como en el ejemplo
que pone Carlos del aeropuerto y Colmenar, donde al aplicar el
runs.test (del paquete randtests) a las curvas y a las diferencias se
ve que están autocorrelacionadas. También he calculado el RMSE para
ver las diferencias entre curvas, que nos da una idea de su
disimilitud, quizá eso sea lo mejor. Gracias de nuevo y saludos,
Javier

2016-03-30 9:24 GMT+02:00 José Trujillo Carmona :
> Ruego a los miembros de la lista disculpas por mi torpeza y el desajuste
> de los mensajes.
>
> Suelo dar "contestar" a los mensajes que contesto y en esta lista tengo
> que acordarme de pinchar en "contestar a la lista".
>
> Envié mi última contestación solo a Carlos y el me ha contestado
> avisando del error de procedimiento y explicando mejor bajo que
> situación muy plausible la diferencia sí eliminaría la autocorrelación.
>
> Creo que no cabe objeción a su consideración actual, si el efecto
> temporal es exactamente el mismo, si no está desplazado en el tiempo ni
> en intensidad (siempre 4 grados y la diferencia es en el mismo tiempo)
> efectivamente la diferencia eliminaría la autocorrelación.
>
> Ya digo en anterior mensaje que la solución para sabe en qué caso
> estamos es fácil: se aplica la diferencia y se contrasta la independencia.
>
> Saludos.
>
>
> El 30/03/16 a las 01:07, Carlos J. Gil Bellosta escribió:
>> Hola, ¿qué tal?
>>
>> Me has escrito solo a mí. No sé si querías mandar el mensaje a la
>> lista o no.
>>
>> En cualquier caso, estamos de acuerdo. Bajo tus hipótesis, no tengo
>> nada que objetar.
>>
>> Yo tenía en mente otra estructura para el problema: la del que dice
>> "en Colmenar [siempre] hace 4 grados menos que en Madrid". Es decir,
>> que si la temperatura de Madrid es de 12 grados, en Colmenar estará
>> haciendo alrededor de 8. De otra manera, T_c = T_m - N(4, sigma).
>>
>> No sé cómo de lejos estarán los sensores del tipo que ha escrito la
>> pregunta, pero _mi_ estructura probabilística puede justificarse en
>> algunos casos. Para Madrid y Colmenar, por ejemplo. He bajado las
>> temperaturas de las últimas 24 horas en Madrid (Barajas) y Colmenar
>> 
>> (en el problema original también había una serie de 24 medidas) y mira:
>>
>> aeropuerto <-
>> read.csv("/home/carlos/Downloads/ultimosdatos_3129_datos-horarios.csv", skip
>> = 2, fileEncoding = "latin1")
>> aeropuerto <- aeropuerto[,2]
>>
>> colmenar <-
>> read.csv("/home/carlos/Downloads/ultimosdatos_3191E_datos-horarios.csv",
>> skip = 2, fileEncoding = "latin1")
>> colmenar <- colmenar[,2]
>>
>> temperaturas <-
>> structure(list(aeropuerto = c(10.9, 12.7, 14.9, 15.8, 17.5, 18.5,
>> 18.8, 18.4, 17.9, 17.4, 16.1, 14.9, 13.6, 12.8, 11.5, 10.5, 9.9,
>> 9.8, 9.8, 9.7, 9.4, 8.8, 9.9, 11), colmenar = c(8.4, 9.4, 10,
>> 11.2, 12.5, 14.3, 14.1, 14.3, 13.5, 12.9, 12.2, 11.4, 10.3, 8.4,
>> 7.3, 7, 7.1, 6.6, 6.4, 6.3, 6.2, 5.9, 5.7, 6.2)), .Names = c("aeropuerto",
>> "colmenar"), row.names = c(NA, -24L), class = "data.frame")
>>
>> plot(aeropuerto, ylim = c(min(colmenar), max(aeropuerto)), type = "l")
>> lines(colmenar, col = "red")
>>
>> Y si tomas diferencias verás que no parecen seguir ningún tipo de
>> patrón temporal.
>>
>> Ahora bien, ¿puedo hacer un t-test? Casi seguro que no se justifica
>> del todo por el hecho de que sobreestimo los grados de libertad
>> (piensa que podría tener tantos como quisiera tomando, por ejemplo,
>> medidas de temperatura cada nanosegundo). Pero no sería una solución
>> "tremendamente mala". Incluso podría ponerme en el lado conservador de
>> infraestimar el número de grados de libertad (i.e., usar una t de
>> Student con un par de grados de libertad y aún así encontrar
>> diferencias significativas).
>>
>> La otra alternativa sería crear un modelo que ajuste la temperatura en
>> función de la hora y la ubicación (p.e., usando GAM) y viendo si mi
>> coeficiente de la ubicación es significativamente distinto de cero. De
>> nuevo, todo lo anterior, bajo _mis_ hipótesis. Que seguro que no se
>> cumplen si las ubicaciones son Madrid y Santander.
>>
>> Ahora bien, no sabemos cuáles (las tuyas o las mías) son más creíbles
>> en el caso que da lugar a la pregunta. ¡No nos lo han dicho!
>>
>> Un saludo,
>>
>> Carlos J. Gil Bellosta
>> http://www.datanalytics.com
>>
>>
>>
>>
>> El 29 de marzo de 2016, 17:33, José Trujillo Carmona
>>  escribió:
>>
>> No estoy de acuerdo con Carlos.
>>
>> Si la estructura temporal viniese dada por un modelo determinista,
>> como si el tiempo fuese una variable 

Re: [R] convergence issues with coxph

2016-03-30 Thread Therneau, Terry M., Ph.D.
Failure to converge in a coxph model is very rare.  If the program does not make it in 20 
iterations it likely will never converge, so your control argument will do little.


Without the data set I have no way to guess what is happening.  My first question, 
however, is to ask how many events you have, e.g. table(bca).  I count 19 covariates on 
the right hand side, and a good rule of thumb is that one should have at least 5- 10 
endpoints per covariate for simple numerical stability and 10-20 for statistical 
stability.  That means 100-200 events.  Most medical data sets have fewer than this. A 
data set with 5000 rows and 4 death counts as "4" in this calculation by the way.


  I am always interested in data sets that push the boundaries of the code and can look 
deeper if you want to send me a copy.  Details of how things are coded can matter, e.g., 
centered covariates.  Otherwise there is little we can do.


Terry Therneau


On 03/30/2016 05:00 AM, r-help-requ...@r-project.org wrote:

I am having convergence issues when I use coxph with a interaction term 
(glarg*bca_py) and interaction term with the restricted cubic spline 
(glarg*bca_time_ns). I use survival and spline package to create the Cox model 
and cubic splines respectively. Without the interaction term and/or spline, I 
have no convergence problem. I read some forums about changing the iterations 
and I have but it did not work. I was just wondering if I am using the 
inter.max and outer.max appropriately. I read the survival manual, other R-help 
and stackoverflow pages and it suggested changing the iterations but it doesn't 
specify what is the max I can go. I ran something similar in SAS and did not 
run into a convergence problem.

This is my code:

bca_time_ns <- ns(ins_ca$bca_py, knots=3, Boundary.knots=range(2,5,10))
test <- ins_ca$glarg*ins_ca$bca_py
test1 <- ins_ca$glarg*bca_time_ns

coxit <- coxph.control(iter.max=1, outer.max=1)

bca<-coxph(Surv(bca_py,bca) ~ glarg + test + test1 + age + calyr + diab_dur + 
hba1c + adm_met + adm_sulfo + adm_tzd + adm_oth +
 med_statin + med_aspirin + med_nsaids + bmi_cat + ccscat + alc + smk, 
data=ins_ca, control=coxit, ties=c("breslow"))


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compute the Gini coefficient

2016-03-30 Thread Achim Zeileis

On Wed, 30 Mar 2016, Erich Neuwirth wrote:




On 30 Mar 2016, at 02:53, Marine Regis  wrote:

Hello,

I would like to build a Lorenz curve and calculate a Gini coefficient in order 
to find how much parasites does the top 20% most infected hosts support.

Here is my data set:

Number of parasites per host:
parasites = c(0,1,2,3,4,5,6,7,8,9,10)

Number of hosts associated with each number of parasites given above:
hosts = c(18,20,28,19,16,10,3,1,0,0,0)

To represent the Lorenz curve:
I manually calculated the cumulative percentage of parasites and hosts:

cumul_parasites <- cumsum(parasites)/max(cumsum(parasites))
cumul_hosts <- cumsum(hosts)/max(cumsum(hosts))
plot(cumul_hosts, cumul_parasites, type= "l?)



Your values in hosts are frequencies. So you need to calculate

cumul_hosts = cumsum(hosts)/sum(hosts)
cumul_parasites = cumsum(hosts*parasites)/sum(parasites)


That's what I thought as well but Marine explicitly said that the 'host' 
are _not_ weights. Hence I was confused what this would actually mean.


Using the "ineq" package you can also do
plot(Lc(parasites, hosts))


The Lorenz curves starts at (0,0), so to draw it, you need to extend these 
vectors

cumul_hosts = c(0,cumul_hosts)
cumul_parasites = c(0,cumul_parasites)

plot(cumul_hosts,cum9l_parasites,type=?l?)


The Gini coefficient can be calculated as
library(reldist)
gini(parasites,hosts)


If you want to check, you can ?recreate? the original data (number of parasited 
for each host) with

num_parasites = rep(parasites,hosts)

and
gini(num_parasites)

will also give you the Gini coefficient you want.








From this Lorenz curve, how can I calculate the Gini coefficient with the function "gini" 
in R (package reldist) given that the vector "hosts" is not a vector of weights ?


Thank you very much for your help.
Have a nice day
Marine


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compute the Gini coefficient

2016-03-30 Thread Erich Neuwirth

> On 30 Mar 2016, at 02:53, Marine Regis  wrote:
> 
> Hello,
> 
> I would like to build a Lorenz curve and calculate a Gini coefficient in 
> order to find how much parasites does the top 20% most infected hosts support.
> 
> Here is my data set:
> 
> Number of parasites per host:
> parasites = c(0,1,2,3,4,5,6,7,8,9,10)
> 
> Number of hosts associated with each number of parasites given above:
> hosts = c(18,20,28,19,16,10,3,1,0,0,0)
> 
> To represent the Lorenz curve:
> I manually calculated the cumulative percentage of parasites and hosts:
> 
> cumul_parasites <- cumsum(parasites)/max(cumsum(parasites))
> cumul_hosts <- cumsum(hosts)/max(cumsum(hosts))
> plot(cumul_hosts, cumul_parasites, type= "l”)


Your values in hosts are frequencies. So you need to calculate

cumul_hosts = cumsum(hosts)/sum(hosts)
cumul_parasites = cumsum(hosts*parasites)/sum(parasites)

The Lorenz curves starts at (0,0), so to draw it, you need to extend these 
vectors

cumul_hosts = c(0,cumul_hosts)
cumul_parasites = c(0,cumul_parasites)

plot(cumul_hosts,cum9l_parasites,type=“l”)


The Gini coefficient can be calculated as
library(reldist)
gini(parasites,hosts)


If you want to check, you can “recreate” the original data (number of parasited 
for each host) with

num_parasites = rep(parasites,hosts)

and
gini(num_parasites)

will also give you the Gini coefficient you want.



> 

>> From this Lorenz curve, how can I calculate the Gini coefficient with the 
>> function "gini" in R (package reldist) given that the vector "hosts" is not 
>> a vector of weights ?
> 
> Thank you very much for your help.
> Have a nice day
> Marine
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convergence issues when using ns splines (pkg: spline) in Cox model (coxph) even when changing coxph.control

2016-03-30 Thread Göran Broström

Hi Jennifer,

see below.

On 2016-03-29 22:47, Jennifer Wu, Miss wrote:

Hi,

I am currently using R v3.2.3 and on Windows 10 OS 64Bit.

I am having convergence issues when I use coxph with a interaction
term (glarg*bca_py) and interaction term with the restricted cubic
spline (glarg*bca_time_ns).


Comment on interactions: (i) You should not create interaction terms 
'manually' but rather in the formula in the call to coxph ('*' means 
different things in a formula and on the command line). That assures 
that you get the main effects included. In your formula, you are missing 
the main effects 'bca_py' and 'bca_time_ns'. Is that intentional? In a 
few exceptional cases it may make sense to drop a main affect, but 
generally not. (ii) Convergence problems may also occur with 
interactions of variables that are not centered, so try to center 
involved covariates.


Your 'coxph.control' values have nothing to do with your convergence 
problem.


And try to provide reproducible code, in your case, the data set. If it 
is not possible, maybe you can scale it down to include only the 
problematic variables (with some fake values, if necessary) and just a 
few cases, but enough to show your problem.


Göran Broström

> I use survival and spline package to

create the Cox model and cubic splines respectively. Without the
interaction term and/or spline, I have no convergence problem. I read
some forums about changing the iterations and I have but it did not
work. I was just wondering if I am using the inter.max and outer.max
appropriately. I read the survival manual, other R-help and
stackoverflow pages and it suggested changing the iterations but it
doesn't specify what is the max I can go. I ran something similar in
SAS and did not run into a convergence problem.

This is my code:

bca_time_ns <- ns(ins_ca$bca_py, knots=3,
Boundary.knots=range(2,5,10)) test <- ins_ca$glarg*ins_ca$bca_py
test1 <- ins_ca$glarg*bca_time_ns

coxit <- coxph.control(iter.max=1, outer.max=1)

bca<-coxph(Surv(bca_py,bca) ~ glarg + test + test1 + age + calyr +
diab_dur + hba1c + adm_met + adm_sulfo + adm_tzd + adm_oth +
med_statin + med_aspirin + med_nsaids + bmi_cat + ccscat + alc + smk,
data=ins_ca, control=coxit, ties=c("breslow"))


This is the error message I get:

Warning message: In fitter(X, Y, strats, offset, init, control,
weights = weights,  : Ran out of iterations and did not converge



[[alternative HTML version deleted]]

__ R-help@r-project.org
mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
posting guide http://www.R-project.org/posting-guide.html and provide
commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filtering based on the occurrence

2016-03-30 Thread Giorgio Garziano

# your code
Subject<- c("2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "5", "5")
dates <- seq(as.Date('2011-01-01'),as.Date('2011-01-12'), by = 1)
deps <- c("A", "B", "C", "C", "D", "A", "F", "G", "A", "F", "A", "D")
df <- data.frame(Subject, dates, deps)
df
final<-c("2 2011-01-02B","2 2011-01-03C","3 2011-01-05D","3 
2011-01-06A",
 "4 2011-01-07F","4 2011-01-08G","5 2011-01-10F","5 
2011-01-11A",
 "5 2011-01-12D")

# here below my code
dep.list <- c("B", "D", "F")
sel.row = NULL
for (dep in dep.list) {
  f <- which(df$deps == dep)
  sel.row <- c(sel.row, c(f, f+1))
}
sel.row[sel.row > nrow(df)] <- NA
sel.row <- na.omit(sel.row)
df.sel <- df[sel.row,]
df.sel.ord <- df.sel[order(df.sel$dates),]

# showing and comparing with final
> df.sel.ord
   Subject  dates deps
22 2011-01-02B
32 2011-01-03C
53 2011-01-05D
63 2011-01-06A
74 2011-01-07F
84 2011-01-08G
10   5 2011-01-10F
11   5 2011-01-11A
12   5 2011-01-12D

> data.frame(final)
  final
1 2 2011-01-02B
2 2 2011-01-03C
3 3 2011-01-05D
4 3 2011-01-06A
5 4 2011-01-07F
6 4 2011-01-08G
7 5 2011-01-10F
8 5 2011-01-11A
9 5 2011-01-12D

--

Best

GG


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filtering based on the occurrence

2016-03-30 Thread Jim Lemon
Hi Farnoosh,
Despite my deep suspicion that this answer will solve a useless
problem, try this:

last_subject<-0
keep_deps<-c("B","D","F")
keep_rows<-NULL
for(rowindex in 1:dim(df)[1]) {
 if(df[rowindex,"Subject"] != last_subject) {
  last_subject<-df[rowindex,"Subject"]
  start_keeping<-0
 }
 if(df[rowindex,"deps"] %in% keep_deps) start_keeping<-1
 if(start_keeping) keep_rows<-c(keep_rows,rowindex)
}
final<-matrix(unlist(lapply(df[keep_rows,],as.character)),ncol=3)

I find it terribly hard to ignore puzzles.

Jim


On Wed, Mar 30, 2016 at 10:52 AM, Farnoosh Sheikhi via R-help
 wrote:
> Hello,
> I have a data set similar to below and I wanted to keep the observations 
> after the first occurrence of these department: "B", "D", "F".For example for 
> ID=2, the observation with deps=B and anything after will be kept in the 
> data. For ID=3, observations with deps=D and anything after will be included.
> Subject<- c("2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "5", 
> "5")dates<-seq(as.Date('2011-01-01'),as.Date('2011-01-12'),by = 1) 
> deps<-c("A", "B", "C", "C", "D", "A", "F", "G", "A", "F", "A", "D")df <- 
> data.frame(Subject, dates, deps)df
> The final data should look like this:final<-c("2 2011-01-02B","2 
> 2011-01-03C","3 2011-01-05D","3 2011-01-06A","4 2011-01-07
> F","4 2011-01-08G","5 2011-01-10F","5 2011-01-11A","5 2011-01-12  
>   D") Thank you tons for your help.
> Farnoosh
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Mann-Whitney con datos temporales

2016-03-30 Thread José Trujillo Carmona
Ruego a los miembros de la lista disculpas por mi torpeza y el desajuste 
de los mensajes.

Suelo dar "contestar" a los mensajes que contesto y en esta lista tengo 
que acordarme de pinchar en "contestar a la lista".

Envié mi última contestación solo a Carlos y el me ha contestado 
avisando del error de procedimiento y explicando mejor bajo que 
situación muy plausible la diferencia sí eliminaría la autocorrelación.

Creo que no cabe objeción a su consideración actual, si el efecto 
temporal es exactamente el mismo, si no está desplazado en el tiempo ni 
en intensidad (siempre 4 grados y la diferencia es en el mismo tiempo) 
efectivamente la diferencia eliminaría la autocorrelación.

Ya digo en anterior mensaje que la solución para sabe en qué caso 
estamos es fácil: se aplica la diferencia y se contrasta la independencia.

Saludos.


El 30/03/16 a las 01:07, Carlos J. Gil Bellosta escribió:
> Hola, ¿qué tal?
>
> Me has escrito solo a mí. No sé si querías mandar el mensaje a la 
> lista o no.
>
> En cualquier caso, estamos de acuerdo. Bajo tus hipótesis, no tengo 
> nada que objetar.
>
> Yo tenía en mente otra estructura para el problema: la del que dice 
> "en Colmenar [siempre] hace 4 grados menos que en Madrid". Es decir, 
> que si la temperatura de Madrid es de 12 grados, en Colmenar estará 
> haciendo alrededor de 8. De otra manera, T_c = T_m - N(4, sigma).
>
> No sé cómo de lejos estarán los sensores del tipo que ha escrito la 
> pregunta, pero _mi_ estructura probabilística puede justificarse en 
> algunos casos. Para Madrid y Colmenar, por ejemplo. He bajado las 
> temperaturas de las últimas 24 horas en Madrid (Barajas) y Colmenar 
> 
>  
> (en el problema original también había una serie de 24 medidas) y mira:
>
> aeropuerto <- 
> read.csv("/home/carlos/Downloads/ultimosdatos_3129_datos-horarios.csv", skip 
> = 2, fileEncoding = "latin1")
> aeropuerto <- aeropuerto[,2]
>
> colmenar <- 
> read.csv("/home/carlos/Downloads/ultimosdatos_3191E_datos-horarios.csv", 
> skip = 2, fileEncoding = "latin1")
> colmenar <- colmenar[,2]
>
> temperaturas <-
> structure(list(aeropuerto = c(10.9, 12.7, 14.9, 15.8, 17.5, 18.5,
> 18.8, 18.4, 17.9, 17.4, 16.1, 14.9, 13.6, 12.8, 11.5, 10.5, 9.9,
> 9.8, 9.8, 9.7, 9.4, 8.8, 9.9, 11), colmenar = c(8.4, 9.4, 10,
> 11.2, 12.5, 14.3, 14.1, 14.3, 13.5, 12.9, 12.2, 11.4, 10.3, 8.4,
> 7.3, 7, 7.1, 6.6, 6.4, 6.3, 6.2, 5.9, 5.7, 6.2)), .Names = c("aeropuerto",
> "colmenar"), row.names = c(NA, -24L), class = "data.frame")
>
> plot(aeropuerto, ylim = c(min(colmenar), max(aeropuerto)), type = "l")
> lines(colmenar, col = "red")
>
> Y si tomas diferencias verás que no parecen seguir ningún tipo de 
> patrón temporal.
>
> Ahora bien, ¿puedo hacer un t-test? Casi seguro que no se justifica 
> del todo por el hecho de que sobreestimo los grados de libertad 
> (piensa que podría tener tantos como quisiera tomando, por ejemplo, 
> medidas de temperatura cada nanosegundo). Pero no sería una solución 
> "tremendamente mala". Incluso podría ponerme en el lado conservador de 
> infraestimar el número de grados de libertad (i.e., usar una t de 
> Student con un par de grados de libertad y aún así encontrar 
> diferencias significativas).
>
> La otra alternativa sería crear un modelo que ajuste la temperatura en 
> función de la hora y la ubicación (p.e., usando GAM) y viendo si mi 
> coeficiente de la ubicación es significativamente distinto de cero. De 
> nuevo, todo lo anterior, bajo _mis_ hipótesis. Que seguro que no se 
> cumplen si las ubicaciones son Madrid y Santander.
>
> Ahora bien, no sabemos cuáles (las tuyas o las mías) son más creíbles 
> en el caso que da lugar a la pregunta. ¡No nos lo han dicho!
>
> Un saludo,
>
> Carlos J. Gil Bellosta
> http://www.datanalytics.com
>
>
>
>
> El 29 de marzo de 2016, 17:33, José Trujillo Carmona 
>  escribió:
>
> No estoy de acuerdo con Carlos.
>
> Si la estructura temporal viniese dada por un modelo determinista,
> como si el tiempo fuese una variable extrínseca, y con la misma
> función y los mismos parámetros, Carlos tendría razón.
>
> Pero si la estructura temporal es de naturaleza estocástica, como
> un modelo ARIMA por ejemplo, entonces no es cierto que las
> diferencias eliminen la estructura.
>
> Ejemplo al canto. Me ciño al modelo MA(1) donde es más fácil de
> probar. Todo modelo ARIMA se puede expresar como un MA(inf) así
> que lo que digo es generalizable.
>
> En el modelo MA(1) la estructura de las observaciones es:
>
> X(t) = m1 + e(t) + q e(t-1)
>
> Donde m1 es la media de la serie (en un residuo de un modelo,
> normalmente es cero) e(1), e(2), ... e(t) son ruido blanco.
>
> Una segunda serie con la misma estructura (coeficiente) q vendría
> dada por:
>
> Y(t) = m2 + f(t) + q f(t-1)
>
> Donde f(1), f(2), ... f(t) son igualmente ruido blanco 

Re: [R] how can I count data points outside the main plot line?

2016-03-30 Thread PIKAL Petr
Hi Raz

Keep your responses to the rhelp list.
Did you try residuals function?


DNase1 <- subset(DNase, Run == 1)
DNase1[12,3]<-0.1
fm1DNase1 <- nls(density ~ SSlogis(log(conc), Asym, xmid, scal), DNase1)
resid(fm1DNase1)
[1] -3.273213e-02 -3.173213e-02 -7.798226e-03 -4.798226e-03 -7.665433e-05
[6]  8.923346e-03  4.956802e-02  4.656802e-02  9.936101e-02  9.436101e-02
[11]  2.233609e-01 -6.956391e-01  1.334304e-01  1.634304e-01 -2.106923e-02
[16] -4.106923e-02
attr(,"label")
1] "Residuals"
plot(resid(fm1DNase1))

You can use the threshold for selecting values departing from your model.

> sum(abs(resid(fm1DNase1))>0.4)
[1] 1
> sum(abs(resid(fm1DNase1))>0.2)
[1] 2
> sum(abs(resid(fm1DNase1))>0.1)
[1] 4
>

Cheers
Petr


From: raz [mailto:barvazd...@gmail.com]
Sent: Wednesday, March 30, 2016 8:31 AM
To: PIKAL Petr 
Subject: Re: [R] how can I count data points outside the main plot line?

Hi Petr,
Thanks for your reply, if you have time can you elaborate more about your 
suggestions, I dont understand what you meant..
Thanks
Raz

On Tue, Mar 29, 2016 at 12:14 PM, PIKAL Petr 
> wrote:
Hi

Did you try residuals and/or influence.measures?

Cheers
Petr


-Original Message-
From: R-help 
[mailto:r-help-boun...@r-project.org] On 
Behalf Of raz
Sent: Tuesday, March 29, 2016 10:51 AM
To: r-help@r-project.org
Subject: [R] how can I count data points outside the main plot line?

How can I count data points that lay outside of the main plot line?
I have a plot in which most data points create a sigmoid line, but some are 
spread throughout the plot area, these points dont fit the curve. I would like 
to count them to know the ratio between the main curve and the data points that 
dont fit, any ideas?

Thanks,

Raz

--
\m/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, 

Re: [R] R logo size in package information

2016-03-30 Thread Marc Girondot via R-help

Le 30/03/2016 06:18, Jeff Newmiller a écrit :
You are not clarifying yet. If this requires RStudio to reproduce then 
this question doesn't belong here. I am not yet convinced that RStudio 
IS required, but every time you mention it the water gets muddier.

I try to be shorter:

There is bad interaction between Rlogo.svg installed by R and Rstudio:
So there are two solutions. Or Rstudio changes the way they scale 
Rlogo.svg, or Rlogo.svg is differently scaled during R install. Here is 
a Rlogo.svg correctly scaled to replace the original version hat is 
located at 
/Library/Frameworks/R.framework/Versions/3.3/Resources/doc/html/Rlogo.svg for 
R 3.3 MacOSX version:

http://www.ese.u-psud.fr/epc/conservation/CRAN/Rlogo.svg

Marc

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.