[R] Unordered combinations with repetition

2015-06-09 Thread Thomas Chesney
Does anyone know of a function that will return all unordered combinations of n 
elements from a list with repetition?

The combs function in caTools will do this without repetition:

combs(1:2, 2)

 [,1] [,2]
[1,]12

What I'd like is:

1 1
1 2
2 2

Thank you,

Thomas Chesney



This message and any attachment are intended solely for the addressee
and may contain confidential information. If you have received this
message in error, please send it back to me, and immediately delete it. 

Please do not use, copy or disclose the information contained in this
message or in any attachment.  Any views or opinions expressed by the
author of this email do not necessarily reflect the views of the
University of Nottingham.

This message has been checked for viruses but the contents of an
attachment may still contain software viruses which could damage your
computer system, you are advised to perform your own checks. Email
communications with the University of Nottingham may be monitored as
permitted by UK legislation.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unordered combinations with repetition

2015-06-09 Thread WRAY NICHOLAS
You could try expand.grid -- you'd prob need to modify what's beneath

*a=c(0,1,2)*

*b=c(0,1)*

*c=c(0,1)*

*y<-list()*

*y[[1]]<-a*

*y[[2]]<-b*

*y[[3]]<-c*

*expand.grid(y)*

This code gives all combinations


On 9 June 2015 at 10:11, Thomas Chesney 
wrote:

> Does anyone know of a function that will return all unordered combinations
> of n elements from a list with repetition?
>
> The combs function in caTools will do this without repetition:
>
> combs(1:2, 2)
>
>  [,1] [,2]
> [1,]12
>
> What I'd like is:
>
> 1 1
> 1 2
> 2 2
>
> Thank you,
>
> Thomas Chesney
>
>
>
> This message and any attachment are intended solely for the addressee
> and may contain confidential information. If you have received this
> message in error, please send it back to me, and immediately delete it.
>
> Please do not use, copy or disclose the information contained in this
> message or in any attachment.  Any views or opinions expressed by the
> author of this email do not necessarily reflect the views of the
> University of Nottingham.
>
> This message has been checked for viruses but the contents of an
> attachment may still contain software viruses which could damage your
> computer system, you are advised to perform your own checks. Email
> communications with the University of Nottingham may be monitored as
> permitted by UK legislation.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unordered combinations with repetition

2015-06-09 Thread Thomas Chesney
Thank you Nicholas.

I've found that Urnsamples in the prob package does it too:

urnsamples(1:2, size = 2, replace = TRUE, ordered = FALSE)

Thomas

From: WRAY NICHOLAS [nicholas.w...@ntlworld.com]
Sent: Tuesday, June 09, 2015 10:52 AM
To: Thomas Chesney
Cc: r-help@r-project.org
Subject: Re: [R] Unordered combinations with repetition

You could try expand.grid -- you'd prob need to modify what's beneath
a=c(0,1,2)
b=c(0,1)
c=c(0,1)
y<-list()
y[[1]]<-a
y[[2]]<-b
y[[3]]<-c
expand.grid(y)
This code gives all combinations


On 9 June 2015 at 10:11, Thomas Chesney 
mailto:thomas.ches...@nottingham.ac.uk>> wrote:
Does anyone know of a function that will return all unordered combinations of n 
elements from a list with repetition?

The combs function in caTools will do this without repetition:

combs(1:2, 2)

 [,1] [,2]
[1,]12

What I'd like is:

1 1
1 2
2 2

Thank you,

Thomas Chesney



This message and any attachment are intended solely for the addressee
and may contain confidential information. If you have received this
message in error, please send it back to me, and immediately delete it.

Please do not use, copy or disclose the information contained in this
message or in any attachment.  Any views or opinions expressed by the
author of this email do not necessarily reflect the views of the
University of Nottingham.

This message has been checked for viruses but the contents of an
attachment may still contain software viruses which could damage your
computer system, you are advised to perform your own checks. Email
communications with the University of Nottingham may be monitored as
permitted by UK legislation.

__
R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




This message and any attachment are intended solely for the addressee
and may contain confidential information. If you have received this
message in error, please send it back to me, and immediately delete it. 

Please do not use, copy or disclose the information contained in this
message or in any attachment.  Any views or opinions expressed by the
author of this email do not necessarily reflect the views of the
University of Nottingham.

This message has been checked for viruses but the contents of an
attachment may still contain software viruses which could damage your
computer system, you are advised to perform your own checks. Email
communications with the University of Nottingham may be monitored as
permitted by UK legislation.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] load a very big .RData - error reading from connection

2015-06-09 Thread Jim Lemon
Hi carol,
Have you tried renaming the file to something like "my.RData"? And
just how big is it?

Jim


On Tue, Jun 9, 2015 at 5:50 AM, carol white via R-help
 wrote:
> Hi,How is it possible to load a very big .RData that can't be loaded it's 
> very big and the following error msg is displayed
>
> load(".RData")
> Error: error reading from connection
> Thanks
> Carol
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mismatch between match and unique causing ecdf (well, approxfun) to fail

2015-06-09 Thread Meyners, Michael
Thanks Martin. 
Yep, I understand it is documented and my code wasn't as it should've been -- 
the confusion comes from the fact that it worked ok for hundreds of situations 
that seem very much alike, but one situation breaks. I agree that you typically 
can't be sure about having only numerical data in the data frame, but I was 
sure I had by design (numeric results of simulations, so no factors or anything 
else) and was then sloppy in passing the rows of the data frame to ecdf. So 
wondering what makes this situation different from all the others I had... 
Anyway, point taken and working solution found, so all fine :-)
Cheers, Michael

> -Original Message-
> From: Martin Maechler [mailto:maech...@stat.math.ethz.ch]
> Sent: Montag, 8. Juni 2015 16:43
> To: Meyners, Michael
> Cc: r-help@r-project.org
> Subject: Re: [R] mismatch between match and unique causing ecdf (well,
> approxfun) to fail
> 
> 
> > Aehm, adding on this: I incorrectly *assumed* without testing that
> rounding would help; it doesn't:
> > ecdf(round(test2,0))# a rounding that is way too rough for my
> application...
> > #Error in xy.coords(x, y) : 'x' and 'y' lengths differ
> >
> > Digging deeper: The initially mentioned call to unique() is not very 
> > helpful,
> as test2 is a data frame, so I get what I deserve, an unchanged data frame
> with 1 row. Still, the issue remains and can even be simplified further:
> >
> > > ecdf(data.frame(a=3, b=4))
> > Empirical CDF
> > Call: ecdf(data.frame(a = 3, b = 4))
> >  x[1:2] =  3,  4
> >
> > works ok, but
> >
> > > ecdf(data.frame(a=3, b=3))
> > Error in xy.coords(x, y) : 'x' and 'y' lengths differ
> >
> > doesn't (same for a=b=1 or 2, so likely the same for any a=b).
> > Instead,
> >
> > > ecdf(c(a=3, b=3))
> > Empirical CDF
> > Call: ecdf(c(a = 3, b = 3))
> >  x[1:1] =  3
> >
> > does the trick. From ?ecdf, I get that x should be a numeric vector -
> apparently, my misuse of the function by applying it to a row of a data frame
> (i.e. a data frame with one row). In all my other (dozens of) cases that
> worked ok, though but not for this particular one. A simple unlist() helps:
> 
> You were lucky.   To use a one-row data frame instead of a
> numerical vector will typically *not* work unless ... well, you are lucky.
> 
> No, do *not*  pass data frame rows instead of numeric vectors.
> 
> >
> > > ecdf(unlist(data.frame(a=3, b=3)))
> > Empirical CDF
> > Call: ecdf(unlist(data.frame(a = 3, b = 3)))
> >  x[1:1] =  3
> >
> > Yet, I'm even more confused than before: in my other data, there were
> also duplicated values in the vector (1-row-data frame), and it never caused
> any issue. For this particular example, it does. I must be missing something
> fundamental...
> >
> 
> well.. I'm confused about why you are confused, but if you are thinking
> about passing rows of data frames as numeric vectors, this means you are
> sure that your data frame only contains "classical numbers" (no factors, no
> 'Date's, no...).
> 
> In such a case, transform your data frame to a numerical matrix
> *once* preferably using  data.matrix() instead of just
> as.matrix() but in this case it should not matter.
> Then *check* the result and then work with that matrix from then on.
> 
> All other things probably will continue to leave you confused ..
> ;-)
> 
> Martin Maechler,
> ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summarizing data based on Date

2015-06-09 Thread Shivi82
Hi Petr 

I researched a lot over the net and R manual as well based on which I
revamped my code and came to the code as:
test$CR_DT <- as.Date(test$CR_DT, '%d-%b-%y')

iii<- aggregate(test$CHG_WT,list(format(test$CR_DT,"%m")),FUN=sum)

However it still gives me the error as below:
Error in Summary.factor(c(1L, 1L, 1L, 3286L, 1646L, 3241L, 1L, 1L, 1307L,  : 
  ‘sum’ not meaningful for factors. 

If could you guide on how to achieve the desired output. Thanks. 



--
View this message in context: 
http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708384.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] load a very big .RData - error reading from connection

2015-06-09 Thread carol white via R-help
yes and doesn't help.600MB
Thanks
Carol 


 On Tuesday, June 9, 2015 12:22 PM, Jim Lemon  wrote:
   

 Hi carol,
Have you tried renaming the file to something like "my.RData"? And
just how big is it?

Jim


On Tue, Jun 9, 2015 at 5:50 AM, carol white via R-help
 wrote:
> Hi,How is it possible to load a very big .RData that can't be loaded it's 
> very big and the following error msg is displayed
>
> load(".RData")
> Error: error reading from connection
> Thanks
> Carol
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Summarizing data based on Date

2015-06-09 Thread David L Carlson
What does the following command print out?

str(test)

The error message indicates that test$CHG_WT is not numeric.

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Shivi82
Sent: Tuesday, June 9, 2015 7:01 AM
To: r-help@r-project.org
Subject: Re: [R] Summarizing data based on Date

Hi Petr 

I researched a lot over the net and R manual as well based on which I
revamped my code and came to the code as:
test$CR_DT <- as.Date(test$CR_DT, '%d-%b-%y')

iii<- aggregate(test$CHG_WT,list(format(test$CR_DT,"%m")),FUN=sum)

However it still gives me the error as below:
Error in Summary.factor(c(1L, 1L, 1L, 3286L, 1646L, 3241L, 1L, 1L, 1307L,  : 
  ‘sum’ not meaningful for factors. 

If could you guide on how to achieve the desired output. Thanks. 



--
View this message in context: 
http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708384.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] load a very big .RData - error reading from connection

2015-06-09 Thread Boris Steipe
There are several possible reasons and you have really told us nothing that 
might help isolating the problem. 600 MB is large, but not "very" large. R and 
your OS should not be expected to have a problem with files of that size. First 
of all, you'll need to document why you expect this should work in the first 
place.

- is the file where you think it is?  ?dir
- do you actually have read access? ?file.access
- was the file produced by a program that writes correctly formatted Rdata 
files?
- does this work with other files that are similarly formatted?

etc.


B.

On Jun 9, 2015, at 6:38 AM, carol white via R-help  wrote:

> yes and doesn't help.600MB
> Thanks
> Carol 
> 
> 
> On Tuesday, June 9, 2015 12:22 PM, Jim Lemon  wrote:
> 
> 
> Hi carol,
> Have you tried renaming the file to something like "my.RData"? And
> just how big is it?
> 
> Jim
> 
> 
> On Tue, Jun 9, 2015 at 5:50 AM, carol white via R-help
>  wrote:
>> Hi,How is it possible to load a very big .RData that can't be loaded it's 
>> very big and the following error msg is displayed
>> 
>> load(".RData")
>> Error: error reading from connection
>> Thanks
>> Carol
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unordered combinations with repetition

2015-06-09 Thread William Dunlap
> combnWithRepetition <- function(n, k) combn(n+k-1, k) - seq(from=0, len=k)
> combnWithRepetition(2, 2)
 [,1] [,2] [,3]
[1,]112
[2,]122
> combnWithRepetition(3, 2)
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]111223
[2,]123233



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Jun 9, 2015 at 2:11 AM, Thomas Chesney <
thomas.ches...@nottingham.ac.uk> wrote:

> Does anyone know of a function that will return all unordered combinations
> of n elements from a list with repetition?
>
> The combs function in caTools will do this without repetition:
>
> combs(1:2, 2)
>
>  [,1] [,2]
> [1,]12
>
> What I'd like is:
>
> 1 1
> 1 2
> 2 2
>
> Thank you,
>
> Thomas Chesney
>
>
>
> This message and any attachment are intended solely for the addressee
> and may contain confidential information. If you have received this
> message in error, please send it back to me, and immediately delete it.
>
> Please do not use, copy or disclose the information contained in this
> message or in any attachment.  Any views or opinions expressed by the
> author of this email do not necessarily reflect the views of the
> University of Nottingham.
>
> This message has been checked for viruses but the contents of an
> attachment may still contain software viruses which could damage your
> computer system, you are advised to perform your own checks. Email
> communications with the University of Nottingham may be monitored as
> permitted by UK legislation.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to reach a txt file like this?

2015-06-09 Thread Ye Lin
​Hey All, I have a txt data file that looks like this:

​[{“ID”:“A”,“Name":"Tom", "Age":"18"},{“ID”:“B”,“Name":"Jim", "Age":"19"}]


​How can I read this into R as a data frame? I have used readLines to read
all the lines but dont know how to deal with column names and inputs.

Thanks for your help!​

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Summarizing data based on Date

2015-06-09 Thread John Kane
Hi,

As David said have a look at str(test). You have a factor in there or else that 
weird "list(format(test$CR_DT,"%m"))" command in aggregate() is mucking things 
up.  What is "list(format(test$CR_DT,"%m"))" intended to do?  No ,a quick test 
says it is mucking something else up and not giving the us the factor problem. 

Here is your sample data and what I think is what you are trying to do. Note 
the data is supplied in dput() format which is the preferred way to supply 
sample data to the R-help list.  See 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
 and http://adv-r.had.co.nz/Reproducibility.html for more information.  I used 
lubridate's dmy() function rather than as.Date() to format the dates.

dat1  <-  structure(list(dd = structure(c(1426204800, 142776, 1426377600, 
1426550400, 1426550400, 1426032000, 1426032000, 1426723200), tzone = "UTC", 
class = c("POSIXct", 
"POSIXt")), wt = c(0, 0, 0, 770, 3.73, 70, 10, 500)), .Names = c("dd", 
"wt"), row.names = c(NA, -8L), class = "data.frame")

str(dat1)

aggregate(dat1$wt, list(dat1$dd), sum)


John Kane
Kingston ON Canada


> -Original Message-
> From: shivibha...@ymail.com
> Sent: Tue, 9 Jun 2015 05:01:23 -0700 (PDT)
> To: r-help@r-project.org
> Subject: Re: [R] Summarizing data based on Date
> 
> Hi Petr
> 
> I researched a lot over the net and R manual as well based on which I
> revamped my code and came to the code as:
> test$CR_DT <- as.Date(test$CR_DT, '%d-%b-%y')
> 
> iii<- aggregate(test$CHG_WT,list(format(test$CR_DT,"%m")),FUN=sum)
> 
> However it still gives me the error as below:
> Error in Summary.factor(c(1L, 1L, 1L, 3286L, 1646L, 3241L, 1L, 1L, 1307L,
> :
>   ‘sum’ not meaningful for factors.
> 
> If could you guide on how to achieve the desired output. Thanks.
> 
> 
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708384.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5
Capture screenshots, upload images, edit and send them to your friends
through IMs, post on Twitter®, Facebook®, MySpace™, LinkedIn® – FAST!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unordered combinations with repetition

2015-06-09 Thread William Dunlap
That combnWithRepetition (based on combn) can use much
less memory (and time) than the algorithm in prob:::urnsamples.default
with replace=TRUE, ordered=FALSE.  Perhaps urnsamples()
could be updated to use combn instead of unique(as.matrix(expand.grid())).

See the urn chapter in Feller vol. 1.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Jun 9, 2015 at 8:56 AM, William Dunlap  wrote:

> > combnWithRepetition <- function(n, k) combn(n+k-1, k) - seq(from=0,
> len=k)
> > combnWithRepetition(2, 2)
>  [,1] [,2] [,3]
> [1,]112
> [2,]122
> > combnWithRepetition(3, 2)
>  [,1] [,2] [,3] [,4] [,5] [,6]
> [1,]111223
> [2,]123233
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Jun 9, 2015 at 2:11 AM, Thomas Chesney <
> thomas.ches...@nottingham.ac.uk> wrote:
>
>> Does anyone know of a function that will return all unordered
>> combinations of n elements from a list with repetition?
>>
>> The combs function in caTools will do this without repetition:
>>
>> combs(1:2, 2)
>>
>>  [,1] [,2]
>> [1,]12
>>
>> What I'd like is:
>>
>> 1 1
>> 1 2
>> 2 2
>>
>> Thank you,
>>
>> Thomas Chesney
>>
>>
>>
>> This message and any attachment are intended solely for the addressee
>> and may contain confidential information. If you have received this
>> message in error, please send it back to me, and immediately delete it.
>>
>> Please do not use, copy or disclose the information contained in this
>> message or in any attachment.  Any views or opinions expressed by the
>> author of this email do not necessarily reflect the views of the
>> University of Nottingham.
>>
>> This message has been checked for viruses but the contents of an
>> attachment may still contain software viruses which could damage your
>> computer system, you are advised to perform your own checks. Email
>> communications with the University of Nottingham may be monitored as
>> permitted by UK legislation.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to reach a txt file like this?

2015-06-09 Thread John McKown
On Tue, Jun 9, 2015 at 11:24 AM, Ye Lin  wrote:

> ​Hey All, I have a txt data file that looks like this:
>
> ​[{“ID”:“A”,“Name":"Tom", "Age":"18"},{“ID”:“B”,“Name":"Jim", "Age":"19"}]
>
>
> ​How can I read this into R as a data frame? I have used readLines to read
> all the lines but dont know how to deal with column names and inputs.
>

​That looks like a JSON array of objects to me. I would look into
"jsonlite", "rjson", or "RJSONIO" on CRAN. You'll need to review them to
see which best meets your needs.​


>
> Thanks for your help!​
>
> [[alternative HTML version deleted]]
>

​ Please change to plain text. In many cases HTML displays poorly due to
the list trying to change it for you to plain text. And, in that case,
you'll likely be ignored.
​

-- 
Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.

My sister opened a computer store in Hawaii. She sells C shells down by the
seashore.
If someone tell you that nothing is impossible:
Ask him to dribble a football.

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cross-over Data with Kenward-Roger correction

2015-06-09 Thread Ben Bolker
knouri  yahoo.com> writes:

> 
> Dear all:for the folowing data, a two-period, two treatment (A=1 vs. B=2)
> cross-over is fitted
> using the folowing SAS code.  
> data one;

[snip]

> run;
> proc mixed data=one method=reml;
> class Sbj Per Trt;
>    model PEF = Per Trt /ddfm=kr;
>    repeated Trt / sub=Sbj type=un r;
>    lsmeans Trt / cl alpha=0.05;
>    estimate 'B vs. A' Trt -1  1 / alpha=0.1 cl;
> run;

> (where kr option is for Kenward-Roger method).I need to use R to
> reproduce the results similar to what the above SAS code generates.
> I have used several R functions including lme, lmer with no success
> so far.Any advice will be greatly appreciated,Sincerely, Keramat


This is more appropriate for r-sig-mixed-mod...@r-project.org.
Please post followups there.

The lmerTest and lsmeans packages will probably be useful.

As a statistical point, I don't understand why you can't just
do a paired t-test on these data??

dat <- read.table(header=TRUE,text=
"Sbj Seq Per Trt PEF
1 1 1 1 310
1 1 2 2 270
4 1 1 1 310
4 1 2 2 260
6 1 1 1 370
6 1 2 2 300
7 1 1 1 410
7 1 2 2 390
10 1 1 1 250
10 1 2 2 210
11 1 1 1 380
11 1 2 2 350
14 1 1 1 330
14 1 2 2 365
2 2 1 2 370
2 2 2 1 385
3 2 1 2 310
3 2 2 1 400
5 2 1 2 380
5 2 2 1 410
9 2 1 2 290
9 2 2 1 320
12 2 1 2 260
12 2 2 1 340
13 2 1 2 90
13 2 2 1 220")

library(lmerTest)
library(ggplot2); theme_set(theme_bw())
ggplot(dat,aes(x=Per,y=PEF,colour=factor(Trt)))+geom_point()+
geom_line(colour="gray",aes(group=Sbj))+
scale_x_continuous(breaks=c(1,2))

m1 <- lmer(PEF~Per+Trt +(Trt|Sbj), data=dat)

## warning about unidentifiability

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] web scraping image

2015-06-09 Thread boB Rudis
You can also do it with rvest & httr (but that does involve some "parsing"):

library(httr)
library(rvest)

url <- 
"http://nwis.waterdata.usgs.gov/nwis/peak?site_no=12144500&agency_cd=USGS&format=img";
html(url) %>%
  html_nodes("img") %>%
  html_attr("src") %>%
  paste0("http://nwis.waterdata.usgs.gov";, .) %>%
  GET(write_disk("12144500.gif")) -> status

Very readable and can be made programmatic pretty easily, too. Plus:
avoids direct use of the XML library. Future versions will no doubt
swap xml2 for XML as well.

-Bob


On Mon, Jun 8, 2015 at 2:09 PM, Curtis DeGasperi
 wrote:
> Thanks to Jim's prompting, I think I came up with a fairly painless way to
> parse the HTML without having to write any parsing code myself using the
> function getHTMLExternalFiles in the XML package. A working version of the
> code follows:
>
> ## Code to process USGS peak flow data
>
> require(dataRetrieval)
> require(XML)
>
> ## Need to start with list of gauge ids to process
>
> siteno <- c('12142000','12134500','12149000')
>
> lstas <-length(siteno) #length of locator list
>
> print(paste('Processsing...',siteno[1],' ',siteno[1], sep = ""))
>
> datall <-  readNWISpeak(siteno[1])
>
> for (a in 2:lstas) {
>   # Print station being processed
>   print(paste('Processsing...',siteno[a], sep = ""))
>
>   dat<-  readNWISpeak(siteno[a])
>
>   datall <- rbind(datall,dat)
>
> }
>
> write.csv(datall, file = "usgs_peaks.csv")
>
> # Retrieve ascii text files and graphics
> for (a in 1:lstas) {
>
>   print(paste('Processsing...',siteno[a], sep = ""))
>
>   graphic.url <-
> paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=',siteno[a],'&agency_cd=USGS&format=img',
> sep = "")
>   usgs.img <- getHTMLExternalFiles(graphic.url)
>   graphic.img <- paste('http://nwis.waterdata.usgs.gov',usgs.img, sep = "")
>
>   peakfq.url <-
> paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=',siteno[a],'&agency_cd=USGS&format=hn2',
> sep = "")
>   tab.url  <- 
> paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=',siteno[a],'&agency_cd=USGS&format=rdb',
> sep = "")
>
>   graphic.fn <- paste('graphic_',siteno[a],'.gif', sep = "")
>   peakfq.fn <- paste('peakfq_',siteno[a],'.txt', sep = "")
>   tab.fn  <- paste('tab_',siteno[a],'.txt', sep = "")
>   download.file(graphic.img,graphic.fn,mode='wb')
>   download.file(peakfq.url,peakfq.fn)
>   download.file(tab.url,tab.fn)
> }
>
>> --
>>
>> Message: 34
>> Date: Fri, 5 Jun 2015 08:59:04 +1000
>> From: Jim Lemon 
>> To: Curtis DeGasperi 
>> Cc: r-help mailing list 
>> Subject: Re: [R] web scraping image
>> Message-ID:
>> <
> ca+8x3fv0ajw+e22jayv1gfm6jr_tazua5fwgd3t_mfgfqy2...@mail.gmail.com>
>> Content-Type: text/plain; charset=UTF-8
>>
>> Hi Chris,
>> I don't have the packages you are using, but tracing this indicates
>> that the page source contains the relative path of the graphic, in
>> this case:
>>
>> /nwisweb/data/img/USGS.12144500.19581112.20140309..0.peak.pres.gif
>>
>> and you already have the server URL:
>>
>> nwis.waterdata.usgs.gov
>>
>> getting the path out of the page source isn't difficult, just split
>> the text at double quotes and get the token following "img src=". If I
>> understand the arguments of "download.file" correctly, the path is the
>> graphic.fn argument and the server URL is the graphic.url argument. I
>> would paste them together and display the result to make sure that it
>> matches the image you want. When I did this, the correct image
>> appeared in my browser. I'm using Google Chrome, so I don't have to
>> prepend the http://
>>
>> Jim
>>
>> On Fri, Jun 5, 2015 at 2:31 AM, Curtis DeGasperi
>>  wrote:
>>> I'm working on a script that downloads data from the USGS NWIS server.
>>> dataRetrieval makes it easy to quickly get the data in a neat tabular
>>> format, but I was also interested in getting the tabular text files -
>>> also fairly easy for me using download.file.
>>>
>>> However, I'm not skilled enough to work out how to download the nice
>>> graphic files that can be produced dynamically from the USGS NWIS
>>> server (for example:
>>>
> http://nwis.waterdata.usgs.gov/nwis/peak?site_no=12144500&agency_cd=USGS&format=img
> )
>>>
>>> My question is how do I get the image from this web page and save it
>>> to a local directory? scrapeR returns the information from the page
>>> and I suspect this is a possible solution path, but I don't know what
>>> the next step is.
>>>
>>> My code provided below works from a list I've created of USGS flow
>>> gauging stations.
>>>
>>> Curtis
>>>
>>> ## Code to process USGS daily flow data for high and low flow analysis
>>> ## Need to start with list of gauge ids to process
>>> ## Can't figure out how to automate download of images
>>>
>>> require(dataRetrieval)
>>> require(data.table)
>>> require(scrapeR)
>>>
>>> df <- read.csv("usgs_stations.csv", header=TRUE)
>>>
>>> lstas <-length(df$siteno) #length of locator list
>>>
>>> print(paste('Processsing...',df$name[1],' ',df$siteno[1], sep 

[R] Cross tabulation with top one variable and side as multiple variables

2015-06-09 Thread jagadishpchary
Hi:

I have a huge data with lot of variables and I need to check the trend
variations from year to year. In order to do so, I have to cross tabulate
the year variable as top (constant) and all the remaining variables as side
(attached the cross tabulation report). I have searched the forums but the
syntax I could find for cross tabulation is between 2 or 3 variables. So i
would request to provide a code which can print the data in the same way as
in the attached.   



--
View this message in context: 
http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting a dataframe

2015-06-09 Thread Ryan Derickson
Lots of ways to do this, I use %in% with bracket notation [row, column].
The empty column argument below returns all columns but you could have
conditional logic there as well.

dd[dd$rows %in% test_rows, ]



On Mon, Jun 8, 2015 at 6:44 PM, Bogdan Tanasa  wrote:

> Dear all,
>
> would appreciate your suggestions on subsetting a dataframe : please let's
> consider an example dataframe df:
>
> dd<-c(1,2,3)
> rows<-c("A1","A2","A3")
> columns<-c("B1","B2","B3")
> numbers <- c(400, 500, 600)
> df <- dataframe(dd,rows,columns, numbers)
>
> and a vector : test_rows <-c("A1","A3") ;
>
> how could I subset the dataframe df function of vector test_rows, in such a
> way that only the lines of dataframe df (df$rows) that match the elements
> of test_rows ("A1" and "A3") are listed ?
>
> thank you very much,
>
> -- bogdan
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to reach a txt file like this?

2015-06-09 Thread Boris Steipe
This is (almost) json data (but see NOTE below); there are several packages 
that deal with json, jsonlite for example.

R > data <- '[{"ID":"A", "Name":"Tom", "Age":"18"},{"ID":"B", "Name":"Jim", 
"Age":"19"}]'

R > install.packages("jsonlite")
R > library(jsonlite)

R > myDf <- fromJSON(data, simplifyDataFrame=TRUE)
R > str(myDf)
'data.frame':   2 obs. of  3 variables:
 $ ID  : chr  "A" "B"
 $ Name: chr  "Tom" "Jim"
 $ Age : chr  "18" "19"



NOTE: some of the quotation marks in your example are messed up, and some of 
your commas and colons seem to use an Asian font - i.e. they are UTF, not 
ASCII. You will need to clean up all the non ASCII characters that are 
syntactically important, otherwise things break.

Cheers,
Boris


On Jun 9, 2015, at 12:24 PM, Ye Lin  wrote:

> ​Hey All, I have a txt data file that looks like this:
> 
> ​[{“ID”:“A”,“Name":"Tom", "Age":"18"},{“ID”:“B”,“Name":"Jim", "Age":"19"}]
> 
> 
> ​How can I read this into R as a data frame? I have used readLines to read
> all the lines but dont know how to deal with column names and inputs.
> 
> Thanks for your help!​
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Warning message when using lmer function

2015-06-09 Thread li li
I got the following warning message when using the lmer function.
Does anyone know what is the implication? Thanks!

Warning message:
In anova(model, ddf = "lme4") : bytecode version mismatch; using eval

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cross tabulation with top one variable and side as multiple variables

2015-06-09 Thread David Winsemius

On Jun 9, 2015, at 1:40 AM, jagadishpchary wrote:

> Hi:
> 
> I have a huge data with lot of variables and I need to check the trend
> variations from year to year. In order to do so, I have to cross tabulate
> the year variable as top (constant) and all the remaining variables as side
> (attached the cross tabulation report). I have searched the forums but the
> syntax I could find for cross tabulation is between 2 or 3 variables. So i
> would request to provide a code which can print the data in the same way as
> in the attached.   

I think you will find that people on this list expect you to provide data in 
the form of text rather than pictures. When I looked at the request there were 
two routes I considered: 1) combine margin.table with ftable and 2) investigate 
one of (but not both) of plyr or dply packages.
> 
> 
> View this message in context: 
> http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379.html
> Sent from the R help mailing list archive at Nabble.com.

Nabble is neither the R help mailing list nor its archive.

Nabble also removes this message from replies. You should read the material 
about the list.

> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] graphs, need urgent help (deadline :( )

2015-06-09 Thread Rosa Oliveira
Hi,

another naive question (i’m pretty sure :( )


I’m trying to plot a multiple line graph:

 regionsample  factora  factorbfactorc
0.1 10  0.895   0.903   0.378
0.2 10  0.811   0.865   0.688
0.1 20  0.735   0.966   0.611
0.2 20  0.777   0.732   0.653
0.1 30  0.600   0.778   0.694
0.2 30  0.466   174.592 0.461
0.1 40  0.446   0.432   0.693
0.2 40  0.392   0.294   0.686



The first column should be the independent variable, the second should compute 
a bold line for sample(10) and dash line for sample 20.
The others variables are outcomes for each of the first scenarios, and so it 
should: the 3rd, 4th and 5th columns should be blue, red and green 
respectively. 


Resume :)

I should have a graph, in the x-axe should have the region and in the y axe, 
the factor.
Lines:
1 - blue and bold for region 0.1, sample 10 and factor a
2 - blue and dash for region 0.2, sample 10 and factor a
3 - red and bold for region 0.1, sample 10 and factor b
4 - red and dash for region 0.2, sample 10 and factor b
5 - green and bold for region 0.1, sample 10 and factor c
6 - green and dash for region 0.2, sample 10 and factor c

nonetheless the independent variable is nominal, I should plot a line graph.

Can anyone help me please?
I have my file as a cvs file, so I first read that file (that I know how to do 
:)).

But I have it in that format.

Best,
RO



Atenciosamente,
Rosa Oliveira

-- 



Rosa Celeste dos Santos Oliveira, 

E-mail: rosit...@gmail.com
Tlm: +351 939355143 
Linkedin: https://pt.linkedin.com/in/rosacsoliveira

"Many admire, few know"
Hippocrates


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cross tabulation with top one variable and side as multiple variables

2015-06-09 Thread Jeff Newmiller
There are two issues here... calculation and presentation. The table function 
from base R can work with many variables. If your data set is so large that you 
have problems with memory then you could investigate data.table or sqldf 
packages, which perform the computations but do not present the data in cross 
tabulation form. You could use table or perhaps the tables package to render 
the data into the desired form.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On June 9, 2015 1:40:53 AM PDT, jagadishpchary  
wrote:
>Hi:
>
>I have a huge data with lot of variables and I need to check the trend
>variations from year to year. In order to do so, I have to cross
>tabulate
>the year variable as top (constant) and all the remaining variables as
>side
>(attached the cross tabulation report). I have searched the forums but
>the
>syntax I could find for cross tabulation is between 2 or 3 variables.
>So i
>would request to provide a code which can print the data in the same
>way as
>in the attached. 
> 
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Different random intercepts but same random slope for groups

2015-06-09 Thread li li
Hi all,
  I'd like to fit a random intercept and random slope model. In my
data, there are three groups. I want to have different random
intercept for each group but the same random slope effect for all
three groups. I used the following R command.
However, there seems to be some problem. Any suggestions?



mod2 <- lmer(result  ~ group*time+(0+group1+ group2 +
group3+time|lot), na.action=na.omit, data=alldata)

> summary(mod2)
Model is not identifiable...
summary from lme4 is returned
some computational error has occurred in lmerTest
Linear mixed model fit by REML ['merModLmerTest']
Formula: result ~ group * time + (0 + group1 + group2 + group3 + time |
lot)
   Data: alldata

REML criterion at convergence: 807.9

Scaled residuals:
Min  1Q  Median  3Q Max
-3.0112 -0.3364  0.0425  0.2903  3.2017

Random effects:
 Groups   Name Variance Std.Dev. Corr
 lot  group1   0.0 0.000
  group2   86.20156 9.284  NaN
  group3 55.91479 7.478  NaN  0.06
  time  0.02855 0.169  NaN -0.99  0.10
 Residual  39.91968 6.318
Number of obs: 119, groups:  lot, 15

Fixed effects:
Estimate Std. Error t value
(Intercept) 100.1566 2.5108   39.89
group  group2-2.9707 3.7490   -0.79
group  group3   -0.0717 2.8144   -0.03
time -0.1346 0.1780   -0.76
group  group2 :time   0.1450 0.29390.49
group  group3:time0.1663 0.21520.77

Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 0.147314 (tol = 0.002, component 2)
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge: degenerate  Hessian with 2 negative eigenvalues

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] more complex by with data.table???

2015-06-09 Thread Ramiro Barrantes
Hello,

I am trying to do something that I am able to do with the "by" function within 
data.frame but can't figure out how to achieve with data.table.

Consider

dt<-data.table(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
myFunction <- function(x) { mean(x) }

I am aware that I can do something like:

dt[, .(meanVar1=myFunction(var1)) ,by=.(name)]

but how could I do the equivalent of:

df<-data.frame(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
myFunction <- function(x) { mean(x) }

columnNames <- c("var1","var2","var3")
result <- by(df, df$name, function(x) {
   output <- c()
   for(col in columnNames) {
 output[col] <- myFunction(x[,col])
   }
  output
})
do.call(rbind,result)

Thanks in advance,
Ramiro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different random intercepts but same random slope for groups

2015-06-09 Thread Thierry Onkelinx
Your model is too complex for the data. This gives you two options: a)
simplify the model and b) get more data.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-06-09 21:57 GMT+02:00 li li :

> Hi all,
>   I'd like to fit a random intercept and random slope model. In my
> data, there are three groups. I want to have different random
> intercept for each group but the same random slope effect for all
> three groups. I used the following R command.
> However, there seems to be some problem. Any suggestions?
>
>
>
> mod2 <- lmer(result  ~ group*time+(0+group1+ group2 +
> group3+time|lot), na.action=na.omit, data=alldata)
>
> > summary(mod2)
> Model is not identifiable...
> summary from lme4 is returned
> some computational error has occurred in lmerTest
> Linear mixed model fit by REML ['merModLmerTest']
> Formula: result ~ group * time + (0 + group1 + group2 + group3 + time |
> lot)
>Data: alldata
>
> REML criterion at convergence: 807.9
>
> Scaled residuals:
> Min  1Q  Median  3Q Max
> -3.0112 -0.3364  0.0425  0.2903  3.2017
>
> Random effects:
>  Groups   Name Variance Std.Dev. Corr
>  lot  group1   0.0 0.000
>   group2   86.20156 9.284  NaN
>   group3 55.91479 7.478  NaN  0.06
>   time  0.02855 0.169  NaN -0.99  0.10
>  Residual  39.91968 6.318
> Number of obs: 119, groups:  lot, 15
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 100.1566 2.5108   39.89
> group  group2-2.9707 3.7490   -0.79
> group  group3   -0.0717 2.8144   -0.03
> time -0.1346 0.1780   -0.76
> group  group2 :time   0.1450 0.29390.49
> group  group3:time0.1663 0.21520.77
>
> Warning messages:
> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 0.147314 (tol = 0.002,
> component 2)
> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge: degenerate  Hessian with 2 negative eigenvalues
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cross tabulation with top one variable and side as multiple variables

2015-06-09 Thread John Kane
We probably should have a better idea of what the raw data looks like and 
perhaps a bit better idea of what the analyis is to show.  Have a look at 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
 and http://adv-r.had.co.nz/Reproducibility.html for some suggestions. In 
particular see the discussion about dput() for the best way to provide sample 
data to the help list.


John Kane
Kingston ON Canada


> -Original Message-
> From: p.jagad...@inrhythm-inc.com
> Sent: Tue, 9 Jun 2015 01:40:53 -0700 (PDT)
> To: r-help@r-project.org
> Subject: [R] Cross tabulation with top one variable and side as multiple
> variables
> 
> Hi:
> 
> I have a huge data with lot of variables and I need to check the trend
> variations from year to year. In order to do so, I have to cross tabulate
> the year variable as top (constant) and all the remaining variables as
> side
> (attached the cross tabulation report). I have searched the forums but
> the
> syntax I could find for cross tabulation is between 2 or 3 variables. So
> i
> would request to provide a code which can print the data in the same way
> as
> in the attached.
> 
> 
> 
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


Can't remember your password? Do you need a strong and secure password?
Use Password manager! It stores your passwords & protects your account.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different random intercepts but same random slope for groups

2015-06-09 Thread Bert Gunter
Thierry:

I don't think so. It looks to me like her syntax/understanding is confused.
I think the call should be:

mod2 <- lmer(result  ~ group*time+(group + time|lot), na.action=na.omit,
data=alldata)

Her request for "the same random slope for each group" -- I assume it's for
time -- means to me that the time slope will vary "randomly" by lot only,
the slope would be the same for all groups within the lot.

Of course, I may be wrong also. If so, I suggest that she follow the
posting guide and post at least head(alldata) using dput() to enable folks
to understand the structure of her data. And only on r-sig-mixed-models --
crossposting is frowned upon here and the mixed models list is the best bet
for this sort of question anyway.

As always, corrections and criticism welcome.

Cheers,
Bert

Bert Gunter

"Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom."
   -- Clifford Stoll

On Tue, Jun 9, 2015 at 1:49 PM, Thierry Onkelinx 
wrote:

> Your model is too complex for the data. This gives you two options: a)
> simplify the model and b) get more data.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2015-06-09 21:57 GMT+02:00 li li :
>
> > Hi all,
> >   I'd like to fit a random intercept and random slope model. In my
> > data, there are three groups. I want to have different random
> > intercept for each group but the same random slope effect for all
> > three groups. I used the following R command.
> > However, there seems to be some problem. Any suggestions?
> >
> >
> >
> > mod2 <- lmer(result  ~ group*time+(0+group1+ group2 +
> > group3+time|lot), na.action=na.omit, data=alldata)
> >
> > > summary(mod2)
> > Model is not identifiable...
> > summary from lme4 is returned
> > some computational error has occurred in lmerTest
> > Linear mixed model fit by REML ['merModLmerTest']
> > Formula: result ~ group * time + (0 + group1 + group2 + group3 + time |
> > lot)
> >Data: alldata
> >
> > REML criterion at convergence: 807.9
> >
> > Scaled residuals:
> > Min  1Q  Median  3Q Max
> > -3.0112 -0.3364  0.0425  0.2903  3.2017
> >
> > Random effects:
> >  Groups   Name Variance Std.Dev. Corr
> >  lot  group1   0.0 0.000
> >   group2   86.20156 9.284  NaN
> >   group3 55.91479 7.478  NaN  0.06
> >   time  0.02855 0.169  NaN -0.99  0.10
> >  Residual  39.91968 6.318
> > Number of obs: 119, groups:  lot, 15
> >
> > Fixed effects:
> > Estimate Std. Error t value
> > (Intercept) 100.1566 2.5108   39.89
> > group  group2-2.9707 3.7490   -0.79
> > group  group3   -0.0717 2.8144   -0.03
> > time -0.1346 0.1780   -0.76
> > group  group2 :time   0.1450 0.29390.49
> > group  group3:time0.1663 0.21520.77
> >
> > Warning messages:
> > 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
> :
> >   Model failed to converge with max|grad| = 0.147314 (tol = 0.002,
> > component 2)
> > 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
> :
> >   Model failed to converge: degenerate  Hessian with 2 negative
> eigenvalues
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A-priori contrasts with type III sums of squares in R

2015-06-09 Thread John Fox
Dear Rachel,

How about this (using the data and model you sent originally)?

> linearHypothesis(EpiLM, "GzrTreatpresence = 0")
Linear hypothesis test

Hypothesis:
GzrTreatpresence = 0

Model 1: restricted model
Model 2: log_EpiChla ~ TempTreat * GzrTreat * ShadeTreat

  Res.Df RSS Df  Sum of Sq  F Pr(>F)
1 25 0.12665
2 24 0.12623  1 0.00042195 0.0802 0.7794

> linearHypothesis(EpiLM, "GzrTreatimmigration = 0")
Linear hypothesis test

Hypothesis:
GzrTreatimmigration = 0

Model 1: restricted model
Model 2: log_EpiChla ~ TempTreat * GzrTreat * ShadeTreat

  Res.Df RSS Df  Sum of Sq F Pr(>F)
1 25 0.12623   
2 24 0.12623  1 5.0931e-06 0.001 0.9754

Note that this tests main-effect contrasts in a model that includes
interactions to which the main effect is marginal. You should probably think
about whether you really want to do that.

BTW, the slides to which you refer are for *multivariate* linear models
(including repeated measures); you're using a univariate linear model.

Best,
 John

> -Original Message-
> From: Rachael Blake [mailto:bl...@nceas.ucsb.edu]
> Sent: June-09-15 5:14 PM
> To: John Fox; r-help@r-project.org
> Subject: Re: [R] A-priori contrasts with type III sums of squares in R
> 
> Thank you for replying, John!
> 
> I am not using treatment contrasts in this analysis.  I am specifying
>options(contrasts=c("contr.sum", "contr.poly"))
> earlier in my code in order to get interpretable results from the Type
> III SS.  However, I did not include that code in the example because it
> is not related to my initial question, and those contrasts are not of
> interest to me.  My interest is in my a-priori specified contrasts:
>   contrasts(All09$GzrTreat) <- cbind('presence'=c(1,-2,1),
> 'immigration'=c(1,0,-1))
> 
> I have made a valiant attempt to use linearHypothesis(), based on the
> example provided here
> https://web.warwick.ac.uk/statsdept/user2011/TalkSlides/Contributed/17Au
> g_1705_FocusV_4-Multivariate_1-Fox.pdf
> as well as other places.   I have tried two different ways of specifying
> my contrast matrix, but I keep getting error messages that I can not
> resolve.   My code based on that powerpoint presentation is as follows
> (still using the data included in my initial question):
> 
>  options(contrasts=c("contr.sum", "contr.poly"))
>  EpiLM <- lm(log_EpiChla~TempTreat*GzrTreat*ShadeTreat, All09)
>  Anova(EpiLM, type="III")
>  class(EpiLM)
>  contrasts(All09$GzrTreat) <- cbind('presence'=c(1,-2,1),
> 'immigration'=c(1,0,-1))
>  con <- contrasts(All09$GzrTreat) ; con
>  EpiLM2 <- update(EpiLM)
>  rownames(coef(EpiLM2))
>  linearHypothesis(model=EpiLM2,
> hypothesis.matrix=c("presence","immigration"), verbose=T)  # first
> attempt to implement
>  linearHypothesis(model=EpiLM2, hypothesis.matrix=con,
> verbose=T)  # second attempt
> to implement
> 
> 
> Thanks again for your reply.
> 
> -Rachael
> 
> 
> On 6/6/2015 12:35 PM, John Fox wrote:
> > Dear Rachel,
> >
> > Anova() won't give you a breakdown of the SS for each term into 1 df
> > components (there is no split argument, as you can see if you look at
> > ?Anova). Because, with the exception of GzrTreat, your contrasts are
> not
> > orthogonal in the row basis of the design (apparently you're using the
> > default "contr.treatment" coding), you also won't get sensible type-
> III
> > tests from Anova(). If you formulated the contrasts for the other
> factors
> > properly (using, e.g., contr.sum), you could get single df tests from
> > linearHypothesis() in the car package.
> >
> > I hope this helps,
> >   John
> >
> > ---
> > John Fox, Professor
> > McMaster University
> > Hamilton, Ontario, Canada
> > http://socserv.socsci.mcmaster.ca/jfox/
> >
> >
> >
> >
> >> -Original Message-
> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> Rachael
> >> Blake
> >> Sent: June-05-15 6:32 PM
> >> To: r-help@r-project.org
> >> Subject: [R] A-priori contrasts with type III sums of squares in R
> >>
> >> I am analyzing data using a factorial three-way ANOVA with a-priori
> >> contrasts and type III sums of squares. (Please don't comment about
> type
> >> I SS vs. type III SS. That's not the point of my question.  I have
> read
> >> at length about the choice between types of SS and have made my
> >> decision.) I get the contrasts like I need using summary.aov(),
> however
> >> that uses type I SS. When I use the Anova() function from
> library(car)
> >> to get type III SS, I don't get the contrasts. I have also tried
> using
> >> drop1() with the lm() model, but I get the same results as Anova()
> >> (without the contrasts).
> >>
> >> Please advise on a statistical method in R to analyze data using
> >> factorial ANOVA with a-priori contrasts and type 

Re: [R] Different random intercepts but same random slope for groups

2015-06-09 Thread Ben Bolker
li li  gmail.com> writes:

> 

[snip]

>   I'd like to fit a random intercept and random slope model. In my
> data, there are three groups. I want to have different random
> intercept for each group but the same random slope effect for all
> three groups. I used the following R command.
> However, there seems to be some problem. Any suggestions?

  Please do not cross-post to more than one R list (in this
case, r-sig-mixed-models is more appropriate, and you've already
gotten some answers there).
 
  Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] more complex by with data.table???

2015-06-09 Thread jim holtman
try this:

> dt[
+ , {
+ result <- list()
+ for (i in names(.SD)){
+ result[[i]] <- myFunction(unlist(.SD[, i, with = FALSE]))
+ }
+ result
+   }
+ , by = name
+ ]
   name var1 var2 var3
1:a  2.0   22   42
2:b  7.5   28   48
>



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Tue, Jun 9, 2015 at 4:22 PM, Ramiro Barrantes <
ram...@precisionbioassay.com> wrote:

> Hello,
>
> I am trying to do something that I am able to do with the "by" function
> within data.frame but can't figure out how to achieve with data.table.
>
> Consider
>
>
> dt<-data.table(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> I am aware that I can do something like:
>
> dt[, .(meanVar1=myFunction(var1)) ,by=.(name)]
>
> but how could I do the equivalent of:
>
>
> df<-data.frame(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> columnNames <- c("var1","var2","var3")
> result <- by(df, df$name, function(x) {
>output <- c()
>for(col in columnNames) {
>  output[col] <- myFunction(x[,col])
>}
>   output
> })
> do.call(rbind,result)
>
> Thanks in advance,
> Ramiro
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] more complex by with data.table???

2015-06-09 Thread Ista Zahn
Hi Ramiro,

There is a demonstration of this on the data.table wiki at
https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html.
You can do

dt[, lapply(.SD, mean), by=name]

or

dt[, as.list(colMeans(.SD)), by=name]

BTW, there are pretty straightforward ways to do this in base R as well, e.g,

data.frame(t(sapply(split(df[-1], df$name), colMeans)))

Best,
Ista

On Tue, Jun 9, 2015 at 4:22 PM, Ramiro Barrantes
 wrote:
> Hello,
>
> I am trying to do something that I am able to do with the "by" function 
> within data.frame but can't figure out how to achieve with data.table.
>
> Consider
>
> dt<-data.table(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> I am aware that I can do something like:
>
> dt[, .(meanVar1=myFunction(var1)) ,by=.(name)]
>
> but how could I do the equivalent of:
>
> df<-data.frame(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
> myFunction <- function(x) { mean(x) }
>
> columnNames <- c("var1","var2","var3")
> result <- by(df, df$name, function(x) {
>output <- c()
>for(col in columnNames) {
>  output[col] <- myFunction(x[,col])
>}
>   output
> })
> do.call(rbind,result)
>
> Thanks in advance,
> Ramiro
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] %OSn in time formats: is it only valid for formatting, but invalid for parsing?

2015-06-09 Thread Brent via R-help
Consider this R code:
time = as.POSIXct(1433867059, origin = "1970-01-01")
print(time)
print( as.numeric(time) )

timeFormat = "%Y-%m-%d %H:%M:%OS3"
tz = "EST"

timestamp = format(time, format = timeFormat, tz = tz)
print(timestamp)

timeParsed = as.POSIXct(timestamp, format = timeFormat, tz = tz)
print(timeParsed)
print( as.numeric(timeParsed) )


If I paste that into Rgui on my Windows bos, which is running the latest 
(3.2.0) stable release, I get this:
> time = as.POSIXct(1433867059, origin = "1970-01-01")
> print(time)
[1] "2015-06-09 12:24:19 EDT"
> print( as.numeric(time) )
[1] 1433867059
> 
> timeFormat = "%Y-%m-%d %H:%M:%OS3"
> tz = "EST"
> 
> timestamp = format(time, format = timeFormat, tz = tz)
> print(timestamp)
[1] "2015-06-09 11:24:19.000"
> 
> timeParsed = as.POSIXct(timestamp, format = timeFormat, tz = tz)
> print(timeParsed)
[1] NA
> print( as.numeric(timeParsed) )
[1] NA

Notice how the time format, which ends with %OS3, produces the correct time 
stamp (a 3 digit millisecond resolution).

However, that same time format FAILS IN THE OPPOSITE DIRECTION: it cannot parse 
that time stamp back into the original POSIXct value; it barfs and parses NA.

Anyone know what is going on?

A web search found this link 
https://stackoverflow.com/questions/19062178/how-to-convert-specific-time-format-to-timestamp-in-r
where one of the commenters, Waldir Leoncio, in the first answer, appears to 
describe the same parsing bug with %OS3 that I do:
"use, for example, strptime(y, "%d.%m.%Y %H:%M:%OS3"), but it doesn't work for 
me. Henrik noted that the function's help page, ?strptime states that the %OS3 
bit is OS-dependent. I'm using an updated Ubuntu 13.04 and using %OS3 yields 
NA."

The help page mentioned in the quote above likely is
https://stat.ethz.ch/R-manual/R-devel/library/base/html/strptime.html
which is unfortunately terse, merely saying
"Specific to R is %OSn, which for output gives the seconds truncated to 0 <= n 
<= 6 decimal places (and if %OS is not followed by a digit, it uses the setting 
of getOption("digits.secs"), or if that is unset, n = 3). Further, for strptime 
%OS will input seconds including fractional seconds. Note that %S ignores (and 
not rounds) fractional parts on output. "

That final senetence about strptime (i.e. parsing) is subtle: it says "for 
strptime %OS".  Note the absence of an 'n': it says %OS instead of %OSn.

Does that mean that %OSn can NOT be used for parsing, only for formatting?

That is what I have empirically found, but is it expected behavior or a bug?

Very annoying if expected behavior, since that means that I need different time 
formats for formatting and parsing.  Have never seen that before in any other 
language's date API.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summarizing data based on Date

2015-06-09 Thread Shivi82
HI All,

I am able to get the desired result. Thanks for extending help. 
while reading the csv file I made some changes as :

Test<-read.csv("Testdata.csv", head=TRUE, stringsAsFactors = FALSE,
strip.white = TRUE)
with this character var were not changed to factors. 

Then aggregation was simple: 
aggregate(test$CHG_WT, list(test$CR_DT), sum)

However the output is not sorted based on Dates and the columns names
appearing as very different:

Group.1   x
1   1-Mar-15  909791
2  10-Mar-15  822436
3  11-Mar-15  848609
4  12-Mar-15  924842
5  13-Mar-15  895270
6  14-Mar-15  93238
7 2-Mar-15 731600

Can you all please suggest why the column names are so different and how I
could sort based on dates. I added the sort option in the above syntax 
aggregate(test$CHG_WT, list(test$CR_DT), sum,sort(test$CR_DT,decreasing =
TRUE))

But it gave me an error: 
Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
Thanks All. 




--
View this message in context: 
http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708423.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graphs, need urgent help (deadline :( )

2015-06-09 Thread Don McKenzie
The answer lies in learning to use the help (and knowing where to start).  Did 
you look at the tutorial that comes with the R installation?

?plot
?lines

?par   

In the last, look for the descriptions of “col” and “lty”.

Using plot() and lines(), and subsetting the four unique values of “sample”, 
you can create your lines.

Here is a crude start, assuming your columns are part of a data frame called 
“my.data”.   Untested...

plot(my.data$region[my.data$sample==10],my.data$factora[my.data$sample==10],col=4)
 # blue line, not dashed
.
.
.
lines(my.data$region[my.data$sample==20],my.data$factorb[my.data$sample==20],col=2,lty=2)
   # red dashed line


> On Jun 9, 2015, at 10:36 AM, Rosa Oliveira  wrote:
> 
> Hi,
> 
> another naive question (i’m pretty sure :( )
> 
> 
> I’m trying to plot a multiple line graph:
> 
> region   sample  factora  factorb
> factorc
> 0.1   10  0.895   0.903   0.378
> 0.2   10  0.811   0.865   0.688
> 0.1   20  0.735   0.966   0.611
> 0.2   20  0.777   0.732   0.653
> 0.1   30  0.600   0.778   0.694
> 0.2   30  0.466   174.592 0.461
> 0.1   40  0.446   0.432   0.693
> 0.2   40  0.392   0.294   0.686
> 
> 
> 
> The first column should be the independent variable, the second should 
> compute a bold line for sample(10) and dash line for sample 20.

What about the other two values of “sample”?  

> The others variables are outcomes for each of the first scenarios, and so it 
> should: the 3rd, 4th and 5th columns should be blue, red and green 
> respectively. 
> 
> 
> Resume :)
> 
> I should have a graph, in the x-axe should have the region and in the y axe, 
> the factor.
> Lines:
>   1 - blue and bold for region 0.1, sample 10 and factor a
>   2 - blue and dash for region 0.2, sample 10 and factor a
>   3 - red and bold for region 0.1, sample 10 and factor b
>   4 - red and dash for region 0.2, sample 10 and factor b
>   5 - green and bold for region 0.1, sample 10 and factor c
>   6 - green and dash for region 0.2, sample 10 and factor c

Not consistent with what you said above. These are no longer lines, but points.
> 
> nonetheless the independent variable is nominal, I should plot a line graph.
> 
> Can anyone help me please?
> I have my file as a cvs file, so I first read that file (that I know how to 
> do :)).
> 
> But I have it in that format.
> 
> Best,
> RO
> 
> 
> 
> Atenciosamente,
> Rosa Oliveira
> 
> -- 
> 
> 
> 
> Rosa Celeste dos Santos Oliveira, 
> 
> E-mail: rosit...@gmail.com
> Tlm: +351 939355143 
> Linkedin: https://pt.linkedin.com/in/rosacsoliveira
> 
> "Many admire, few know"
> Hippocrates
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A-priori contrasts with type III sums of squares in R

2015-06-09 Thread Rachael Blake

Thank you for replying, John!

I am not using treatment contrasts in this analysis.  I am specifying
  options(contrasts=c("contr.sum", "contr.poly"))
earlier in my code in order to get interpretable results from the Type 
III SS.  However, I did not include that code in the example because it 
is not related to my initial question, and those contrasts are not of 
interest to me.  My interest is in my a-priori specified contrasts:
 contrasts(All09$GzrTreat) <- cbind('presence'=c(1,-2,1), 
'immigration'=c(1,0,-1))


I have made a valiant attempt to use linearHypothesis(), based on the 
example provided here

https://web.warwick.ac.uk/statsdept/user2011/TalkSlides/Contributed/17Aug_1705_FocusV_4-Multivariate_1-Fox.pdf
as well as other places.   I have tried two different ways of specifying 
my contrast matrix, but I keep getting error messages that I can not 
resolve.   My code based on that powerpoint presentation is as follows 
(still using the data included in my initial question):


options(contrasts=c("contr.sum", "contr.poly"))
EpiLM <- lm(log_EpiChla~TempTreat*GzrTreat*ShadeTreat, All09)
Anova(EpiLM, type="III")
class(EpiLM)
contrasts(All09$GzrTreat) <- cbind('presence'=c(1,-2,1), 
'immigration'=c(1,0,-1))

con <- contrasts(All09$GzrTreat) ; con
EpiLM2 <- update(EpiLM)
rownames(coef(EpiLM2))
linearHypothesis(model=EpiLM2, 
hypothesis.matrix=c("presence","immigration"), verbose=T)  # first 
attempt to implement
linearHypothesis(model=EpiLM2, hypothesis.matrix=con, 
verbose=T)  # second attempt 
to implement



Thanks again for your reply.

-Rachael


On 6/6/2015 12:35 PM, John Fox wrote:

Dear Rachel,

Anova() won't give you a breakdown of the SS for each term into 1 df
components (there is no split argument, as you can see if you look at
?Anova). Because, with the exception of GzrTreat, your contrasts are not
orthogonal in the row basis of the design (apparently you're using the
default "contr.treatment" coding), you also won't get sensible type-III
tests from Anova(). If you formulated the contrasts for the other factors
properly (using, e.g., contr.sum), you could get single df tests from
linearHypothesis() in the car package.

I hope this helps,
  John

---
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.socsci.mcmaster.ca/jfox/





-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rachael
Blake
Sent: June-05-15 6:32 PM
To: r-help@r-project.org
Subject: [R] A-priori contrasts with type III sums of squares in R

I am analyzing data using a factorial three-way ANOVA with a-priori
contrasts and type III sums of squares. (Please don't comment about type
I SS vs. type III SS. That's not the point of my question.  I have read
at length about the choice between types of SS and have made my
decision.) I get the contrasts like I need using summary.aov(), however
that uses type I SS. When I use the Anova() function from library(car)
to get type III SS, I don't get the contrasts. I have also tried using
drop1() with the lm() model, but I get the same results as Anova()
(without the contrasts).

Please advise on a statistical method in R to analyze data using
factorial ANOVA with a-priori contrasts and type III SS as shown in my
example below.

Sample data:
  DF <- structure(list(Code = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L,
3L,
  3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L,
9L,
  9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L), .Label = c("A",
  "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L"), class =
  "factor"), GzrTreat = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L,
  3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,  2L,
2L,
  2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), contrasts = structure(c(1,
  -2, 1, 1, 0, -1), .Dim = c(3L, 2L), .Dimnames = list(c("I",
  "N", "R"), NULL)), .Label = c("I", "N", "R"), class = "factor"),
  BugTreat = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
  1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
  3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label =
  c("Immigration", "Initial", "None"), class = "factor"), TempTreat =
  structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L,
  2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
  1L, 1L, 1L, 1L, 1L), .Label = c("Not Warm", "Warmed"), class =
  "factor"), ShadeTreat = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L,
  2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
  1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L), .Label =
c("Light",
  "Shaded"), class = "factor"), EpiChla = c(0.268482353, 0.423119608,
  0.579507843, 0.738839216, 0.727856863, 0.523960784, 0.405801961,
  0.335964706

Re: [R] graphs, need urgent help (deadline :( )

2015-06-09 Thread Rosa Oliveira
Dear Don and all,

I’ve read the tutorial and tried several codes before posting :)
I’m really naive.



what I was trying to :  is something like the graph in the picture I drawee.




Is it more clear now? 

Atenciosamente,
Rosa Oliveira

-- 



Rosa Celeste dos Santos Oliveira, 

E-mail: rosit...@gmail.com 
Tlm: +351 939355143 
Linkedin: https://pt.linkedin.com/in/rosacsoliveira 


"Many admire, few know"
Hippocrates

> On 09 Jun 2015, at 19:23, Don McKenzie  > wrote:
> 
> The answer lies in learning to use the help (and knowing where to start).  
> Did you look at the tutorial that comes with the R installation?
> 
> ?plot
> ?lines
> 
> ?par   
> 
> In the last, look for the descriptions of “col” and “lty”.
> 
> Using plot() and lines(), and subsetting the four unique values of “sample”, 
> you can create your lines.
> 
> Here is a crude start, assuming your columns are part of a data frame called 
> “my.data”.   Untested...
> 
> plot(my.data$region[my.data$sample==10],my.data$factora[my.data$sample==10],col=4)
>  # blue line, not dashed
> .
> .
> .
> lines(my.data$region[my.data$sample==20],my.data$factorb[my.data$sample==20],col=2,lty=2)
># red dashed line
> 
> 
>> On Jun 9, 2015, at 10:36 AM, Rosa Oliveira > > wrote:
>> 
>> Hi,
>> 
>> another naive question (i’m pretty sure :( )
>> 
>> 
>> I’m trying to plot a multiple line graph:
>> 
>> region  sample  factora  factorb
>> factorc
>> 0.1  10  0.895   0.903   0.378
>> 0.2  10  0.811   0.865   0.688
>> 0.1  20  0.735   0.966   0.611
>> 0.2  20  0.777   0.732   0.653
>> 0.1  30  0.600   0.778   0.694
>> 0.2  30  0.466   174.592 0.461
>> 0.1  40  0.446   0.432   0.693
>> 0.2  40  0.392   0.294   0.686
>> 
>> 
>> 
>> The first column should be the independent variable, the second should 
>> compute a bold line for sample(10) and dash line for sample 20.
> 
> What about the other two values of “sample”?  
> 
>> The others variables are outcomes for each of the first scenarios, and so it 
>> should: the 3rd, 4th and 5th columns should be blue, red and green 
>> respectively. 
>> 
>> 
>> Resume :)
>> 
>> I should have a graph, in the x-axe should have the region and in the y axe, 
>> the factor.
>> Lines:
>>  1 - blue and bold for region 0.1, sample 10 and factor a
>>  2 - blue and dash for region 0.2, sample 10 and factor a
>>  3 - red and bold for region 0.1, sample 10 and factor b
>>  4 - red and dash for region 0.2, sample 10 and factor b
>>  5 - green and bold for region 0.1, sample 10 and factor c
>>  6 - green and dash for region 0.2, sample 10 and factor c
> 
> Not consistent with what you said above. These are no longer lines, but 
> points.
>> 
>> nonetheless the independent variable is nominal, I should plot a line graph.
>> 
>> Can anyone help me please?
>> I have my file as a cvs file, so I first read that file (that I know how to 
>> do :)).
>> 
>> But I have it in that format.
>> 
>> Best,
>> RO
>> 
>> 
>> 
>> Atenciosamente,
>> Rosa Oliveira
>> 
>> -- 
>> 
>> 
>> 
>> Rosa Celeste dos Santos Oliveira, 
>> 
>> E-mail: rosit...@gmail.com 
>> Tlm: +351 939355143 
>> Linkedin: https://pt.linkedin.com/in/rosacsoliveira 
>> 
>> 
>> "Many admire, few know"
>> Hippocrates
>> 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org  mailing list -- To 
>> UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> 
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
>> 
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graphs, need urgent help [from Rosa Oliveira]

2015-06-09 Thread Don McKenzie
The R function plot() will draw the first line and the two axes.  You need to 
tell it which subsample of your data to plot, as in my example below.
So start with those two observations for which “sample” = 10.  But if you want 
separate lines for each unique value of “sample”, your lines will connect
only two data points, because you have only two instances of each of those 
unique values, unlike the lines in your hand-drawn graph.

Another issue is that you have a huge outlier (value very much larger than the 
others) in the 6th row of “factorb”.  Is this an error?  If not, your other 
lines will be indistinguishable when you try to plot everything.

Part of the reason no one else has responded may be that it appears that you 
are confused about your own data in a way that makes it very difficult for 
us to help you.  Can you get some basic advice from someone local?  I or 
someone else on the list could give you the code line-by-line that we THINK you 
need,
but it could be wrong, given the inconsistencies in what you have shown us, and 
that would make everything worse.

>> plot(my.data$region[my.data$sample==10],my.data$factora[my.data$sample==10],col=4)
>>  # blue line, not dashed

Did you try plotting just this line?  What happened?


> On Jun 9, 2015, at 5:53 PM, Rosa Oliveira  wrote:
> 
> Dear Don and all,
> 
> I’ve read the tutorial and tried several codes before posting :)
> I’m really naive.
> 
> 
> 
> what I was trying to :  is something like the graph in the picture I drawee.
> 
> 
> 
> 
> Is it more clear now? 
> 
> Atenciosamente,
> Rosa Oliveira
> 
> -- 
> 
>  
> 
> Rosa Celeste dos Santos Oliveira, 
> 
> E-mail: rosit...@gmail.com 
> Tlm: +351 939355143 
> Linkedin: https://pt.linkedin.com/in/rosacsoliveira 
> 
> 
> "Many admire, few know"
> Hippocrates
> 
>> On 09 Jun 2015, at 19:23, Don McKenzie > > wrote:
>> 
>> The answer lies in learning to use the help (and knowing where to start).  
>> Did you look at the tutorial that comes with the R installation?
>> 
>> ?plot
>> ?lines
>> 
>> ?par   
>> 
>> In the last, look for the descriptions of “col” and “lty”.
>> 
>> Using plot() and lines(), and subsetting the four unique values of “sample”, 
>> you can create your lines.
>> 
>> Here is a crude start, assuming your columns are part of a data frame called 
>> “my.data”.   Untested...
>> 
>> plot(my.data$region[my.data$sample==10],my.data$factora[my.data$sample==10],col=4)
>>  # blue line, not dashed

Did you try plotting just this line?  What happened?

>> .
>> .
>> .
>> lines(my.data$region[my.data$sample==20],my.data$factorb[my.data$sample==20],col=2,lty=2)
>># red dashed line
>> 
>> 
>>> On Jun 9, 2015, at 10:36 AM, Rosa Oliveira >> > wrote:
>>> 
>>> Hi,
>>> 
>>> another naive question (i’m pretty sure :( )
>>> 
>>> 
>>> I’m trying to plot a multiple line graph:
>>> 
>>> region sample  factora  factorb
>>> factorc
>>> 0.1 10  0.895   0.903   0.378
>>> 0.2 10  0.811   0.865   0.688
>>> 0.1 20  0.735   0.966   0.611
>>> 0.2 20  0.777   0.732   0.653
>>> 0.1 30  0.600   0.778   0.694
>>> 0.2 30  0.466   174.592 0.461
>>> 0.1 40  0.446   0.432   0.693
>>> 0.2 40  0.392   0.294   0.686
>>> 
>>> 
>>> 
>>> The first column should be the independent variable, the second should 
>>> compute a bold line for sample(10) and dash line for sample 20.
>> 
>> What about the other two values of “sample”?  
>> 
>>> The others variables are outcomes for each of the first scenarios, and so 
>>> it should: the 3rd, 4th and 5th columns should be blue, red and green 
>>> respectively. 
>>> 
>>> 
>>> Resume :)
>>> 
>>> I should have a graph, in the x-axe should have the region and in the y 
>>> axe, the factor.
>>> Lines:
>>> 1 - blue and bold for region 0.1, sample 10 and factor a
>>> 2 - blue and dash for region 0.2, sample 10 and factor a
>>> 3 - red and bold for region 0.1, sample 10 and factor b
>>> 4 - red and dash for region 0.2, sample 10 and factor b
>>> 5 - green and bold for region 0.1, sample 10 and factor c
>>> 6 - green and dash for region 0.2, sample 10 and factor c
>> 
>> Not consistent with what you said above. These are no longer lines, but 
>> points.
>>> 
>>> nonetheless the independent variable is nominal, I should plot a line graph.
>>> 
>>> Can anyone help me please?
>>> I have my file as a cvs file, so I first read that file (that I know how to 
>>> do :)).
>>> 
>>> But I have it in that format.
>>> 
>>> Best,
>>> RO
>>> 
>>> 
>>> 
>>> Atenciosamente,
>>> Rosa Oliveira
>>> 
>>> -- 
>>> 
>>> 
>>> 
>>> Rosa Celeste dos Santos Olive

Re: [R] graphs, need urgent help [from Rosa Oliveira]

2015-06-09 Thread Rosa Oliveira
Dear Don,

I done the plot and the lines, and it’s  fine.
I’ll have 10 values on sample. It’s generating (on simulation), that’s why that 
huge outlier, and the other missing points.

The graph I’ve done, is just an example, just to illustrate what I have to get, 
but off course with 10 points in sample, and all the other specificityies.

Best,
RO

Atenciosamente,
Rosa Oliveira

-- 



Rosa Celeste dos Santos Oliveira, 

E-mail: rosit...@gmail.com
Tlm: +351 939355143 
Linkedin: https://pt.linkedin.com/in/rosacsoliveira

"Many admire, few know"
Hippocrates

> On 10 Jun 2015, at 02:41, Don McKenzie  wrote:
> 
> The R function plot() will draw the first line and the two axes.  You need to 
> tell it which subsample of your data to plot, as in my example below.
> So start with those two observations for which “sample” = 10.  But if you 
> want separate lines for each unique value of “sample”, your lines will connect
> only two data points, because you have only two instances of each of those 
> unique values, unlike the lines in your hand-drawn graph.
> 
> Another issue is that you have a huge outlier (value very much larger than 
> the others) in the 6th row of “factorb”.  Is this an error?  If not, your 
> other lines will be indistinguishable when you try to plot everything.
> 
> Part of the reason no one else has responded may be that it appears that you 
> are confused about your own data in a way that makes it very difficult for 
> us to help you.  Can you get some basic advice from someone local?  I or 
> someone else on the list could give you the code line-by-line that we THINK 
> you need,
> but it could be wrong, given the inconsistencies in what you have shown us, 
> and that would make everything worse.
> 
>>> plot(my.data$region[my.data$sample==10],my.data$factora[my.data$sample==10],col=4)
>>>  # blue line, not dashed
> 
> Did you try plotting just this line?  What happened?
> 
> 
>> On Jun 9, 2015, at 5:53 PM, Rosa Oliveira > > wrote:
>> 
>> Dear Don and all,
>> 
>> I’ve read the tutorial and tried several codes before posting :)
>> I’m really naive.
>> 
>> 
>> 
>> what I was trying to :  is something like the graph in the picture I drawee.
>> 
>> 
>> 
>> 
>> Is it more clear now? 
>> 
>> Atenciosamente,
>> Rosa Oliveira
>> 
>> -- 
>> 
>>  
>> 
>> Rosa Celeste dos Santos Oliveira, 
>> 
>> E-mail: rosit...@gmail.com 
>> Tlm: +351 939355143 
>> Linkedin: https://pt.linkedin.com/in/rosacsoliveira 
>> 
>> 
>> "Many admire, few know"
>> Hippocrates
>> 
>>> On 09 Jun 2015, at 19:23, Don McKenzie >> > wrote:
>>> 
>>> The answer lies in learning to use the help (and knowing where to start).  
>>> Did you look at the tutorial that comes with the R installation?
>>> 
>>> ?plot
>>> ?lines
>>> 
>>> ?par   
>>> 
>>> In the last, look for the descriptions of “col” and “lty”.
>>> 
>>> Using plot() and lines(), and subsetting the four unique values of 
>>> “sample”, you can create your lines.
>>> 
>>> Here is a crude start, assuming your columns are part of a data frame 
>>> called “my.data”.   Untested...
>>> 
>>> plot(my.data$region[my.data$sample==10],my.data$factora[my.data$sample==10],col=4)
>>>  # blue line, not dashed
> 
> Did you try plotting just this line?  What happened?
> 
>>> .
>>> .
>>> .
>>> lines(my.data$region[my.data$sample==20],my.data$factorb[my.data$sample==20],col=2,lty=2)
>>># red dashed line
>>> 
>>> 
 On Jun 9, 2015, at 10:36 AM, Rosa Oliveira >>> > wrote:
 
 Hi,
 
 another naive question (i’m pretty sure :( )
 
 
 I’m trying to plot a multiple line graph:
 
 regionsample  factora  factorb
 factorc
 0.110  0.895   0.903   0.378
 0.210  0.811   0.865   0.688
 0.120  0.735   0.966   0.611
 0.220  0.777   0.732   0.653
 0.130  0.600   0.778   0.694
 0.230  0.466   174.592 0.461
 0.140  0.446   0.432   0.693
 0.240  0.392   0.294   0.686
 
 
 
 The first column should be the independent variable, the second should 
 compute a bold line for sample(10) and dash line for sample 20.
>>> 
>>> What about the other two values of “sample”?  
>>> 
 The others variables are outcomes for each of the first scenarios, and so 
 it should: the 3rd, 4th and 5th columns should be blue, red and green 
 respectively. 
 
 
 Resume :)
 
 I should have a graph, in the x-axe should have t

[R] How to validate the cluster analysis?

2015-06-09 Thread My List
All,

I am new to the world of statistics. I am interested in finding out the
validation techniques employed on a cluster analysis. Any point of
reference or site would be helpful. I have read about the clValid package
and usage of the function on cluster.stats() in the fpc package.

Thanks in Advance,
Harmeet

PS: I have marked this mail to both help and devel list. Is it ok?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.