Re: [R] Rendo and dataMultilevelIV

2018-08-02 Thread Daniel Nordlund

On 8/2/2018 7:11 PM, cjg15 wrote:

Hi - Does anyone know what the variables CID and SID are in the
dataMultilevelIV dataset?

The example from page 18-19 of
https://cran.r-project.org/web/packages/REndo/REndo.pdf has

formula1 <- y ~ X11 + X12 + X13 + X14 + X15 + X21 + X22 + X23 + X24 + X31 +
X32 + X33 + (1 + X11 | CID) + (1|SID)

what exactly are the (1 + X11|CID) and (1|SID) terms?

does (1|SID) mean random intercepts for SID, and SID is student ID?

Thanks in advance, Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Did you read pages 9-10 of the document you provided a link to above 
(which describes the dataMultilevelIV dataset)?



Dan

--
Daniel Nordlund
Port Townsend, WA  USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] F-test where the coefficients in the H_0 is nonzero

2018-08-02 Thread Annaert Jan
You can easily test linear restrictions using the function linearHypothesis() 
from the car package.
There are several ways to set up the null hypothesis, but a straightforward one 
here is:
 
> library(car)
> x <- rnorm(10)
> y <- x+rnorm(10)
> linearHypothesis(lm(y~x), c("(Intercept)=0", "x=1"))
Linear hypothesis test

Hypothesis:
(Intercept) = 0
x = 1

Model 1: restricted model
Model 2: y ~ x

  Res.Df RSS Df Sum of Sq  F Pr(>F)
1 10 10.6218   
2  8  9.0001  21.6217 0.7207 0.5155


Jan

From: R-help  on behalf of John 

Date: Thursday, 2 August 2018 at 10:44
To: r-help 
Subject: [R] F-test where the coefficients in the H_0 is nonzero

Hi,

   I try to run the regression
   y = beta_0 + beta_1 x
   and test H_0: (beta_0, beta_1) =(0,1) against H_1: H_0 is false
   I believe I can run the regression
   (y-x) = beta_0 +beta_1‘ x
   and do the regular F-test (using lm functio) where the hypothesized
coefficients are all zero.

   Is there any function in R that deal with the case where the
coefficients are nonzero?

John

[[alternative HTML version deleted]]

__
mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rendo and dataMultilevelIV

2018-08-02 Thread cjg15
Hi - Does anyone know what the variables CID and SID are in the
dataMultilevelIV dataset?

The example from page 18-19 of
https://cran.r-project.org/web/packages/REndo/REndo.pdf has

formula1 <- y ~ X11 + X12 + X13 + X14 + X15 + X21 + X22 + X23 + X24 + X31 +
X32 + X33 + (1 + X11 | CID) + (1|SID)

what exactly are the (1 + X11|CID) and (1|SID) terms?

does (1|SID) mean random intercepts for SID, and SID is student ID?

Thanks in advance, Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to allign data

2018-08-02 Thread Jim Lemon
Hi Petr,
I recently had to align the minima of deceleration events to form an
aggregate "braking profile" for different locations. It seems as
though you are looking for something like:

find_increase<-function(x,surround=10) {
 inc_index<-which.max(diff(x))
 indices<-(inc_index-surround):(inc_index+surround)
 nneg<-sum(indices < 1)
 # pad both ends with NA if needed
 newx<-x[1:max(indices)]
 if(nneg > 0) newx<-c(rep(NA,nneg),newx)
 return(newx)
}

Jim

On Thu, Aug 2, 2018 at 10:23 PM, PIKAL Petr  wrote:
> Dear all
>
> Before I start to reinvent wheel I would like to ask you if you have some 
> easy solution for aligning data
>
> I have something like this
> x<-1:100
> set.seed(42)
> y1<-c(runif(20)+1, 1.2*x[1:80]+runif(80))
> y2<-c(runif(40)+1, 1.2*x[1:60]+runif(60))
>
> plot(x,y1)
> points(x,y2, col=2)
>
> with y increase starting at various x.
>
> I would like to allign data so that the increase starts at the same x point, 
> something like
>
> plot(x,y1)
> points(x[-(1:20)]-20,y2[-(1:20)], col=2)
>
> I consider using strucchange or segmented packages to find break(s) and 
> "shift" x values according to this break. But maybe somebody already did 
> similar task (aligning several vectors according to some common breakpoint) 
> and could offer better or simpler solution.
>
> Best regards.
> Petr
> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
> partnerů PRECHEZA a.s. jsou zveřejněny na: 
> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
> processing and protection of business partner’s personal data are available 
> on website: https://www.precheza.cz/en/personal-data-protection-principles/
> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
> podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
> https://www.precheza.cz/01-dovetek/ | This email and any documents attached 
> to it may be confidential and are subject to the legally binding disclaimer: 
> https://www.precheza.cz/en/01-disclaimer/
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kSamples ad.test question

2018-08-02 Thread Bert Gunter
You may get a response here, but as this is primarily a statistical
question, not a question about R programming, so it is off topic here. I
would suggest that you post this on stats.stackexchange.com or other
statistics site instead. There is a large literature on this sort of thing .

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Aug 2, 2018 at 12:00 PM, Andras Farkas via R-help <
r-help@r-project.org> wrote:

> Dear All,
>
> once we run the following code, the results of the test will give us the
> expected obvious, samples are from the common distribution...
>
>
> library(kSamples)
>
> u1 <- sample(rnorm(500,10,1),20,replace = TRUE)
> u2 <- sample(rnorm(500,10,1),20,replace = TRUE)
> u3 <- sample(rnorm(500,10,1),20,replace = TRUE)
> u4 <- sample(rnorm(500,10,1),20,replace = TRUE)
> u5 <- sample(rnorm(500,10,1),20,replace = TRUE)
>
> ad.test(u1, u2, u3,u4,u5, method = "exact", dist = FALSE, Nsim = 1000)
>
> next, if I change "u5" to:
>
> u5 <- sample(rnorm(500,20,1),20,replace = TRUE)
>
> the results of the test again gives us what we expect, ie samples are not
> from the common distribution my question is: would you know of a way to
> be able to automatically select out or identify  "u5", the distribution
> that is "responsible"  for the results generated showing that the samples
> are not from the common distribution?
>
> much appreciate your help,
>
>
> Andras
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kSamples ad.test question

2018-08-02 Thread Andras Farkas via R-help
Dear All,

once we run the following code, the results of the test will give us the 
expected obvious, samples are from the common distribution...


library(kSamples)

u1 <- sample(rnorm(500,10,1),20,replace = TRUE)
u2 <- sample(rnorm(500,10,1),20,replace = TRUE)
u3 <- sample(rnorm(500,10,1),20,replace = TRUE)
u4 <- sample(rnorm(500,10,1),20,replace = TRUE)
u5 <- sample(rnorm(500,10,1),20,replace = TRUE)

ad.test(u1, u2, u3,u4,u5, method = "exact", dist = FALSE, Nsim = 1000)

next, if I change "u5" to:

u5 <- sample(rnorm(500,20,1),20,replace = TRUE)

the results of the test again gives us what we expect, ie samples are not from 
the common distribution my question is: would you know of a way to be able 
to automatically select out or identify  "u5", the distribution that is 
"responsible"  for the results generated showing that the samples are not from 
the common distribution? 

much appreciate your help,


Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combinations of true/false values where one pair is mutually exclusive

2018-08-02 Thread Bert Gunter
Logic:

!(E == "fail" & F == "fail)   <==>

(E == "pass" | F == "pass")


-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Aug 2, 2018 at 8:57 AM, Sarah Goslee  wrote:

> Given that clarification, I'd just generate the full set and remove
> the ones you aren't interested in, as in:
>
>
> scenarios <- expand.grid(A = c("pass", "fail"), B = c("pass", "fail"), C =
> c("pass", "fail"), D = c("pass", "fail"), E = c("pass", "fail"), F =
> c("pass", "fail"))
>
>
> scenarios <- subset(scenarios, !(E == "fail" & F == "fail))
>
> Sarah
>
> On Thu, Aug 2, 2018 at 11:41 AM, R Stafford 
> wrote:
> > Thank you for pointing that out, I realize not only did I use the wrong
> > language but I did not describe the situation accurately.  I do need to
> > address the situation where both variables E and F actually pass, that is
> > the majority case, one or the other can fail, but there can never be a
> > situation where E and F both fail.  I do not know a specific term for
> that
> > situation, but you are correct that mutual exclusivity is wrong.   While
> I
> > can generate a list of all possible combinations with the expand.grid
> > function (which I am not committed to by the way), it would be very
> helpful
> > if I could exclude the combinations where E and F both fail.  I am not
> sure
> > where to go from here, but the solution does not have to be elegant or
> even
> > efficient because I do not need to scale higher than 6 variables.
> >
> >
> >
> > On Thu, Aug 2, 2018 at 7:26 AM, S Ellison 
> wrote:
> >
> >> > On Thu, Aug 2, 2018 at 11:20 AM, R Stafford 
> >> > wrote:
> >> > > But I have the extra condition that if E is true, then F must be
> >> false, and
> >> > > vice versa,
> >>
> >> Question: Does 'vice versa' mean
> >> a) "if E is False, F must be True"
> >> or
> >> b) "if F is True, E must be False"?
> >> ... which are not the same.
> >>
> >> b) (and mutual exclusivity in general) does not rule out the condition
> "E
> >> False, F False", which would not be addressed by the
> >> pass/fail equivalent equivalent of F <- !E
> >>
> >>
> >>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combinations of true/false values where one pair is mutually exclusive

2018-08-02 Thread Sarah Goslee
Given that clarification, I'd just generate the full set and remove
the ones you aren't interested in, as in:


scenarios <- expand.grid(A = c("pass", "fail"), B = c("pass", "fail"), C =
c("pass", "fail"), D = c("pass", "fail"), E = c("pass", "fail"), F =
c("pass", "fail"))


scenarios <- subset(scenarios, !(E == "fail" & F == "fail))

Sarah

On Thu, Aug 2, 2018 at 11:41 AM, R Stafford  wrote:
> Thank you for pointing that out, I realize not only did I use the wrong
> language but I did not describe the situation accurately.  I do need to
> address the situation where both variables E and F actually pass, that is
> the majority case, one or the other can fail, but there can never be a
> situation where E and F both fail.  I do not know a specific term for that
> situation, but you are correct that mutual exclusivity is wrong.   While I
> can generate a list of all possible combinations with the expand.grid
> function (which I am not committed to by the way), it would be very helpful
> if I could exclude the combinations where E and F both fail.  I am not sure
> where to go from here, but the solution does not have to be elegant or even
> efficient because I do not need to scale higher than 6 variables.
>
>
>
> On Thu, Aug 2, 2018 at 7:26 AM, S Ellison  wrote:
>
>> > On Thu, Aug 2, 2018 at 11:20 AM, R Stafford 
>> > wrote:
>> > > But I have the extra condition that if E is true, then F must be
>> false, and
>> > > vice versa,
>>
>> Question: Does 'vice versa' mean
>> a) "if E is False, F must be True"
>> or
>> b) "if F is True, E must be False"?
>> ... which are not the same.
>>
>> b) (and mutual exclusivity in general) does not rule out the condition "E
>> False, F False", which would not be addressed by the
>> pass/fail equivalent equivalent of F <- !E
>>
>>
>>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combinations of true/false values where one pair is mutually exclusive

2018-08-02 Thread MacQueen, Don via R-help
From what I can tell, the simplest way is to
   First generate all the combinations
   Then exclude those you don't want.

Here's an example, with only three variables (D, E, and F), that excludes those 
where E and F both fail

> tmp <- c('p','f')
> X <- expand.grid(D=tmp, E=tmp, F=tmp)
> X <- subset(X, !(E=='f' & F=='f'))
> X
  D E F
1 p p p
2 f p p
3 p f p
4 f f p
5 p p f
6 f p f


--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

On 8/2/18, 8:41 AM, "R-help on behalf of R Stafford" 
 wrote:

Thank you for pointing that out, I realize not only did I use the wrong
language but I did not describe the situation accurately.  I do need to
address the situation where both variables E and F actually pass, that is
the majority case, one or the other can fail, but there can never be a
situation where E and F both fail.  I do not know a specific term for that
situation, but you are correct that mutual exclusivity is wrong.   While I
can generate a list of all possible combinations with the expand.grid
function (which I am not committed to by the way), it would be very helpful
if I could exclude the combinations where E and F both fail.  I am not sure
where to go from here, but the solution does not have to be elegant or even
efficient because I do not need to scale higher than 6 variables.



On Thu, Aug 2, 2018 at 7:26 AM, S Ellison  wrote:

> > On Thu, Aug 2, 2018 at 11:20 AM, R Stafford 
> > wrote:
> > > But I have the extra condition that if E is true, then F must be
> false, and
> > > vice versa,
>
> Question: Does 'vice versa' mean
> a) "if E is False, F must be True"
> or
> b) "if F is True, E must be False"?
> ... which are not the same.
>
> b) (and mutual exclusivity in general) does not rule out the condition "E
> False, F False", which would not be addressed by the
> pass/fail equivalent equivalent of F <- !E
>
>
>
>
> ***
> This email and any attachments are confidential. Any u...{{dropped:13}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combinations of true/false values where one pair is mutually exclusive

2018-08-02 Thread R Stafford
Thank you for pointing that out, I realize not only did I use the wrong
language but I did not describe the situation accurately.  I do need to
address the situation where both variables E and F actually pass, that is
the majority case, one or the other can fail, but there can never be a
situation where E and F both fail.  I do not know a specific term for that
situation, but you are correct that mutual exclusivity is wrong.   While I
can generate a list of all possible combinations with the expand.grid
function (which I am not committed to by the way), it would be very helpful
if I could exclude the combinations where E and F both fail.  I am not sure
where to go from here, but the solution does not have to be elegant or even
efficient because I do not need to scale higher than 6 variables.



On Thu, Aug 2, 2018 at 7:26 AM, S Ellison  wrote:

> > On Thu, Aug 2, 2018 at 11:20 AM, R Stafford 
> > wrote:
> > > But I have the extra condition that if E is true, then F must be
> false, and
> > > vice versa,
>
> Question: Does 'vice versa' mean
> a) "if E is False, F must be True"
> or
> b) "if F is True, E must be False"?
> ... which are not the same.
>
> b) (and mutual exclusivity in general) does not rule out the condition "E
> False, F False", which would not be addressed by the
> pass/fail equivalent equivalent of F <- !E
>
>
>
>
> ***
> This email and any attachments are confidential. Any u...{{dropped:13}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Philip Morris International - Windows10 migration assessment

2018-08-02 Thread Bert Gunter
R is free and open source. Your queries are inappropriate for this list,
which is about help for programming in R.  Please go here and follow the
relevant links to answer your questions:

https://www.r-project.org/

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Aug 2, 2018 at 1:45 AM, Flament, Kevin 
wrote:

> Dear R Project team,
>
> I am representing the System Toxicology department of Philip Morris
> International in the scope of a Windows 10 migration project.
> This project is currently at the end of the assessment phase. We would
> require an answer to this email by the end of this week.
>
> I would like to ask you some questions related to the following software
> we are using at PMI :
>
> R for Windows 3.0.2
> R for Windows 3.1.2
>
> Are all of those software compatible with Windows 10 Enterprise (version
> 10.0.16299 build 16299)?
> Are all of those software compatible with Windows 10 LTSC (version
> 14393.2399)?
> Are those software compatible with a 32 and/or 64 bit version of Windows
> 10?
> Are those application dependent on MS Office?
> Have those applications any pre-requisites? Such as Java, .Net, Oracle.
>
> If the software are not compatible, do you have a new version of this
> software compatible with the mentioned version of Windows 10?
> What would be the cost of such migration (instrument replacement if
> necessary, licence cost, ...)?
>
> If the software are not compatible and a new version is not yet available,
> could you give me an estimation for the future release availability?
>
> Best Regards,
>
> Kevin Flament
>
> Data & Systems Specialist
> PMI Science and Innovation
> Quai Jeanrenaud 3
> 2000 Neuchâtel
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to allign data

2018-08-02 Thread PIKAL Petr
Dear all

Before I start to reinvent wheel I would like to ask you if you have some easy 
solution for aligning data

I have something like this
x<-1:100
set.seed(42)
y1<-c(runif(20)+1, 1.2*x[1:80]+runif(80))
y2<-c(runif(40)+1, 1.2*x[1:60]+runif(60))

plot(x,y1)
points(x,y2, col=2)

with y increase starting at various x.

I would like to allign data so that the increase starts at the same x point, 
something like

plot(x,y1)
points(x[-(1:20)]-20,y2[-(1:20)], col=2)

I consider using strucchange or segmented packages to find break(s) and "shift" 
x values according to this break. But maybe somebody already did similar task 
(aligning several vectors according to some common breakpoint) and could offer 
better or simpler solution.

Best regards.
Petr
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CODE HELP

2018-08-02 Thread Saptorshee Kanto Chakraborty
Hello,

Thank you for replying. I am sorry te codes were not attached, I did attach
them but I think it got blocked due to some filters. I am pasting the link
for the codes:
https://github.com/zhentaoshi/convex_prog_in_econometrics/tree/master/C-Lasso/PLS_static

The authors never replied I have contacted them twice, I think they are
very busy.

Any help will be useful.

On Thu, 2 Aug 2018 at 10:22, Eric Berger  wrote:

> Hi Saptorshee,
> Two comments:
> 1. no attachments made it through to the list. You probably need to
> include the code directly in your email, and send your email as plain text
> (otherwise information gets stripped)
> 2. for anyone interested in following up on Saptorshee's question, I
> searched for the paper "Two Examples ..." and found that it is available
> for download from https://arxiv.org/pdf/1806.10423.pdf. (It looks quite
> interesting with a lot of discussion regarding various optimization
> packages and their current status regarding availability from R.)
>
> Best,
> Eric
>
>
> On Thu, Aug 2, 2018 at 5:01 AM, Saptorshee Kanto Chakraborty <
> chk...@unife.it> wrote:
>
>> Hello,
>>
>> I am interested to apply an econometric technique of  Latent Variable
>> framework on Environmental Kuznets Curve for 164 countries for a span of
>> 25
>> years.
>>
>> The methodology and the code are from Simulation exercise from an
>> unpublished paper "Two Examples of Convex-Programming-Based
>> High-Dimensional Econometric Estimators" in R. Is it somehow possible to
>> apply it to my data.
>>
>>
>> I am attaching the codes
>>
>> Thanking You
>>
>> --
>> Saptorshee Chakraborty
>>
>> Personal Website: http://saptorshee.weebly.com/
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

-- 
Saptorshee Chakraborty

Personal Website: http://saptorshee.weebly.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Philip Morris International - Windows10 migration assessment

2018-08-02 Thread S Ellison
Suggest you take a look at the R website at www.r-project.org; the most 
important answers are evident there.

If you 'require' more authoritative answers within a particular timescale, I 
suggest you engage an R consultant and pay for them. This is a voluntary list.


S Ellison
 

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Flament,
> Kevin
> Sent: 02 August 2018 09:45
> To: r-help@R-project.org
> Subject: [R] Philip Morris International - Windows10 migration assessment
> 
> Dear R Project team,
> 
> I am representing the System Toxicology department of Philip Morris
> International in the scope of a Windows 10 migration project.
> This project is currently at the end of the assessment phase. We would
> require an answer to this email by the end of this week.
> 
> I would like to ask you some questions related to the following software we
> are using at PMI :
> 
> R for Windows 3.0.2
> R for Windows 3.1.2
> 
> Are all of those software compatible with Windows 10 Enterprise (version
> 10.0.16299 build 16299)?
> Are all of those software compatible with Windows 10 LTSC (version
> 14393.2399)?
> Are those software compatible with a 32 and/or 64 bit version of Windows
> 10?
> Are those application dependent on MS Office?
> Have those applications any pre-requisites? Such as Java, .Net, Oracle.
> 
> If the software are not compatible, do you have a new version of this
> software compatible with the mentioned version of Windows 10?
> What would be the cost of such migration (instrument replacement if
> necessary, licence cost, ...)?
> 
> If the software are not compatible and a new version is not yet available,
> could you give me an estimation for the future release availability?
> 
> Best Regards,
> 
> Kevin Flament
> 
> Data & Systems Specialist
> PMI Science and Innovation
> Quai Jeanrenaud 3
> 2000 Neuch�tel
> 
>   [[alternative HTML version deleted]]



***
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If 
you have received this message in error, please notify the sender 
immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
and delete this message and any copies from your computer and network. 
LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combinations of true/false values where one pair is mutually exclusive

2018-08-02 Thread S Ellison
> On Thu, Aug 2, 2018 at 11:20 AM, R Stafford 
> wrote:
> > But I have the extra condition that if E is true, then F must be false, and
> > vice versa, 

Question: Does 'vice versa' mean 
a) "if E is False, F must be True"
or
b) "if F is True, E must be False"?
... which are not the same.

b) (and mutual exclusivity in general) does not rule out the condition "E 
False, F False", which would not be addressed by the 
pass/fail equivalent equivalent of F <- !E




***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Philip Morris International - Windows10 migration assessment

2018-08-02 Thread Flament, Kevin
Dear R Project team,

I am representing the System Toxicology department of Philip Morris 
International in the scope of a Windows 10 migration project.
This project is currently at the end of the assessment phase. We would require 
an answer to this email by the end of this week.

I would like to ask you some questions related to the following software we are 
using at PMI :

R for Windows 3.0.2
R for Windows 3.1.2

Are all of those software compatible with Windows 10 Enterprise (version 
10.0.16299 build 16299)?
Are all of those software compatible with Windows 10 LTSC (version 14393.2399)?
Are those software compatible with a 32 and/or 64 bit version of Windows 10?
Are those application dependent on MS Office?
Have those applications any pre-requisites? Such as Java, .Net, Oracle.

If the software are not compatible, do you have a new version of this software 
compatible with the mentioned version of Windows 10?
What would be the cost of such migration (instrument replacement if necessary, 
licence cost, ...)?

If the software are not compatible and a new version is not yet available, 
could you give me an estimation for the future release availability?

Best Regards,

Kevin Flament

Data & Systems Specialist
PMI Science and Innovation
Quai Jeanrenaud 3
2000 Neuch�tel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] inconsistency in forecast package....

2018-08-02 Thread akshay kulkarni
dear members,
   I am using R to do my research for Day Trading in 
INDIA. I have a list of 206 stocks to work with.

I have extracted a parameter of a stock based on the OHLC data of the stock. It 
includes values both less than and greater than 1 ( It basically is a ratio). I 
am using forecast package to predict the value of the parameter for the next 
day.

However, the value of the parameter is greater than 1 for all stocks! Actually, 
according to statistics, half of the stocks should have value less than 1 and 
half greater than 1.

Is this because of  a one odd day or is there some techniques of properly 
handling the forecast package?

very many thanks for your time and effort...
yours sincerely,
AKSHAY M KULKARNI

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] F-test where the coefficients in the H_0 is nonzero

2018-08-02 Thread peter dalgaard
This should do it:

> x <- rnorm(10)
> y <- x+rnorm(10)
> fit1 <- lm(y~x)
> fit2 <- lm(y~-1 + offset(0 + 1 * x))
> anova(fit2, fit1)
Analysis of Variance Table

Model 1: y ~ -1 + offset(0 + 1 * x)
Model 2: y ~ x
  Res.Df RSS Df Sum of Sq  F Pr(>F)
1 10 10.6381   
2  8  7.8096  22.8285 1.4487 0.2904


> On 2 Aug 2018, at 10:30 , John  wrote:
> 
> Hi,
> 
>   I try to run the regression
>   y = beta_0 + beta_1 x
>   and test H_0: (beta_0, beta_1) =(0,1) against H_1: H_0 is false
>   I believe I can run the regression
>   (y-x) = beta_0 +beta_1‘ x
>   and do the regular F-test (using lm functio) where the hypothesized
> coefficients are all zero.
> 
>   Is there any function in R that deal with the case where the
> coefficients are nonzero?
> 
> John
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read txt file - date - no space

2018-08-02 Thread PIKAL Petr
Hi

Good that you have finally got desired result.
Regarding aggregate, you could consult help page

?aggregate

It has many good examples how to use it.

and for understanding factors

?factor is your friend and/or pages 16+ from R intro.

Cheers
Petr

From: Diego Avesani 
Sent: Thursday, August 2, 2018 10:53 AM
To: PIKAL Petr ; r-help mailing list 

Subject: Re: [R] read txt file - date - no space


Dear Petr,

I have read the file:
MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",")

I have used  POSIXct to convert properly the date
MyData$date2<-as.POSIXct(MyData$date, format="%m/%d/%Y %H:%M")
creating a second field inside MyDate.

I have converted the -999 to NA:
MyData[MyData== -999] <- NA

dim(MyData):
160008  5
And this is clear because I have 160008 days and 5 field:
date2,date,str1,str2,str3

I have chech the structure of my data:
str(MyData)

'data.frame':   160008 obs. of  5 variables:
 $ date : Factor w/ 160008 levels "10/10/1998 0:00",..: 913 914 925 930 931 932 
933 934 935 936 ...
 $ str1 : num  0.6 0.2 0.6 0 0 0 0 0.2 0.6 0.2 ...
 $ str2 : num  0 0.2 0.2 0 0 0 0 0 0.2 0.4 ...
 $ str3 : num  0 0.2 0.4 0.6 0 0 0 0 0 0.4 ...
 $ date2: POSIXct, format: "1998-10-01 00:00:00" "1998-10-01 01:00:00" 
"1998-10-01 02:00:00" "1998-10-01 03:00:00" ...

Almost everything is clear:
str1,str2,str3 are mumbers,
date2 are date in the format according to POSIXct: Y-m-d h:m:s
date has 160008 Factor, i.e. 160008  factors which are the number of category.
I do not understand "913 914 925 930" are the possibilitiues in levels?

I have no NA in date2:

which(MyData$date2 == NA)
integer(0)

as well in date.

At this point I have applied:

daily_mean1<-aggregate(MyData$str1, list(format(MyData$date, "%Y-%m-%d")), mean)

which seems to be correct:
I have

dim(daily_mean1):
66672
str(daily_mean1)
'data.frame':   6667 obs. of  2 variables:
 $ Group.1: chr  "1998-10-01" "1998-10-02" "1998-10-03" "1998-10-04" ...
 $ x  : num  0.1667 0.0583 0.0417 0.3417 0. ...

Really Really thanks:
You not only taught me R  but also how to dealwith learning.

Can I ask you anover question about aggregate?

Again thanks

Diego

On 2 August 2018 at 10:10, PIKAL Petr 
mailto:petr.pi...@precheza.cz>> wrote:
Hi

From: Diego Avesani mailto:diego.aves...@gmail.com>>
Sent: Thursday, August 2, 2018 10:03 AM
To: PIKAL Petr mailto:petr.pi...@precheza.cz>>
Subject: Re: [R] read txt file - date - no space

Thanks,
I have just send you a e-mail, before reading this one.
Let's me read your last mail and go carefully through it.

Thanks again, really really,
I mean it

P.S.
Do you wand my *.csv file?

Not necessarily, you should better learn things yourself if you really want to 
use R. Only if after you tested all suggested ways and did not get desired 
result.

Cheers
Petr


Diego

On 2 August 2018 at 09:56, PIKAL Petr 
mailto:petr.pi...@precheza.cz>> wrote:
Well,

you followed my advice only partly. Did you get rid of your silly -999 values 
before averaging? Probably not.
Did you tried aggregating by slightly longer construction
aggregate(test[,-1], list(format(test$date, "%Y-%m-%d")), mean)
which keeps difference in month and year? Probably not.

We do not have your data, we do not know what exactly you want to do so it is 
really difficult to give you a help.

If I calculate correctly there are 24 hour in one day and you have data for 18 
years which gives me approximately 158000 distinct values.

I can get either 18 values (averaging years) or aproximately 6600 values 
(averaging days).

So my advice is:

Read your data to R
Change date column to POSIX but store it in different column
Change NA values from -999 to real NA values
Check dimension of your data ?dim
Check structure of your data ?str
Check if all dates are changed to POSIX correctly, are some of them NA?
Aggregate your values (not by lubridate function day) and store them in another 
object

Cheers
Petr


From: Diego Avesani mailto:diego.aves...@gmail.com>>
Sent: Thursday, August 2, 2018 9:31 AM
To: jim holtman mailto:jholt...@gmail.com>>; PIKAL Petr 
mailto:petr.pi...@precheza.cz>>
Cc: R mailing list mailto:r-help@r-project.org>>
Subject: Re: [R] read txt file - date - no space

Dear all,

I have found and error in the date conversion. Now it looks like:

MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",")
# change date to real
MyData$date<-as.POSIXct(MyData$date, format="%m/%d/%Y %H:%M")

After that I apply the PIKAL's suggestions:

aggregate(MyData[,-1], list(day(MyData$date)), mean)

And this is the final results:

 1 -82.43636 -46.12437 -319.2710
22 -82.06105 -45.74184 -319.2696
33 -82.05527 -45.52650 -319.2416
44 -82.03535 -47.59191 -319.2275
55 -77.44928 -50.05953 -320.5798
...
31-86.10234 -47.06247 -340.0968

However, it is not correct.
This because I have not made myself clear about my purpose. As I told you some 
days ago, I have a *.csv file with hourly data from 10/21/1998 to 12/31/2016. I 
would lik

Re: [R] read txt file - date - no space

2018-08-02 Thread Diego Avesani
Dear Petr,

I have read the file:
MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",")

I have used  POSIXct to convert properly the date
MyData$date2<-as.POSIXct(MyData$date, format="%m/%d/%Y %H:%M")
creating a second field inside MyDate.

I have converted the -999 to NA:
MyData[MyData== -999] <- NA

dim(MyData):
160008  5
And this is clear because I have 160008 days and 5 field:
date2,date,str1,str2,str3

I have chech the structure of my data:
str(MyData)

'data.frame': 160008 obs. of  5 variables:
 $ date : Factor w/ 160008 levels "10/10/1998 0:00",..: 913 914 925 930 931
932 933 934 935 936 ...
 $ str1 : num  0.6 0.2 0.6 0 0 0 0 0.2 0.6 0.2 ...
 $ str2 : num  0 0.2 0.2 0 0 0 0 0 0.2 0.4 ...
 $ str3 : num  0 0.2 0.4 0.6 0 0 0 0 0 0.4 ...
 $ date2: POSIXct, format: "1998-10-01 00:00:00" "1998-10-01 01:00:00"
"1998-10-01 02:00:00" "1998-10-01 03:00:00" ...

Almost everything is clear:
str1,str2,str3 are mumbers,
date2 are date in the format according to POSIXct: Y-m-d h:m:s
date has 160008 Factor, i.e. 160008  factors which are the number of
category.
I do not understand "913 914 925 930" are the possibilitiues in levels?

I have no NA in date2:

which(MyData$date2 == NA)
integer(0)

as well in date.

At this point I have applied:

daily_mean1<-aggregate(MyData$str1, list(format(MyData$date, "%Y-%m-%d")),
mean)

which seems to be correct:
I have

dim(daily_mean1):
66672
str(daily_mean1)
'data.frame': 6667 obs. of  2 variables:
 $ Group.1: chr  "1998-10-01" "1998-10-02" "1998-10-03" "1998-10-04" ...
 $ x  : num  0.1667 0.0583 0.0417 0.3417 0. ...

Really Really thanks:
You not only taught me R  but also how to dealwith learning.

Can I ask you anover question about aggregate?

Again thanks

Diego


On 2 August 2018 at 10:10, PIKAL Petr  wrote:

> Hi
>
>
>
> *From:* Diego Avesani 
> *Sent:* Thursday, August 2, 2018 10:03 AM
> *To:* PIKAL Petr 
> *Subject:* Re: [R] read txt file - date - no space
>
>
>
> Thanks,
>
> I have just send you a e-mail, before reading this one.
>
> Let's me read your last mail and go carefully through it.
>
>
>
> Thanks again, really really,
>
> I mean it
>
>
>
> P.S.
>
> Do you wand my *.csv file?
>
>
>
> Not necessarily, you should better learn things yourself if you really
> want to use R. Only if after you tested all suggested ways and did not get
> desired result.
>
>
>
> Cheers
>
> Petr
>
>
>
>
> Diego
>
>
>
> On 2 August 2018 at 09:56, PIKAL Petr  wrote:
>
> Well,
>
>
>
> you followed my advice only partly. Did you get rid of your silly -999
> values before averaging? Probably not.
>
> Did you tried aggregating by slightly longer construction
>
> aggregate(test[,-1], list(format(test$date, "%Y-%m-%d")), mean)
>
> which keeps difference in month and year? Probably not.
>
>
>
> We do not have your data, we do not know what exactly you want to do so it
> is really difficult to give you a help.
>
>
>
> If I calculate correctly there are 24 hour in one day and you have data
> for 18 years which gives me approximately 158000 distinct values.
>
>
>
> I can get either 18 values (averaging years) or aproximately 6600 values
> (averaging days).
>
>
>
> So my advice is:
>
>
>
> Read your data to R
>
> Change date column to POSIX but store it in different column
>
> Change NA values from -999 to real NA values
>
> Check dimension of your data ?dim
>
> Check structure of your data ?str
>
> Check if all dates are changed to POSIX correctly, are some of them NA?
>
> Aggregate your values (not by lubridate function day) and store them in
> another object
>
>
>
> Cheers
>
> Petr
>
>
>
>
>
> *From:* Diego Avesani 
> *Sent:* Thursday, August 2, 2018 9:31 AM
> *To:* jim holtman ; PIKAL Petr  >
> *Cc:* R mailing list 
> *Subject:* Re: [R] read txt file - date - no space
>
>
>
> Dear all,
>
>
>
> I have found and error in the date conversion. Now it looks like:
>
>
>
> MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",")
>
> # change date to real
>
> MyData$date<-as.POSIXct(MyData$date, format="%*m*/%*d*/%Y %H:%M")
>
>
>
> After that I apply the PIKAL's suggestions:
>
>
>
> aggregate(MyData[,-1], list(day(MyData$date)), mean)
>
>
>
> And this is the final results:
>
>
>
>  1 -82.43636 -46.12437 -319.2710
>
> 22 -82.06105 -45.74184 -319.2696
>
> 33 -82.05527 -45.52650 -319.2416
>
> 44 -82.03535 -47.59191 -319.2275
>
> 55 -77.44928 -50.05953 -320.5798
>
> ...
>
> 31-86.10234 -47.06247 -340.0968
>
>
>
> However, it is not correct.
>
> This because I have not made myself clear about my purpose. As I told you
> some days ago, I have a *.csv file with hourly data from 10/21/1998
> to 12/31/2016. I would like to compute the daily means. Basically, I would
> like to have the mean of the hourly date for each day from 10/21/1998
> to 12/31/2016 and not 31 values.
>
>
>
> Really really thanks again,
>
> Diego
>
>
>
>
> Diego
>
>
>
> On 2 August 2018 at 08:55, Diego Avesani  wrote:
>
> Dear
>
>
>
> I have check the one of the line that gives

[R] F-test where the coefficients in the H_0 is nonzero

2018-08-02 Thread John
Hi,

   I try to run the regression
   y = beta_0 + beta_1 x
   and test H_0: (beta_0, beta_1) =(0,1) against H_1: H_0 is false
   I believe I can run the regression
   (y-x) = beta_0 +beta_1‘ x
   and do the regular F-test (using lm functio) where the hypothesized
coefficients are all zero.

   Is there any function in R that deal with the case where the
coefficients are nonzero?

John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CODE HELP

2018-08-02 Thread Eric Berger
Hi Saptorshee,
Two comments:
1. no attachments made it through to the list. You probably need to include
the code directly in your email, and send your email as plain text
(otherwise information gets stripped)
2. for anyone interested in following up on Saptorshee's question, I
searched for the paper "Two Examples ..." and found that it is available
for download from https://arxiv.org/pdf/1806.10423.pdf. (It looks quite
interesting with a lot of discussion regarding various optimization
packages and their current status regarding availability from R.)

Best,
Eric


On Thu, Aug 2, 2018 at 5:01 AM, Saptorshee Kanto Chakraborty <
chk...@unife.it> wrote:

> Hello,
>
> I am interested to apply an econometric technique of  Latent Variable
> framework on Environmental Kuznets Curve for 164 countries for a span of 25
> years.
>
> The methodology and the code are from Simulation exercise from an
> unpublished paper "Two Examples of Convex-Programming-Based
> High-Dimensional Econometric Estimators" in R. Is it somehow possible to
> apply it to my data.
>
>
> I am attaching the codes
>
> Thanking You
>
> --
> Saptorshee Chakraborty
>
> Personal Website: http://saptorshee.weebly.com/
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read txt file - date - no space

2018-08-02 Thread Diego Avesani
Dear PIKAL, Dear all,

thanks again a lot.
I have finally understood what "in line" means.
I would definitely read some "R-intro" and in this moment I am reading a
R-tutorial.
I would not post formatted messages.

I would ask if it is possible to have some final suggestions:
- how to have daily mean;
- how to deal with NA;

Indeed, after changing the ate format I get

   Group.1  str1  str2  str3
11 -82.43636 -46.12437 -319.2710
22 -82.06105 -45.74184 -319.2696
33 -82.05527 -45.52650 -319.2416
44 -82.03535 -47.59191 -319.2275
...
31  31 -86.10234 -47.06247 -340.0968

As I said in the previously this is not correct.
This because I have not made myself clear about my purpose. As I told you
some days ago, I have a *.csv file with hourly data from 10/21/1998 to
12/31/2016.
I would like to compute the daily means.
Basically, I would like to have the mean of the hourly date for each day
from 10/21/1998 to 12/31/2016 and not 31 values.

I really really thank, especially for you patience.
I am leaning a lot,
Again thanks

Diego


On 2 August 2018 at 09:32, PIKAL Petr  wrote:

> Hi
>
>
>
> see in line (and please do not post HTML formated messages, it could be
> scrammbled)
>
>
>
> *From:* Diego Avesani 
> *Sent:* Thursday, August 2, 2018 8:56 AM
> *To:* jim holtman ; PIKAL Petr  >
> *Cc:* R mailing list 
> *Subject:* Re: [R] read txt file - date - no space
>
>
>
> Dear
>
>
>
> I have check the one of the line that gives me problem. I mean, which give
> NA after R processing. I think that is similar to the others:
>
>
>
> You should stop **thinking** and instead do real inspection of „offending“
> values.
>
>
>
> 10/12/1998 10:00,0,0,0
>
> 10/12/1998 11:00,0,0,0
>
> 10/12/1998 12:00,0,0,0
>
> 10/12/1998 13:00,0,0,0
>
> 10/12/1998 14:00,0,0,0
>
> 10/12/1998 15:00,0,0,0
>
> 10/12/1998 16:00,0,0,0
>
> 10/12/1998 17:00,0,0,0
>
>
>
> These lines do not pose any problem with formating.
>
>
>
> >  test<-read.table("clipboard", sep=",")
>
> > str(test)
>
> 'data.frame':   8 obs. of  4 variables:
>
> $ V1: Factor w/ 8 levels "10/12/1998 10:00",..: 1 2 3 4 5 6 7 8
>
> $ V2: int  0 0 0 0 0 0 0 0
>
> $ V3: int  0 0 0 0 0 0 0 0
>
> $ V4: int  0 0 0 0 0 0 0 0
>
> > as.POSIXct(test$V1, format="%d/%m/%Y %H:%M")
>
> [1] "1998-12-10 10:00:00 CET" "1998-12-10 11:00:00 CET"
>
> [3] "1998-12-10 12:00:00 CET" "1998-12-10 13:00:00 CET"
>
> [5] "1998-12-10 14:00:00 CET" "1998-12-10 15:00:00 CET"
>
> [7] "1998-12-10 16:00:00 CET" "1998-12-10 17:00:00 CET"
>
>
>
>
>
> @jim: It seems that you suggestion is focus on reading data from the
> terminal. It is possible to apply it to a *.csv file?
>
>
>
> @Pikal: Could it be that there are some date conversion error?
>
>
>
> Well, your str(MyData) result suggest, that conversion from character to
> POSIX was done correctly (at least partly).
>
>
>
> However NAs in date column you posted in second mail suggest, that some
> values in the input are probably formated differently and they are changed
> to NA during POSIX conversion.
>
>
>
> You could check which values are problematic if instead directly changing
> date column to POSIX you put a new column to you data with converted POSIX
> values
>
>
>
> So read your data from csv file and change date to POSIX but store it in
> different column of data frame.
>
>
>
> MyData$date2 <- as.POSIXct(MyData$date, format="%d/%m/%Y %H:%M")
>
>
>
> and check which values in your original file are formated differently.
>
>
>
> something like
>
> MyData$date[is.na(MyData$date2)]
>
>
>
> However your (very basic) questions suggest, that you have only minor
> understanding what are R objects, how to check, inspect and manipulate
> them. You could do a big favour to yourself going through basic
> documentation as I suggested before.
>
>
>
> Cheers
>
> Petr
>
>
>
> Thanks again,
>
> Diego
>
>
>
>
> Diego
>
>
>
> On 1 August 2018 at 17:01, jim holtman  wrote:
>
>
> Try this:
>
>
>
> > library(lubridate)
>
> > library(tidyverse)
>
> > input <- read.csv(text = "date,str1,str2,str3
>
> + 10/1/1998 0:00,0.6,0,0
>
> +   10/1/1998 1:00,0.2,0.2,0.2
>
> +   10/1/1998 2:00,0.6,0.2,0.4
>
> +   10/1/1998 3:00,0,0,0.6
>
> +   10/1/1998 4:00,0,0,0
>
> +   10/1/1998 5:00,0,0,0
>
> +   10/1/1998 6:00,0,0,0
>
> +   10/1/1998 7:00,0.2,0,0", as.is = TRUE)
>
> > # convert the date and add the "day" so summarize
>
> > input <- input %>%
>
> +   mutate(date = mdy_hm(date),
>
> +  day = floor_date(date, unit = 'day')
>
> +   )
>
> >
>
> > by_day <- input %>%
>
> +   group_by(day) %>%
>
> +   summarise(m_s1 = mean(str1),
>
> + m_s2 = mean(str2),
>
> + m_s3 = mean(str3)
>
> +   )
>
> >
>
> > by_day
>
> # A tibble: 1 x 4
>
>   day  m_s1   m_s2  m_s3
>
>
>
> 1 1998-10-01 00:00:00 0.200 0.0500 0.150
>
>
> Jim Holtman
> *Data Munger Guru*
>
>
> *What is the problem that y

Re: [R] read txt file - date - no space

2018-08-02 Thread PIKAL Petr
Well,

you followed my advice only partly. Did you get rid of your silly -999 values 
before averaging? Probably not.
Did you tried aggregating by slightly longer construction
aggregate(test[,-1], list(format(test$date, "%Y-%m-%d")), mean)
which keeps difference in month and year? Probably not.

We do not have your data, we do not know what exactly you want to do so it is 
really difficult to give you a help.

If I calculate correctly there are 24 hour in one day and you have data for 18 
years which gives me approximately 158000 distinct values.

I can get either 18 values (averaging years) or aproximately 6600 values 
(averaging days).

So my advice is:

Read your data to R
Change date column to POSIX but store it in different column
Change NA values from -999 to real NA values
Check dimension of your data ?dim
Check structure of your data ?str
Check if all dates are changed to POSIX correctly, are some of them NA?
Aggregate your values (not by lubridate function day) and store them in another 
object

Cheers
Petr


From: Diego Avesani 
Sent: Thursday, August 2, 2018 9:31 AM
To: jim holtman ; PIKAL Petr 
Cc: R mailing list 
Subject: Re: [R] read txt file - date - no space

Dear all,

I have found and error in the date conversion. Now it looks like:

MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",")
# change date to real
MyData$date<-as.POSIXct(MyData$date, format="%m/%d/%Y %H:%M")

After that I apply the PIKAL's suggestions:

aggregate(MyData[,-1], list(day(MyData$date)), mean)

And this is the final results:

 1 -82.43636 -46.12437 -319.2710
22 -82.06105 -45.74184 -319.2696
33 -82.05527 -45.52650 -319.2416
44 -82.03535 -47.59191 -319.2275
55 -77.44928 -50.05953 -320.5798
...
31-86.10234 -47.06247 -340.0968

However, it is not correct.
This because I have not made myself clear about my purpose. As I told you some 
days ago, I have a *.csv file with hourly data from 10/21/1998 to 12/31/2016. I 
would like to compute the daily means. Basically, I would like to have the mean 
of the hourly date for each day from 10/21/1998 to 12/31/2016 and not 31 values.

Really really thanks again,
Diego


Diego

On 2 August 2018 at 08:55, Diego Avesani 
mailto:diego.aves...@gmail.com>> wrote:
Dear

I have check the one of the line that gives me problem. I mean, which give NA 
after R processing. I think that is similar to the others:

10/12/1998 10:00,0,0,0
10/12/1998 11:00,0,0,0
10/12/1998 12:00,0,0,0
10/12/1998 13:00,0,0,0
10/12/1998 14:00,0,0,0
10/12/1998 15:00,0,0,0
10/12/1998 16:00,0,0,0
10/12/1998 17:00,0,0,0

@jim: It seems that you suggestion is focus on reading data from the terminal. 
It is possible to apply it to a *.csv file?

@Pikal: Could it be that there are some date conversion error?

Thanks again,
Diego


Diego

On 1 August 2018 at 17:01, jim holtman 
mailto:jholt...@gmail.com>> wrote:

Try this:

> library(lubridate)
> library(tidyverse)
> input <- read.csv(text = "date,str1,str2,str3
+ 10/1/1998 0:00,0.6,0,0
+   10/1/1998 1:00,0.2,0.2,0.2
+   10/1/1998 2:00,0.6,0.2,0.4
+   10/1/1998 3:00,0,0,0.6
+   10/1/1998 4:00,0,0,0
+   10/1/1998 5:00,0,0,0
+   10/1/1998 6:00,0,0,0
+   10/1/1998 7:00,0.2,0,0", as.is = TRUE)
> # convert the date and add the "day" so summarize
> input <- input %>%
+   mutate(date = mdy_hm(date),
+  day = floor_date(date, unit = 'day')
+   )
>
> by_day <- input %>%
+   group_by(day) %>%
+   summarise(m_s1 = mean(str1),
+ m_s2 = mean(str2),
+ m_s3 = mean(str3)
+   )
>
> by_day
# A tibble: 1 x 4
  day  m_s1   m_s2  m_s3
   
1 1998-10-01 00:00:00 0.200 0.0500 0.150

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Tue, Jul 31, 2018 at 11:54 PM Diego Avesani 
mailto:diego.aves...@gmail.com>> wrote:
Dear all,
I am sorry, I did a lot of confusion. I am sorry, I have to relax and stat
all again in order to understand.
If I could I would like to start again, without mixing strategy and waiting
for your advice.

I am really appreciate you help, really really.
Here my new file, a *.csv file (buy the way, it is possible to attach it in
the mailing list?)

date,str1,str2,str3
10/1/1998 0:00,0.6,0,0
10/1/1998 1:00,0.2,0.2,0.2
10/1/1998 2:00,0.6,0.2,0.4
10/1/1998 3:00,0,0,0.6
10/1/1998 4:00,0,0,0
10/1/1998 5:00,0,0,0
10/1/1998 6:00,0,0,0
10/1/1998 7:00,0.2,0,0


I read it as:
MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",")

at this point I would like to have the daily mean.
What would you suggest?

Really Really thanks,
You are my lifesaver

Thanks



Diego


On 1 August 2018 at 01:01, Jeff Newmiller 
mailto:jdnew...@dcn.davis.ca.us>> wrote:

> ... and the most common source of NA values in time data is wrong
> timezones. You really need to make sure the timezon

Re: [R] read txt file - date - no space

2018-08-02 Thread Diego Avesani
Dear all,

I have found and error in the date conversion. Now it looks like:

MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",")
# change date to real
MyData$date<-as.POSIXct(MyData$date, format="%*m*/%*d*/%Y %H:%M")

After that I apply the PIKAL's suggestions:

aggregate(MyData[,-1], list(day(MyData$date)), mean)

And this is the final results:

 1 -82.43636 -46.12437 -319.2710
22 -82.06105 -45.74184 -319.2696
33 -82.05527 -45.52650 -319.2416
44 -82.03535 -47.59191 -319.2275
55 -77.44928 -50.05953 -320.5798
...
31-86.10234 -47.06247 -340.0968

However, it is not correct.
This because I have not made myself clear about my purpose. As I told you
some days ago, I have a *.csv file with hourly data from 10/21/1998
to 12/31/2016. I would like to compute the daily means. Basically, I would
like to have the mean of the hourly date for each day from 10/21/1998
to 12/31/2016 and not 31 values.

Really really thanks again,
Diego


Diego


On 2 August 2018 at 08:55, Diego Avesani  wrote:

> Dear
>
> I have check the one of the line that gives me problem. I mean, which give
> NA after R processing. I think that is similar to the others:
>
> 10/12/1998 10:00,0,0,0
> 10/12/1998 11:00,0,0,0
> 10/12/1998 12:00,0,0,0
> 10/12/1998 13:00,0,0,0
> 10/12/1998 14:00,0,0,0
> 10/12/1998 15:00,0,0,0
> 10/12/1998 16:00,0,0,0
> 10/12/1998 17:00,0,0,0
>
> @jim: It seems that you suggestion is focus on reading data from the
> terminal. It is possible to apply it to a *.csv file?
>
> @Pikal: Could it be that there are some date conversion error?
>
> Thanks again,
> Diego
>
>
> Diego
>
>
> On 1 August 2018 at 17:01, jim holtman  wrote:
>
>>
>> Try this:
>>
>> > library(lubridate)
>> > library(tidyverse)
>> > input <- read.csv(text = "date,str1,str2,str3
>> + 10/1/1998 0:00,0.6,0,0
>> +   10/1/1998 1:00,0.2,0.2,0.2
>> +   10/1/1998 2:00,0.6,0.2,0.4
>> +   10/1/1998 3:00,0,0,0.6
>> +   10/1/1998 4:00,0,0,0
>> +   10/1/1998 5:00,0,0,0
>> +   10/1/1998 6:00,0,0,0
>> +   10/1/1998 7:00,0.2,0,0", as.is = TRUE)
>> > # convert the date and add the "day" so summarize
>> > input <- input %>%
>> +   mutate(date = mdy_hm(date),
>> +  day = floor_date(date, unit = 'day')
>> +   )
>> >
>> > by_day <- input %>%
>> +   group_by(day) %>%
>> +   summarise(m_s1 = mean(str1),
>> + m_s2 = mean(str2),
>> + m_s3 = mean(str3)
>> +   )
>> >
>> > by_day
>> # A tibble: 1 x 4
>>   day  m_s1   m_s2  m_s3
>>
>> 1 1998-10-01 00:00:00 0.200 0.0500 0.150
>>
>> Jim Holtman
>> *Data Munger Guru*
>>
>>
>> *What is the problem that you are trying to solve?Tell me what you want
>> to do, not how you want to do it.*
>>
>>
>> On Tue, Jul 31, 2018 at 11:54 PM Diego Avesani 
>> wrote:
>>
>>> Dear all,
>>> I am sorry, I did a lot of confusion. I am sorry, I have to relax and
>>> stat
>>> all again in order to understand.
>>> If I could I would like to start again, without mixing strategy and
>>> waiting
>>> for your advice.
>>>
>>> I am really appreciate you help, really really.
>>> Here my new file, a *.csv file (buy the way, it is possible to attach it
>>> in
>>> the mailing list?)
>>>
>>> date,str1,str2,str3
>>> 10/1/1998 0:00,0.6,0,0
>>> 10/1/1998 1:00,0.2,0.2,0.2
>>> 10/1/1998 2:00,0.6,0.2,0.4
>>> 10/1/1998 3:00,0,0,0.6
>>> 10/1/1998 4:00,0,0,0
>>> 10/1/1998 5:00,0,0,0
>>> 10/1/1998 6:00,0,0,0
>>> 10/1/1998 7:00,0.2,0,0
>>>
>>>
>>> I read it as:
>>> MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",")
>>>
>>> at this point I would like to have the daily mean.
>>> What would you suggest?
>>>
>>> Really Really thanks,
>>> You are my lifesaver
>>>
>>> Thanks
>>>
>>>
>>>
>>> Diego
>>>
>>>
>>> On 1 August 2018 at 01:01, Jeff Newmiller 
>>> wrote:
>>>
>>> > ... and the most common source of NA values in time data is wrong
>>> > timezones. You really need to make sure the timezone that is assumed
>>> when
>>> > the character data are converted to POSIXt agrees with the data. In
>>> most
>>> > cases the easiest way to insure this is to use
>>> >
>>> > Sys.setenv(TZ="US/Pacific")
>>> >
>>> > or whatever timezone from
>>> >
>>> > OlsonNames()
>>> >
>>> > corresponds with your data. Execute this setenv function before the
>>> > strptime or as.POSIXct() function call.
>>> >
>>> > You can use
>>> >
>>> > MyData[ is.na(MyData$datetime), ]
>>> >
>>> > to see which records are failing to convert time.
>>> >
>>> > [1] https://github.com/jdnewmil/eci298sp2016/blob/master/QuickHowto1
>>> >
>>> > On July 31, 2018 3:04:05 PM PDT, Jim Lemon 
>>> wrote:
>>> > >Hi Diego,
>>> > >I think the error is due to NA values in your data file. If I extend
>>> > >your example and run it, I get no errors:
>>> > >
>>> > >MyData<-read.table(text="103001930 103001580 103001530
>>> > >1998-10-01 00:00:00 0.6 0 0
>>> > >1998-10-01 01:00:00 0.2 0.2 0.2
>>> > >1998-10-01 02:00:00 0.6 0.

Re: [R] read txt file - date - no space

2018-08-02 Thread PIKAL Petr
Hi

see in line (and please do not post HTML formated messages, it could be 
scrammbled)

From: Diego Avesani 
Sent: Thursday, August 2, 2018 8:56 AM
To: jim holtman ; PIKAL Petr 
Cc: R mailing list 
Subject: Re: [R] read txt file - date - no space

Dear

I have check the one of the line that gives me problem. I mean, which give NA 
after R processing. I think that is similar to the others:

You should stop **thinking** and instead do real inspection of „offending“ 
values.

10/12/1998 10:00,0,0,0
10/12/1998 11:00,0,0,0
10/12/1998 12:00,0,0,0
10/12/1998 13:00,0,0,0
10/12/1998 14:00,0,0,0
10/12/1998 15:00,0,0,0
10/12/1998 16:00,0,0,0
10/12/1998 17:00,0,0,0

These lines do not pose any problem with formating.

>  test<-read.table("clipboard", sep=",")
> str(test)
'data.frame':   8 obs. of  4 variables:
$ V1: Factor w/ 8 levels "10/12/1998 10:00",..: 1 2 3 4 5 6 7 8
$ V2: int  0 0 0 0 0 0 0 0
$ V3: int  0 0 0 0 0 0 0 0
$ V4: int  0 0 0 0 0 0 0 0
> as.POSIXct(test$V1, format="%d/%m/%Y %H:%M")
[1] "1998-12-10 10:00:00 CET" "1998-12-10 11:00:00 CET"
[3] "1998-12-10 12:00:00 CET" "1998-12-10 13:00:00 CET"
[5] "1998-12-10 14:00:00 CET" "1998-12-10 15:00:00 CET"
[7] "1998-12-10 16:00:00 CET" "1998-12-10 17:00:00 CET"


@jim: It seems that you suggestion is focus on reading data from the terminal. 
It is possible to apply it to a *.csv file?

@Pikal: Could it be that there are some date conversion error?

Well, your str(MyData) result suggest, that conversion from character to POSIX 
was done correctly (at least partly).

However NAs in date column you posted in second mail suggest, that some values 
in the input are probably formated differently and they are changed to NA 
during POSIX conversion.

You could check which values are problematic if instead directly changing date 
column to POSIX you put a new column to you data with converted POSIX values

So read your data from csv file and change date to POSIX but store it in 
different column of data frame.

MyData$date2 <- as.POSIXct(MyData$date, format="%d/%m/%Y %H:%M")

and check which values in your original file are formated differently.

something like
MyData$date[is.na(MyData$date2)]

However your (very basic) questions suggest, that you have only minor 
understanding what are R objects, how to check, inspect and manipulate them. 
You could do a big favour to yourself going through basic documentation as I 
suggested before.

Cheers
Petr

Thanks again,
Diego


Diego

On 1 August 2018 at 17:01, jim holtman 
mailto:jholt...@gmail.com>> wrote:

Try this:

> library(lubridate)
> library(tidyverse)
> input <- read.csv(text = "date,str1,str2,str3
+ 10/1/1998 0:00,0.6,0,0
+   10/1/1998 1:00,0.2,0.2,0.2
+   10/1/1998 2:00,0.6,0.2,0.4
+   10/1/1998 3:00,0,0,0.6
+   10/1/1998 4:00,0,0,0
+   10/1/1998 5:00,0,0,0
+   10/1/1998 6:00,0,0,0
+   10/1/1998 7:00,0.2,0,0", as.is = TRUE)
> # convert the date and add the "day" so summarize
> input <- input %>%
+   mutate(date = mdy_hm(date),
+  day = floor_date(date, unit = 'day')
+   )
>
> by_day <- input %>%
+   group_by(day) %>%
+   summarise(m_s1 = mean(str1),
+ m_s2 = mean(str2),
+ m_s3 = mean(str3)
+   )
>
> by_day
# A tibble: 1 x 4
  day  m_s1   m_s2  m_s3
   
1 1998-10-01 00:00:00 0.200 0.0500 0.150

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Tue, Jul 31, 2018 at 11:54 PM Diego Avesani 
mailto:diego.aves...@gmail.com>> wrote:
Dear all,
I am sorry, I did a lot of confusion. I am sorry, I have to relax and stat
all again in order to understand.
If I could I would like to start again, without mixing strategy and waiting
for your advice.

I am really appreciate you help, really really.
Here my new file, a *.csv file (buy the way, it is possible to attach it in
the mailing list?)

date,str1,str2,str3
10/1/1998 0:00,0.6,0,0
10/1/1998 1:00,0.2,0.2,0.2
10/1/1998 2:00,0.6,0.2,0.4
10/1/1998 3:00,0,0,0.6
10/1/1998 4:00,0,0,0
10/1/1998 5:00,0,0,0
10/1/1998 6:00,0,0,0
10/1/1998 7:00,0.2,0,0


I read it as:
MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",")

at this point I would like to have the daily mean.
What would you suggest?

Really Really thanks,
You are my lifesaver

Thanks



Diego


On 1 August 2018 at 01:01, Jeff Newmiller 
mailto:jdnew...@dcn.davis.ca.us>> wrote:

> ... and the most common source of NA values in time data is wrong
> timezones. You really need to make sure the timezone that is assumed when
> the character data are converted to POSIXt agrees with the data. In most
> cases the easiest way to insure this is to use
>
> Sys.setenv(TZ="US/Pacific")
>
> or whatever timezone from
>
> OlsonNames()
>
> corresponds with your data. Execute this setenv function before the
> strptime or as.POSIXct() function call.
>
> You can u