Re: [R] propensity scores & imputation

2017-03-16 Thread David Paul
Hi Mr. Gunter,

Will do.  Thanks, I've not visited stats.stackexchange before.


Kind Regards,

David

-Original Message-
From: Bert Gunter [mailto:bgunter.4...@gmail.com] 
Sent: Thursday, March 16, 2017 7:51 PM
To: david.p...@statmetrics.biz
Cc: R-help 
Subject: Re: [R] propensity scores & imputation

Way out of bounds for this list (see the posting guide). Try posting on 
stats.stackexchange.com instead.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Mar 16, 2017 at 10:42 AM, David Paul  wrote:
> Hi,
>
>
>
> Many thanks in advance for whatever advice / input I may receive.
>
>
>
> I have a propensity score matching / data imputation question.  The 
> purpose of the propensity
>
> score modeling is to put subjects from two different clinical trials 
> on a similar footing so that a key
>
> clinical measurement from one study can be attributed / imputed to the 
> other study.  The goal is
>
> NOT to directly compare the two studies, so this is a very atypical 
> kind of propensity score usage.
>
>
>
> I am using lrm( ) to obtain estimated propensity scores, and my 
> question to this List is rather more
>
> philosophical than R-syntax.
>
>
>
>
>
> Here is the data setup:
>
>
>
>a.frame
> b.frame
>
>---
> 
>
>1. Represents  data from clinical trial A1.
> Represents  data from clinical trial B
>
>   2. Two arms, 'ACTIVE' and 'PLACEBO'  2. Two
> arms, 'ACTIVE' and 'PLACEBO'
>
>3. The active drug is the same as with Study B  3. The active
> drug is the same as with Study A
>
>4. The trial design is very similar to Study B4. The
> trial design is very similar to Study A
>
>5. One measurement is a clinical continuous 5. Does NOT
> have the clinical continuous measure
>
> measure obtained via laboratory assay   that
> is available in Study A
>
>6. Number of randomized subjects = 500   6. Number of
> randomized subjects = 5,000
>
>7. A subset of the baseline covariates (call it 7. A
> subset of the baseline covariates (call it
>
> a.subset.frame) has 100% commonality
> b.subset.frame) has 100% commonality
>
> with b.subset.frame
> with a.subset.frame
>
>
> 8. Primary endpoint is time-to-event
>
>
>
>
>
> Here is the analysis setup:
>
>
>
> I have separately split a.frame and b.frame into 'ACTIVE' and 'PLACEBO'
> subjects.
>
>
>
> For the 'PLACEBO' subjects I have entered the a.subset.frame = 
> b.subset.frame baseline
>
> covariates into lrm( ).  The outcome variable is a factor variable 
> representing Study A = 'Y',
>
> so the estimated propensity scores are the estimated probabilities 
> that a 'PLACEBO' subject is
>
> from Study A.  I then, finally, used the %GREEDY algorithm (posted on 
> Mayo Clinic website)
>
> in SAS to match 1-to-many where the Study A subjects are thought of as 
> 'case' subjects and
>
> the Study B subjects are thought of as 'control' subjects. [I know the 
> matching can be done
>
> in R, I'm working on that now.]  The average number of Study B 
> subjects matched to a
>
> single Study A subject is approximately 5.
>
>
>
> I have done a similar analysis for the 'ACTIVE' subjects.
>
>
>
>
>
>
>
> Here is my question:
>
>
>
> At the end, I will combine the Study B matched 'PLACEBO' and 'ACTIVE'
> subjects and
>
> perform a Cox PH regression to compare 'PLACEBO' and 'ACTIVE' - there 
> will be no Study A
>
> subjects in this analysis.  I want to incorporate the clinical 
> continuous measurement "borrowed"
>
> from Study A as a covariate.  When doing this, how should I best take 
> into account the
>
> 1-to-many matching?  Do I need to weight the Study B subjects, or can 
> I simply enter the
>
> matched Study B subjects into a Cox PH regression and ignore the 
> 1-to-many issue?
>
>
>
>
>
> Kind Regards,
>
>
>
>  David
>
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




PGP.sig
Description: PGP signature
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] propensity scores & imputation

2017-03-16 Thread Bert Gunter
Way out of bounds for this list (see the posting guide). Try posting
on stats.stackexchange.com instead.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Mar 16, 2017 at 10:42 AM, David Paul  wrote:
> Hi,
>
>
>
> Many thanks in advance for whatever advice / input I may receive.
>
>
>
> I have a propensity score matching / data imputation question.  The purpose
> of the propensity
>
> score modeling is to put subjects from two different clinical trials on a
> similar footing so that a key
>
> clinical measurement from one study can be attributed / imputed to the other
> study.  The goal is
>
> NOT to directly compare the two studies, so this is a very atypical kind of
> propensity score usage.
>
>
>
> I am using lrm( ) to obtain estimated propensity scores, and my question to
> this List is rather more
>
> philosophical than R-syntax.
>
>
>
>
>
> Here is the data setup:
>
>
>
>a.frame
> b.frame
>
>---
> 
>
>1. Represents  data from clinical trial A1.
> Represents  data from clinical trial B
>
>   2. Two arms, 'ACTIVE' and 'PLACEBO'  2. Two
> arms, 'ACTIVE' and 'PLACEBO'
>
>3. The active drug is the same as with Study B  3. The active
> drug is the same as with Study A
>
>4. The trial design is very similar to Study B4. The
> trial design is very similar to Study A
>
>5. One measurement is a clinical continuous 5. Does NOT
> have the clinical continuous measure
>
> measure obtained via laboratory assay   that
> is available in Study A
>
>6. Number of randomized subjects = 500   6. Number of
> randomized subjects = 5,000
>
>7. A subset of the baseline covariates (call it 7. A
> subset of the baseline covariates (call it
>
> a.subset.frame) has 100% commonality
> b.subset.frame) has 100% commonality
>
> with b.subset.frame
> with a.subset.frame
>
>
> 8. Primary endpoint is time-to-event
>
>
>
>
>
> Here is the analysis setup:
>
>
>
> I have separately split a.frame and b.frame into 'ACTIVE' and 'PLACEBO'
> subjects.
>
>
>
> For the 'PLACEBO' subjects I have entered the a.subset.frame =
> b.subset.frame baseline
>
> covariates into lrm( ).  The outcome variable is a factor variable
> representing Study A = 'Y',
>
> so the estimated propensity scores are the estimated probabilities that a
> 'PLACEBO' subject is
>
> from Study A.  I then, finally, used the %GREEDY algorithm (posted on Mayo
> Clinic website)
>
> in SAS to match 1-to-many where the Study A subjects are thought of as
> 'case' subjects and
>
> the Study B subjects are thought of as 'control' subjects. [I know the
> matching can be done
>
> in R, I'm working on that now.]  The average number of Study B subjects
> matched to a
>
> single Study A subject is approximately 5.
>
>
>
> I have done a similar analysis for the 'ACTIVE' subjects.
>
>
>
>
>
>
>
> Here is my question:
>
>
>
> At the end, I will combine the Study B matched 'PLACEBO' and 'ACTIVE'
> subjects and
>
> perform a Cox PH regression to compare 'PLACEBO' and 'ACTIVE' - there will
> be no Study A
>
> subjects in this analysis.  I want to incorporate the clinical continuous
> measurement "borrowed"
>
> from Study A as a covariate.  When doing this, how should I best take into
> account the
>
> 1-to-many matching?  Do I need to weight the Study B subjects, or can I
> simply enter the
>
> matched Study B subjects into a Cox PH regression and ignore the 1-to-many
> issue?
>
>
>
>
>
> Kind Regards,
>
>
>
>  David
>
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] standard error for regression coefficients corresponding to factor levels

2017-03-16 Thread Rolf Turner


You have been posting to the R-help list long enough so that you should 
have learned by now *not* to post in html.  Your code is mangled so as 
to be unreadable.


A few comments:

(1) Your data frame "data1" seems to have a mysterious (and irrelevant?) 
column named "data1" as well.


(2) The covariance matrix of your coefficient estimates is indeed (as 
you hint) a constant multiple of (X^T X)^{-1}.  So do:


X <- model.matrix(~response*week,data=data1)
S <- solve(t(X)%*%X)
print(S)

and you will see the same pattern of constancy that your results exhibit.

(3) You could get the results you want much more easily, without all the
fooling around buried in your (illegible) code, by doing:

mod <- lm(response ~ (region - 1)/week,data=data1)
summary(mod)

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 17/03/17 07:26, li li wrote:

Hi all,
  I have the following data called "data1". After fitting the ancova model
with different slopes and intercepts for each region, I calculated the
regression coefficients and the corresponding standard error. The standard
error (for intercept or for slope) are all the same for different regions.
Is there something wrong?
  I know the SE is related to (X^T X)^-1, where X is design matrix. So does
this happen whenever each factor level has the same set of values for
"week"?
 Thanks.
 Hanna




mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))> res <- matrix(NA, 5,4)> res[1,1:2] <- 
tmp[1,1:2]> res[2:5,1] <- tmp[1,1]+tmp[2:5,1]> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)> res[1,3:4] <- 
tmp[6,1:2]> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]> res[2:5,4] <- sqrt(tmp[7:10,2]^2-tmp[6,2]^2)



colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")> rownames(res) 
<- letters[1:5]> res   intercept intercept SEslope   slope SE

a 0.18404464   0.08976301 -0.018629310 0.01385073
b 0.17605666   0.08976301 -0.022393789 0.01385073
c 0.16754130   0.08976301 -0.022367770 0.01385073
d 0.12554452   0.08976301 -0.017464385 0.01385073
e 0.06153256   0.08976301  0.007714685 0.01385073








data1week region response

5  3  c  0.057325067
6  6  c  0.066723632
7  9  c -0.025317808
12 3  d  0.024692613
13 6  d  0.021761492
14 9  d -0.099820335
19 3  c  0.119559235
20 6  c -0.054456186
21 9  c  0.078811180
26 3  d  0.091667189
27 6  d -0.053400777
28 9  d  0.090754363
33 3  c  0.163818085
34 6  c  0.008959741
35 9  c -0.115410852
40 3  d  0.193920693
41 6  d -0.087738914
42 9  d  0.004987542
47 3  a  0.121332285
48 6  a -0.020202707
49 9  a  0.037295785
54 3  b  0.214304603
55 6  b -0.052346480
56 9  b  0.082501222
61 3  a  0.053540767
62 6  a -0.019182819
63 9  a -0.057629113
68 3  b  0.068592791
69 6  b -0.123298216
70 9  b -0.230671818
75 3  a  0.330741562
76 6  a  0.013902905
77 9  a  0.190620360
82 3  b  0.151002874
83 6  b  0.086177696
84 9  b  0.178982656
89 3  e  0.062974799
90 6  e  0.062035391
91 9  e  0.206200831
96 3  e  0.123102197
97 6  e  0.040181790
98 9  e  0.121332285
1033  e  0.147557564
1046  e  0.062035391
1059  e  0.144965770

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transposing forecasts results from nnetar function and turn them into a data frame

2017-03-16 Thread Jim Lemon
Hi Paul,
It looks like the information that is printed is in:

TSModelForecast$mean

If str(TSModelForecast$mean) returns something like a list with two
components, you can probably use something like this:

paste(format(TSModelForecast$mean$Date,"%b-%Y"),
 TSModelForecast$mean$Forecast,sep="-",collapse="\n")

It also might be in TSModelForecast$fitted

Jim


On Fri, Mar 17, 2017 at 5:34 AM, Paul Bernal  wrote:
> Dear friends,
>
> I am currently using R version 3.3.3 (64-bit) and used the following code
> to generate forecasts:
>
>> library(forecast)
>>
>> library(tseries)
>
> ‘tseries’ version: 0.10-35
>
> ‘tseries’ is a package for time series analysis and computational
> finance.
>
> See ‘library(help="tseries")’ for details.
>
>
>> DAT<-read.csv("TrainingData.csv")
>>
>> TSdata<-ts(DAT[,1], start=c(1994,10), frequency=12)
>>
>> TSmodel<-nnetar(TSdata)
>>
>> TSmodelForecast<-forecast(TSmodel, h=24)
>>
>> TSmodelForecast
>
> The problem is that the output comes in this fashion:
>
> JanFebMarAprMayJun JulAug
> Sep   Oct
>  201710  20  15  40 9 8 21 21
> 19 18
>  201834  15   76  10  11
>
> The format I would like to have is the following:
>
> Date Forecast
> Jan-2017   10
> Feb-2017   20
> Mar-2017   15
> Apr-201740
> May-2017   9
> Jun-20178
> Jul-2017 21
> Aug-2017   21
> Sep-2017   19
> etc  etc
>
> Is there a way to make the results look like this?
>
> Attached is a dataset as a reference.
>
> Best regards,
>
> Paul
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Display data by condition

2017-03-16 Thread Juan Ceccarelli Arias
Thanks, but I already solved it as you wrote it.
I was a missing comma.

On Thu, Mar 16, 2017 at 5:19 PM, jim holtman  wrote:

> you are probably missing a comma:
>
> View(data[data$fact > 5000, ])
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Thu, Mar 16, 2017 at 11:16 AM, Juan Ceccarelli Arias  > wrote:
>
>> Hello,
>> I need to show the observations of a data set only if the earn more than
>> $5000 (fact is its name in the date set). I use this:
>>
>> View(data[data$fact>5000])
>>
>> The code above shows nothing. No error or message at all.
>> What am i doing wrong?
>> Thanks for your help and time.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] propensity scores & imputation

2017-03-16 Thread David Paul
Hi,

 

Many thanks in advance for whatever advice / input I may receive.

 

I have a propensity score matching / data imputation question.  The purpose
of the propensity

score modeling is to put subjects from two different clinical trials on a
similar footing so that a key

clinical measurement from one study can be attributed / imputed to the other
study.  The goal is

NOT to directly compare the two studies, so this is a very atypical kind of
propensity score usage.

 

I am using lrm( ) to obtain estimated propensity scores, and my question to
this List is rather more 

philosophical than R-syntax.

 

 

Here is the data setup:

 

   a.frame
b.frame

   ---


   1. Represents  data from clinical trial A1.
Represents  data from clinical trial B

  2. Two arms, 'ACTIVE' and 'PLACEBO'  2. Two
arms, 'ACTIVE' and 'PLACEBO'

   3. The active drug is the same as with Study B  3. The active
drug is the same as with Study A

   4. The trial design is very similar to Study B4. The
trial design is very similar to Study A

   5. One measurement is a clinical continuous 5. Does NOT
have the clinical continuous measure

measure obtained via laboratory assay   that
is available in Study A

   6. Number of randomized subjects = 500   6. Number of
randomized subjects = 5,000

   7. A subset of the baseline covariates (call it 7. A
subset of the baseline covariates (call it

a.subset.frame) has 100% commonality
b.subset.frame) has 100% commonality

with b.subset.frame
with a.subset.frame

 
8. Primary endpoint is time-to-event

 

 

Here is the analysis setup:

 

I have separately split a.frame and b.frame into 'ACTIVE' and 'PLACEBO'
subjects.  

 

For the 'PLACEBO' subjects I have entered the a.subset.frame =
b.subset.frame baseline 

covariates into lrm( ).  The outcome variable is a factor variable
representing Study A = 'Y', 

so the estimated propensity scores are the estimated probabilities that a
'PLACEBO' subject is

from Study A.  I then, finally, used the %GREEDY algorithm (posted on Mayo
Clinic website)

in SAS to match 1-to-many where the Study A subjects are thought of as
'case' subjects and

the Study B subjects are thought of as 'control' subjects. [I know the
matching can be done

in R, I'm working on that now.]  The average number of Study B subjects
matched to a 

single Study A subject is approximately 5.

 

I have done a similar analysis for the 'ACTIVE' subjects.

 

 

 

Here is my question:

 

At the end, I will combine the Study B matched 'PLACEBO' and 'ACTIVE'
subjects and 

perform a Cox PH regression to compare 'PLACEBO' and 'ACTIVE' - there will
be no Study A 

subjects in this analysis.  I want to incorporate the clinical continuous
measurement "borrowed" 

from Study A as a covariate.  When doing this, how should I best take into
account the 

1-to-many matching?  Do I need to weight the Study B subjects, or can I
simply enter the 

matched Study B subjects into a Cox PH regression and ignore the 1-to-many
issue?

 

 

Kind Regards,

 

 David

 



PGP.sig
Description: PGP signature
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Display data by condition

2017-03-16 Thread Juan Ceccarelli Arias
Thank you both.
The issue was I didn't declare the database as a data frame and I also
forgot the comma...

ene=as.data.frame(data)
attach(ene)
View(ene[ene$fact>5000,])

The code listed did the trick I desired.
Again, thanks I can say the problem is solved.




On Thu, Mar 16, 2017 at 3:32 PM, Jeff Newmiller 
wrote:

> Presuming "data" is a data frame because you have not provided a minimal
> reproducible example as requested in the Posting Guide... note also that
> "data" is the name of a function in base R, so that is a potentially
> troublesome variable name.
>
>  A data frame is a list of vectors. It can be indexed either as a
> one-dimensional object of length equal to the number of columns, or as a
> two-dimensional object. You are doing the former but giving a logical index
> appropriate for the number of rows in your data frame. Go re-read the
> Introduction to R document section on indexing to figure out where the
> comma goes.
> --
> Sent from my phone. Please excuse my brevity.
>
> On March 16, 2017 8:16:29 AM PDT, Juan Ceccarelli Arias 
> wrote:
> >Hello,
> >I need to show the observations of a data set only if the earn more
> >than
> >$5000 (fact is its name in the date set). I use this:
> >
> >View(data[data$fact>5000])
> >
> >The code above shows nothing. No error or message at all.
> >What am i doing wrong?
> >Thanks for your help and time.
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coeftest with covariance matrix

2017-03-16 Thread Achim Zeileis


On Thu, 16 Mar 2017, alfonso.carf...@uniparthenope.it wrote:


Hi all,


I want to ask you which is the difference between the specifyng and not 
specifyng the covariance matrix of the estimated coefficients when 
performing the coeftest command.


coeftest(object, ...) computes Wald statistics for all coefficients. Hence 
coef(object) is used to extract the coefficients and then, by default, 
vcov(object) is used to extract the variance-covariance matrix. For lm() 
models this computes the "usual" covariance matrix estimate assuming 
homoskedastic and uncorelated errors.


When you supply coeftest(object, vcov = vcovHC) then a 
heteroscedasticity-consistent covariance matrix estimate is used (HC3 by 
default).


See vignette("sandwich", package = "sandwich") for more details.

I'm estimating a VECM model and I want to test the significance of the 
short-run casual effects of the explanatory variables:


mod<-cajorls(ca.jo(data[,4:6], ecdet = "const", type="eigen", K=2, 
spec="longrun"))$rlm


The command:

coeftest(mod)

give me different results with respect to this one:

V<-vcovHC(mod)
coeftest(mod,V)




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the transpose of R´s function forecast output and turn it into a data frame

2017-03-16 Thread Jim Lemon
Hi Paul,
As Peter noted, without knowing the structure of the the object, only
a guess can be made. Mine is:

fdf<-data.frame(Date=names(forecast),forecast=forecast)

You may want to apply as.numeric to the names.

Jim



On Thu, Mar 16, 2017 at 11:43 PM, Paul Bernal  wrote:
> Dear all,
>
> Hope you are doing great. Some R time series functions generate the
> forecasts in an horizontal way, for example:
>
> 2017 2018 20192020
> forecast12   153575
>
> but I´d like to have the output as follows:
>
>
> Date  forecast
> 2017   12
> 2018   15
> 2019   35
> 2020   75
>
> I tried using the t() function to get the transpose, but after taking the
> transpose I was not able to turn it into a data frame.
>
> Any help will be greatly appreciated,
>
> Cheers,
>
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] coeftest with covariance matrix

2017-03-16 Thread alfonso . carfora

Hi all,


I want to ask you which is the difference between the specifyng and  
not specifyng the covariance matrix of the estimated coefficients when  
performing the coeftest command.


I'm estimating a VECM model and I want to test the significance of the  
short-run casual effects of the explanatory variables:


mod<-cajorls(ca.jo(data[,4:6], ecdet = "const", type="eigen", K=2,  
spec="longrun"))$rlm


The command:

coeftest(mod)

give me different results with respect to this one:

V<-vcovHC(mod)
coeftest(mod,V)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Display data by condition

2017-03-16 Thread jim holtman
you are probably missing a comma:

View(data[data$fact > 5000, ])


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Thu, Mar 16, 2017 at 11:16 AM, Juan Ceccarelli Arias 
wrote:

> Hello,
> I need to show the observations of a data set only if the earn more than
> $5000 (fact is its name in the date set). I use this:
>
> View(data[data$fact>5000])
>
> The code above shows nothing. No error or message at all.
> What am i doing wrong?
> Thanks for your help and time.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Transposing forecasts results from nnetar function and turn them into a data frame

2017-03-16 Thread Paul Bernal
Dear friends,

I am currently using R version 3.3.3 (64-bit) and used the following code
to generate forecasts:

> library(forecast)
>
> library(tseries)

‘tseries’ version: 0.10-35

‘tseries’ is a package for time series analysis and computational
finance.

See ‘library(help="tseries")’ for details.


> DAT<-read.csv("TrainingData.csv")
>
> TSdata<-ts(DAT[,1], start=c(1994,10), frequency=12)
>
> TSmodel<-nnetar(TSdata)
>
> TSmodelForecast<-forecast(TSmodel, h=24)
>
> TSmodelForecast

The problem is that the output comes in this fashion:

JanFebMarAprMayJun JulAug
Sep   Oct
 201710  20  15  40 9 8 21 21
19 18
 201834  15   76  10  11

The format I would like to have is the following:

Date Forecast
Jan-2017   10
Feb-2017   20
Mar-2017   15
Apr-201740
May-2017   9
Jun-20178
Jul-2017 21
Aug-2017   21
Sep-2017   19
etc  etc

Is there a way to make the results look like this?

Attached is a dataset as a reference.

Best regards,

Paul
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Display data by condition

2017-03-16 Thread Jeff Newmiller
Presuming "data" is a data frame because you have not provided a minimal 
reproducible example as requested in the Posting Guide... note also that "data" 
is the name of a function in base R, so that is a potentially troublesome 
variable name. 

 A data frame is a list of vectors. It can be indexed either as a 
one-dimensional object of length equal to the number of columns, or as a 
two-dimensional object. You are doing the former but giving a logical index 
appropriate for the number of rows in your data frame. Go re-read the 
Introduction to R document section on indexing to figure out where the comma 
goes.
-- 
Sent from my phone. Please excuse my brevity.

On March 16, 2017 8:16:29 AM PDT, Juan Ceccarelli Arias  
wrote:
>Hello,
>I need to show the observations of a data set only if the earn more
>than
>$5000 (fact is its name in the date set). I use this:
>
>View(data[data$fact>5000])
>
>The code above shows nothing. No error or message at all.
>What am i doing wrong?
>Thanks for your help and time.
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] standard error for regression coefficients corresponding to factor levels

2017-03-16 Thread li li
Hi all,
  I have the following data called "data1". After fitting the ancova model
with different slopes and intercepts for each region, I calculated the
regression coefficients and the corresponding standard error. The standard
error (for intercept or for slope) are all the same for different regions.
Is there something wrong?
  I know the SE is related to (X^T X)^-1, where X is design matrix. So does
this happen whenever each factor level has the same set of values for
"week"?
 Thanks.
 Hanna



> mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))> res <- 
> matrix(NA, 5,4)> res[1,1:2] <- tmp[1,1:2]> res[2:5,1] <- tmp[1,1]+tmp[2:5,1]> 
> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)> res[1,3:4] <- tmp[6,1:2]> 
> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]> res[2:5,4] <- 
> sqrt(tmp[7:10,2]^2-tmp[6,2]^2)

> colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")> 
> rownames(res) <- letters[1:5]> res   intercept intercept SEslope   
> slope SE
a 0.18404464   0.08976301 -0.018629310 0.01385073
b 0.17605666   0.08976301 -0.022393789 0.01385073
c 0.16754130   0.08976301 -0.022367770 0.01385073
d 0.12554452   0.08976301 -0.017464385 0.01385073
e 0.06153256   0.08976301  0.007714685 0.01385073







> data1week region response
5  3  c  0.057325067
6  6  c  0.066723632
7  9  c -0.025317808
12 3  d  0.024692613
13 6  d  0.021761492
14 9  d -0.099820335
19 3  c  0.119559235
20 6  c -0.054456186
21 9  c  0.078811180
26 3  d  0.091667189
27 6  d -0.053400777
28 9  d  0.090754363
33 3  c  0.163818085
34 6  c  0.008959741
35 9  c -0.115410852
40 3  d  0.193920693
41 6  d -0.087738914
42 9  d  0.004987542
47 3  a  0.121332285
48 6  a -0.020202707
49 9  a  0.037295785
54 3  b  0.214304603
55 6  b -0.052346480
56 9  b  0.082501222
61 3  a  0.053540767
62 6  a -0.019182819
63 9  a -0.057629113
68 3  b  0.068592791
69 6  b -0.123298216
70 9  b -0.230671818
75 3  a  0.330741562
76 6  a  0.013902905
77 9  a  0.190620360
82 3  b  0.151002874
83 6  b  0.086177696
84 9  b  0.178982656
89 3  e  0.062974799
90 6  e  0.062035391
91 9  e  0.206200831
96 3  e  0.123102197
97 6  e  0.040181790
98 9  e  0.121332285
1033  e  0.147557564
1046  e  0.062035391
1059  e  0.144965770

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2: Adjusting title and labels

2017-03-16 Thread Ulrik Stervbo
Hi Georg,

If you remove the coord_polar, you'll see that the optimal y-value for the
labels is between the upper and lower bound of the stacked bar-element.

I am not sure it is the most elegant solution, but you can calculate them
like this:

df <- data.frame(group = c("Male", "Female", "Child"),
 value = c(25, 25, 50))

# Order the data.frame to match that of the final plot
df <- df[order(df$group, decreasing = TRUE), ]
# Get the upper bound of the stacked bar element
df$upper <- cumsum(df$value)
# And the lower
df$lower <- c(0, df$upper[seq_along(1:(nrow(df) - 1))])

# Now calculate the position
df$label_pos <- (df$upper - df$lower)/2 + df$lower

# And plot
blank_theme <- theme_minimal() + theme(
  axis.title.x = element_blank(),
  axis.title.y = element_blank(),
  axis.text.x = element_blank(),
  panel.border = element_blank(),
  panel.grid = element_blank(),
  axis.ticks = element_blank(),
  plot.title = element_text(size = 4, face = "bold"))

ggplot(df, aes(x = "", y = value, fill = group)) +
  geom_bar(
width = 1,
stat = "identity")+
  # coord_polar("y", start = 0) +
  scale_fill_brewer(
name = "Gruppe",
palette = "Blues") +
  blank_theme +
  geom_text(
aes(
  y = label_pos,
  label = scales::percent(value/100)),
size = 5) +
  labs(title = "Pie Title")

HTH
Ulrik


On Thu, 16 Mar 2017 at 17:24  wrote:

> Hi All,
>
> I have a question to ggplot 2. My code is the following:
>
> -- cut --
>
> library(ggplot2)
> library(scales)
>
> df <-
>   data.frame(group = c("Male", "Female", "Child"),
>  value = c(25, 25, 50))
>
> blank_theme <- theme_minimal() + theme(
>   axis.title.x = element_blank(),
>   axis.title.y = element_blank(),
>   axis.text.x = element_blank(),
>   panel.border = element_blank(),
>   panel.grid = element_blank(),
>   axis.ticks = element_blank(),
>   plot.title = element_text(size = 4, face = "bold"))
>
> ggplot(df, aes(x = "", y = value, fill = group)) +
>   geom_bar(
> width = 1,
> stat = "identity") +
>   coord_polar("y", start = 0) +
>   scale_fill_brewer(
> name = "Gruppe",
> palette = "Blues") +
>   blank_theme +
>   geom_text(
> aes(
>   y = c(10, 40, 75),
>   label = scales::percent(value/100)),
> size = 5) +
>   labs(title = "Pie Title")
>
> -- cut --
>
> Is there a way to give the position of the labels to the chunks of the pie
> in a generalized form instead of finding the value interatively by
> trial-n-error?
>
> How can I adjust the title of the graph converning font height and postion
> (e. g. center)?
>
> Kind regards
>
> Georg
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Pregunta (debate) sobre licencia R

2017-03-16 Thread javier.ruben.marcuzzi
Francisco

Recuerdo un caso en donde conocí a varios partícipes, básicamente se crea una 
empresa en Argentina con fondos y gran parte aportado por Italia. Esa 
asociación tiene 40 programadores, yo estuve con los gerentes reunidos y vi el 
lugar con las personas y las computadoras, me comentan que para proteger tienen 
contratos y compraron un sistema que todo lo que escribe cada programador se 
almacena, todo se registra.

Luego otro ingeniero me dice (mientras trotábamos por el parque) que parte de 
los fondos son del banco mundial, que él es el veedor, pero que hay otro 
contrato donde todo lo intelectual pasa al banco pudiendo repetir la 
experiencia, resulta que el desarrollo se duplica en Uruguay y otros países, 
entonces el veedor me dice que no va a trabajar más con ellos, entre todos se 
rompieron la cabeza resolviendo los problemas con los fondos y recursos que 
tenían (mucho ingenio por falta de herramientas), y cuándo el mecanismo estaba 
aceitado y funcionando hay un “copia y pega”.

En definitiva toda la seguridad que tuvieron con los empleados y ellos mismos 
para que no se filtre nada y no tener problemas de legales, fue “roto” por el 
banco mundial, el copia y pega lo podría haber realizado el que limpia el piso 
sin ningún drama, o en ese caso sí.

El resultado es que lo que la pensaron se fueron, hay sistemas de guardado de 
información sobre cada tecla que se toca, sea quien sea (compre un gran disco 
rígido), pero como dice Carlos, la legislación acá es igual, si la empresa te 
paga un sueldo es de ellos. Aunque algo también legal y no extrapolable, mi 
profesor de legislación en la universidad es el juez, y nos dijo, un empleado 
cobra salario, un profesional honorarios, por ese lado habría algo legal medio 
rebuscado y no extrapolable.

No estoy seguro si en mi comentario era el banco mundial, pero si no era este 
era el fondo monetario internacional, las naciones unidas, el banco 
interamericano de desarrollo, uno de estos se quedo con los códigos.

Javier Rubén Marcuzzi

De: Francisco Rodríguez
Enviado: jueves, 16 de marzo de 2017 14:41
Para: Carlos J. Gil Bellosta ; r-help-es
Asunto: Re: [R-es] Pregunta (debate) sobre licencia R

Hola Carlos, buenas tardes,


Pongamos un caso sencillo, un programa escrito enteramente en R que hago para 
un cliente para resolver un problema de negocia


Me refiero a exactamente a alguna de las siguientes situaciones:


1 Alguien me lo copia sin mi permiso y lo usa e incluso lo publica


2 �Alguien ajeno a mi empresa y/o cliente podria tener el derecho de exigir que 
le entregase ese codigo?


Ojo, no se me ha dado el caso ni voy a denunciar a nadie, es solo curiosidad



De: gilbello...@gmail.com  en nombre de Carlos J. Gil 
Bellosta 
Enviado: jueves, 16 de marzo de 2017 16:59
Para: Francisco Rodr�guez; r-help-es
Asunto: Re: [R-es] Pregunta (debate) sobre licencia R

Hola, �qu� tal?

�Qu� significa "basado en R"? �Modificando el c�digo fuente de R? No te 
refieres a "un programa escrito en R", �verdad?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com
datanalytics - Estad�stica y an�lisis de datos
www.datanalytics.com
Davidi: Est� claro que se reinventa mucho nombre, a veces un poco de... Jorge 
Mart�n: por cierto acabo de enterarme de que existe R-ladies. Esperar�...



El 16 de marzo de 2017, 17:43, Francisco Rodr�guez 
> escribi�:
Hola buenos d�as, una pregunta que quiero realizar de R sobre el tema de la 
licencia y que me inquieta un poco, a ver si alguien me la puede responder de 
un modo suficientemente claro o referirme a alg�n sitio donde informarme porque 
yo por el momento estoy un poco liado.


Imaginemos el siguiente ejemplo. Una empresa crea un software totalmente basado 
en R para comercializarlo o alguien realiza un proyecto de consultor�a (a modo 
freelance) con este fin, ... (o ejemplo con esa idea)


Pregunta 1: Si yo me entero de ese desarrollo �Tendr�a yo derecho a exigir que 
se me pase el c�digo fuente de todo lo realizado o de que estuviese disponible 
para su descarga?


Pregunta 2: Si alguien de forma (seguro no �tica e ilegal no s� si las dos 
cosas o una de ellas) colgase en alg�n sitio el c�digo fuente y alguien lo 
descargase y lo revendiese �Tendr�a el creador derecho a ser idemnizado por 
plagio?


Un saludo y muchas gracias por las respuestas que me deis

[[alternative HTML version deleted]]


___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[[alternative HTML version deleted]]



[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Display data by condition

2017-03-16 Thread Rui Barradas

Hello,

Maybe you're missing a comma. (I'm assuming your dataset is a data.frame 
or a matrix.)

Try

View(data[data$fact>5000, ])

To give you a better answer you need to show us the output of

str(data)

And don't name your data 'data', it already is the name of an R function.
And post in plain text, not HTML.

Hope this helps,

Rui Barradas

Em 16-03-2017 15:16, Juan Ceccarelli Arias escreveu:

Hello,
I need to show the observations of a data set only if the earn more than
$5000 (fact is its name in the date set). I use this:

View(data[data$fact>5000])

The code above shows nothing. No error or message at all.
What am i doing wrong?
Thanks for your help and time.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HELP ME: Fill NA Values from the previous Non-NA Values

2017-03-16 Thread Allan Tanaka
Amazing! Thanks for your kindness. 

On Thursday, 16 March 2017, 20:38, PIKAL Petr  
wrote:
 

 Hi

You can achieve exactly what you want by using ave/na.locf twice

dat<-data.frame(ie=rep(letters[1:3], each=3), iw=rnorm(9))
dat[c(3, 4),2]<-NA
> dat
  ie          iw
1  a  1.07254438
2  a  0.53067188
3  a          NA
4  b          NA
5  b -0.09767088
6  b -1.02719060
7  c  2.35787246
8  c -0.07513048
9  c -0.17164728

dat$iw <- ave(dat$iw, dat$ie, FUN=function(x) na.locf(x, na.rm=FALSE))
dat$iw <- ave(dat$iw, dat$ie, FUN=function(x) na.locf(x, na.rm=FALSE, 
fromLast=TRUE))

> dat
  ie          iw
1  a  1.07254438
2  a  0.53067188
3  a  0.53067188
4  b -0.09767088
5  b -0.09767088
6  b -1.02719060
7  c  2.35787246
8  c -0.07513048
9  c -0.17164728
>

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Allan
> Tanaka
> Sent: Thursday, March 16, 2017 3:16 AM
> To: William Dunlap 
> Cc: r-help@r-project.org
> Subject: Re: [R] HELP ME: Fill NA Values from the previous Non-NA Values
>
>  Hi. Thanks for the function. My bad, after looking at the csv file, it seems 
>that
> NA values come not only from previous Non-NA values but also from the
> next Non-NA values. Example:
> | NCQ05 | 11.395 |
> | NCQ05 | 11.395 |
> | NCQ05 |  |
> | NCQ06 |  |
> | NCQ06 | 13 |
> | NCQ06 | 13 |
>
>
> If i use the function, then the blank row would be filled with 11.395, instead
> of filling with 11.395 and 13.
> Does it mean that the function can be modified like this?
> locf2 <- function(x, initial=NA, IS_BAD = is.na) {
>    # Replace 'bad' values in 'x' with last previous non-bad value.
>    # If no previous non-bad value, replace with 'initial'.
>    stopifnot(is.function(IS_BAD))
>    good <- !IS_BAD(x)
>    stopifnot(is.logical(good), length(good) == length(x), !anyNA(good))
>    i <- cumsum(good)
>    x <- x[c(1,which(good))][i+1]
>    x <- x[c(1,which(good))][i+2]
>    x[i==0] <- initial
>    x
> }    On Thursday, 16 March 2017, 1:17, William Dunlap 
> wrote:
>
>
>  You could use the following function
>
> locf2 <- function(x, initial=NA, IS_BAD = is.na) {
>    # Replace 'bad' values in 'x' with last previous non-bad value.
>    # If no previous non-bad value, replace with 'initial'.
>    stopifnot(is.function(IS_BAD))
>    good <- !IS_BAD(x)
>    stopifnot(is.logical(good), length(good) == length(x), !anyNA(good))
>    i <- cumsum(good)
>    x <- x[c(1,which(good))][i+1]
>    x[i==0] <- initial
>    x
> }
>
> as in
>
> > locf2(c("", "A", "B", "", "", "C", ""), IS_BAD=function(x)x=="",
> > initial="---")
> [1] "---" "A"  "B"  "B"  "B"  "C"  "C"
> > locf2(factor(c(NA,"Small","Medium",NA,"Large",NA,NA,NA,"Small")))
> [1]   Small  Medium Medium Large  Large  Large  Large  Small
> Levels: Large Medium Small
> > locf2(c(12, NA, 10, 11, NA, NA))
> [1] 12 12 10 11 11 11
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Wed, Mar 15, 2017 at 4:08 AM, Allan Tanaka 
> wrote:
> > The following is an example:
> >
> > | Item_Identifier | Item_Weight |
> > | FDP10 | 19 |
> > | FDP10 |  |
> > | DRI11 | 8.26 |
> > | DRI11 |  |
> > | FDW12 | 8.315 |
> > | FDW12 |  |
> >
> >
> > The following is the one that i want to be. That is, filling NA values from 
> > the
> previous Non-NA values.
> > | Item_Identifier | Item_Weight |
> > | FDP10 | 19 |
> > | FDP10 | 19 |
> > | DRI11 | 8.26 |
> > | DRI11 | 8.26 |
> > | FDW12 | 8.315 |
> > | FDW12 | 8.315 |
> >
> >
> > My current code data frame: train <- read.csv("Train.csv",
> > header=T,sep = ",",na.strings = c(""," ",NA))
> >
> >
> > Some people suggest to use na.locf function but in my case, i don't have
> numeric unique values in my Item_Identifier coloumn but rather it's
> characters. Not sure what to solve this problem.
> >
> >
> >        [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>      [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv 

[R] Display data by condition

2017-03-16 Thread Juan Ceccarelli Arias
Hello,
I need to show the observations of a data set only if the earn more than
$5000 (fact is its name in the date set). I use this:

View(data[data$fact>5000])

The code above shows nothing. No error or message at all.
What am i doing wrong?
Thanks for your help and time.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in roc.default(response, predictor, auc = TRUE, ...) : No valid data provided.

2017-03-16 Thread Allan Tanaka
Check my fitted dimension:str(predict(mod, Test1)) 
 Named num [1:2131] 402 2346 1995 2205 2895 ... - attr(*, "names")= chr 
[1:2131] "1" "2" "4" "6" ...

So i want to see AUC score for my model being applied into Test1data after 
having splitting total data (Train) into Train 1 and Test 1, but i get the 
following error:Error in roc.default(response, predictor, auc = TRUE, ...) :    
No valid data provided.

Even trying this code also gives a 
malfunction:error<-sqrt((sum((Test1$Item_Outlet_Sales-preds)^2))/nrow(Test1))  
Error in Test1$Item_Outlet_Sales - preds :   non-numeric argument to binary 
operator===
Here is the code:
set.seed(1234)
split <- sample(1:nrow(Train),size=floor((nrow(Train)/4)*3)) Train1 <- 
Train[(split),]Test1 <-  
Train[-split,]outcomeName='Item_Outlet_Sales'predictorNames <- 
setdiff(names(Train1), outcomeName)mod <- lm(Item_Outlet_Sales ~ ., 
data=Train1)preds <- predict(mod, Test1[,predictorNames], se.fit = 
TRUE)print(auc(Test1[,outcomeName], preds$mod))
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quantiles with ordered categories

2017-03-16 Thread Martin Maechler
>   
> on Tue, 14 Mar 2017 21:54:42 +0100 writes:

> I found it:
> quantile(ordered(1:10), probs=0.5, type=1) 

> works, because type=1 seems to round up or down, whatever. The default 
option for is 7, which wants to interpolate, and then produces the error. 

> Two options come to my mind:

> - The error message could be improved.
> - The default type could be 1 if the data is from ordered categories.
> - Or both.

Well, it is remarkable that nobody looks at the help page (or
the source code) of quantile() to be informed.

In 'Details' it has contained

Types 1 and 3 can be used for class; "Date" and for ordered factors.

since Oct 15, 2009 ...

But I agree that the error message can be improved and have done
so now, so that instead of

"factors are not allowed"

you now get

"'type' must be 1 or 3 for ordered factors"


> It is probably a little thing to fix, but I lack the skills to do this 
myself.

(Really? -- After seeing the change you will agree it was easy .. ?)


Thank you for the suggestion.

Best regards,

Martin Maechler
ETH Zurich


> Best wishes,
> Matthias


> Von: Bert Gunter
> Gesendet: Dienstag, 14. März 2017 21:34
> An: matthias-gon...@gmx.de
> Cc: r-help@r-project.org
> Betreff: Re: [R] Quantiles with ordered categories

> Inline.
> Bert Gunter

> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


> On Tue, Mar 14, 2017 at 12:36 PM,   wrote:
>> Dear R users,
>> 
>> This works:
>> 
>> quantile(1:10, probs=0.5)
>> 
>> This fails (obviously):
>> 
>> quantile(factor(1:10), probs=0.5)
>> 
>> But why do quantiles for ordered factors not work either?
>> 
>> quantile(ordered(1:10), probs=0.5)
>> 
>> Is it because interpolation (see the optional type argument) is not 
defined?
> Yes.


> Is there an elegant workaround?
> No. How can there be? By definition, all that is assumed by an ordered
> factor is an ordering of the categories. How can you "interpolate" in
> ordered(letters[1:3]) . ASAIK there is no "a.5"  .

> -- Bert



>> 
>> Thank you.
>> 
>> Best wishes,
>> 
>> Matthias
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.


> [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Pregunta (debate) sobre licencia R

2017-03-16 Thread Carlos J. Gil Bellosta
Hola, ¿qué tal?

Pues eso depende de lo que hayas firmado. Frecuentemente, el código
pertenece  la organización que paga su desarrollo. Si trabajas (con
contrato laboral), casi seguro que el código pertenece a tu empleador.

Si eres ajeno, depende. Pero lo normal es que:

1) Si el cliente se dedica a vender código, te haya hecho firmar que el
código le pertenece.
2) Si el cliente solo quiere usar el código, casi seguro le da igual que lo
uses en otro lado. La restricción puede venir por otra parte (por ejemplo,
si el código implementa un algoritmo o método propio de esa empresa).

Yo lo he visto todo. Incluido pasarse todas esas reglas a la torera sin que
haya pasado nada en absoluto (en España).

Un saludo,

Carlos J. Gil Bellosta
http://www.datatanalytics.com



El 16 de marzo de 2017, 18:04, Francisco Rodríguez 
escribió:

> Hola Carlos, buenas tardes,
>
>
> Pongamos un caso sencillo, un programa escrito enteramente en R que hago
> para un cliente para resolver un problema de negocia
>
>
> Me refiero a exactamente a alguna de las siguientes situaciones:
>
>
> 1 Alguien me lo copia sin mi permiso y lo usa e incluso lo publica
>
>
> 2 ¿Alguien ajeno a mi empresa y/o cliente podria tener el derecho de
> exigir que le entregase ese codigo?
>
>
> Ojo, no se me ha dado el caso ni voy a denunciar a nadie, es solo
> curiosidad
>
>
> --
> *De:* gilbello...@gmail.com  en nombre de Carlos
> J. Gil Bellosta 
> *Enviado:* jueves, 16 de marzo de 2017 16:59
> *Para:* Francisco Rodríguez; r-help-es
> *Asunto:* Re: [R-es] Pregunta (debate) sobre licencia R
>
> Hola, ¿qué tal?
>
> ¿Qué significa "basado en R"? ¿Modificando el código fuente de R? No te
> refieres a "un programa escrito en R", ¿verdad?
>
> Un saludo,
>
> Carlos J. Gil Bellosta
> http://www.datanalytics.com
> datanalytics - Estadística y análisis de datos
> 
> www.datanalytics.com
> Davidi: Está claro que se reinventa mucho nombre, a veces un poco de...
> Jorge Martín: por cierto acabo de enterarme de que existe R-ladies.
> Esperaré...
>
>
> El 16 de marzo de 2017, 17:43, Francisco Rodríguez 
> escribió:
>
>> Hola buenos días, una pregunta que quiero realizar de R sobre el tema de
>> la licencia y que me inquieta un poco, a ver si alguien me la puede
>> responder de un modo suficientemente claro o referirme a algún sitio donde
>> informarme porque yo por el momento estoy un poco liado.
>>
>>
>> Imaginemos el siguiente ejemplo. Una empresa crea un software totalmente
>> basado en R para comercializarlo o alguien realiza un proyecto de
>> consultoría (a modo freelance) con este fin, ... (o ejemplo con esa idea)
>>
>>
>> Pregunta 1: Si yo me entero de ese desarrollo ¿Tendría yo derecho a
>> exigir que se me pase el código fuente de todo lo realizado o de que
>> estuviese disponible para su descarga?
>>
>>
>> Pregunta 2: Si alguien de forma (seguro no ética e ilegal no sé si las
>> dos cosas o una de ellas) colgase en algún sitio el código fuente y alguien
>> lo descargase y lo revendiese ¿Tendría el creador derecho a ser idemnizado
>> por plagio?
>>
>>
>> Un saludo y muchas gracias por las respuestas que me deis
>>
>> [[alternative HTML version deleted]]
>>
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] Pregunta (debate) sobre licencia R

2017-03-16 Thread javier.ruben.marcuzzi
Estimado Francisco Rodríguez

Hay cosas diferentes, una es usar el R con licencia “gratis” y otra con 
licencia “paga”.

Por otro lado está el código que escribes, siempre es propiedad intelectual de 
quien lo escribe, ahora bien, si este lo publica haciéndolo público en internet 
estaría renunciando al mismo si es que no se escribe algo al respecto y está 
bien documentado, ese documento es un instrumento legal de propiedad 
intelectual y depende de cada país. Tendría que consultar con un abogado y ver 
el caso específico con la legislación vigente. Algo que es en todo el mundo, 
una vez que comienza o tiene la patente en su país, tiene un año para recorrer 
el resto de las naciones y protegerlo.

Javier Rubén Marcuzzi

De: Francisco Rodríguez
Enviado: jueves, 16 de marzo de 2017 13:44
Para: r-help-es@r-project.org
Asunto: [R-es] Pregunta (debate) sobre licencia R

Hola buenos d�as, una pregunta que quiero realizar de R sobre el tema de la 
licencia y que me inquieta un poco, a ver si alguien me la puede responder de 
un modo suficientemente claro o referirme a alg�n sitio donde informarme porque 
yo por el momento estoy un poco liado.


Imaginemos el siguiente ejemplo. Una empresa crea un software totalmente basado 
en R para comercializarlo o alguien realiza un proyecto de consultor�a (a modo 
freelance) con este fin, ... (o ejemplo con esa idea)


Pregunta 1: Si yo me entero de ese desarrollo �Tendr�a yo derecho a exigir que 
se me pase el c�digo fuente de todo lo realizado o de que estuviese disponible 
para su descarga?


Pregunta 2: Si alguien de forma (seguro no �tica e ilegal no s� si las dos 
cosas o una de ellas) colgase en alg�n sitio el c�digo fuente y alguien lo 
descargase y lo revendiese �Tendr�a el creador derecho a ser idemnizado por 
plagio?


Un saludo y muchas gracias por las respuestas que me deis

[[alternative HTML version deleted]]



[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

[R-es] Pregunta (debate) sobre licencia R

2017-03-16 Thread Francisco Rodríguez
Hola buenos d�as, una pregunta que quiero realizar de R sobre el tema de la 
licencia y que me inquieta un poco, a ver si alguien me la puede responder de 
un modo suficientemente claro o referirme a alg�n sitio donde informarme porque 
yo por el momento estoy un poco liado.


Imaginemos el siguiente ejemplo. Una empresa crea un software totalmente basado 
en R para comercializarlo o alguien realiza un proyecto de consultor�a (a modo 
freelance) con este fin, ... (o ejemplo con esa idea)


Pregunta 1: Si yo me entero de ese desarrollo �Tendr�a yo derecho a exigir que 
se me pase el c�digo fuente de todo lo realizado o de que estuviese disponible 
para su descarga?


Pregunta 2: Si alguien de forma (seguro no �tica e ilegal no s� si las dos 
cosas o una de ellas) colgase en alg�n sitio el c�digo fuente y alguien lo 
descargase y lo revendiese �Tendr�a el creador derecho a ser idemnizado por 
plagio?


Un saludo y muchas gracias por las respuestas que me deis

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Test individual slope for each factor level in ANCOVA

2017-03-16 Thread li li
Hi John. Thanks much for your help. It is great to know this.
  Hanna

2017-03-16 8:02 GMT-04:00 Fox, John :

> Dear Hanna,
>
> You can test the slope in each non-reference group as a linear hypothesis.
> You didn’t make the data available for your example, so here’s an example
> using the linearHypothesis() function in the car package with the Moore
> data set in the same package:
>
> - - - snip - - -
>
> > library(car)
> > mod <- lm(conformity ~ fscore*partner.status, data=Moore)
> > summary(mod)
>
> Call:
> lm(formula = conformity ~ fscore * partner.status, data = Moore)
>
> Residuals:
> Min  1Q  Median  3Q Max
> -7.5296 -2.5984 -0.4473  2.0994 12.4704
>
> Coefficients:
>   Estimate Std. Error t value Pr(>|t|)
> (Intercept)   20.793483.26273   6.373 1.27e-07 ***
> fscore-0.151100.07171  -2.107  0.04127 *
> partner.statuslow-15.534084.40045  -3.530  0.00104 **
> fscore:partner.statuslow   0.261100.09700   2.692  0.01024 *
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 4.562 on 41 degrees of freedom
> Multiple R-squared:  0.2942,Adjusted R-squared:  0.2426
> F-statistic: 5.698 on 3 and 41 DF,  p-value: 0.002347
>
> > linearHypothesis(mod, "fscore + fscore:partner.statuslow")
> Linear hypothesis test
>
> Hypothesis:
> fscore  + fscore:partner.statuslow = 0
>
> Model 1: restricted model
> Model 2: conformity ~ fscore * partner.status
>
>   Res.DfRSS Df Sum of Sq  F  Pr(>F)
> 1 42 912.45
> 2 41 853.42  159.037 2.8363 0.09976 .
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> - - - snip - - -
>
> In this case, there are just two levels for partner.status, but for a
> multi-level factor you can simply perform more than one test.
>
>
> I hope this helps,
>
>  John
>
> -
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> Web: http://socserv.mcmaster.ca/jfox/
>
>
>
>
> On 2017-03-15, 9:43 PM, "R-help on behalf of li li"
>  wrote:
>
> >Hi all,
> >   Consider the data set where there are a continuous response variable, a
> >continuous predictor "weeks" and a categorical variable "region" with five
> >levels "a", "b", "c",
> >"d", "e".
> >  I fit the ANCOVA model as follows. Here the reference level is region
> >"a"
> >and there are 4 dummy variables. The interaction terms (in red below)
> >represent the slope
> >difference between each region and  the baseline region "a" and the
> >corresponding p-value is for testing whether this slope difference is
> >zero.
> >Is there a way to directly test whether the slope corresponding to each
> >individual factor level is 0 or not, instead of testing the slope
> >difference from the baseline level?
> >  Thanks very much.
> >  Hanna
> >
> >
> >
> >
> >
> >
> >> mod <- lm(response ~ weeks*region,data)> summary(mod)
> >Call:
> >lm(formula = response ~ weeks * region, data = data)
> >
> >Residuals:
> > Min   1Q   Median   3Q  Max
> >-0.19228 -0.07433 -0.01283  0.04439  0.24544
> >
> >Coefficients:
> >Estimate Std. Error t value Pr(>|t|)
> >(Intercept)1.2105556  0.0954567  12.682  1.2e-14 ***
> >weeks -0.021  0.0147293  -1.4480.156
> >regionb   -0.0257778  0.1349962  -0.1910.850
> >regionc   -0.034  0.1349962  -0.2550.800
> >regiond   -0.075  0.1349962  -0.5590.580
> >regione   -0.148  0.1349962  -1.0980.280weeks:regionb
> >-0.0007222  0.0208304  -0.0350.973
> >weeks:regionc -0.0017778  0.0208304  -0.0850.932
> >weeks:regiond  0.003  0.0208304   0.1440.886
> >weeks:regione  0.0301667  0.0208304   1.4480.156---
> >Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> >Residual standard error: 0.1082 on 35 degrees of freedom
> >Multiple R-squared:  0.2678,   Adjusted R-squared:  0.07946
> >F-statistic: 1.422 on 9 and 35 DF,  p-value: 0.2165
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2: Adjusting title and labels

2017-03-16 Thread G . Maubach
Hi All,

I have a question to ggplot 2. My code is the following:

-- cut --

library(ggplot2)
library(scales)

df <-
  data.frame(group = c("Male", "Female", "Child"),
 value = c(25, 25, 50))

blank_theme <- theme_minimal() + theme(
  axis.title.x = element_blank(),
  axis.title.y = element_blank(),
  axis.text.x = element_blank(),
  panel.border = element_blank(),
  panel.grid = element_blank(),
  axis.ticks = element_blank(),
  plot.title = element_text(size = 4, face = "bold"))

ggplot(df, aes(x = "", y = value, fill = group)) +
  geom_bar(
width = 1,
stat = "identity") +
  coord_polar("y", start = 0) +
  scale_fill_brewer(
name = "Gruppe",
palette = "Blues") +
  blank_theme +
  geom_text(
aes(
  y = c(10, 40, 75),
  label = scales::percent(value/100)),
size = 5) +
  labs(title = "Pie Title")

-- cut --

Is there a way to give the position of the labels to the chunks of the pie 
in a generalized form instead of finding the value interatively by 
trial-n-error?

How can I adjust the title of the graph converning font height and postion 
(e. g. center)?

Kind regards

Georg


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] screen

2017-03-16 Thread David L Carlson
Sorry. I focused on the table and not the record selection. Given the table 
this seems to be what you are looking for, but there may be an easier way:

> keep <- which(t(apply(DF2.tbl, 1, cumsum)) > .01, arr.ind=TRUE)
> keep <- keep[order(keep[, 1], keep[, 2]), ]
> keep # These are the records you want to keep
  family time
A  12
A  13
A  14
B  21
B  22
B  23
B  24
C  34
# Now turn keep into a data.frame with factors: family and time
# so it matches DF2
> rownames(keep) <- NULL
> keep <- data.frame(keep)
> keep$family <- factor(keep$family, labels=levels(DF2$family))
> keep$time <- factor(keep$time, labels=levels(DF2$time))
> keep
  family  time
1  A WEEK2
2  A WEEK3
3  A WEEK4
4  B WEEK1
5  B WEEK2
6  B WEEK3
7  B WEEK4
8  C WEEK4
> DF2.new <- merge(DF2, keep)
> DF2.new
   family  time obs
1   A WEEK2   0
2   A WEEK2   1
3   A WEEK3   1
4   A WEEK3   0
5   B WEEK1   0
6   B WEEK1   1
7   B WEEK1   1
8   B WEEK2   0
9   B WEEK2   0
10  B WEEK3   1
11  B WEEK3   0
12  C WEEK4   1
13  C WEEK4   1

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of David L Carlson
Sent: Thursday, March 16, 2017 9:01 AM
To: Val ; r-help@R-project.org (r-help@r-project.org) 

Subject: Re: [R] screen

Something like this?

> DF2.agg <- aggregate(DF2$obs, DF2[, c("family", "time")], mean)
> DF2.tbl <- xtabs(x~family+time, DF2.agg)
> DF2.tbl  time
family WEEK1 WEEK2 WEEK3 WEEK4
 A  0.00  0.50  0.50  0.00
 B  0.67  0.00  0.50  0.00
 C  0.00  0.00  0.00  1.00

You can get closer to the output in your example with this

> suppressWarnings(as.table(formatC(DF2.tbl, digits=2, width=4, 
> zero.print=".")))
  time
family WEEK1 WEEK2 WEEK3 WEEK4
 A.   0.5   0.5 . 
 B 0.67 .   0.5 . 
 C. . . 1

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Val
Sent: Wednesday, March 15, 2017 5:41 PM
To: r-help@R-project.org (r-help@r-project.org) 
Subject: [R] screen

HI all,

I have some data to be screened  based on the recording flag (obs).
Some family recorded properly (1) and others not (0).  Th 0 = improper
and 1 = proper

The recording  period starts week1.  All families may not start in the
same week in recording properly an observation,

  DF2 <- read.table(header=TRUE, text='family time obs
A  WEEK1 0
A  WEEK1 0
A  WEEK1 0
A  WEEK2 1
A  WEEK2 0
A  WEEK3 1
A  WEEK3 0
B  WEEK1 1
B  WEEK1 0
B  WEEK1 1
B  WEEK2 0
B  WEEK2 0
B  WEEK3 1
B  WEEK3 0
C  WEEK3 0
C  WEEK3 0
C  WEEK4 1
C  WEEK4 1')

Example, in week1  all records of family "A" are 0 (improper), but
starting the week2 they start recording proper (1) records as well.
Then I create a table that shows me the ratio of proper records to the
total records for each family within week. If the ratio is zero and
there is no prior proper recordings for that family then I want to
delete those records.

However,  once any family started showing proper records  as "1"  and
even if in the  the subsequent week the ratio is 0  then I want keep
that record for that family. Example records of week2 for family B

Here is the summary table

  WEEK1  WEEK2WEEK3WEEK4
A  00.5  0.5   .
B   0.33   00.5   .
C  .   . 01

>From the above table
For A-  I want exclude all records of week1 and keep the rest. Because
they were not recording it propeller
For B-  Keep all records, as they stated recording properly from the beginning.
For C-  Keep only the week4 records because all records are  1's

Final and desired  result will be

A WEEK2 1
A WEEK2 0
A WEEK3 1
A WEEK3 0
B WEEK1 1
B WEEK1 0
B WEEK1 1
B WEEK2 0
B WEEK2 0
B WEEK3 1
B WEEK3 0
C WEEK4 1
C WEEK4 1


and the summary table looks like as follows

   WEEK1  WEEK2  WEEK3  WEEK4
A .0.5 0.5.
B  0.330 0.5.
C   .  . .1

Thank you in advance

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] add median value and standard deviation bar to lattice plot

2017-03-16 Thread Bert Gunter
Just add whatever further code to decorate the groups as you like
within the panel.groups function. I believe I have given you
sufficient information in my code for you to do that if you study the
code carefully. Depending on what you decide to do -- which is
statistical and OT here (and not something I would offer specific
advice on remotely anyway) -- you may also have to pass down
additional arguments based on computations that you do with *all* the
data from *all* groups together.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Mar 16, 2017 at 1:38 AM, Luigi Marongiu
 wrote:
> dear Bert,
> thank you for the solution, it worked perfectly. However I still would
> like to know how reliable are the dots that are plotted, that is why i
> would like to have individual bars on each dot (if possible). the
> standard deviation maybe is not the right tool and the confidence
> interval is perhaps better, but the procedure should be the same: draw
> an arrow from the lower to the upper limit. is that possible?
> regards,
> luigi
>
> PS sorry for the formatting, usually plain text is my default; it
> should have switched to html when i replied to a previous email but
> the difference does not show up when i type...
>
> On Wed, Mar 15, 2017 at 4:28 PM, Bert Gunter  wrote:
>> There may be a specific function that handles this for you, but to
>> roll your own, you need a custom panel.groups function, not the
>> default. You need to modify the panel function (which is
>> panel.superpose by default) to pass down the "col" argument to the
>> panel.segments call in the panel.groups function.
>>
>> This should get you started:
>>
>> useOuterStrips(
>>strip = strip.custom(par.strip.text = list(cex = 0.75)),
>>strip.left = strip.custom(par.strip.text = list(cex = 0.75)),
>>stripplot(
>>   average ~ type|target+cluster,
>>   panel = function(x,y,col,...)
>>  panel.superpose(x,y,col=col,...),
>>   panel.groups = function(x,y,col,...){
>>  panel.stripplot(x,y,col=col,...)
>>  m <- median(y)
>>  panel.segments(x0 = x[1] -.5, y0 = m,
>> x1 = x[1] +.5, y1 = m,
>> col=col, lwd=2
>> )
>>   },
>>   my.data,
>>   groups = type,
>>   pch=1,
>>   jitter.data = TRUE,
>>   main = "Group-wise",
>>   xlab = expression(bold("Target")), ylab = expression(bold("Reading")),
>>   col = c("grey", "green", "red"),
>>   par.settings = list(strip.background = list(col=c("paleturquoise",
>> "grey"))),
>>   scales = list(alternating = FALSE, x=list(draw=FALSE)),
>>   key = list(
>>  space = "top",
>>  columns = 3,
>>  text = list(c("Blank", "Negative", "Positive"), col="black"),
>>  rectangles = list(col=c("grey", "green", "red"))
>>   )
>>)
>> )
>>
>> FWIW, I think adding 1 sd bars is a bad idea statistically.
>>
>> And though it made no difference here, please post in pain text, not HTML.
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Mar 15, 2017 at 2:22 AM, Luigi Marongiu
>>  wrote:
>>> Dear all,
>>> I am analyzing some multivariate data that is organized like this:
>>> 1st variable = cluster (A or B)
>>> 2nd variable = target (a, b, c, d, e)
>>> 3rd variable = type (blank, negative, positive)
>>> 4th variable = sample (the actual name of the sample)
>>> 5th variable = average (the actual reading -- please not that this is the
>>> mean of different measures with an assumed normal distribution, but the
>>> assumption might not always be true)
>>> 6th variable = stdev (the standard deviation associated with each reading)
>>> 7th variable = ll (lower limit that is average stdev)
>>> 8th variable = ul (upper limit that is average + stdev)
>>>
>>> I am plotting the data using lattice's stripplot and I would need to add:
>>> 1. an error bar for each measurement. the bar should be possibly coloured
>>> in light grey and semitransparent to reduce the noise of the plot.
>>> 2. a type-based median bar to show differences in measurements between
>>> blanks, negative and positive samples within each panel.
>>>
>>> How would I do that?
>>> Many thanks,
>>> Luigi
>>>
>>
>>> cluster <- c(rep("A", 90), rep("B", 100))
>>> sample <- c(
>>>   rep(c("cow-01", "cow-02", "cow-03", "cow-04", "cow-05", "cow-06",
>>> "cow-07", "cow-08", "cow-09", "cow-10", "cow-11",
>>> "cow-12", "cow-13", "cow-14", "cow-15", "cow-16", "cow-17",
>>> "blank"), 5),
>>>   rep(c("cow-26", "cow-35", "cow-36", "cow-37", 

Re: [R] force axis to extend

2017-03-16 Thread David L Carlson
Use

plot(NA, xlim=c(0, 5), ylim=c(-35, 35), type="n", axes=FALSE, ann=FALSE)

to set up the length of the y axis instead of your first plot command.

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon
Sent: Wednesday, March 15, 2017 9:09 PM
To: Jen 
Cc: r-help mailing list 
Subject: Re: [R] force axis to extend

Hi Jen,
It seems way too simple, but does this work?

axis(side=2,at=seq(-35,35,by=5),cex.axis=0.7)

You may want to consider using a pyramid plot for this.

Jim


On Thu, Mar 16, 2017 at 11:45 AM, Jen  wrote:
> Hi,  I'm creating a couple of mirrored bar plots.  Below is data and code
> for one.
>
> My problem is that I need the axis to go from -35 to 35 by 5.  I can't get
> that to happen with the code below.  I need it so all my plots are on the
> same scale.
>
> How can I do that using barplot? For reasons, I can't use ggplot or
> lattice.
>
> Thanks,
>
> Jen
>
>
>
> df <- data.frame(matrix(c(
> '18-29','Females',  23.221039,
> '30-44','Females',  16.665565,
> '45-59','Females',  7.173238,
> '60+',  'Females',  4.275979,
> '18-29','Males',-22.008875,
> '30-44','Males',-15.592936,
> '45-59','Males',-7.312195,
> '60+',  'Males',-3.750173),
> nrow=8, ncol=3, byrow=T,
> dimnames=list(NULL, c("Age", "Sex", "Percent"
>
> df$Percent <- as.numeric(as.character(df$Percent))
>
> midf <- barplot(height = df$Percent[df$Sex == "Females"])
>
> # distribution of men and women with solid fill
>
> plot(c(0,5),range(df$Percent),type = "n", axes=FALSE, ann=F)
>
> barplot(height = df$Percent[df$Sex == "Females"], add = TRUE,axes = FALSE,
> col="#b498ec", ylab="")
>
> barplot(height = df$Percent[df$Sex == "Males"], add = TRUE, axes = F,
> col="#f8bb85", ylab="",
> names.arg=c("18-29", "30-44", "45-59", "60+"))
>
> axis(side=2, at = seq(-35,35,by=5),
>  labels=format(abs(seq(-35,35,by=5)), scientific=F),
>  cex.axis=0.7)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] screen

2017-03-16 Thread David L Carlson
Something like this?

> DF2.agg <- aggregate(DF2$obs, DF2[, c("family", "time")], mean)
> DF2.tbl <- xtabs(x~family+time, DF2.agg)
> DF2.tbl  time
family WEEK1 WEEK2 WEEK3 WEEK4
 A  0.00  0.50  0.50  0.00
 B  0.67  0.00  0.50  0.00
 C  0.00  0.00  0.00  1.00

You can get closer to the output in your example with this

> suppressWarnings(as.table(formatC(DF2.tbl, digits=2, width=4, 
> zero.print=".")))
  time
family WEEK1 WEEK2 WEEK3 WEEK4
 A.   0.5   0.5 . 
 B 0.67 .   0.5 . 
 C. . . 1

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Val
Sent: Wednesday, March 15, 2017 5:41 PM
To: r-help@R-project.org (r-help@r-project.org) 
Subject: [R] screen

HI all,

I have some data to be screened  based on the recording flag (obs).
Some family recorded properly (1) and others not (0).  Th 0 = improper
and 1 = proper

The recording  period starts week1.  All families may not start in the
same week in recording properly an observation,

  DF2 <- read.table(header=TRUE, text='family time obs
A  WEEK1 0
A  WEEK1 0
A  WEEK1 0
A  WEEK2 1
A  WEEK2 0
A  WEEK3 1
A  WEEK3 0
B  WEEK1 1
B  WEEK1 0
B  WEEK1 1
B  WEEK2 0
B  WEEK2 0
B  WEEK3 1
B  WEEK3 0
C  WEEK3 0
C  WEEK3 0
C  WEEK4 1
C  WEEK4 1')

Example, in week1  all records of family "A" are 0 (improper), but
starting the week2 they start recording proper (1) records as well.
Then I create a table that shows me the ratio of proper records to the
total records for each family within week. If the ratio is zero and
there is no prior proper recordings for that family then I want to
delete those records.

However,  once any family started showing proper records  as "1"  and
even if in the  the subsequent week the ratio is 0  then I want keep
that record for that family. Example records of week2 for family B

Here is the summary table

  WEEK1  WEEK2WEEK3WEEK4
A  00.5  0.5   .
B   0.33   00.5   .
C  .   . 01

>From the above table
For A-  I want exclude all records of week1 and keep the rest. Because
they were not recording it propeller
For B-  Keep all records, as they stated recording properly from the beginning.
For C-  Keep only the week4 records because all records are  1's

Final and desired  result will be

A WEEK2 1
A WEEK2 0
A WEEK3 1
A WEEK3 0
B WEEK1 1
B WEEK1 0
B WEEK1 1
B WEEK2 0
B WEEK2 0
B WEEK3 1
B WEEK3 0
C WEEK4 1
C WEEK4 1


and the summary table looks like as follows

   WEEK1  WEEK2  WEEK3  WEEK4
A .0.5 0.5.
B  0.330 0.5.
C   .  . .1

Thank you in advance

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] force axis to extend

2017-03-16 Thread Jen
Yay!  That worked!  Thanks so much!

Jen

On Thu, Mar 16, 2017, 9:28 AM David L Carlson  wrote:

> Use
>
> plot(NA, xlim=c(0, 5), ylim=c(-35, 35), type="n", axes=FALSE, ann=FALSE)
>
> to set up the length of the y axis instead of your first plot command.
>
> -
> David L Carlson
> Department of Anthropology
> Texas A University
> College Station, TX 77840-4352
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon
> Sent: Wednesday, March 15, 2017 9:09 PM
> To: Jen 
> Cc: r-help mailing list 
> Subject: Re: [R] force axis to extend
>
> Hi Jen,
> It seems way too simple, but does this work?
>
> axis(side=2,at=seq(-35,35,by=5),cex.axis=0.7)
>
> You may want to consider using a pyramid plot for this.
>
> Jim
>
>
> On Thu, Mar 16, 2017 at 11:45 AM, Jen 
> wrote:
> > Hi,  I'm creating a couple of mirrored bar plots.  Below is data and code
> > for one.
> >
> > My problem is that I need the axis to go from -35 to 35 by 5.  I can't
> get
> > that to happen with the code below.  I need it so all my plots are on the
> > same scale.
> >
> > How can I do that using barplot? For reasons, I can't use ggplot or
> > lattice.
> >
> > Thanks,
> >
> > Jen
> >
> >
> >
> > df <- data.frame(matrix(c(
> > '18-29','Females',  23.221039,
> > '30-44','Females',  16.665565,
> > '45-59','Females',  7.173238,
> > '60+',  'Females',  4.275979,
> > '18-29','Males',-22.008875,
> > '30-44','Males',-15.592936,
> > '45-59','Males',-7.312195,
> > '60+',  'Males',-3.750173),
> > nrow=8, ncol=3, byrow=T,
> > dimnames=list(NULL, c("Age", "Sex", "Percent"
> >
> > df$Percent <- as.numeric(as.character(df$Percent))
> >
> > midf <- barplot(height = df$Percent[df$Sex == "Females"])
> >
> > # distribution of men and women with solid fill
> >
> > plot(c(0,5),range(df$Percent),type = "n", axes=FALSE, ann=F)
> >
> > barplot(height = df$Percent[df$Sex == "Females"], add = TRUE,axes =
> FALSE,
> > col="#b498ec", ylab="")
> >
> > barplot(height = df$Percent[df$Sex == "Males"], add = TRUE, axes = F,
> > col="#f8bb85", ylab="",
> > names.arg=c("18-29", "30-44", "45-59", "60+"))
> >
> > axis(side=2, at = seq(-35,35,by=5),
> >  labels=format(abs(seq(-35,35,by=5)), scientific=F),
> >  cex.axis=0.7)
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] force axis to extend

2017-03-16 Thread S Ellison
> Unfortunately, that doesn't work.   The axis automatically scales to (-30,
> 25, by 5).

Your data do not extend to ±35 so the axis limits you have asked axis() for are 
outside the plot region. it is plot() that is (correctly) defaulting to the 
data range. if you want to override that to get a larger plot range, specify 
the y limits for the plot as part of the plot command:

plot(c(0,5),range(df$Percent),type = "n", axes=FALSE, ann=F, ylim=c(-35, 35))

barplot(height = df$Percent[df$Sex == "Females"], add = TRUE,axes = FALSE,
col="#b498ec", ylab="")

barplot(height = df$Percent[df$Sex == "Males"], add = TRUE, axes = F, 
col="#f8bb85", ylab="",
names.arg=c("18-29", "30-44", "45-59", "60+"))

axis(side=2, at = seq(-35,35,by=5),
 labels=format(abs(seq(-35,35,by=5)), scientific=F),
 cex.axis=0.7)
#Axes now run to ±35

S Ellison
> 
> Jen
> 
> 
> 
> On Wed, Mar 15, 2017, 10:09 PM Jim Lemon 
> wrote:
> 
> > Hi Jen,
> > It seems way too simple, but does this work?
> >
> > axis(side=2,at=seq(-35,35,by=5),cex.axis=0.7)
> >
> > You may want to consider using a pyramid plot for this.
> >
> > Jim
> >
> >
> > On Thu, Mar 16, 2017 at 11:45 AM, Jen 
> > wrote:
> > > Hi,  I'm creating a couple of mirrored bar plots.  Below is data and
> > > code for one.
> > >
> > > My problem is that I need the axis to go from -35 to 35 by 5.  I
> > > can't
> > get
> > > that to happen with the code below.  I need it so all my plots are
> > > on the same scale.
> > >
> > > How can I do that using barplot? For reasons, I can't use ggplot or
> > > lattice.
> > >
> > > Thanks,
> > >
> > > Jen
> > >
> > >
> > >
> > > df <- data.frame(matrix(c(
> > > '18-29','Females',  23.221039,
> > > '30-44','Females',  16.665565,
> > > '45-59','Females',  7.173238,
> > > '60+',  'Females',  4.275979,
> > > '18-29','Males',-22.008875,
> > > '30-44','Males',-15.592936,
> > > '45-59','Males',-7.312195,
> > > '60+',  'Males',-3.750173),
> > > nrow=8, ncol=3, byrow=T,
> > > dimnames=list(NULL, c("Age", "Sex", "Percent"
> > >
> > > df$Percent <- as.numeric(as.character(df$Percent))
> > >
> > > midf <- barplot(height = df$Percent[df$Sex == "Females"])
> > >
> > > # distribution of men and women with solid fill
> > >
> > > plot(c(0,5),range(df$Percent),type = "n", axes=FALSE, ann=F)
> > >
> > > barplot(height = df$Percent[df$Sex == "Females"], add = TRUE,axes =
> > FALSE,
> > > col="#b498ec", ylab="")
> > >
> > > barplot(height = df$Percent[df$Sex == "Males"], add = TRUE, axes =
> > > F, col="#f8bb85", ylab="",
> > > names.arg=c("18-29", "30-44", "45-59", "60+"))
> > >
> > > axis(side=2, at = seq(-35,35,by=5),
> > >  labels=format(abs(seq(-35,35,by=5)), scientific=F),
> > >  cex.axis=0.7)
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HELP ME: Fill NA Values from the previous Non-NA Values

2017-03-16 Thread PIKAL Petr
Hi

You can achieve exactly what you want by using ave/na.locf twice

dat<-data.frame(ie=rep(letters[1:3], each=3), iw=rnorm(9))
dat[c(3, 4),2]<-NA
> dat
  ie  iw
1  a  1.07254438
2  a  0.53067188
3  a  NA
4  b  NA
5  b -0.09767088
6  b -1.02719060
7  c  2.35787246
8  c -0.07513048
9  c -0.17164728

dat$iw <- ave(dat$iw, dat$ie, FUN=function(x) na.locf(x, na.rm=FALSE))
dat$iw <- ave(dat$iw, dat$ie, FUN=function(x) na.locf(x, na.rm=FALSE, 
fromLast=TRUE))

> dat
  ie  iw
1  a  1.07254438
2  a  0.53067188
3  a  0.53067188
4  b -0.09767088
5  b -0.09767088
6  b -1.02719060
7  c  2.35787246
8  c -0.07513048
9  c -0.17164728
>

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Allan
> Tanaka
> Sent: Thursday, March 16, 2017 3:16 AM
> To: William Dunlap 
> Cc: r-help@r-project.org
> Subject: Re: [R] HELP ME: Fill NA Values from the previous Non-NA Values
>
>  Hi. Thanks for the function. My bad, after looking at the csv file, it seems 
> that
> NA values come not only from previous Non-NA values but also from the
> next Non-NA values. Example:
> | NCQ05 | 11.395 |
> | NCQ05 | 11.395 |
> | NCQ05 |  |
> | NCQ06 |  |
> | NCQ06 | 13 |
> | NCQ06 | 13 |
>
>
> If i use the function, then the blank row would be filled with 11.395, instead
> of filling with 11.395 and 13.
> Does it mean that the function can be modified like this?
> locf2 <- function(x, initial=NA, IS_BAD = is.na) {
> # Replace 'bad' values in 'x' with last previous non-bad value.
> # If no previous non-bad value, replace with 'initial'.
> stopifnot(is.function(IS_BAD))
> good <- !IS_BAD(x)
> stopifnot(is.logical(good), length(good) == length(x), !anyNA(good))
> i <- cumsum(good)
> x <- x[c(1,which(good))][i+1]
> x <- x[c(1,which(good))][i+2]
> x[i==0] <- initial
> x
> }On Thursday, 16 March 2017, 1:17, William Dunlap 
> wrote:
>
>
>  You could use the following function
>
> locf2 <- function(x, initial=NA, IS_BAD = is.na) {
> # Replace 'bad' values in 'x' with last previous non-bad value.
> # If no previous non-bad value, replace with 'initial'.
> stopifnot(is.function(IS_BAD))
> good <- !IS_BAD(x)
> stopifnot(is.logical(good), length(good) == length(x), !anyNA(good))
> i <- cumsum(good)
> x <- x[c(1,which(good))][i+1]
> x[i==0] <- initial
> x
> }
>
> as in
>
> > locf2(c("", "A", "B", "", "", "C", ""), IS_BAD=function(x)x=="",
> > initial="---")
> [1] "---" "A"  "B"  "B"  "B"  "C"  "C"
> > locf2(factor(c(NA,"Small","Medium",NA,"Large",NA,NA,NA,"Small")))
> [1]   Small  Medium Medium Large  Large  Large  Large  Small
> Levels: Large Medium Small
> > locf2(c(12, NA, 10, 11, NA, NA))
> [1] 12 12 10 11 11 11
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Wed, Mar 15, 2017 at 4:08 AM, Allan Tanaka 
> wrote:
> > The following is an example:
> >
> > | Item_Identifier | Item_Weight |
> > | FDP10 | 19 |
> > | FDP10 |  |
> > | DRI11 | 8.26 |
> > | DRI11 |  |
> > | FDW12 | 8.315 |
> > | FDW12 |  |
> >
> >
> > The following is the one that i want to be. That is, filling NA values from 
> > the
> previous Non-NA values.
> > | Item_Identifier | Item_Weight |
> > | FDP10 | 19 |
> > | FDP10 | 19 |
> > | DRI11 | 8.26 |
> > | DRI11 | 8.26 |
> > | FDW12 | 8.315 |
> > | FDW12 | 8.315 |
> >
> >
> > My current code data frame: train <- read.csv("Train.csv",
> > header=T,sep = ",",na.strings = c(""," ",NA))
> >
> >
> > Some people suggest to use na.locf function but in my case, i don't have
> numeric unique values in my Item_Identifier coloumn but rather it's
> characters. Not sure what to solve this problem.
> >
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu 

Re: [R] force axis to extend

2017-03-16 Thread Jen
Hi Jim,

Thanks for replying.

Unfortunately, that doesn't work.   The axis automatically scales to (-30,
25, by 5).

Jen



On Wed, Mar 15, 2017, 10:09 PM Jim Lemon  wrote:

> Hi Jen,
> It seems way too simple, but does this work?
>
> axis(side=2,at=seq(-35,35,by=5),cex.axis=0.7)
>
> You may want to consider using a pyramid plot for this.
>
> Jim
>
>
> On Thu, Mar 16, 2017 at 11:45 AM, Jen 
> wrote:
> > Hi,  I'm creating a couple of mirrored bar plots.  Below is data and code
> > for one.
> >
> > My problem is that I need the axis to go from -35 to 35 by 5.  I can't
> get
> > that to happen with the code below.  I need it so all my plots are on the
> > same scale.
> >
> > How can I do that using barplot? For reasons, I can't use ggplot or
> > lattice.
> >
> > Thanks,
> >
> > Jen
> >
> >
> >
> > df <- data.frame(matrix(c(
> > '18-29','Females',  23.221039,
> > '30-44','Females',  16.665565,
> > '45-59','Females',  7.173238,
> > '60+',  'Females',  4.275979,
> > '18-29','Males',-22.008875,
> > '30-44','Males',-15.592936,
> > '45-59','Males',-7.312195,
> > '60+',  'Males',-3.750173),
> > nrow=8, ncol=3, byrow=T,
> > dimnames=list(NULL, c("Age", "Sex", "Percent"
> >
> > df$Percent <- as.numeric(as.character(df$Percent))
> >
> > midf <- barplot(height = df$Percent[df$Sex == "Females"])
> >
> > # distribution of men and women with solid fill
> >
> > plot(c(0,5),range(df$Percent),type = "n", axes=FALSE, ann=F)
> >
> > barplot(height = df$Percent[df$Sex == "Females"], add = TRUE,axes =
> FALSE,
> > col="#b498ec", ylab="")
> >
> > barplot(height = df$Percent[df$Sex == "Males"], add = TRUE, axes = F,
> > col="#f8bb85", ylab="",
> > names.arg=c("18-29", "30-44", "45-59", "60+"))
> >
> > axis(side=2, at = seq(-35,35,by=5),
> >  labels=format(abs(seq(-35,35,by=5)), scientific=F),
> >  cex.axis=0.7)
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the transpose of R´s function forecast output and turn it into a data frame

2017-03-16 Thread peter dalgaard
You're not giving us much to play with here. Reproducible example, please.

(Remember to send it to the list, not me.)

My immediate guess was cbind(), but without knowing the data structure, I can't 
tell for sure.

-pd

> On 16 Mar 2017, at 13:43 , Paul Bernal  wrote:
> 
> Dear all,
> 
> Hope you are doing great. Some R time series functions generate the
> forecasts in an horizontal way, for example:
> 
>2017 2018 20192020
> forecast12   153575
> 
> but I´d like to have the output as follows:
> 
> 
> Date  forecast
> 2017   12
> 2018   15
> 2019   35
> 2020   75
> 
> I tried using the t() function to get the transpose, but after taking the
> transpose I was not able to turn it into a data frame.
> 
> Any help will be greatly appreciated,
> 
> Cheers,
> 
> Paul
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HELP ME: Fill NA Values from the previous Non-NA Values

2017-03-16 Thread Allan Tanaka
 Hi. Thanks for the function. My bad, after looking at the csv file, it seems 
that NA values come not only from previous Non-NA values but also from the next 
Non-NA values. Example:
| NCQ05 | 11.395 |
| NCQ05 | 11.395 |
| NCQ05 |  |
| NCQ06 |  |
| NCQ06 | 13 |
| NCQ06 | 13 |


If i use the function, then the blank row would be filled with 11.395, instead 
of filling with 11.395 and 13.
Does it mean that the function can be modified like this?
locf2 <- function(x, initial=NA, IS_BAD = is.na) {
    # Replace 'bad' values in 'x' with last previous non-bad value.
    # If no previous non-bad value, replace with 'initial'.
    stopifnot(is.function(IS_BAD))
    good <- !IS_BAD(x)
    stopifnot(is.logical(good), length(good) == length(x), !anyNA(good))
    i <- cumsum(good)
    x <- x[c(1,which(good))][i+1]
    x <- x[c(1,which(good))][i+2]
    x[i==0] <- initial
    x
}On Thursday, 16 March 2017, 1:17, William Dunlap  wrote:
 

 You could use the following function

locf2 <- function(x, initial=NA, IS_BAD = is.na) {
    # Replace 'bad' values in 'x' with last previous non-bad value.
    # If no previous non-bad value, replace with 'initial'.
    stopifnot(is.function(IS_BAD))
    good <- !IS_BAD(x)
    stopifnot(is.logical(good), length(good) == length(x), !anyNA(good))
    i <- cumsum(good)
    x <- x[c(1,which(good))][i+1]
    x[i==0] <- initial
    x
}

as in

> locf2(c("", "A", "B", "", "", "C", ""), IS_BAD=function(x)x=="", 
> initial="---")
[1] "---" "A"  "B"  "B"  "B"  "C"  "C"
> locf2(factor(c(NA,"Small","Medium",NA,"Large",NA,NA,NA,"Small")))
[1]   Small  Medium Medium Large  Large  Large  Large  Small
Levels: Large Medium Small
> locf2(c(12, NA, 10, 11, NA, NA))
[1] 12 12 10 11 11 11

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Mar 15, 2017 at 4:08 AM, Allan Tanaka  wrote:
> The following is an example:
>
> | Item_Identifier | Item_Weight |
> | FDP10 | 19 |
> | FDP10 |  |
> | DRI11 | 8.26 |
> | DRI11 |  |
> | FDW12 | 8.315 |
> | FDW12 |  |
>
>
> The following is the one that i want to be. That is, filling NA values from 
> the previous Non-NA values.
> | Item_Identifier | Item_Weight |
> | FDP10 | 19 |
> | FDP10 | 19 |
> | DRI11 | 8.26 |
> | DRI11 | 8.26 |
> | FDW12 | 8.315 |
> | FDW12 | 8.315 |
>
>
> My current code data frame: train <- read.csv("Train.csv", header=T,sep = 
> ",",na.strings = c(""," ",NA))
>
>
> Some people suggest to use na.locf function but in my case, i don't have 
> numeric unique values in my Item_Identifier coloumn but rather it's 
> characters. Not sure what to solve this problem.
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to get the transpose of R´s function forecast output and turn it into a data frame

2017-03-16 Thread Paul Bernal
Dear all,

Hope you are doing great. Some R time series functions generate the
forecasts in an horizontal way, for example:

2017 2018 20192020
forecast12   153575

but I´d like to have the output as follows:


Date  forecast
2017   12
2018   15
2019   35
2020   75

I tried using the t() function to get the transpose, but after taking the
transpose I was not able to turn it into a data frame.

Any help will be greatly appreciated,

Cheers,

Paul

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulate data from Structural Equation Model

2017-03-16 Thread Troels Ring

Hi - running rseek.org
with
simulate structutral model data
seems to give some inputs?
BW
Troels


Den 16-03-2017 kl. 12:34 skrev Michael Haenlein:

Dear all,

I am looking for an R package or code that allows me to simulate data
consistent with a given structural equation model. Essentially my idea is
to define (a) the number of endogenous and exogenous latent variables, (b)
the strength of relationship between them and (c) the way of measurement
(number of indicators, distribution of indicators) and to obtain simulated
data consistent with this specification.

I know there is some literature on this topic (e.g., Mattson, S. (1997).
How to generate non-normal data for simulation of structural equation
models. Multivariate behavioral research, 32(4), 355 – 373), but I do not
know whether some of these approaches have already been implanted in R and/
or whether better methods exist.

Any help would be very much appreciated,

Thanks,

Michael


Michael Haenlein
Professor of Marketing
ESCP Europe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test individual slope for each factor level in ANCOVA

2017-03-16 Thread Fox, John
Dear Hanna,

You can test the slope in each non-reference group as a linear hypothesis.
You didn’t make the data available for your example, so here’s an example
using the linearHypothesis() function in the car package with the Moore
data set in the same package:

- - - snip - - -

> library(car)
> mod <- lm(conformity ~ fscore*partner.status, data=Moore)
> summary(mod)

Call:
lm(formula = conformity ~ fscore * partner.status, data = Moore)

Residuals:
Min  1Q  Median  3Q Max
-7.5296 -2.5984 -0.4473  2.0994 12.4704

Coefficients:
  Estimate Std. Error t value Pr(>|t|)
(Intercept)   20.793483.26273   6.373 1.27e-07 ***
fscore-0.151100.07171  -2.107  0.04127 *
partner.statuslow-15.534084.40045  -3.530  0.00104 **
fscore:partner.statuslow   0.261100.09700   2.692  0.01024 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.562 on 41 degrees of freedom
Multiple R-squared:  0.2942,Adjusted R-squared:  0.2426
F-statistic: 5.698 on 3 and 41 DF,  p-value: 0.002347

> linearHypothesis(mod, "fscore + fscore:partner.statuslow")
Linear hypothesis test

Hypothesis:
fscore  + fscore:partner.statuslow = 0

Model 1: restricted model
Model 2: conformity ~ fscore * partner.status

  Res.DfRSS Df Sum of Sq  F  Pr(>F)
1 42 912.45
2 41 853.42  159.037 2.8363 0.09976 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

- - - snip - - -

In this case, there are just two levels for partner.status, but for a
multi-level factor you can simply perform more than one test.


I hope this helps,

 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/




On 2017-03-15, 9:43 PM, "R-help on behalf of li li"
 wrote:

>Hi all,
>   Consider the data set where there are a continuous response variable, a
>continuous predictor "weeks" and a categorical variable "region" with five
>levels "a", "b", "c",
>"d", "e".
>  I fit the ANCOVA model as follows. Here the reference level is region
>"a"
>and there are 4 dummy variables. The interaction terms (in red below)
>represent the slope
>difference between each region and  the baseline region "a" and the
>corresponding p-value is for testing whether this slope difference is
>zero.
>Is there a way to directly test whether the slope corresponding to each
>individual factor level is 0 or not, instead of testing the slope
>difference from the baseline level?
>  Thanks very much.
>  Hanna
>
>
>
>
>
>
>> mod <- lm(response ~ weeks*region,data)> summary(mod)
>Call:
>lm(formula = response ~ weeks * region, data = data)
>
>Residuals:
> Min   1Q   Median   3Q  Max
>-0.19228 -0.07433 -0.01283  0.04439  0.24544
>
>Coefficients:
>Estimate Std. Error t value Pr(>|t|)
>(Intercept)1.2105556  0.0954567  12.682  1.2e-14 ***
>weeks -0.021  0.0147293  -1.4480.156
>regionb   -0.0257778  0.1349962  -0.1910.850
>regionc   -0.034  0.1349962  -0.2550.800
>regiond   -0.075  0.1349962  -0.5590.580
>regione   -0.148  0.1349962  -1.0980.280weeks:regionb
>-0.0007222  0.0208304  -0.0350.973
>weeks:regionc -0.0017778  0.0208304  -0.0850.932
>weeks:regiond  0.003  0.0208304   0.1440.886
>weeks:regione  0.0301667  0.0208304   1.4480.156---
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>Residual standard error: 0.1082 on 35 degrees of freedom
>Multiple R-squared:  0.2678,   Adjusted R-squared:  0.07946
>F-statistic: 1.422 on 9 and 35 DF,  p-value: 0.2165
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simulate data from Structural Equation Model

2017-03-16 Thread Michael Haenlein
Dear all,

I am looking for an R package or code that allows me to simulate data
consistent with a given structural equation model. Essentially my idea is
to define (a) the number of endogenous and exogenous latent variables, (b)
the strength of relationship between them and (c) the way of measurement
(number of indicators, distribution of indicators) and to obtain simulated
data consistent with this specification.

I know there is some literature on this topic (e.g., Mattson, S. (1997).
How to generate non-normal data for simulation of structural equation
models. Multivariate behavioral research, 32(4), 355 – 373), but I do not
know whether some of these approaches have already been implanted in R and/
or whether better methods exist.

Any help would be very much appreciated,

Thanks,

Michael


Michael Haenlein
Professor of Marketing
ESCP Europe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] R <- POO

2017-03-16 Thread javier.ruben.marcuzzi
Estimado Miguel

Creo que no vale la pena discutir si es o no es orientado a objetos, hay otras 
preguntas:

¿Tiene capacidad de soportar POO? Si.

¿Tiene capacidad de programación funcional? Si, ayer pase un link del libre de 
Hadley Wickham que lo trata.

Ahora otro lenguaje C#, C++, java, ¿pueden programar sin usar objetos?, sin 
usarlos se complica, cosa que no pasa con R, o Microsoft no desarrollaría F#.

R tiene un poco de cada uno, yo aprendí R con lo que sabía de fortran, sobre 
objetos aprendí mucho tiempo después. R no es un lenguaje “clásico”, aunque se 
esté transformando en el “clásico de la estadística”.


Javier Rubén Marcuzzi

De: miguel.angel.rodriguez.mui...@sergas.es
Enviado: jueves, 16 de marzo de 2017 7:33
Para: javier.ruben.marcu...@gmail.com; c...@datanalytics.com
CC: r-help-es@r-project.org
Asunto: Re: [R-es] R <- POO

Sigo estando en desacuerdo contigo, Javier.
R es, claramente, un lenguaje de Programación Orientado a Objetos (POO, OOP, o 
como quieras llamarlo). Maneja clases (con sus propiedades, métodos, 
propiedades, estados, ...), herencia, polimorfismo, 
Otra cosa es que quieras usar R de otra manera (programación estructurada, 
modular, funcional, ...)
Un saludo,
Miguel.


El 15/03/2017 a las 16:32, javier.ruben.marcu...@gmail.com escribió:
Estimados 
 
Yo me refería a esto:
 
https://stat.ethz.ch/R-manual/R-devel/library/base/html/class.html
 
Interpreto que la traducción dice algo como que es una función con un mecanismo 
simple y genérico que puede ser utilizado como un estilo de programación 
orientada a objetos.
 
Sin entrar en partes muy técnicas.
 
En un lenguaje orientado a objetos como c#, creo una clase con atributos, luego 
puedo crear una lista para estos, algo como List lista_de_autos.
 
Otra forma sería algo como List, donde no coloco el objeto creado por la 
clase autos. T se daría cuenta. 
 
R interpreto que tiene un mecanismo semejante a este último. Si realizo 
str(algo), por decirlo de alguna forma me trae los atributos de un objeto ya 
definido, pero altura <- 1.85 sería una variable, que R permite ser llamada 
como un objeto.
 
R tiene partes en C/C++ que tiene objetos, pero en Fortran no se utilizan 
objetos, yo nunca tuve que iniciar una instancia de una matriz en Fortran, en R 
tampoco tuve que iniciar una instancia de una matriz, ni colocar partes pública 
o privadas, hay partes distintas en los lenguajes, o herencia entre dos 
data.frame.
 
Comparo nuevamente con C#
Var altura = 1,85 es una variable.
Persona.altura = 1,85 es un valor dentro del objeto
 
En R altura <- 1,85 puede trabajarse como variable u objeto (porque tiene una 
función simple y genérica que puede ser utilizada como programación orientada a 
objetos, entiendo que por library/base/class).
 
Por eso escribí que no es orientado a objetos pero que hay formas de trabajarlo 
como objetos. Luego hay diferencias como función y método,  pero el efecto es 
lo mismo, las diferencias no nos complica la vida, salvo si por lo que yo 
nombre como cajas se confunda con crear clases para definir objetos, en R no 
necesitamos definir una clase para llenar un data.frame. Sin embargo permite 
crear objetos y ser trabajado como OOP, en R se pueden usar varias alternativas.
https://www.r-bloggers.com/oo-in-r/
 
R permite más de un estilo de programación.
 
Javier Rubén Marcuzzi
 
De: Carlos J. Gil Bellosta 
Enviado: miércoles, 15 de marzo de 2017 11:14
Para: miguel angel rodriguez muinos
CC: Javier Marcuzzi; Mauricio Monsalvo; r-help-es
Asunto: Re: [R-es] R <- POO
 
De hecho, dos de los principios de diseño de R son:
 
1) Todo lo que existe es un objeto.
2) Todo lo que sucede es una llamada a una función.
 
Un saludo,
 
Carlos J. Gil Bellosta
http://www.datanalytics.com
 
El 15 de marzo de 2017, 14:22,  
escribió:
Disiento, Javier.

R sí es un lenguaje orientado a objetos.

:-)


Un saludo,
Miguel.



El 15/03/2017 a las 13:43, 
javier.ruben.marcu...@gmail.com 
escribió:

R es un lenguaje medio complicado, no es orientado a objetos, aunque hay formas 
para un trabajo con objetos, por otro lado se puede definir una función o 
emplear 5 paquetes para lo mismo en dos líneas de código.




Nota: A información contida nesta mensaxe e os seus posibles documentos 
adxuntos é privada e confidencial e está dirixida únicamente ó seu 
destinatario/a. Se vostede non é o/a destinatario/a orixinal desta mensaxe, por 
favor elimínea. A distribución ou copia desta mensaxe non está autorizada.

Nota: La información contenida en este mensaje y sus posibles documentos 
adjuntos es privada y confidencial y está dirigida únicamente a su 
destinatario/a. Si usted no es el/la destinatario/a original de este mensaje, 
por favor elimínelo. La distribución o copia de este mensaje no está autorizada.

See more languages: http://www.sergas.es/aviso-confidencialidad

___
R-help-es mailing list

Re: [R-es] R <- POO

2017-03-16 Thread miguel.angel.rodriguez.muinos
Sigo estando en desacuerdo contigo, Javier.

R es, claramente, un lenguaje de Programación Orientado a Objetos (POO, OOP, o 
como quieras llamarlo). Maneja clases (con sus propiedades, métodos, 
propiedades, estados, ...), herencia, polimorfismo, 

Otra cosa es que quieras usar R de otra manera (programación estructurada, 
modular, funcional, ...)

Un saludo,
Miguel.


El 15/03/2017 a las 16:32, 
javier.ruben.marcu...@gmail.com 
escribió:
Estimados

Yo me refería a esto:

https://stat.ethz.ch/R-manual/R-devel/library/base/html/class.html

Interpreto que la traducción dice algo como que es una función con un mecanismo 
simple y genérico que puede ser utilizado como un estilo de programación 
orientada a objetos.

Sin entrar en partes muy técnicas.

En un lenguaje orientado a objetos como c#, creo una clase con atributos, luego 
puedo crear una lista para estos, algo como List lista_de_autos.

Otra forma sería algo como List, donde no coloco el objeto creado por la 
clase autos. T se daría cuenta.

R interpreto que tiene un mecanismo semejante a este último. Si realizo 
str(algo), por decirlo de alguna forma me trae los atributos de un objeto ya 
definido, pero altura <- 1.85 sería una variable, que R permite ser llamada 
como un objeto.

R tiene partes en C/C++ que tiene objetos, pero en Fortran no se utilizan 
objetos, yo nunca tuve que iniciar una instancia de una matriz en Fortran, en R 
tampoco tuve que iniciar una instancia de una matriz, ni colocar partes pública 
o privadas, hay partes distintas en los lenguajes, o herencia entre dos 
data.frame.

Comparo nuevamente con C#
Var altura = 1,85 es una variable.
Persona.altura = 1,85 es un valor dentro del objeto

En R altura <- 1,85 puede trabajarse como variable u objeto (porque tiene una 
función simple y genérica que puede ser utilizada como programación orientada a 
objetos, entiendo que por library/base/class).

Por eso escribí que no es orientado a objetos pero que hay formas de trabajarlo 
como objetos. Luego hay diferencias como función y método,  pero el efecto es 
lo mismo, las diferencias no nos complica la vida, salvo si por lo que yo 
nombre como cajas se confunda con crear clases para definir objetos, en R no 
necesitamos definir una clase para llenar un data.frame. Sin embargo permite 
crear objetos y ser trabajado como OOP, en R se pueden usar varias alternativas.
https://www.r-bloggers.com/oo-in-r/

R permite más de un estilo de programación.

Javier Rubén Marcuzzi

De: Carlos J. Gil Bellosta 
Enviado: miércoles, 15 de marzo de 2017 11:14
Para: miguel angel rodriguez 
muinos
CC: Javier Marcuzzi; Mauricio 
Monsalvo; r-help-es
Asunto: Re: [R-es] R <- POO

De hecho, dos de los principios de diseño de R son:

1) Todo lo que existe es un objeto.
2) Todo lo que sucede es una llamada a una función.

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El 15 de marzo de 2017, 14:22, 
>
 escribió:
Disiento, Javier.

R sí es un lenguaje orientado a objetos.

:-)


Un saludo,
Miguel.



El 15/03/2017 a las 13:43, 
javier.ruben.marcu...@gmail.com>
 escribió:

R es un lenguaje medio complicado, no es orientado a objetos, aunque hay formas 
para un trabajo con objetos, por otro lado se puede definir una función o 
emplear 5 paquetes para lo mismo en dos líneas de código.





Nota: A información contida nesta mensaxe e os seus posibles documentos 
adxuntos é privada e confidencial e está dirixida únicamente ó seu 
destinatario/a. Se vostede non é o/a destinatario/a orixinal desta mensaxe, por 
favor elimínea. A distribución ou copia desta mensaxe non está autorizada.

Nota: La información contenida en este mensaje y sus posibles documentos 
adjuntos es privada y confidencial y está dirigida únicamente a su 
destinatario/a. Si usted no es el/la destinatario/a original de este mensaje, 
por favor elimínelo. La distribución o copia de este mensaje no está autorizada.

See more languages: http://www.sergas.es/aviso-confidencialidad

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] mood.test/mood.medtest

2017-03-16 Thread peter dalgaard

> On 15 Mar 2017, at 16:32 , Leemann, Lucas T  wrote:
> 
> Hello,
> 
> I was trying to test whether two medians are identical or not and used the 
> function “mood.test” from the “stats" package. My co-author, a medical 
> doctor, was trying to do the same in SPSS and had different results.

stats::mood.test() is a test of scale, not medians, according to its 
documentation. 

mood.medtest() is a test for a common median, basically looking at a crosstable 
of observations above and below the joint median:

> M <- table(indicator, x > median(x))
> chisq.test(M)

Pearson's Chi-squared test with Yates' continuity correction

data:  M
X-squared = 4.125, df = 1, p-value = 0.04225
> mood.medtest(x ~ indicator)

Mood's median test

data:  x by indicator
X-squared = 4.125, df = 1, p-value = 0.04225

This might differ from SPSS output (which you do not cite) in details like 
Yates correction, use of exact test, etc.

-pd



> 
> I wanted to see whether there was a problem on my end and also used the 
> function “mood.medtest” from the “RVAideMemoire” package. I find different 
> results while I am under the impression that both functions claim to carry 
> out the same test and have the same defaults. While my actual data is 
> sensitive medical information, I provide simple code below for a reproducible 
> example. 
> 
> library(RVAideMemoire)
> set.seed(123)
> a <- runif(100)
> b <- runif(120,0.2,1.1)
> indicator <- c(rep(0,100),rep(1,120))
> x <- c(a,b)
> mood.test(x ~ indicator)
> mood.medtest(x ~ indicator)
> 
> Has anybody encounter this problem before or would be able to provide any 
> insights?
> 
> Best wishes,
> Lucas
> 
> 
> 
> 
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HELP ME: Fill NA Values from the previous Non-NA Values

2017-03-16 Thread PIKAL Petr
Hi

why nonumeric values in data identifier matters?

Some toy data
dat<-data.frame(ie=rep(letters[1:3], each=3), iw=rnorm(9))
dat[c(2, 4,6, 9),2]<-NA
library(zoo)

ave(dat$iw, dat$ie, FUN=function(x) na.locf(x, na.rm=FALSE))

You can use ave together with na.locf to propagate nonumeric values only within 
identifiers

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Allan
> Tanaka
> Sent: Wednesday, March 15, 2017 12:09 PM
> To: r-help@r-project.org
> Subject: [R] HELP ME: Fill NA Values from the previous Non-NA Values
>
> The following is an example:
>
> | Item_Identifier | Item_Weight |
> | FDP10 | 19 |
> | FDP10 |  |
> | DRI11 | 8.26 |
> | DRI11 |  |
> | FDW12 | 8.315 |
> | FDW12 |  |
>
>
> The following is the one that i want to be. That is, filling NA values from 
> the
> previous Non-NA values.
> | Item_Identifier | Item_Weight |
> | FDP10 | 19 |
> | FDP10 | 19 |
> | DRI11 | 8.26 |
> | DRI11 | 8.26 |
> | FDW12 | 8.315 |
> | FDW12 | 8.315 |
>
>
> My current code data frame: train <- read.csv("Train.csv", header=T,sep =
> ",",na.strings = c(""," ",NA))
>
>
> Some people suggest to use na.locf function but in my case, i don't have
> numeric unique values in my Item_Identifier coloumn but rather it's
> characters. Not sure what to solve this problem.
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] add median value and standard deviation bar to lattice plot

2017-03-16 Thread Luigi Marongiu
dear Bert,
thank you for the solution, it worked perfectly. However I still would
like to know how reliable are the dots that are plotted, that is why i
would like to have individual bars on each dot (if possible). the
standard deviation maybe is not the right tool and the confidence
interval is perhaps better, but the procedure should be the same: draw
an arrow from the lower to the upper limit. is that possible?
regards,
luigi

PS sorry for the formatting, usually plain text is my default; it
should have switched to html when i replied to a previous email but
the difference does not show up when i type...

On Wed, Mar 15, 2017 at 4:28 PM, Bert Gunter  wrote:
> There may be a specific function that handles this for you, but to
> roll your own, you need a custom panel.groups function, not the
> default. You need to modify the panel function (which is
> panel.superpose by default) to pass down the "col" argument to the
> panel.segments call in the panel.groups function.
>
> This should get you started:
>
> useOuterStrips(
>strip = strip.custom(par.strip.text = list(cex = 0.75)),
>strip.left = strip.custom(par.strip.text = list(cex = 0.75)),
>stripplot(
>   average ~ type|target+cluster,
>   panel = function(x,y,col,...)
>  panel.superpose(x,y,col=col,...),
>   panel.groups = function(x,y,col,...){
>  panel.stripplot(x,y,col=col,...)
>  m <- median(y)
>  panel.segments(x0 = x[1] -.5, y0 = m,
> x1 = x[1] +.5, y1 = m,
> col=col, lwd=2
> )
>   },
>   my.data,
>   groups = type,
>   pch=1,
>   jitter.data = TRUE,
>   main = "Group-wise",
>   xlab = expression(bold("Target")), ylab = expression(bold("Reading")),
>   col = c("grey", "green", "red"),
>   par.settings = list(strip.background = list(col=c("paleturquoise",
> "grey"))),
>   scales = list(alternating = FALSE, x=list(draw=FALSE)),
>   key = list(
>  space = "top",
>  columns = 3,
>  text = list(c("Blank", "Negative", "Positive"), col="black"),
>  rectangles = list(col=c("grey", "green", "red"))
>   )
>)
> )
>
> FWIW, I think adding 1 sd bars is a bad idea statistically.
>
> And though it made no difference here, please post in pain text, not HTML.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Mar 15, 2017 at 2:22 AM, Luigi Marongiu
>  wrote:
>> Dear all,
>> I am analyzing some multivariate data that is organized like this:
>> 1st variable = cluster (A or B)
>> 2nd variable = target (a, b, c, d, e)
>> 3rd variable = type (blank, negative, positive)
>> 4th variable = sample (the actual name of the sample)
>> 5th variable = average (the actual reading -- please not that this is the
>> mean of different measures with an assumed normal distribution, but the
>> assumption might not always be true)
>> 6th variable = stdev (the standard deviation associated with each reading)
>> 7th variable = ll (lower limit that is average stdev)
>> 8th variable = ul (upper limit that is average + stdev)
>>
>> I am plotting the data using lattice's stripplot and I would need to add:
>> 1. an error bar for each measurement. the bar should be possibly coloured
>> in light grey and semitransparent to reduce the noise of the plot.
>> 2. a type-based median bar to show differences in measurements between
>> blanks, negative and positive samples within each panel.
>>
>> How would I do that?
>> Many thanks,
>> Luigi
>>
>
>> cluster <- c(rep("A", 90), rep("B", 100))
>> sample <- c(
>>   rep(c("cow-01", "cow-02", "cow-03", "cow-04", "cow-05", "cow-06",
>> "cow-07", "cow-08", "cow-09", "cow-10", "cow-11",
>> "cow-12", "cow-13", "cow-14", "cow-15", "cow-16", "cow-17",
>> "blank"), 5),
>>   rep(c("cow-26", "cow-35", "cow-36", "cow-37", "cow-38", "cow-39",
>> "cow-40", "cow-41", "cow-42", "cow-43", "cow-44", "cow-45",
>> "cow-46", "cow-47", "cow-48", "cow-49", "cow-50", "cow-51",
>> "cow-59", "blank"), 5)
>> )
>> type <- c(
>>   rep(c("negative", "negative", "negative", "negative", "negative",
>> "negative", "negative", "negative", "positive", "positive",
>> "positive", "positive", "positive", "positive", "positive",
>> "positive", "positive", "blank"), 5),
>>   rep(c("negative", "positive", "negative", "negative", "negative",
>> "negative", "negative", "negative", "positive", "positive",
>> "positive", "positive", "positive", "positive", "positive",
>> "positive", "positive", "positive", "positive", "blank"), 5)
>> )
>> target <- c(
>> c(rep("a", 18), rep("b", 18), rep("c", 18), rep("d", 18), rep("e", 18)),
>> c(rep("a", 20), rep("b", 20), rep("c", 20), rep("d", 20), rep("e", 20))
>>