Re: [R] Regression with factor having1 level

2016-03-10 Thread David Winsemius

> On Mar 10, 2016, at 5:45 PM, Nordlund, Dan (DSHS/RDA)  
> wrote:
> 
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of David
>> Winsemius
>> Sent: Thursday, March 10, 2016 4:39 PM
>> To: Robert McGehee
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Regression with factor having1 level
>> 
>> 
>>> On Mar 10, 2016, at 2:00 PM, Robert McGehee 
>> wrote:
>>> 
>>> Hello R-helpers,
>>> I'd like a function that given an arbitrary formula and a data frame
>>> returns the residual of the dependent variable,and maintains all NA values.
>> 
>> What does "maintains all NA values" actually mean?
>>> 
>>> Here's an example that will give me what I want if my formula is
>>> y~x1+x2+x3 and my data frame is df:
>>> 
>>> resid(lm(y~x1+x2+x3, data=df, na.action=na.exclude))
>>> 
>>> Here's the catch, I do not want my function to ever fail due to a
>>> factor with only one level. A one-level factor may appear because 1)
>>> the user passed it in, or 2) (more common) only one factor in a term
>>> is left after na.exclude removes the other NA values.
>>> 
>>> Here is the error I would get
>> 
>> From what code?
>> 
>> 
>>> above if one of the terms was a factor with one level:
>>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>>> contrasts can be applied only to factors with 2 or more levels
>> 
>> Unable to create that error with the actions you decribe but to not actually
>> offer in coded form:
>> 
>> 
>>> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=TRUE, x3=rnorm(10))
>>> lm(y~x1+x2+x3, dfrm)
>> 
>> Call:
>> lm(formula = y ~ x1 + x2 + x3, data = dfrm)
>> 
>> Coefficients:
>> (Intercept)   x1   x2TRUE   x3
>>   -0.16274 -0.30032   NA -0.09093
>> 
>>> resid(lm(y~x1+x2+x3, data=dfrm, na.action=na.exclude))
>>  1   2   3   4   5   6
>> -0.16097245  0.65408508 -0.70098223 -0.15360434  1.26027872  0.55752239
>>  7   8   9  10
>> -0.05965653 -2.17480605  1.42917190 -0.65103650
>> 
>>> 
>> 
>> 
>>> Instead of giving me an error, I'd like the function to do just what
>>> lm() normally does when it sees a variable with no variance, ignore
>>> the variable (coefficient is NA) and continue to regress out all the other
>> variables.
>>> Thus if 'x2' is a factor with one variable in the above example, I'd
>>> like the function to return the result of:
>>> resid(lm(y~x1+x3, data=df, na.action=na.exclude)) Can anyone provide
>>> me a straight forward recommendation for how to do this?
>>> I feel like it should be easy, but I'm honestly stuck, and my Google
>>> searching for this hasn't gotten anywhere. The key is that I'd like
>>> the solution to be generic enough to work with an arbitrary linear
>>> formula, and not substantially kludgy (like trying ever combination of
>>> regressions terms until one works) as I'll be running this a lot on
>>> big data sets and don't want my computation time swamped by running
>>> unnecessary regressions or checking for number of factors after removing
>> NAs.
>>> 
>>> Thanks in advance!
>>> --Robert
>>> 
>>> 
>>> PS. The Google search feature in the R-help archives appears to be down:
>>> http://tolstoy.newcastle.edu.au/R/
>> 
>> It's working for me.
>> 
>>> 
>>> [[alternative HTML version deleted]]
>>> 
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 
> 
> I agree that what is wanted is not clear.  However, if dfrm is created with 
> x2 as a factor, then you get the error message that the OP mentions when you 
> run the regression.
> 
>> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=as.factor(TRUE), 
>> x3=rnorm(10))
>> lm(y~x1+x2+x3, dfrm, na.action=na.exclude)
> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
>  contrasts can be applied

Yes, and the error appears to come from `model.matrix`:

> model.matrix(y~x1+factor(x2)+x3, dfrm)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

> model.matrix(y~x1+x2+x3, dfrm)
   (Intercept)  x1 x2TRUE x3
11  0.04887847  1 -0.4199628
21 -1.04786688  1  1.3947923
31 -0.34896007  1 -2.1873666
41 -0.08866061  1  0.1204129
51 -0.4366  1 -1.6631057
61 -0.83449110  1  1.1631801
71 -0.67887823  1  0.3207544
81 -1.12206068  1  0.6012040
91  0.05116683  1  0.3598696
10   1  1.74413583  1  0.3608478
attr(,"assign")
[1] 0 1 2 3
attr(,"contrasts")
attr(,"contrasts")$x2
[1] "contr.treatment"

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

Re: [R] Regression with factor having1 level

2016-03-10 Thread Nordlund, Dan (DSHS/RDA)
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of David
> Winsemius
> Sent: Thursday, March 10, 2016 4:39 PM
> To: Robert McGehee
> Cc: r-help@r-project.org
> Subject: Re: [R] Regression with factor having1 level
> 
> 
> > On Mar 10, 2016, at 2:00 PM, Robert McGehee 
> wrote:
> >
> > Hello R-helpers,
> > I'd like a function that given an arbitrary formula and a data frame
> > returns the residual of the dependent variable,and maintains all NA values.
> 
> What does "maintains all NA values" actually mean?
> >
> > Here's an example that will give me what I want if my formula is
> > y~x1+x2+x3 and my data frame is df:
> >
> > resid(lm(y~x1+x2+x3, data=df, na.action=na.exclude))
> >
> > Here's the catch, I do not want my function to ever fail due to a
> > factor with only one level. A one-level factor may appear because 1)
> > the user passed it in, or 2) (more common) only one factor in a term
> > is left after na.exclude removes the other NA values.
> >
> > Here is the error I would get
> 
> From what code?
> 
> 
> > above if one of the terms was a factor with one level:
> > Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
> >  contrasts can be applied only to factors with 2 or more levels
> 
> Unable to create that error with the actions you decribe but to not actually
> offer in coded form:
> 
> 
> > dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=TRUE, x3=rnorm(10))
> > lm(y~x1+x2+x3, dfrm)
> 
> Call:
> lm(formula = y ~ x1 + x2 + x3, data = dfrm)
> 
> Coefficients:
> (Intercept)   x1   x2TRUE   x3
>-0.16274 -0.30032   NA -0.09093
> 
> > resid(lm(y~x1+x2+x3, data=dfrm, na.action=na.exclude))
>   1   2   3   4   5   6
> -0.16097245  0.65408508 -0.70098223 -0.15360434  1.26027872  0.55752239
>   7   8   9  10
> -0.05965653 -2.17480605  1.42917190 -0.65103650
> 
> >
> 
> 
> > Instead of giving me an error, I'd like the function to do just what
> > lm() normally does when it sees a variable with no variance, ignore
> > the variable (coefficient is NA) and continue to regress out all the other
> variables.
> > Thus if 'x2' is a factor with one variable in the above example, I'd
> > like the function to return the result of:
> > resid(lm(y~x1+x3, data=df, na.action=na.exclude)) Can anyone provide
> > me a straight forward recommendation for how to do this?
> > I feel like it should be easy, but I'm honestly stuck, and my Google
> > searching for this hasn't gotten anywhere. The key is that I'd like
> > the solution to be generic enough to work with an arbitrary linear
> > formula, and not substantially kludgy (like trying ever combination of
> > regressions terms until one works) as I'll be running this a lot on
> > big data sets and don't want my computation time swamped by running
> > unnecessary regressions or checking for number of factors after removing
> NAs.
> >
> > Thanks in advance!
> > --Robert
> >
> >
> > PS. The Google search feature in the R-help archives appears to be down:
> > http://tolstoy.newcastle.edu.au/R/
> 
> It's working for me.
> 
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 

I agree that what is wanted is not clear.  However, if dfrm is created with x2 
as a factor, then you get the error message that the OP mentions when you run 
the regression.

> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=as.factor(TRUE), 
> x3=rnorm(10))
> lm(y~x1+x2+x3, dfrm, na.action=na.exclude)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied


Dan

Daniel Nordlund, PhD
Research and Data Analysis Division
Services & Enterprise Support Administration
Washington State Department of Social and Health Services

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression with factor having1 level

2016-03-10 Thread Robert McGehee
Here's an example for clarity:

> df <- data.frame(y=c(0,2,4,6,8), x1=c(1,1,2,2,NA),
x2=factor(c("A","A","A","A","B")))
> resid(lm(y~x1+x2, data=df, na.action=na.exclude)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels

Note that the x2 factor variable contains two levels, but the "B" level is
excluded in the regression due to the NA value in x1. Hence the error.

Instead of the above error, I would like a function that returns the
residual of the regression without the offending term, which in this case
would be equivalent to:
> resid(lm(y~x1, data=df, na.action=na.exclude)
 1  2  3  4  5
-1  1 -1  1 NA

Note the 5th term returns an NA as there is an NA in the x1 independent
variable, which was what I had meant by maintain NAs.

I'm currently leaning towards rewriting model.matrix.default so that it
removes offending terms rather than give an error, but if someone has done
this already (or something more elegant), that would of course be preferred
:)
--Robert

On Thu, Mar 10, 2016 at 7:39 PM, David Winsemius 
wrote:

>
> > On Mar 10, 2016, at 2:00 PM, Robert McGehee  wrote:
> >
> > Hello R-helpers,
> > I'd like a function that given an arbitrary formula and a data frame
> > returns the residual of the dependent variable,and maintains all NA
> values.
>
> What does "maintains all NA values" actually mean?
> >
> > Here's an example that will give me what I want if my formula is
> y~x1+x2+x3
> > and my data frame is df:
> >
> > resid(lm(y~x1+x2+x3, data=df, na.action=na.exclude))
> >
> > Here's the catch, I do not want my function to ever fail due to a factor
> > with only one level. A one-level factor may appear because 1) the user
> > passed it in, or 2) (more common) only one factor in a term is left after
> > na.exclude removes the other NA values.
> >
> > Here is the error I would get
>
> From what code?
>
>
> > above if one of the terms was a factor with
> > one level:
> > Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
> >  contrasts can be applied only to factors with 2 or more levels
>
> Unable to create that error with the actions you decribe but to not
> actually offer in coded form:
>
>
> > dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=TRUE, x3=rnorm(10))
> > lm(y~x1+x2+x3, dfrm)
>
> Call:
> lm(formula = y ~ x1 + x2 + x3, data = dfrm)
>
> Coefficients:
> (Intercept)   x1   x2TRUE   x3
>-0.16274 -0.30032   NA -0.09093
>
> > resid(lm(y~x1+x2+x3, data=dfrm, na.action=na.exclude))
>   1   2   3   4   5   6
> -0.16097245  0.65408508 -0.70098223 -0.15360434  1.26027872  0.55752239
>   7   8   9  10
> -0.05965653 -2.17480605  1.42917190 -0.65103650
>
> >
>
>
> > Instead of giving me an error, I'd like the function to do just what lm()
> > normally does when it sees a variable with no variance, ignore the
> variable
> > (coefficient is NA) and continue to regress out all the other variables.
> > Thus if 'x2' is a factor with one variable in the above example, I'd like
> > the function to return the result of:
> > resid(lm(y~x1+x3, data=df, na.action=na.exclude))
> > Can anyone provide me a straight forward recommendation for how to do
> this?
> > I feel like it should be easy, but I'm honestly stuck, and my Google
> > searching for this hasn't gotten anywhere. The key is that I'd like the
> > solution to be generic enough to work with an arbitrary linear formula,
> and
> > not substantially kludgy (like trying ever combination of regressions
> terms
> > until one works) as I'll be running this a lot on big data sets and don't
> > want my computation time swamped by running unnecessary regressions or
> > checking for number of factors after removing NAs.
> >
> > Thanks in advance!
> > --Robert
> >
> >
> > PS. The Google search feature in the R-help archives appears to be down:
> > http://tolstoy.newcastle.edu.au/R/
>
> It's working for me.
>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression with factor having1 level

2016-03-10 Thread David Winsemius

> On Mar 10, 2016, at 2:00 PM, Robert McGehee  wrote:
> 
> Hello R-helpers,
> I'd like a function that given an arbitrary formula and a data frame
> returns the residual of the dependent variable,and maintains all NA values.

What does "maintains all NA values" actually mean?
> 
> Here's an example that will give me what I want if my formula is y~x1+x2+x3
> and my data frame is df:
> 
> resid(lm(y~x1+x2+x3, data=df, na.action=na.exclude))
> 
> Here's the catch, I do not want my function to ever fail due to a factor
> with only one level. A one-level factor may appear because 1) the user
> passed it in, or 2) (more common) only one factor in a term is left after
> na.exclude removes the other NA values.
> 
> Here is the error I would get

>From what code?


> above if one of the terms was a factor with
> one level:
> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>  contrasts can be applied only to factors with 2 or more levels

Unable to create that error with the actions you decribe but to not actually 
offer in coded form:


> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=TRUE, x3=rnorm(10))
> lm(y~x1+x2+x3, dfrm)

Call:
lm(formula = y ~ x1 + x2 + x3, data = dfrm)

Coefficients:
(Intercept)   x1   x2TRUE   x3  
   -0.16274 -0.30032   NA -0.09093  

> resid(lm(y~x1+x2+x3, data=dfrm, na.action=na.exclude))
  1   2   3   4   5   6 
-0.16097245  0.65408508 -0.70098223 -0.15360434  1.26027872  0.55752239 
  7   8   9  10 
-0.05965653 -2.17480605  1.42917190 -0.65103650 

> 


> Instead of giving me an error, I'd like the function to do just what lm()
> normally does when it sees a variable with no variance, ignore the variable
> (coefficient is NA) and continue to regress out all the other variables.
> Thus if 'x2' is a factor with one variable in the above example, I'd like
> the function to return the result of:
> resid(lm(y~x1+x3, data=df, na.action=na.exclude))
> Can anyone provide me a straight forward recommendation for how to do this?
> I feel like it should be easy, but I'm honestly stuck, and my Google
> searching for this hasn't gotten anywhere. The key is that I'd like the
> solution to be generic enough to work with an arbitrary linear formula, and
> not substantially kludgy (like trying ever combination of regressions terms
> until one works) as I'll be running this a lot on big data sets and don't
> want my computation time swamped by running unnecessary regressions or
> checking for number of factors after removing NAs.
> 
> Thanks in advance!
> --Robert
> 
> 
> PS. The Google search feature in the R-help archives appears to be down:
> http://tolstoy.newcastle.edu.au/R/

It's working for me.

> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in make.names(col.names, unique = TRUE) : invalid multibyte string at '14 <4a>ULY 2012'

2016-03-10 Thread Dalthorp, Daniel
Hi Ken,
Without seeing your .csv file or how you are trying to read it, it's tough
to diagnose the trouble. I inserted commas between the columns in your data
snippet, pasted into Excel, saved as .csv file called "datesfile.csv" in
the R working directory. Then, the following worked fine for me:

junk<-read.csv("datesfile.csv", header = TRUE)
junk # is a dataframe with headers Gender, DOB, etc.

 # Age at screening (in days):
as.Date(junk$Screen.Date,format="%d %B %Y")-as.Date(junk$DOB,format="%d %B
%Y")

# Age at screening (in years):
as.numeric(as.Date(junk$Screen.Date,format="%d %B
%Y")-as.Date(junk$DOB,format="%d %B %Y"))/365.2425

I hope this helps.

-Dan


On Thu, Mar 10, 2016 at 11:34 AM, KMNanus  wrote:

> I’m trying to read in the data below from an Excel file (as a .csv file)
> in  order to create an age (in years.%years) but am getting the error
> message in the subject line.
>
> I’ve tried saving the dates as dates in Excel and tried saving the dates
> as text, both give me the same error message.  Can someone pls tell me what
> I’m doing wrong?
>
> Gender  DOB Diagnosis   Screen Date
> Male 14 JULY 2012   No   05 OCTOBER 2015
> Female   31 OCTOBER 2009No   30 NOVEMBER 2015
> Female   08 JULY 2009   No   06 DECEMBER 2015
> Male 04 JUNE 2011   NA   11 JANUARY 2016
> Female   21 AUGUST 2009 Yes  01 FEBRUARY 2016
> Male 05 NOVEMBER 2007   No   16 FEBRUARY 2016
> Male 01 JUNE 2009   NA   29 FEBRUARY 2016
>
>
>
> Ken
> kmna...@gmail.com
> 914-450-0816 (tel)
> 347-730-4813 (fax)
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




-- 
Dan Dalthorp, PhD
USGS Forest and Rangeland Ecosystem Science Center
Forest Sciences Lab, Rm 189
3200 SW Jefferson Way
Corvallis, OR 97331
ph: 541-750-0953
ddalth...@usgs.gov

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regression with factor having1 level

2016-03-10 Thread Ben Bolker
Robert McGehee  gmail.com> writes:

> 
> Hello R-helpers,
> I'd like a function that given an arbitrary formula and a data frame
> returns the residual of the dependent variable, and maintains all
>  NA values.
> 
> Here's an example that will give me what I want if my formula is y~x1+x2+x3
> and my data frame is df:
> 
> resid(lm(y~x1+x2+x3, data=df, na.action=na.exclude))
> 
> Here's the catch, I do not want my function to ever fail due to a factor
> with only one level. A one-level factor may appear because 1) the user
> passed it in, or 2) (more common) only one factor in a term is left after
> na.exclude removes the other NA values.
> 

 [snip to try to make Gmane happy]
> 
> Can anyone provide me a straight forward recommendation for how 
> to do this?

  The only approach I can think of is to screen for single-level factors
yourself and remove these factors from the
formula. It's a little tricky; you can't call model.frame() with a single-level
factor (that's where the error comes from), and you have to strip out NA
values yourself so you can see which factors end up with only a single
level after NA removal.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prediction from a rank deficient fit may be misleading

2016-03-10 Thread David Winsemius

> On Mar 10, 2016, at 2:21 PM, Michael Artz  wrote:
> 
> Here is the results of the logistic regression model.  Is it because of the
> NA values?

It's unclear. The InternetServiceNo (an other "No")-values could well be the 
cause. Many times questionnaires get encoded in a manner that causes complete 
collinearity and the glm function then "aliases" those levels and displays an 
NA result for the coefficients. I don't remember the predict function then 
emitting that warning, but seems possible that including column names for 
aliased factors would be a well-mannered behavior for software. At any rate I 
don't see the absurd sorts of coefficients (such as 10 or 20) that I associate 
with severe numerical pathology.


> 
> Call:
> glm(formula = TARGET_A ~ Contract + Dependents + DeviceProtection +
>gender + InternetService + MonthlyCharges + MultipleLines +
>OnlineBackup + OnlineSecurity + PaperlessBilling + Partner +
>PaymentMethod + PhoneService + SeniorCitizen + StreamingMovies +
>StreamingTV + TechSupport + tenure + TotalCharges, family =
> binomial(link = "logit"),
>data = churn_training)
> 
> Deviance Residuals:
>Min   1Q   Median   3Q  Max
> -1.8943  -0.6867  -0.2863   0.7378   3.4259
> 
> Coefficients: (7 not defined because of singularities)
>   Estimate Std. Error z value Pr(>|z|)
> 
> (Intercept)   1.0664928  1.7195494   0.620   0.5351
> 
> ContractOne year -0.6874005  0.1314227  -5.230 1.69e-07
> ***
> ContractTwo year -1.2775385  0.2101193  -6.080 1.20e-09
> ***
> DependentsYes-0.1485301  0.1095348  -1.356   0.1751
> 
> DeviceProtectionNo internet service  -1.5547306  0.9661837  -1.609   0.1076
> 
> DeviceProtectionYes   0.0459115  0.2114253   0.217   0.8281
> 
> genderMale   -0.0350970  0.0776896  -0.452   0.6514
> 
> InternetServiceFiber optic1.4800374  0.9545398   1.551   0.1210
> 
> InternetServiceNoNA NA  NA   NA
> 
> MonthlyCharges   -0.0324614  0.0379646  -0.855   0.3925
> 
> MultipleLinesNo phone service 0.0808745  0.7736359   0.105   0.9167
> 
> MultipleLinesYes  0.3990450  0.2131343   1.872   0.0612
> .
> OnlineBackupNo internet service  NA NA  NA   NA
> 
> OnlineBackupYes  -0.0328892  0.2081145  -0.158   0.8744
> 
> OnlineSecurityNo internet serviceNA NA  NA   NA
> 
> OnlineSecurityYes-0.2760602  0.2132917  -1.294   0.1956
> 
> PaperlessBillingYes   0.3509944  0.0890884   3.940 8.15e-05
> ***
> PartnerYes0.0306815  0.0940650   0.326   0.7443
> 
> PaymentMethodCredit card (automatic) -0.0710923  0.1377252  -0.516   0.6057
> 
> PaymentMethodElectronic check 0.3074078  0.1137939   2.701   0.0069
> **
> PaymentMethodMailed check-0.0201076  0.1377539  -0.146   0.8839
> 
> PhoneServiceYes  NA NA  NA   NA
> 
> SeniorCitizen 0.1856454  0.1023527   1.814   0.0697
> .
> StreamingMoviesNo internet service   NA NA  NA   NA
> 
> StreamingMoviesYes0.5260087  0.3899615   1.349   0.1774
> 
> StreamingTVNo internet service   NA NA  NA   NA
> 
> StreamingTVYes0.4781321  0.3905777   1.224   0.2209
> 
> TechSupportNo internet service   NA NA  NA   NA
> 
> TechSupportYes   -0.2511197  0.2181612  -1.151   0.2497
> 
> tenure   -0.0702813  0.0077113  -9.114  < 2e-16
> ***
> TotalCharges  0.0004276  0.874   4.892 9.97e-07
> ***
> 
> On Thu, Mar 10, 2016 at 4:05 PM, David Winsemius 
> wrote:
> 
>> 
>>> On Mar 10, 2016, at 8:08 AM, Michael Artz 
>> wrote:
>>> 
>>> HI all,
>>> I have the following error -
 resultVector <- predict(logitregressmodel, dataset1, type='response')
>>> Warning message:
>>> In predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type ==
>> :
>>> prediction from a rank-deficient fit may be misleading
>> 
>> It wasn't an R error. It was an R warning. Was the `summary` output on
>> logitregressmodel informative? Does the resultVector look sensible given
>> its inputs?
>> 
>> 
>>> I have seen on internet that there may be some collinearity in the data
>> and
>>> this is causing that.  How can I be sure?
>> 
>> Do some diagnostics. After looking carefully at the output of
>> summary(logitregressmodel)  and perhaps summary(dataset1) if it was the
>> original input to the modeling functions, and then you could move on to
>> looking at cross-correlations on things you think are continuous 

Re: [R] Prediction from a rank deficient fit may be misleading

2016-03-10 Thread Michael Artz
Here is the results of the logistic regression model.  Is it because of the
NA values?

Call:
glm(formula = TARGET_A ~ Contract + Dependents + DeviceProtection +
gender + InternetService + MonthlyCharges + MultipleLines +
OnlineBackup + OnlineSecurity + PaperlessBilling + Partner +
PaymentMethod + PhoneService + SeniorCitizen + StreamingMovies +
StreamingTV + TechSupport + tenure + TotalCharges, family =
binomial(link = "logit"),
data = churn_training)

Deviance Residuals:
Min   1Q   Median   3Q  Max
-1.8943  -0.6867  -0.2863   0.7378   3.4259

Coefficients: (7 not defined because of singularities)
   Estimate Std. Error z value Pr(>|z|)

(Intercept)   1.0664928  1.7195494   0.620   0.5351

ContractOne year -0.6874005  0.1314227  -5.230 1.69e-07
***
ContractTwo year -1.2775385  0.2101193  -6.080 1.20e-09
***
DependentsYes-0.1485301  0.1095348  -1.356   0.1751

DeviceProtectionNo internet service  -1.5547306  0.9661837  -1.609   0.1076

DeviceProtectionYes   0.0459115  0.2114253   0.217   0.8281

genderMale   -0.0350970  0.0776896  -0.452   0.6514

InternetServiceFiber optic1.4800374  0.9545398   1.551   0.1210

InternetServiceNoNA NA  NA   NA

MonthlyCharges   -0.0324614  0.0379646  -0.855   0.3925

MultipleLinesNo phone service 0.0808745  0.7736359   0.105   0.9167

MultipleLinesYes  0.3990450  0.2131343   1.872   0.0612
.
OnlineBackupNo internet service  NA NA  NA   NA

OnlineBackupYes  -0.0328892  0.2081145  -0.158   0.8744

OnlineSecurityNo internet serviceNA NA  NA   NA

OnlineSecurityYes-0.2760602  0.2132917  -1.294   0.1956

PaperlessBillingYes   0.3509944  0.0890884   3.940 8.15e-05
***
PartnerYes0.0306815  0.0940650   0.326   0.7443

PaymentMethodCredit card (automatic) -0.0710923  0.1377252  -0.516   0.6057

PaymentMethodElectronic check 0.3074078  0.1137939   2.701   0.0069
**
PaymentMethodMailed check-0.0201076  0.1377539  -0.146   0.8839

PhoneServiceYes  NA NA  NA   NA

SeniorCitizen 0.1856454  0.1023527   1.814   0.0697
.
StreamingMoviesNo internet service   NA NA  NA   NA

StreamingMoviesYes0.5260087  0.3899615   1.349   0.1774

StreamingTVNo internet service   NA NA  NA   NA

StreamingTVYes0.4781321  0.3905777   1.224   0.2209

TechSupportNo internet service   NA NA  NA   NA

TechSupportYes   -0.2511197  0.2181612  -1.151   0.2497

tenure   -0.0702813  0.0077113  -9.114  < 2e-16
***
TotalCharges  0.0004276  0.874   4.892 9.97e-07
***

On Thu, Mar 10, 2016 at 4:05 PM, David Winsemius 
wrote:

>
> > On Mar 10, 2016, at 8:08 AM, Michael Artz 
> wrote:
> >
> > HI all,
> > I have the following error -
> >> resultVector <- predict(logitregressmodel, dataset1, type='response')
> > Warning message:
> > In predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type ==
> :
> >  prediction from a rank-deficient fit may be misleading
>
> It wasn't an R error. It was an R warning. Was the `summary` output on
> logitregressmodel informative? Does the resultVector look sensible given
> its inputs?
>
>
> > I have seen on internet that there may be some collinearity in the data
> and
> > this is causing that.  How can I be sure?
>
> Do some diagnostics. After looking carefully at the output of
> summary(logitregressmodel)  and perhaps summary(dataset1) if it was the
> original input to the modeling functions, and then you could move on to
> looking at cross-correlations on things you think are continuous and
> crosstabs on factor variables and the condition number on the full data
> matrix.
>
> Lots of stuff turns up on search for "detecting collinearity condition
> number in r"
>
> >
> > Thanks
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the 

Re: [R] extracting months from a data

2016-03-10 Thread KMNanus
Thanks for this.  I wasn’t familiar with paste0, so I’ll call that and see if 
it works.

Ken
kmna...@gmail.com
914-450-0816 (tel)
347-730-4813 (fax)



> On Mar 9, 2016, at 7:15 PM, Dalthorp, Daniel  wrote:
> 
> Or: 
> 
> x <- c( "3-Oct", "10-Nov" )
> format(as.Date(paste0(x,rep("-1970",length(x))),format='%d-%b-%Y'),'%b')
> 
> # the 'paste0' appends a year to the text vector
> # the 'as.Date' interprets the strings as dates with format  10-Jun-2016 
> (e.g.)
> # the 'format' returns a string with date in format '%b' (which is just the 
> name of the month)
> 
> On Wed, Mar 9, 2016 at 3:52 PM, Jeff Newmiller  > wrote:
> Your dates are incomplete (no year) so I suggest staying away from the date 
> functions for this. Read ?regex and ?sub.
> 
> x <- c( "3-Oct", "10-Nov" )
> m <- sub( "^\\d+-([A-Za-z]{3})$", "\\1", x )
> 
> --
> Sent from my phone. Please excuse my brevity.
> 
> On March 9, 2016 10:14:25 AM PST, KMNanus  > wrote:
> >I have a series of dates in  format 3-Oct, 10-Oct, 20-Oct, etc.
> >
> >I want to create a variable of just the month.  If I convert the date
> >to a character string, substr is ineffective because some of the dates
> >have 5 characters (3-Oct) and some have 6 (10-Oct).
> >
> >Is there a date function that accomplishes this easily?
> >
> >Ken
> >kmna...@gmail.com 
> >914-450-0816 (tel)
> >347-730-4813 (fax)
> >
> >
> >
> >__
> >R-help@r-project.org  mailing list -- To 
> >UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help 
> >
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html 
> >
> >and provide commented, minimal, self-contained, reproducible code.
> 
> [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org  mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help 
> 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> 
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> Dan Dalthorp, PhD
> USGS Forest and Rangeland Ecosystem Science Center
> Forest Sciences Lab, Rm 189
> 3200 SW Jefferson Way 
> Corvallis, OR 97331 
> ph: 541-750-0953
> ddalth...@usgs.gov 
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Regression with factor having1 level

2016-03-10 Thread Robert McGehee
Hello R-helpers,
I'd like a function that given an arbitrary formula and a data frame
returns the residual of the dependent variable, and maintains all NA values.

Here's an example that will give me what I want if my formula is y~x1+x2+x3
and my data frame is df:

resid(lm(y~x1+x2+x3, data=df, na.action=na.exclude))

Here's the catch, I do not want my function to ever fail due to a factor
with only one level. A one-level factor may appear because 1) the user
passed it in, or 2) (more common) only one factor in a term is left after
na.exclude removes the other NA values.

Here is the error I would get above if one of the terms was a factor with
one level:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels

Instead of giving me an error, I'd like the function to do just what lm()
normally does when it sees a variable with no variance, ignore the variable
(coefficient is NA) and continue to regress out all the other variables.
Thus if 'x2' is a factor with one variable in the above example, I'd like
the function to return the result of:
resid(lm(y~x1+x3, data=df, na.action=na.exclude))

Can anyone provide me a straight forward recommendation for how to do this?
I feel like it should be easy, but I'm honestly stuck, and my Google
searching for this hasn't gotten anywhere. The key is that I'd like the
solution to be generic enough to work with an arbitrary linear formula, and
not substantially kludgy (like trying ever combination of regressions terms
until one works) as I'll be running this a lot on big data sets and don't
want my computation time swamped by running unnecessary regressions or
checking for number of factors after removing NAs.

Thanks in advance!
--Robert


PS. The Google search feature in the R-help archives appears to be down:
http://tolstoy.newcastle.edu.au/R/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R related issue

2016-03-10 Thread Nordlund, Dan (DSHS/RDA)
You haven't provided sufficient information for people to help you.   Please 
read the posting guide linked to at the bottom of this email.  We need a 
reproducible example.  You say you tried to discretize but it didn't work.  
What did you try (actual code please), and what error messages did you receive?

Dan

Daniel Nordlund, PhD
Research and Data Analysis Division
Services & Enterprise Support Administration
Washington State Department of Social and Health Services


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Santanu
> Mukherjee
> Sent: Thursday, March 10, 2016 8:54 AM
> To: r-help@r-project.org
> Subject: [R] R related issue
> 
> Hi,
> I have R and MySQL at the backend. I have used dbGetQuery to get the rows
> I want and put it in a data.frame rs2.
> Now I want to use that data.frame to do market basket using apriori it is
> giving me errors Error in asMethod(object) : column(s) 1, 2 not logical or a
> factor.
> Discretize the columns first
> 
> I tried to dicretize but did not work
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in make.names(col.names, unique = TRUE) : invalid multibyte string at '14 <4a>ULY 2012'

2016-03-10 Thread KMNanus
I’m trying to read in the data below from an Excel file (as a .csv file) in  
order to create an age (in years.%years) but am getting the error message in 
the subject line.

I’ve tried saving the dates as dates in Excel and tried saving the dates as 
text, both give me the same error message.  Can someone pls tell me what I’m 
doing wrong?

Gender  DOB Diagnosis   Screen Date
Male 14 JULY 2012   No   05 OCTOBER 2015
Female   31 OCTOBER 2009No   30 NOVEMBER 2015
Female   08 JULY 2009   No   06 DECEMBER 2015
Male 04 JUNE 2011   NA   11 JANUARY 2016
Female   21 AUGUST 2009 Yes  01 FEBRUARY 2016
Male 05 NOVEMBER 2007   No   16 FEBRUARY 2016
Male 01 JUNE 2009   NA   29 FEBRUARY 2016


 
Ken
kmna...@gmail.com
914-450-0816 (tel)
347-730-4813 (fax)



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Prediction from a rank deficient fit may be misleading

2016-03-10 Thread David Winsemius

> On Mar 10, 2016, at 8:08 AM, Michael Artz  wrote:
> 
> HI all,
> I have the following error -
>> resultVector <- predict(logitregressmodel, dataset1, type='response')
> Warning message:
> In predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type ==  :
>  prediction from a rank-deficient fit may be misleading

It wasn't an R error. It was an R warning. Was the `summary` output on 
logitregressmodel informative? Does the resultVector look sensible given its 
inputs?


> I have seen on internet that there may be some collinearity in the data and
> this is causing that.  How can I be sure?

Do some diagnostics. After looking carefully at the output of 
summary(logitregressmodel)  and perhaps summary(dataset1) if it was the 
original input to the modeling functions, and then you could move on to looking 
at cross-correlations on things you think are continuous and crosstabs on 
factor variables and the condition number on the full data matrix.

Lots of stuff turns up on search for "detecting collinearity condition number 
in r"

> 
> Thanks
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conversion problem with write.csv and as.character()

2016-03-10 Thread Jim Lemon
Hi Peter,
Have you tried:

a_fetchdata<-format(a_fetchdata,"%Y-%m-%d %H:%M:%S")

before writing the data?

Jim

On Thu, Mar 10, 2016 at 10:14 PM, Peter Neumaier
 wrote:
> Hi all, sorry for double/cross posting, I have sent an initial, similar
> question
> accidentally to r-sig-finance.
>
> I am writing a matrix (typeof = double) into a CSV file with write.csv.
>
> My first column of the matrix is a date in the form -mm-dd hh:mm:ss:
>
>> a_fetchdata[1,0]
>
> 2016-02-09 07:30:00
>> typeof(a_fetchdata[1,0])
> [1] "double"
>
> My CSV file contains a sequence of integers (from 1 to x) instead of the
> expected date.
>
> I tried to convert, but ran into "Error in dimnames":
>
>> as.character(first_fetchdata[1,0])
> Error in dimnames(cd) <- list(as.character(index(x)), colnames(x)) :
>   'dimnames' applied to non-array
> Called from: as.matrix.xts(x)
> Browse[1]> c
>>
>
> a) How can I prevent the conversion into integers to happen when writing
> into CSV?
> b) if a) is not do-able: how can I convert the date in double format to
> chars (i.e. with as.character) ?
>
> Thanks in advance,
> Peter
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R-es] R cern c++ alternativa e integración

2016-03-10 Thread Javier Marcuzzi
Estimados, de casualidad encontré esto (yo no tenía ni idea al respecto), 
escapa un poco a la lista pero puede ser de utilidad para usuarios avanzados, 
el organismo que está detrás tiene presupuesto en dinero y cerebros.

https://root.cern.ch/

C++ but integrated with other languages such as Python and R.

Javier Rubén Marcuzzi


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] truncpareto() - doesn't like my data and odd error message

2016-03-10 Thread peter dalgaard
Actually, the issue is that the left hand side of the model formula should be 
the name of the variable under consideration. If you write y ~ 1 there had 
better be a "y", but without the renaming you could also have used 
H_to_fit.Height ~ 1.

-pd


> On 10 Mar 2016, at 19:03 , John Hillier  wrote:
> 
> Dear Peter,
> 
> Thank you. Apolgies for not looking closer.  It is the end of a long day. 
> Fixed now, and I have learnt more about correctly interpreting R's manual 
> pages.
> 
> For the record  
> 
> Summary: If input to truncpareto() is not explicitly called 'y' it can 
> produce error messages about the values 'lower', which might be confusing.  
> So, ensure input is called 'y', and that 'lower' and 'upper' are just outside 
> the range of y.
> 
> John
> 
> -
> Dr John Hillier
> Senior Lecturer - Physical Geography
> Loughborough University
> 01509 223727
> 
> 
> From: peter dalgaard 
> Sent: 10 March 2016 17:49
> To: John Hillier
> Cc: r-help@r-project.org
> Subject: Re: [R] truncpareto() - doesn't like my data and odd error message
> 
> Look closer
> 
> -pd
> 
> 
>> On 10 Mar 2016, at 18:41 , John Hillier  wrote:
>> 
>> Thank you Peter,
>> 
>> Yes, it seems to do the same even if I simultaneously make that change.  
>> Output below.
>> 
>>> pdataH <- data.frame(y = H_to_fit$Height)
>>> summary(pdataH)
>>  y
>> Min.   :2000
>> 1st Qu.:2281
>> Median :2666
>> Mean   :2825
>> 3rd Qu.:3212
>> Max.   :4794
>>> fit3 <- vglm(y ~ 1, truncpareto(1999, 4794), data = pdataH, trace = TRUE)
>> Error in eval(expr, envir, enclos) :
>> the value of argument 'upper' is too low (requires 'max(y) < upper')
>> 
>> -
>> Dr John Hillier
>> Senior Lecturer - Physical Geography
>> Loughborough University
>> 01509 223727
>> 
>> 
>> From: peter dalgaard 
>> Sent: 10 March 2016 09:36
>> To: John Hillier
>> Cc: r-help@r-project.org
>> Subject: Re: [R] truncpareto() - doesn't like my data and odd error message
>> 
>> Also if you simultaneously change the 2000 to say 1999?
>> 
>> -p
>> 
>> On 10 Mar 2016, at 09:22 , John Hillier  wrote:
>> 
>>> Thank you Peter,
>>> 
>>> I believe this might be the way the error message is hard coded (i.e. it's 
>>> always y to describe the input).  Anyway, I changed the first line to
 pdataH <- data.frame(y = H_to_fit$Height)
>>> This makes the input 'y' instead of 'H_to_fit.Height', but makes no 
>>> difference to the outcome/error message.
>>> 
>>> John
>>> 
>>> -
>>> Dr John Hillier
>>> Senior Lecturer - Physical Geography
>>> Loughborough University
>>> 01509 223727
>>> 
>>> 
>>> From: peter dalgaard 
>>> Sent: 09 March 2016 19:58
>>> To: John Hillier
>>> Cc: r-help@r-project.org
>>> Subject: Re: [R] truncpareto() - doesn't like my data and odd error message
>>> 
 On 09 Mar 2016, at 18:52 , John Hillier  wrote:
 
 Dear All,
 
 
 I am attempting to describe a distribution of height data.  It appears 
 roughly linear on a log-log plot, so Pareto seems sensible.  However, the 
 data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I 
 would like to fit a Pareto distribution to the reliable (i.e. truncated) 
 section of the data.
 
 
 I found truncpareto(), and implemented one of its example uses 
 successfully.  Specifically, the third one at 
 http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.).
 
 
 When I try to run my data, I get the output below. Inputs shown with 
 chevrons.
 
 
> pdataH <- data.frame(H_to_fit$Height)
> summary(pdataH)
 H_to_fit.Height
 Min.   :2000
 1st Qu.:2281
 
 Median :2666
 Mean   :2825
 3rd Qu.:3212
 Max.   :4794
> fit3 <- vglm(y ~ 1, truncpareto(2000, 4794), data = pdataH, trace = TRUE)
 Error in eval(expr, envir, enclos) :
 the value of argument 'lower' is too high (requires '0 < lower < min(y)')
 
 
 This is odd as the usage format is - truncpareto(lower, upper), and 
 varying 2000 to 1900 and 2100 makes no difference. Neither do smaller or 
 larger variations. From the summary I think that my lowest input is 2000, 
 which I am taking as min(y). I have also played with the upper limit.  
 pdataH has 2117 observations in it.
 
 
 Is this a data format thing? i.e. of pdataH (a tried a few things, but to 
 no avail)
 
>>> 
>>> Umm, it doesn't seem to have a column called "y"?
>>> 
>>> --
>>> Peter Dalgaard, Professor,
>>> Center for Statistics, Copenhagen Business School
>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>> Phone: (+45)38153501
>>> Office: A 4.23
>>> Email: pd@cbs.dk  Priv: 

Re: [R] How to avoid endless loop in shiny

2016-03-10 Thread Michael Peng
Hi Greg,

Isolate may not solve the problem.  For the following code. It runs only
1-2 times and stopped without isolate. If we update the slider with same
value, it will not send a new message.

library("shiny")

ui <- fluidPage(

  titlePanel("Slider Test"),

  sidebarLayout(
sidebarPanel(
  min=2, max=10, value=10),

  sliderInput("sliderB", "B:",
  min = 1, max = 5, value = 5)
  ),

mainPanel(
  plotOutput("plot")
)
  )
)

server <- function(input, output, clientData, session) {

  observeEvent(input$sliderA,{updateSliderInput(session, "sliderB", value =
as.integer(input$sliderA/2))})

  observeEvent(input$sliderB,{updateSliderInput(session, "sliderA", value =
isolate(input$sliderB*2))})
}


shinyApp(server = server, ui = ui)

For example:

set B: 4 -> update A: 8 -> update B: 8 (same, no message, end)

changing of A is tricky
If original A is 10, B is 5.

set A: 9  -> update B: 4 (not same, continue) -> update A: 8 (not same,
continue) -> update B: 4 (same, no message, end)


If original A is 8, B is 4.
set A:9 - > update B: 4(same, no message, end)

isolate doesn't work in this case.

Instead, use the following code:

library("shiny")

ui <- fluidPage(

  titlePanel("Slider Test"),

  sidebarLayout(
sidebarPanel(
  min=2, max=10, value=10),

  sliderInput("sliderB", "B:",
  min = 1, max = 5, value = 5)
  ),

mainPanel(
  plotOutput("plot")
)
  )
)

server <- function(input, output, clientData, session) {

server <- function(input, output, clientData, session) {

  ignoreNext <- ""

  observeEvent(input$sliderA,{
  if (ignoreNext == "A") {
ignoreNext <<- ""
  }
  else{
valB <- as.integer(input$sliderA/2)
if(valB != input$sliderB){
  ignoreNext <<- "B"
  updateSliderInput(session, "sliderB", value = valB)
}
  }
})

  observeEvent(input$sliderB,{
if (ignoreNext == "B") {
  ignoreNext <<- ""
}
else{
  valA <- as.integer(input$sliderA*2)
  if(valA != input$sliderA){
ignoreNext <<- "A"
updateSliderInput(session, "sliderA", value = valA)
  }
}
})
}


shinyApp(server = server, ui = ui)

2016-03-08 18:00 GMT-05:00 Greg Snow <538...@gmail.com>:

> You need to use `isolate` on one of the assignments so that it does
> not register as an update.  Here are a few lines of code from the
> server.R file for an example that I use that has a slider for r
> (correlation) and another slider for r^2 and whenever one is changed,
> I want the other to update:
>
>   observe({
> updateSliderInput(session, 'r',
> value=isolate(ifelse(input$r<0,-1,1))*sqrt(input$r2))
>   })
>
>   observe({
> updateSliderInput(session, 'r2', value=input$r^2)
>   })
>
>
> I did end up in a loop once when I happened to choose just the wrong
> value and the rounding caused a jumping back and forth, but all the
> other times this has worked perfectly without the endless loop.
>
>
> On Tue, Mar 8, 2016 at 12:35 PM, Michael Peng
>  wrote:
> > Hi,
> >
> > I added two sliderInput into the app with package "shiny": sliderA and
> > sliderB. The values in the two sliders are correlated. If I change
> sliderA,
> > I used updateSliderInput to update the value in sliderB. And also If I
> > change sliderB, I used  updateSliderInput to update the value in slideA.
> >
> > The problem is it is an endless loop. How can I use updateSliderInput
> > without sending message to update the other slider.
> >
> > Thank.
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CANNOT RUN ctree

2016-03-10 Thread Sarah Goslee
In general, you should probably update your R installation.

In specific, you should install the missing package Formula, as the
error message is telling you.

On Thu, Mar 10, 2016 at 2:51 PM, CHIRIBOGA Xavier
 wrote:
> Dear all,
> Sorry for question again, but I am desesperating..
> I installed already "partykit" But there is stille a message (see below):
> the package "partykit" has been compiled with the version R3.1.3.
> What should I do now?
>
> I run my function and appears this:
>
> Error in loadNamespace(name) : there is no package called 'Formula'
>
> THANK YOU for ur assistance,
>
> Xavier
>
>
> install.packages("partykit") Installing package into 
> 'C:/Users/chiribogax/Documents/R/win-library/3.1' (as 'lib' is unspecified) 
> trying URL 
> 'http://cran.rstudio.com/bin/windows/contrib/3.1/partykit_1.0-5.zip' Content 
> type 'application/zip' length 1215105 bytes (1.2 Mb) opened URL downloaded 
> 1.2 Mb package 'partykit' successfully unpacked and MD5 sums checked The 
> downloaded binary packages are in 
> C:\Users\chiribogax\AppData\Local\Temp\RtmpyugH55\downloaded_packages > 
> library("partykit") Le chargement a nécessité le package : grid Warning 
> message: le package 'partykit' a été compilé avec la version R 3.1.3 > 
> plot(ctree(Surv(hours,state)~soil+volatile, data=data)) Error in 
> loadNamespace(name) : there is no package called 'Formula'
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] CANNOT RUN ctree

2016-03-10 Thread CHIRIBOGA Xavier
Dear all,
Sorry for question again, but I am desesperating..
I installed already "partykit" But there is stille a message (see below):
the package "partykit" has been compiled with the version R3.1.3.
What should I do now?

I run my function and appears this:

Error in loadNamespace(name) : there is no package called 'Formula'

THANK YOU for ur assistance,

Xavier


install.packages("partykit") Installing package into 
'C:/Users/chiribogax/Documents/R/win-library/3.1' (as 'lib' is unspecified) 
trying URL 'http://cran.rstudio.com/bin/windows/contrib/3.1/partykit_1.0-5.zip' 
Content type 'application/zip' length 1215105 bytes (1.2 Mb) opened URL 
downloaded 1.2 Mb package 'partykit' successfully unpacked and MD5 sums checked 
The downloaded binary packages are in 
C:\Users\chiribogax\AppData\Local\Temp\RtmpyugH55\downloaded_packages > 
library("partykit") Le chargement a nécessité le package : grid Warning 
message: le package 'partykit' a été compilé avec la version R 3.1.3 > 
plot(ctree(Surv(hours,state)~soil+volatile, data=data)) Error in 
loadNamespace(name) : there is no package called 'Formula'

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R related issue

2016-03-10 Thread Santanu Mukherjee
Hi,
I have R and MySQL at the backend. I have used dbGetQuery to get the rows I
want and put it in a data.frame rs2.
Now I want to use that data.frame to do market basket using apriori
it is giving me errors
Error in asMethod(object) : column(s) 1, 2 not logical or a factor.
Discretize the columns first

I tried to dicretize but did not work

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Prediction from a rank deficient fit may be misleading

2016-03-10 Thread Michael Artz
HI all,
I have the following error -
  >  resultVector <- predict(logitregressmodel, dataset1, type='response')
Warning message:
In predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type ==  :
  prediction from a rank-deficient fit may be misleading

I have seen on internet that there may be some collinearity in the data and
this is causing that.  How can I be sure?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] truncpareto() - doesn't like my data and odd error message

2016-03-10 Thread John Hillier
Dear Peter,

Thank you. Apolgies for not looking closer.  It is the end of a long day. Fixed 
now, and I have learnt more about correctly interpreting R's manual pages.

For the record  

Summary: If input to truncpareto() is not explicitly called 'y' it can produce 
error messages about the values 'lower', which might be confusing.  So, ensure 
input is called 'y', and that 'lower' and 'upper' are just outside the range of 
y.

John

-
Dr John Hillier
Senior Lecturer - Physical Geography
Loughborough University
01509 223727


From: peter dalgaard 
Sent: 10 March 2016 17:49
To: John Hillier
Cc: r-help@r-project.org
Subject: Re: [R] truncpareto() - doesn't like my data and odd error message

Look closer

-pd


> On 10 Mar 2016, at 18:41 , John Hillier  wrote:
>
> Thank you Peter,
>
> Yes, it seems to do the same even if I simultaneously make that change.  
> Output below.
>
>> pdataH <- data.frame(y = H_to_fit$Height)
>> summary(pdataH)
>   y
> Min.   :2000
> 1st Qu.:2281
> Median :2666
> Mean   :2825
> 3rd Qu.:3212
> Max.   :4794
>> fit3 <- vglm(y ~ 1, truncpareto(1999, 4794), data = pdataH, trace = TRUE)
> Error in eval(expr, envir, enclos) :
>  the value of argument 'upper' is too low (requires 'max(y) < upper')
>
> -
> Dr John Hillier
> Senior Lecturer - Physical Geography
> Loughborough University
> 01509 223727
>
> 
> From: peter dalgaard 
> Sent: 10 March 2016 09:36
> To: John Hillier
> Cc: r-help@r-project.org
> Subject: Re: [R] truncpareto() - doesn't like my data and odd error message
>
> Also if you simultaneously change the 2000 to say 1999?
>
> -p
>
> On 10 Mar 2016, at 09:22 , John Hillier  wrote:
>
>> Thank you Peter,
>>
>> I believe this might be the way the error message is hard coded (i.e. it's 
>> always y to describe the input).  Anyway, I changed the first line to
>>> pdataH <- data.frame(y = H_to_fit$Height)
>> This makes the input 'y' instead of 'H_to_fit.Height', but makes no 
>> difference to the outcome/error message.
>>
>> John
>>
>> -
>> Dr John Hillier
>> Senior Lecturer - Physical Geography
>> Loughborough University
>> 01509 223727
>>
>> 
>> From: peter dalgaard 
>> Sent: 09 March 2016 19:58
>> To: John Hillier
>> Cc: r-help@r-project.org
>> Subject: Re: [R] truncpareto() - doesn't like my data and odd error message
>>
>>> On 09 Mar 2016, at 18:52 , John Hillier  wrote:
>>>
>>> Dear All,
>>>
>>>
>>> I am attempting to describe a distribution of height data.  It appears 
>>> roughly linear on a log-log plot, so Pareto seems sensible.  However, the 
>>> data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I 
>>> would like to fit a Pareto distribution to the reliable (i.e. truncated) 
>>> section of the data.
>>>
>>>
>>> I found truncpareto(), and implemented one of its example uses 
>>> successfully.  Specifically, the third one at 
>>> http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.).
>>>
>>>
>>> When I try to run my data, I get the output below. Inputs shown with 
>>> chevrons.
>>>
>>>
 pdataH <- data.frame(H_to_fit$Height)
 summary(pdataH)
>>> H_to_fit.Height
>>> Min.   :2000
>>> 1st Qu.:2281
>>>
>>> Median :2666
>>> Mean   :2825
>>> 3rd Qu.:3212
>>> Max.   :4794
 fit3 <- vglm(y ~ 1, truncpareto(2000, 4794), data = pdataH, trace = TRUE)
>>> Error in eval(expr, envir, enclos) :
>>> the value of argument 'lower' is too high (requires '0 < lower < min(y)')
>>>
>>>
>>> This is odd as the usage format is - truncpareto(lower, upper), and varying 
>>> 2000 to 1900 and 2100 makes no difference. Neither do smaller or larger 
>>> variations. From the summary I think that my lowest input is 2000, which I 
>>> am taking as min(y). I have also played with the upper limit.  pdataH has 
>>> 2117 observations in it.
>>>
>>>
>>> Is this a data format thing? i.e. of pdataH (a tried a few things, but to 
>>> no avail)
>>>
>>
>> Umm, it doesn't seem to have a column called "y"?
>>
>> --
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd@cbs.dk  Priv: pda...@gmail.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__

Re: [R] truncpareto() - doesn't like my data and odd error message

2016-03-10 Thread peter dalgaard
Look closer

-pd


> On 10 Mar 2016, at 18:41 , John Hillier  wrote:
> 
> Thank you Peter,
> 
> Yes, it seems to do the same even if I simultaneously make that change.  
> Output below.
> 
>> pdataH <- data.frame(y = H_to_fit$Height)
>> summary(pdataH)
>   y   
> Min.   :2000  
> 1st Qu.:2281  
> Median :2666  
> Mean   :2825  
> 3rd Qu.:3212  
> Max.   :4794  
>> fit3 <- vglm(y ~ 1, truncpareto(1999, 4794), data = pdataH, trace = TRUE)
> Error in eval(expr, envir, enclos) : 
>  the value of argument 'upper' is too low (requires 'max(y) < upper')
> 
> -
> Dr John Hillier
> Senior Lecturer - Physical Geography
> Loughborough University
> 01509 223727
> 
> 
> From: peter dalgaard 
> Sent: 10 March 2016 09:36
> To: John Hillier
> Cc: r-help@r-project.org
> Subject: Re: [R] truncpareto() - doesn't like my data and odd error message
> 
> Also if you simultaneously change the 2000 to say 1999?
> 
> -p
> 
> On 10 Mar 2016, at 09:22 , John Hillier  wrote:
> 
>> Thank you Peter,
>> 
>> I believe this might be the way the error message is hard coded (i.e. it's 
>> always y to describe the input).  Anyway, I changed the first line to
>>> pdataH <- data.frame(y = H_to_fit$Height)
>> This makes the input 'y' instead of 'H_to_fit.Height', but makes no 
>> difference to the outcome/error message.
>> 
>> John
>> 
>> -
>> Dr John Hillier
>> Senior Lecturer - Physical Geography
>> Loughborough University
>> 01509 223727
>> 
>> 
>> From: peter dalgaard 
>> Sent: 09 March 2016 19:58
>> To: John Hillier
>> Cc: r-help@r-project.org
>> Subject: Re: [R] truncpareto() - doesn't like my data and odd error message
>> 
>>> On 09 Mar 2016, at 18:52 , John Hillier  wrote:
>>> 
>>> Dear All,
>>> 
>>> 
>>> I am attempting to describe a distribution of height data.  It appears 
>>> roughly linear on a log-log plot, so Pareto seems sensible.  However, the 
>>> data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I 
>>> would like to fit a Pareto distribution to the reliable (i.e. truncated) 
>>> section of the data.
>>> 
>>> 
>>> I found truncpareto(), and implemented one of its example uses 
>>> successfully.  Specifically, the third one at 
>>> http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.).
>>> 
>>> 
>>> When I try to run my data, I get the output below. Inputs shown with 
>>> chevrons.
>>> 
>>> 
 pdataH <- data.frame(H_to_fit$Height)
 summary(pdataH)
>>> H_to_fit.Height
>>> Min.   :2000
>>> 1st Qu.:2281
>>> 
>>> Median :2666
>>> Mean   :2825
>>> 3rd Qu.:3212
>>> Max.   :4794
 fit3 <- vglm(y ~ 1, truncpareto(2000, 4794), data = pdataH, trace = TRUE)
>>> Error in eval(expr, envir, enclos) :
>>> the value of argument 'lower' is too high (requires '0 < lower < min(y)')
>>> 
>>> 
>>> This is odd as the usage format is - truncpareto(lower, upper), and varying 
>>> 2000 to 1900 and 2100 makes no difference. Neither do smaller or larger 
>>> variations. From the summary I think that my lowest input is 2000, which I 
>>> am taking as min(y). I have also played with the upper limit.  pdataH has 
>>> 2117 observations in it.
>>> 
>>> 
>>> Is this a data format thing? i.e. of pdataH (a tried a few things, but to 
>>> no avail)
>>> 
>> 
>> Umm, it doesn't seem to have a column called "y"?
>> 
>> --
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd@cbs.dk  Priv: pda...@gmail.com
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] truncpareto() - doesn't like my data and odd error message

2016-03-10 Thread John Hillier
Thank you Peter,

Yes, it seems to do the same even if I simultaneously make that change.  Output 
below.

> pdataH <- data.frame(y = H_to_fit$Height)
> summary(pdataH)
   y   
 Min.   :2000  
 1st Qu.:2281  
 Median :2666  
 Mean   :2825  
 3rd Qu.:3212  
 Max.   :4794  
> fit3 <- vglm(y ~ 1, truncpareto(1999, 4794), data = pdataH, trace = TRUE)
Error in eval(expr, envir, enclos) : 
  the value of argument 'upper' is too low (requires 'max(y) < upper')

-
Dr John Hillier
Senior Lecturer - Physical Geography
Loughborough University
01509 223727


From: peter dalgaard 
Sent: 10 March 2016 09:36
To: John Hillier
Cc: r-help@r-project.org
Subject: Re: [R] truncpareto() - doesn't like my data and odd error message

Also if you simultaneously change the 2000 to say 1999?

-p

On 10 Mar 2016, at 09:22 , John Hillier  wrote:

> Thank you Peter,
>
> I believe this might be the way the error message is hard coded (i.e. it's 
> always y to describe the input).  Anyway, I changed the first line to
>> pdataH <- data.frame(y = H_to_fit$Height)
> This makes the input 'y' instead of 'H_to_fit.Height', but makes no 
> difference to the outcome/error message.
>
> John
>
> -
> Dr John Hillier
> Senior Lecturer - Physical Geography
> Loughborough University
> 01509 223727
>
> 
> From: peter dalgaard 
> Sent: 09 March 2016 19:58
> To: John Hillier
> Cc: r-help@r-project.org
> Subject: Re: [R] truncpareto() - doesn't like my data and odd error message
>
>> On 09 Mar 2016, at 18:52 , John Hillier  wrote:
>>
>> Dear All,
>>
>>
>> I am attempting to describe a distribution of height data.  It appears 
>> roughly linear on a log-log plot, so Pareto seems sensible.  However, the 
>> data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I would 
>> like to fit a Pareto distribution to the reliable (i.e. truncated) section 
>> of the data.
>>
>>
>> I found truncpareto(), and implemented one of its example uses successfully. 
>>  Specifically, the third one at 
>> http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.).
>>
>>
>> When I try to run my data, I get the output below. Inputs shown with 
>> chevrons.
>>
>>
>>> pdataH <- data.frame(H_to_fit$Height)
>>> summary(pdataH)
>>  H_to_fit.Height
>>  Min.   :2000
>>  1st Qu.:2281
>>
>>  Median :2666
>>  Mean   :2825
>>  3rd Qu.:3212
>>  Max.   :4794
>>> fit3 <- vglm(y ~ 1, truncpareto(2000, 4794), data = pdataH, trace = TRUE)
>> Error in eval(expr, envir, enclos) :
>> the value of argument 'lower' is too high (requires '0 < lower < min(y)')
>>
>>
>> This is odd as the usage format is - truncpareto(lower, upper), and varying 
>> 2000 to 1900 and 2100 makes no difference. Neither do smaller or larger 
>> variations. From the summary I think that my lowest input is 2000, which I 
>> am taking as min(y). I have also played with the upper limit.  pdataH has 
>> 2117 observations in it.
>>
>>
>> Is this a data format thing? i.e. of pdataH (a tried a few things, but to no 
>> avail)
>>
>
> Umm, it doesn't seem to have a column called "y"?
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FUNCTION ctree

2016-03-10 Thread Achim Zeileis
Thanks to Erich and Sarah for the clarifications. For those wondering 
whether ctree() from "party" or from "partykit" should be used: the latter 
is the newer and improved implementation.


On Thu, 10 Mar 2016, Erich Neuwirth wrote:


If you do
??ctree
and the package partykit is installed, you will see that this function is 
defined in this package.
So, you should run
library(partykit)
before running your function call

If partykit is not installed, you need to install it.






On Mar 10, 2016, at 15:58, CHIRIBOGA Xavier  wrote:

Dear all,


I am using Rstudio. What to do when you get this message?

Error in plot(ctree(Surv(hours, state) ~ soil + volatile, data = data)) :
 could not find function "ctree"

Thank you,


Xavier

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Navigation keys not working

2016-03-10 Thread Loris Bennett
Hi,

Fazulur Rehaman  writes:

> Dear Sir/Madam,
>
> Navigation keys not working in R installed on linux (Linux version
> 3.10.0-327.10.1.el7.x86_64). I am using R version 3.0.1. When I press
> up arrow its giving "^[[A".  Could you please suggest me how to
> overcome this problem.
>
> Here is my R session Info
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=C LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>> ^[[A^[[A^[[A^[[A^[[A^[[A
>
> Thanks in Advance.
> Rehaman

It looks as if your version of R may not be linked to the readline
library.  You can check the shared libraries linked to the binary with
'ldd'. e.g.

$ ldd /usr/lib/R/bin/exec/R 
linux-vdso.so.1 (0x7fff235ed000)
libR.so => /usr/lib/libR.so (0x7f3bfa692000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 
(0x7f3bfa47c000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
(0x7f3bfa25f000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f3bf9eb4000)
libblas.so.3 => /usr/lib/libblas.so.3 (0x7f3bf9c34000)
libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 
(0x7f3bf9916000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x7f3bf9615000)
libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 
(0x7f3bf93d8000)
libreadline.so.6 => /lib/x86_64-linux-gnu/libreadline.so.6 
(0x7f3bf918e000)
libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x7f3bf8f2)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x7f3bf8cfd000)
libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 
(0x7f3bf8aed000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x7f3bf88d2000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x7f3bf86ca000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x7f3bf84c6000)
/lib64/ld-linux-x86-64.so.2 (0x7f3bfabb8000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 
(0x7f3bf82b)
libtinfo.so.5 => /lib/x86_64-linux-gnu/libtinfo.so.5 
(0x7f3bf8086000)

For the arrow keys to allow you to traverse the command history, you
will need an entry containing 'libreadline.so'.

If it is missing, you will have to rebuild R, making sure that the
'configure' option '--with-readline=yes' is set.  This is actually the
default, but you are using an old version of R, so it may not have been
back then.  Also your readline library might be in a strange place and
wasn't found by 'configure'.  In that case, you would have to ensure
that it is in the variable $LD_LIBRARY_PATH.

Cheers,

Loris

-- 
This signature is currently under construction.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conversion problem with write.csv and as.character()

2016-03-10 Thread Joshua Ulrich
I already answered all your questions on R-SIG-Finance.

On Thu, Mar 10, 2016 at 5:14 AM, Peter Neumaier
 wrote:
> Hi all, sorry for double/cross posting, I have sent an initial, similar
> question
> accidentally to r-sig-finance.
>
> I am writing a matrix (typeof = double) into a CSV file with write.csv.
>
> My first column of the matrix is a date in the form -mm-dd hh:mm:ss:
>
>> a_fetchdata[1,0]
>
> 2016-02-09 07:30:00
>> typeof(a_fetchdata[1,0])
> [1] "double"
>
> My CSV file contains a sequence of integers (from 1 to x) instead of the
> expected date.
>
> I tried to convert, but ran into "Error in dimnames":
>
>> as.character(first_fetchdata[1,0])
> Error in dimnames(cd) <- list(as.character(index(x)), colnames(x)) :
>   'dimnames' applied to non-array
> Called from: as.matrix.xts(x)
> Browse[1]> c
>>
>
> a) How can I prevent the conversion into integers to happen when writing
> into CSV?
> b) if a) is not do-able: how can I convert the date in double format to
> chars (i.e. with as.character) ?
>
> Thanks in advance,
> Peter
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com
R/Finance 2016 | www.rinfinance.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conversion problem with write.csv and as.character()

2016-03-10 Thread Sarah Goslee
Hi Peter,

We really need a reproducible example to solve this kind of question.
Please use dput(head(yourdata)) to provide a sample of data (or make
up fake data that shows the same problems), and provide the code
you're using.

Sarah

On Thu, Mar 10, 2016 at 6:14 AM, Peter Neumaier
 wrote:
> Hi all, sorry for double/cross posting, I have sent an initial, similar
> question
> accidentally to r-sig-finance.
>
> I am writing a matrix (typeof = double) into a CSV file with write.csv.
>
> My first column of the matrix is a date in the form -mm-dd hh:mm:ss:
>
>> a_fetchdata[1,0]
>
> 2016-02-09 07:30:00
>> typeof(a_fetchdata[1,0])
> [1] "double"
>
> My CSV file contains a sequence of integers (from 1 to x) instead of the
> expected date.
>
> I tried to convert, but ran into "Error in dimnames":
>
>> as.character(first_fetchdata[1,0])
> Error in dimnames(cd) <- list(as.character(index(x)), colnames(x)) :
>   'dimnames' applied to non-array
> Called from: as.matrix.xts(x)
> Browse[1]> c
>>
>
> a) How can I prevent the conversion into integers to happen when writing
> into CSV?
> b) if a) is not do-able: how can I convert the date in double format to
> chars (i.e. with as.character) ?
>
> Thanks in advance,
> Peter
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conversion problem with write.csv and as.character()

2016-03-10 Thread Peter Neumaier
Hi all, sorry for double/cross posting, I have sent an initial, similar
question
accidentally to r-sig-finance.

I am writing a matrix (typeof = double) into a CSV file with write.csv.

My first column of the matrix is a date in the form -mm-dd hh:mm:ss:

> a_fetchdata[1,0]

2016-02-09 07:30:00
> typeof(a_fetchdata[1,0])
[1] "double"

My CSV file contains a sequence of integers (from 1 to x) instead of the
expected date.

I tried to convert, but ran into "Error in dimnames":

> as.character(first_fetchdata[1,0])
Error in dimnames(cd) <- list(as.character(index(x)), colnames(x)) :
  'dimnames' applied to non-array
Called from: as.matrix.xts(x)
Browse[1]> c
>

a) How can I prevent the conversion into integers to happen when writing
into CSV?
b) if a) is not do-able: how can I convert the date in double format to
chars (i.e. with as.character) ?

Thanks in advance,
Peter

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Navigation keys not working

2016-03-10 Thread Fazulur Rehaman
Dear Sir/Madam,

Navigation keys not working in R installed on linux (Linux version 
3.10.0-327.10.1.el7.x86_64). I am using R version 3.0.1. When I press up arrow 
its giving "^[[A".  Could you please suggest me how to overcome this problem.

Here is my R session Info

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
> ^[[A^[[A^[[A^[[A^[[A^[[A

Thanks in Advance.
Rehaman


This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are 
not to copy, disclose, or distribute this e-mail or its contents to any other 
person and any such actions that are unlawful. This e-mail may contain viruses. 
Ocimum Biosolutions has taken every reasonable precaution to minimize this 
risk, but is not liable for any damage you may sustain as a result of any virus 
in this e-mail. You should carry out your own virus checks before opening the 
e-mail or attachment.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] rZeppelin: An R notebook that makes Spark easy to use

2016-03-10 Thread Amos B. Elberg
rZeppelin is an R interpreter for Apache (incubating) Zeppelin.  Zeppelin
is a notebook, sort of like iPython, built on top of Apache Spark.

rZeppelin makes it possible, for the first time, to create a single data/ML
pipeline that mixes R, scala, and Python code, seamlessly, from a single
interface.  (Without breaking lazy evaluation!)

For R-using data scientists, this means that you can access the full power
of Spark — including ultra-fast distributed implementations of popular
algorithms — using R, without having to learn scala, without a dedicated
administrator to manage a Spark or Hadoop cluster, and without spending
more than minimal time to review the SparkR api.

You can load text data using R, quickly create an LDA model using Spark’s
distributed LDA package, tag the text using gensim from Python, and then
visualize and take further steps from R, from a single session using a
single interface.

The full range of Spark packages, including MLLIB and GraphX, which used to
require scala development, can be used in the same pipeline with R.
(Except Spark Streaming, which Zeppelin doesn’t yet support.)

Beyond Spark, R data can be visualized using Zeppelin’s built-in
interactive visualizations.  rZeppelin also leverages knitr to make
available most R visualization and interactive visualization packages.

Many data types are also easily moved between R, scala and Python:  the
languages share a ZeppelinContext, where variables can be added and
extracted with .z.put() and .z.get().

rZeppelin is intended to make Spark part of the R data scientist’s daily
toolbox.

rZeppelin is available here:  https://github.com/elbamos/Zeppelin-With-R

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FUNCTION ctree

2016-03-10 Thread Erich Neuwirth
If you do
??ctree
and the package partykit is installed, you will see that this function is 
defined in this package.
So, you should run
library(partykit)
before running your function call

If partykit is not installed, you need to install it.





> On Mar 10, 2016, at 15:58, CHIRIBOGA Xavier  wrote:
> 
> Dear all,
> 
> 
> I am using Rstudio. What to do when you get this message?
> 
> Error in plot(ctree(Surv(hours, state) ~ soil + volatile, data = data)) :
>  could not find function "ctree"
> 
> Thank you,
> 
> 
> Xavier
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FUNCTION ctree

2016-03-10 Thread Sarah Goslee
Probably load the package that ctree() comes from, possibly party.

library(party)

The code you're looking at should tell you.

On Thu, Mar 10, 2016 at 9:58 AM, CHIRIBOGA Xavier
 wrote:
> Dear all,
>
>
> I am using Rstudio. What to do when you get this message?
>
> Error in plot(ctree(Surv(hours, state) ~ soil + volatile, data = data)) :
>   could not find function "ctree"
>
> Thank you,
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] FUNCTION ctree

2016-03-10 Thread CHIRIBOGA Xavier
Dear all,


I am using Rstudio. What to do when you get this message?

Error in plot(ctree(Surv(hours, state) ~ soil + volatile, data = data)) :
  could not find function "ctree"

Thank you,


Xavier

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Map of Europe at NUTS 2 Level

2016-03-10 Thread Miluji Sb
Dear all.

I would like to draw a map of France, Italy, Spain, and Portugal at NUTS 2
level. I used the following code:

library(“rgdal”)
library(“RColorBrewer”)
library(“classInt”)
#library(“SmarterPoland”)
library(fields)

# Download Administrative Level data from EuroStat
temp <- tempfile(fileext = ".zip")
download.file("
http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2010_60M_SH.zip
",
  temp)
unzip(temp)

# Read data
EU_NUTS <- readOGR(dsn = "./NUTS_2010_60M_SH/data", layer =
"NUTS_RG_60M_2010")

# Subset NUTS 2 level data
map_nuts2 <- subset(EU_NUTS, STAT_LEVL_ == 2)

# Draw basic plot
plot(map_nuts2)

This does produce a plot but its rather 'ugle'. Is there any way I can
subset the data further and draw a map for France, Italy, Spain, and
Portugal only? Thank you very much!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R 3.2.4 is released

2016-03-10 Thread Peter Dalgaard
The build system rolled up  R-3.2.4.tar.gz (codename "Very Secure Dishes") this 
morning.

The list below details the changes in this release.

You can get the source code from

http://cran.r-project.org/src/base/R-3/R-3.2.4.tar.gz

or wait for it to be mirrored at a CRAN site nearer to you.

Binaries for various platforms will appear in due course.


For the R Core Team,

Peter Dalgaard


These are the md5sums for the freshly created files, in case you wish
to check that they are uncorrupted:


MD5 (AUTHORS) = eb97a5cd38acb1cfc6408988bffef765
MD5 (COPYING) = eb723b61539feef013de476e68b5c50a
MD5 (COPYING.LIB) = a6f89e2100d9b6cdffcea4f398e37343
MD5 (FAQ) = d2e93152b963acbb53027c355dda539a
MD5 (INSTALL) = 3964b9119adeaab9ceb633773fc94aac
MD5 (NEWS) = 087f64ddfe922d2a565ff64bb8543039
MD5 (NEWS.0) = bfcd7c147251b5474d96848c6f57e5a8
MD5 (NEWS.1) = eb78c4d053ec9c32b815cf0c2ebea801
MD5 (NEWS.2) = 8e2f4d1d5228663ae598a09bf1e2bc6b
MD5 (R-latest.tar.gz) = 5953104583ed93dc2085a6c80e884e4a
MD5 (README) = aece1dfbd18c1760128c3787f5456af6
MD5 (RESOURCES) = 529223fd3ffef95731d0a87353108435
MD5 (THANKS) = ba00f6cc68a823e1741cfa6011f40ccb
MD5 (VERSION-INFO.dcf) = 71584fcd5c399b40750fcd7113521636
MD5 (R-3/R-3.2.4.tar.gz) = 5953104583ed93dc2085a6c80e884e4a


This is the relevant part of the NEWS file

CHANGES IN R 3.2.4:

  NEW FEATURES:

* install.packages() and related functions now give a more
  informative warning when an attempt is made to install a base
  package.

* summary(x) now prints with less rounding when x contains infinite
  values. (Request of PR#16620.)

* provideDimnames() gets an optional unique argument.

* shQuote() gains type = "cmd2" for quoting in cmd.exe in Windows.
  (Response to PR#16636.)

* The data.frame method of rbind() gains an optional argument
  stringsAsFactors (instead of only depending on
  getOption("stringsAsFactors")).

* smooth(x, *) now also works for long vectors.

* tools::texi2dvi() has a workaround for problems with the texi2dvi
  script supplied by texinfo 6.1.

  It extracts more error messages from the LaTeX logs when in
  emulation mode.

  UTILITIES:

* R CMD check will leave a log file build_vignettes.log from the
  re-building of vignettes in the .Rcheck directory if there is a
  problem, and always if environment variable
  _R_CHECK_ALWAYS_LOG_VIGNETTE_OUTPUT_ is set to a true value.

  DEPRECATED AND DEFUNCT:

* Use of SUPPORT_OPENMP from header Rconfig.h is deprecated in
  favour of the standard OpenMP define _OPENMP.

  (This has been the recommendation in the manual for a while now.)

* The make macro AWK which is long unused by R itself but recorded
  in file etc/Makeconf is deprecated and will be removed in R
  3.3.0.

* The C header file S.h is no longer documented: its use should be
  replaced by R.h.

  BUG FIXES:

* kmeans(x, centers = <1-row>) now works. (PR#16623)

* Vectorize() now checks for clashes in argument names.  (PR#16577)

* file.copy(overwrite = FALSE) would signal a successful copy when
  none had taken place.  (PR#16576)

* ngettext() now uses the same default domain as gettext().
  (PR#14605)

* array(.., dimnames = *) now warns about non-list dimnames and,
  from R 3.3.0, will signal the same error for invalid dimnames as
  matrix() has always done.

* addmargins() now adds dimnames for the extended margins in all
  cases, as always documented.

* heatmap() evaluated its add.expr argument in the wrong
  environment.  (PR#16583)

* require() etc now give the correct entry of lib.loc in the
  warning about an old version of a package masking a newer
  required one.

* The internal deparser did not add parentheses when necessary,
  e.g. before [] or [[]].  (Reported by Lukas Stadler; additional
  fixes included as well).

* as.data.frame.vector(*, row.names=*) no longer produces
  'corrupted' data frames from row names of incorrect length, but
  rather warns about them.  This will become an error.

* url connections with method = "libcurl" are destroyed properly.
  (PR#16681)

* withCallingHandler() now (again) handles warnings even during S4
  generic's argument evaluation.  (PR#16111)

* deparse(..., control = "quoteExpressions") incorrectly quoted
  empty expressions.  (PR#16686)

* format()ting datetime objects ("POSIX[cl]?t") could segfault or
  recycle wrongly.  (PR#16685)

* plot.ts(, las = 1) now does use las.

* saveRDS(*, compress = "gzip") now works as documented.
  (PR#16653)

* (Windows only) The Rgui front end did not always initialize the
  console properly, and could cause R to crash.  (PR#16998)

* dummy.coef.lm() now works in more cases, thanks to a proposal by
  Werner Stahel (PR#16665).  In addition, it now works for
  multivariate linear models ("mlm", manova) thanks 

Re: [R] truncpareto() - doesn't like my data and odd error message

2016-03-10 Thread peter dalgaard
Also if you simultaneously change the 2000 to say 1999?

-p

On 10 Mar 2016, at 09:22 , John Hillier  wrote:

> Thank you Peter,
> 
> I believe this might be the way the error message is hard coded (i.e. it's 
> always y to describe the input).  Anyway, I changed the first line to 
>> pdataH <- data.frame(y = H_to_fit$Height)
> This makes the input 'y' instead of 'H_to_fit.Height', but makes no 
> difference to the outcome/error message.
> 
> John
> 
> -
> Dr John Hillier
> Senior Lecturer - Physical Geography
> Loughborough University
> 01509 223727
> 
> 
> From: peter dalgaard 
> Sent: 09 March 2016 19:58
> To: John Hillier
> Cc: r-help@r-project.org
> Subject: Re: [R] truncpareto() - doesn't like my data and odd error message
> 
>> On 09 Mar 2016, at 18:52 , John Hillier  wrote:
>> 
>> Dear All,
>> 
>> 
>> I am attempting to describe a distribution of height data.  It appears 
>> roughly linear on a log-log plot, so Pareto seems sensible.  However, the 
>> data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I would 
>> like to fit a Pareto distribution to the reliable (i.e. truncated) section 
>> of the data.
>> 
>> 
>> I found truncpareto(), and implemented one of its example uses successfully. 
>>  Specifically, the third one at 
>> http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.).
>> 
>> 
>> When I try to run my data, I get the output below. Inputs shown with 
>> chevrons.
>> 
>> 
>>> pdataH <- data.frame(H_to_fit$Height)
>>> summary(pdataH)
>>  H_to_fit.Height
>>  Min.   :2000
>>  1st Qu.:2281
>> 
>>  Median :2666
>>  Mean   :2825
>>  3rd Qu.:3212
>>  Max.   :4794
>>> fit3 <- vglm(y ~ 1, truncpareto(2000, 4794), data = pdataH, trace = TRUE)
>> Error in eval(expr, envir, enclos) :
>> the value of argument 'lower' is too high (requires '0 < lower < min(y)')
>> 
>> 
>> This is odd as the usage format is - truncpareto(lower, upper), and varying 
>> 2000 to 1900 and 2100 makes no difference. Neither do smaller or larger 
>> variations. From the summary I think that my lowest input is 2000, which I 
>> am taking as min(y). I have also played with the upper limit.  pdataH has 
>> 2117 observations in it.
>> 
>> 
>> Is this a data format thing? i.e. of pdataH (a tried a few things, but to no 
>> avail)
>> 
> 
> Umm, it doesn't seem to have a column called "y"?
> 
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] .Call works in R 2 not in R 3

2016-03-10 Thread Sebastien Moretti

Hi

Using  cutree(tree$merge, k)  fixed the problem.
In fact had to use  cutree(tree["merge"], k)

Will compare results in R 2 and R 3 now.

Thanks for your help.
Sébastien


cutree is a function available in stats.
So it might be worth a try to just replace

.Call("R_cutree", tree$merge, k, PACKAGE = "stats”)

by

cutree(tree$merge,k)

and see what happens.

checking the source of cutree shows the following call

 ans <- .Call(C_cutree, tree$merge, k)

so replacing R_cutree by C_cutree also might be an option.
But, of course as Uwe recommended,
using a plain R call and not using .Call is the preferred solution.



On Mar 8, 2016, at 14:55, Sebastien Moretti  wrote:


Hi

I inherited a R package done in 2004 that works perfectly in R 2.15.1
and before, but not in R 3.2.2 ( >= 3).

I have already fixed issues with namespace for functions in R 3 but
maybe not all of them.

Here is the error message:
Error in .Call("R_cutree", tree$merge, k, PACKAGE = "stats")
"R_cutree" not available for .Call() for package "stats"


Why do you .Call() into another package? Rather use the API.


Let's say that I am far from a R master.
I never use .Call() myself.

I want the code works again in R >= 3 because R support for R 2 will soon be 
stopped in my institute.
When the code will work again, I could change internals by comparing results 
with R 2 and R 3.


Best,
Uwe Ligges



Thanks for your help


--
Sébastien Moretti

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help -- feed data frames, returned from function into list --

2016-03-10 Thread Berend Hasselman

> On 10 Mar 2016, at 08:56, Sunny Singha  
> wrote:
> 
> Hi,
> I got a problem. Using loop, I'm trying to feed data frames, returned from
> the function, into the list but not all data frames are getting captured.
> 
> I have vector with some string values which I want to pass to the function.
> 
> *groups* <- c('cocacola', 'youtube','facebook)
> 

You posted in HTML. In a plain text mailing list we see *groups* which is 
nonsense. You boldified?
Please do not do that and do not post in HTML.

> for(i in 1:length(groups)){
> g <- list()
> g[[i]] <- searchGroups_mod(*groups[i]*, token=fb_oauth, 10)
> }
> 
> The result list stores data frame only for the last string in the 'groups'
> vector.
> Why the List is getting reassigned for each iteration ? Please guide.
> 

Because you are creating the list g inside the loop.
Put the g <- list) before the loop.

> Regards,
> Sunny Singha.
> 
>   [[alternative HTML version deleted]]__

This is a plain text mailinglist. Do not post in HTML.

Berend

> 
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] truncpareto() - doesn't like my data and odd error message

2016-03-10 Thread John Hillier
Thank you Peter,

I believe this might be the way the error message is hard coded (i.e. it's 
always y to describe the input).  Anyway, I changed the first line to 
> pdataH <- data.frame(y = H_to_fit$Height)
This makes the input 'y' instead of 'H_to_fit.Height', but makes no difference 
to the outcome/error message.

John

-
Dr John Hillier
Senior Lecturer - Physical Geography
Loughborough University
01509 223727


From: peter dalgaard 
Sent: 09 March 2016 19:58
To: John Hillier
Cc: r-help@r-project.org
Subject: Re: [R] truncpareto() - doesn't like my data and odd error message

> On 09 Mar 2016, at 18:52 , John Hillier  wrote:
>
> Dear All,
>
>
> I am attempting to describe a distribution of height data.  It appears 
> roughly linear on a log-log plot, so Pareto seems sensible.  However, the 
> data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I would 
> like to fit a Pareto distribution to the reliable (i.e. truncated) section of 
> the data.
>
>
> I found truncpareto(), and implemented one of its example uses successfully.  
> Specifically, the third one at 
> http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.).
>
>
> When I try to run my data, I get the output below. Inputs shown with chevrons.
>
>
>> pdataH <- data.frame(H_to_fit$Height)
>> summary(pdataH)
>   H_to_fit.Height
>   Min.   :2000
>   1st Qu.:2281
>
>   Median :2666
>   Mean   :2825
>   3rd Qu.:3212
>   Max.   :4794
>> fit3 <- vglm(y ~ 1, truncpareto(2000, 4794), data = pdataH, trace = TRUE)
> Error in eval(expr, envir, enclos) :
>  the value of argument 'lower' is too high (requires '0 < lower < min(y)')
>
>
> This is odd as the usage format is - truncpareto(lower, upper), and varying 
> 2000 to 1900 and 2100 makes no difference. Neither do smaller or larger 
> variations. From the summary I think that my lowest input is 2000, which I am 
> taking as min(y). I have also played with the upper limit.  pdataH has 2117 
> observations in it.
>
>
> Is this a data format thing? i.e. of pdataH (a tried a few things, but to no 
> avail)
>

Umm, it doesn't seem to have a column called "y"?

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survival analysis with interval censored data

2016-03-10 Thread Audrey Bologna
Hi!

Does anyone know about survival analysis using interval censored data and
the intcox package?

I fed different ant colonies with different food (called treatment) and I
measured every 2 days the death of ants. I would like to see if the
treatment can explain a different death of individuals.
Since I checked the death every 2 days, I assume that my datas are interval
censored.
Here is a part of my dataset, an exemple for one colony (1A):

date date2 fate colony condition
1 2 3 1A before
1 4 3 1A before
1 4 3 1A before
1 8 3  1A before
1 14 3  1A before
1 16 3   1A before
1 20 0  1 A before
1 20 0  1A before
1 20 0  1A before
1 20 0  1A before
1 20 0   1A before






date corresponds to the beginning of my experiment: the 21st of January and
I called it 1.
date 2 corresponds to the date of the death: 2= 23rd of Janurary, 3= 25th
of January and so one.
Fate: 3 a death event for interval censored data, 0 when individuals are
still alive.

I created a survival object this way:

surv.object<-Surv(test$date1,test$date,test$fate,type="interval") and here
is a part of what I obtain as a survival object:


[1901] [1,  4] [1,  6] [1,  6] [1,  8] [1,  8] [1,  8] [1,  8] [1, 10] [1,
10] [1, 10]
[1911] [1, 10] [1, 12] [1, 16] [1, 16] [1, 16] [1, 18] [1, 20] 1+  1+
   1+
[1921] 1+  1+  1+  1+  1+  1+  1+  1+  1+
   1+
[1931] 1+  1+  1+  1+  1+  1+  1+  1+  1+
   1+
[1941] 1+  1+  1+  1+  1+  1+  1+  1+  1+
   1+
[1951] 1+  1+  1+  1+  1+  1+  1+  1+  1+
   1+
[1961] 1+  1+  1+  1+  1+  1+  1+  1+  1+
   1+
[1971] 1+  1+  1+  1+  1+  1+  1+  1+  1+
   1+
[1981] 1+  1+  1+  1+


However when I try to apply the intcox function, it returns me warnings
messages:

 ex<-intcox(surv.object~test$colony, data=test)
Erreur dans if (any(derivs.wert$g1 <= 0)) { :
  valeur manquante là où TRUE / FALSE est requis
De plus : Message d'avis :
In coxph(formula, data) :
  X matrix deemed to be singular; variable 1 2 3 4 5 6 7 8 9 10 11 12 13 14
15 16 17 18 19 22


Can anybody help me?

Thanks a lot


Audrey Bologna - PhD Student
Unit of Social Ecology
Université libre de Bruxelles, CP 231
Boulevard du Triomphe
B-1050 Brussels
Belgium

 http://www.ulb.ac.be/sciences/use/bologna.html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.