date:20160722

Re: [R] Improving performance by rewriting for loops into apply functions

2016-07-22 Thread Jeff Newmiller

If you complain to the doctor that it hurts when you ram your head into the 
wall, (s)he is going to tell you to not so that. What do you expect us to say? 

You seem full of misinformation. The apply family functions do not necessarily 
speed anything up... they are just more compact than for loops.

And yes, their arguments are immutable... as are pretty much all arguments to 
functions other than environments. So your screwdriver is not going to work on 
that nail.

Also, working with N x 1 matrices is silly... this is not Matlab.

In many cases you can restructure the problem so that you don't need 
incremental calculation... but I am not really sure what you wanted in this 
case. If this were a for loop the code you have looks to me like it would 
produce

path1 <- c(1,2,3,4,5)
path2 <- c(1,2,3,4,5,6)
path1 * path2[ -1 ]

Maybe it is time for you to read "The R Inferno"? 

Also please read the Posting Guide, which warns you to post plain text (so your 
email does not get corrupted when it is stripped by the mailing list).
-- 
Sent from my phone. Please excuse my brevity.

On July 22, 2016 4:16:18 PM PDT, "Aleš Grm"  wrote:
>Hello,
>
>I have a slight performance issue that I'd like to solve by rewriting a
>short bit of code that uses for loops so that it would use apply in
>order
>to get some performance gains. My problem is that I can't modify the
>variables that are passed to apply function during apply functions
>execution and use it's latest results. My first thought was to use
>apply
>functions but if this isn't possible I'm open to other suggestions.
>
>For example:
>path1 = matrix(c(1,2,3,4,5), ncol=1);
>path2 = matrix(c(1,2,3,4,5,6), ncol=1);
>apply(path1, 2, function(x, path2){
>  tmp = x*path2[x+1];
>  path2[x+1] = tmp;
>  return(tmp);
>}, path2)
>
>In the code above, path2 should have its elements updated in the course
>of
>the apply function and its value should be use in the next iteration
>while
>executing apply function but that doesn't happen. It seems that when a
>variable is passed into apply function (path2) it is immutable.
>
>BR Aleš
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Improving performance by rewriting for loops into apply functions

2016-07-22 Thread Aleš Grm

Hello,

I have a slight performance issue that I'd like to solve by rewriting a
short bit of code that uses for loops so that it would use apply in order
to get some performance gains. My problem is that I can't modify the
variables that are passed to apply function during apply functions
execution and use it's latest results. My first thought was to use apply
functions but if this isn't possible I'm open to other suggestions.

For example:
path1 = matrix(c(1,2,3,4,5), ncol=1);
path2 = matrix(c(1,2,3,4,5,6), ncol=1);
apply(path1, 2, function(x, path2){
  tmp = x*path2[x+1];
  path2[x+1] = tmp;
  return(tmp);
}, path2)

In the code above, path2 should have its elements updated in the course of
the apply function and its value should be use in the next iteration while
executing apply function but that doesn't happen. It seems that when a
variable is passed into apply function (path2) it is immutable.

BR Aleš

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using C library in R

2016-07-22 Thread Jeff Newmiller

You are entitled to your opinion,  but apparently you have not read the Posting 
Guide either.
-- 
Sent from my phone. Please excuse my brevity.

On July 22, 2016 6:00:19 PM PDT, Dirk Eddelbuettel  wrote:
>Jeff Newmiller  dcn.davis.ca.us> writes:
>> 2) Interfacing R with other languages is off-topic on this list.
>There are 
>other lists where such issues are
>> on-topic. Your post is a bit like walking into a bowling alley and
>asking if 
>anyone there can solve your
>> chess problem... someone might be able to, but it isn't very
>efficient and 
>disturbs the bowlers unnecessarily.
>
>Not sure I agree. It is an advanced topic, but it is not off-list.
>
>Dirk
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using C library in R

2016-07-22 Thread Dirk Eddelbuettel

Jeff Newmiller  dcn.davis.ca.us> writes:
> 2) Interfacing R with other languages is off-topic on this list. There are 
other lists where such issues are
> on-topic. Your post is a bit like walking into a bowling alley and asking if 
anyone there can solve your
> chess problem... someone might be able to, but it isn't very efficient and 
disturbs the bowlers unnecessarily.

Not sure I agree. It is an advanced topic, but it is not off-list.

Dirk

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using C library in R

2016-07-22 Thread Jeff Newmiller

Read the Posting Guide. This will tell you at least two important things:

1) Post using plain text. HTML mangles code.

2) Interfacing R with other languages is off-topic on this list. There are 
other lists where such issues are on-topic. Your post is a bit like walking 
into a bowling alley and asking if anyone there can solve your chess problem... 
someone might be able to, but it isn't very efficient and disturbs the bowlers 
unnecessarily.
-- 
Sent from my phone. Please excuse my brevity.

On July 22, 2016 11:12:57 AM PDT, "Ganz, Carl"  wrote:
>Hello everyone,
>I am attempting to link to a C library named libxlsxwriter
>(http://libxlsxwriter.github.io/) that creates and styles XLSX files,
>but after several days of repeatedly reading "Writing R Extensions",
>and I am stuck and hoping someone can help me.
>The C library is easy to use and works with C++ as you would expect.
>For example, here is test.cpp:
>#include 
>int main() {
>  lxw_workbook  *workbook  = workbook_new("myexcel.xlsx");
>  lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);
>  int row = 0;
>  int col = 0;
>  worksheet_write_string(worksheet, row, col, "Hello me!", NULL);
>  return workbook_close(workbook);
>}
>To compile this I need to run the makefile in the libxlsxwriter folder,
>which compiles the libxlsxwriter library, and then call:
>cc test.cpp -o test -Ipath.to.xlsxwriter.h path.to.libxlsxwriter.a -lz
>This generates an executable that creates an excel document. I
>understand this command says to compile test.cpp and specifies where to
>search for headers, and what libraries to link to so to get this to
>work with R all I should need to do is make the library, point to the
>header, and link to the .a file.
>To get this to work with R, I created a libxlsxwriter folder in my /src
>folder with the libxlsxwriter library in it, and added this makevars to
>my /src:
>PKG_CFLAGS=
># specify header location
>PKG_CPPFLAGS=-Ilibxlsxwriter/include
># specify libs to link to
>PKG_LIBS=liblsxwriter/lib/libxlsxwriter.a -lz
># make libxlsxwriter
>libxlsxwriter/lib/libxlsxwriter.a:
>cd libxlsxwriter;$(MAKE)
>
>When I build the package I can see that this runs the makevars, and
>generates the .a and .dll files in libxlsxwriter. I get no errors
>associated with headers or the libraries so I thought I would be good
>to go, but I can't seem to get anything to run.
>Here is my test.cpp code I am trying to run in R with Rcpp:
>#include 
>#include 
>using namespace Rcpp;
>//' @useDynLib libxlsxwriter
>//' @export
>//' @import Rcpp
>// [[Rcpp::export]]
>void test() {
>  lxw_workbook  *workbook  = workbook_new("myexcel.xlsx");
>  lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);
>  int row = 0;
>  int col = 0;
>  worksheet_write_string(worksheet, row, col, "Hello me!", NULL);
>  workbook_close(workbook);
>}
>I am fairly certain I am doing something wrong with my makevars that is
>preventing test.cpp from being compiled. I tried running the makevars
>for libxlsxwriter from the command line and then building using just
>this makevars:
>PKG_CFLAGS=
># specify header location
>PKG_CPPFLAGS=-Ilibxlsxwriter/include
># specify libs to link to
>PKG_LIBS=liblsxwriter/lib/libxlsxwriter.a -lz
>I see the .o files for test.cpp being built, which is a good sign, and
>indicates there was a problem with my previous makevars, but I get
>errors saying undefined reference for all the functions from
>libxlsxwriter, so clearly thinks aren't linking like I need them to. I
>am out of ideas at this point, so any guidance would be greatly
>appreciated.
>A github repo with my code is here:
>https://github.com/carlganz/lwritexlsx
>I was using this webpage as a guide for how to include a C library in
>an R package: http://mazamascience.com/WorkingWithData/?p=1151
>Kind Regards,
>Carl Ganz
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Turn character /string as variable/column name in summarize in dplyr

2016-07-22 Thread John Kane

It really might help to have a minimum working example
Have a look at 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
and/or 
http://adv-r.had.co.nz/Reproducibility.html

John Kane
Kingston ON Canada


> -Original Message-
> From: chenme...@hotmail.com
> Sent: Fri, 22 Jul 2016 14:08:45 +
> To: r-help@r-project.org
> Subject: [R] Turn character /string as variable/column name in summarize
> in dplyr
> 
> Hi all,
> 
> 
> Trying to turn string in to variable in dplyr , it R interprets it as
> strings rather than column name in the data.
> 
> Any ideas?
> 
> 
> shock5 =paste0(shocksName[5],"fit")
>   print(shock5)
> 
>   x<-group_by(plotdata,grp) %>% summarize(
> Actuals=sum(weight*response/sum(weight)),
> ...
> ...
> #
> assign(shocksName[4],sum(weight*as.name(paste(shocksName[4],"fit"))/sum(weight))),
>  assign(shocksName[5],sum(weight*(as.environment(shock5))
> /sum(weight)))
>   )
> 
> 
> Sent from Outlook
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Aggregate data to lower resolution

2016-07-22 Thread Miluji Sb

Dear Jean,

Thank you so much for your reply and the solution, This does work. I was
wondering is this similar to 'rasterFromXYZ'? Thanks again!

Sincerely,

Milu

On Fri, Jul 22, 2016 at 3:06 PM, Adams, Jean  wrote:

> Milu,
>
> Perhaps an approach like this would work.  In the example below, I
> calculate the mean GDP for each 1 degree by 1 degree.
>
> temp$long1 <- floor(temp$longitude)
> temp$lat1 <- floor(temp$latitude)
> temp1 <- aggregate(GDP ~ long1 + lat1, temp, mean)
>
>   long1 lat1GDP
> 1   -69  -55 0.90268640
> 2   -68  -55 0.09831317
> 3   -72  -54 0.22379000
> 4   -71  -54 0.14067290
> 5   -70  -54 0.00300380
> 6   -69  -54 0.00574220
>
> Jean
>
> On Thu, Jul 21, 2016 at 3:57 PM, Miluji Sb  wrote:
>
>> Dear all,
>>
>> I have the following GDP data by latitude and longitude at 0.5 degree by
>> 0.5 degree.
>>
>> temp <- dput(head(ptsDF,10))
>> structure(list(longitude = c(-68.25, -67.75, -67.25, -68.25,
>> -67.75, -67.25, -71.25, -70.75, -69.25, -68.75), latitude = c(-54.75,
>> -54.75, -54.75, -54.25, -54.25, -54.25, -53.75, -53.75, -53.75,
>> -53.75), GDP = c(1.683046, 0.3212307, 0.0486207, 0.1223268, 0.0171909,
>> 0.0062104, 0.22379, 0.1406729, 0.0030038, 0.0057422)), .Names =
>> c("longitude",
>> "latitude", "GDP"), row.names = c(4L, 17L, 30L, 43L, 56L, 69L,
>> 82L, 95L, 108L, 121L), class = "data.frame")
>>
>> I would like to aggregate the data 1 degree by 1 degree. I understand that
>> the first step is to convert to raster. I have tried:
>>
>> rasterDF <- rasterFromXYZ(temp)
>> r <- aggregate(rasterDF,fact=2, fun=sum)
>>
>> But this does not seem to work. Could anyone help me out please? Thank you
>> in advance.
>>
>> Sincerely,
>>
>> Milu
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using C library in R

2016-07-22 Thread Ganz, Carl

Hello everyone,
I am attempting to link to a C library named libxlsxwriter 
(http://libxlsxwriter.github.io/) that creates and styles XLSX files, but after 
several days of repeatedly reading "Writing R Extensions", and I am stuck and 
hoping someone can help me.
The C library is easy to use and works with C++ as you would expect.
For example, here is test.cpp:
#include 
int main() {
  lxw_workbook  *workbook  = workbook_new("myexcel.xlsx");
  lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);
  int row = 0;
  int col = 0;
  worksheet_write_string(worksheet, row, col, "Hello me!", NULL);
  return workbook_close(workbook);
}
To compile this I need to run the makefile in the libxlsxwriter folder, which 
compiles the libxlsxwriter library, and then call:
cc test.cpp -o test -Ipath.to.xlsxwriter.h path.to.libxlsxwriter.a -lz
This generates an executable that creates an excel document. I understand this 
command says to compile test.cpp and specifies where to search for headers, and 
what libraries to link to so to get this to work with R all I should need to do 
is make the library, point to the header, and link to the .a file.
To get this to work with R, I created a libxlsxwriter folder in my /src folder 
with the libxlsxwriter library in it, and added this makevars to my /src:
PKG_CFLAGS=
# specify header location
PKG_CPPFLAGS=-Ilibxlsxwriter/include
# specify libs to link to
PKG_LIBS=liblsxwriter/lib/libxlsxwriter.a -lz
# make libxlsxwriter
libxlsxwriter/lib/libxlsxwriter.a:
cd libxlsxwriter;$(MAKE)

When I build the package I can see that this runs the makevars, and generates 
the .a and .dll files in libxlsxwriter. I get no errors associated with headers 
or the libraries so I thought I would be good to go, but I can't seem to get 
anything to run.
Here is my test.cpp code I am trying to run in R with Rcpp:
#include 
#include 
using namespace Rcpp;
//' @useDynLib libxlsxwriter
//' @export
//' @import Rcpp
// [[Rcpp::export]]
void test() {
  lxw_workbook  *workbook  = workbook_new("myexcel.xlsx");
  lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);
  int row = 0;
  int col = 0;
  worksheet_write_string(worksheet, row, col, "Hello me!", NULL);
  workbook_close(workbook);
}
I am fairly certain I am doing something wrong with my makevars that is 
preventing test.cpp from being compiled. I tried running the makevars for 
libxlsxwriter from the command line and then building using just this makevars:
PKG_CFLAGS=
# specify header location
PKG_CPPFLAGS=-Ilibxlsxwriter/include
# specify libs to link to
PKG_LIBS=liblsxwriter/lib/libxlsxwriter.a -lz
I see the .o files for test.cpp being built, which is a good sign, and 
indicates there was a problem with my previous makevars, but I get errors 
saying undefined reference for all the functions from libxlsxwriter, so clearly 
thinks aren't linking like I need them to. I am out of ideas at this point, so 
any guidance would be greatly appreciated.
A github repo with my code is here: https://github.com/carlganz/lwritexlsx
I was using this webpage as a guide for how to include a C library in an R 
package: http://mazamascience.com/WorkingWithData/?p=1151
Kind Regards,
Carl Ganz

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why the order of parameters in a logistic regression affects results significantly?

2016-07-22 Thread David Winsemius

> On Jul 21, 2016, at 3:04 PM, Qinghua He via R-help  
> wrote:
> 
> Using the same data, if I ran
> fit2 
> <-glm(formula=AR~Age+LumA+LumB+HER2+Basal+Normal,family=binomial,data=RacComp1)summary(fit2)exp(coef(fit2))
> I obtained:

exp(coef(fit2))(Intercept) AgeLumALumBHER2  
 Basal  Normal

0.24866935  1.00433781  0.10639937  0.31614001  0.08220685 
20.25180956  NA 

> while if I ran
> 
> fit2 
> <-glm(formula=AR~Age+LumA+LumB+Basal+Normal+HER2,family=binomial,data=RacComp1)summary(fit2)exp(coef(fit2))
> I obtained:

exp(coef(fit2)) (Intercept)  Age LumA LumBBasal 
  NormalHER2

 0.02044232   1.00433781   1.29428846   3.84566516 246.35185956 
 12.16443690   NA 

> 
> Essentially they're the same model - I just moved HER2 to the last. But the 
> OR changed significantly. Can someone explain?

You have collinearity and one of your variables will be dropped as redundant. 
Which one is dropped is determined by the order of the variable names in the 
model formula.

> For the latter result, I don't even know how to interpret as all factors have 
> OR>1 (except Intercept), how could that possible? Can I eliminate the effect 
> of intercept?

In the first model (with the defaults of  treatment contrasts) the Intercept is 
actually an estimate for cases with LumA, LumB,Basal,Her2 all at their lowest 
level and this not coincidentally also precisely defines your Normal variable. 
They all (excepting Normal) have adverse impact in your study of AR whatever it 
might be. If these various categories (which I suspect are breast cancer risk 
predictors) are all distinct with no overlaps, then use this:

fit2 <-glm(formula=AR~Age+ Normal+ LumA+LumB+HER2+Basal+ 
0,family=binomial,data=RacComp1)

The results will probably be the same as your first model except that 
Intercept's parameter will now be the parameter for Normal.

> Also, I cannot obtain OR for the last factor due to collinearity. However, I 
> know others obtained OR for all factors for the same dataset. Can someone 
> tell me how to obtain OR for all factors? All factors are categorical 
> variables (i.e., 0 or 1).
> Thanks!
> Peter
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why the order of parameters in a logistic regression affects results significantly?

2016-07-22 Thread Michael Dewey


Dear Peter

Have you tried removing the intercept? Just put -1 at the end of your 
formula.


On 21/07/2016 23:04, Qinghua He via R-help wrote:

Using the same data, if I ran
fit2 
<-glm(formula=AR~Age+LumA+LumB+HER2+Basal+Normal,family=binomial,data=RacComp1)summary(fit2)exp(coef(fit2))
I obtained:

exp(coef(fit2))(Intercept) AgeLumALumBHER2  
 Basal  Normal  0.24866935  1.00433781  0.10639937  0.31614001  0.08220685 
20.25180956  NA

while if I ran

fit2 
<-glm(formula=AR~Age+LumA+LumB+Basal+Normal+HER2,family=binomial,data=RacComp1)summary(fit2)exp(coef(fit2))
I obtained:

exp(coef(fit2)) (Intercept)  Age LumA LumBBasal 
  Normal HER2   0.02044232   1.00433781   1.29428846   3.84566516 
246.35185956  12.16443690   NA


Essentially they're the same model - I just moved HER2 to the last. But the OR 
changed significantly. Can someone explain?
For the latter result, I don't even know how to interpret as all factors have 
OR>1 (except Intercept), how could that possible? Can I eliminate the effect of 
intercept?
Also, I cannot obtain OR for the last factor due to collinearity. However, I 
know others obtained OR for all factors for the same dataset. Can someone tell 
me how to obtain OR for all factors? All factors are categorical variables 
(i.e., 0 or 1).
Thanks!
Peter
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about interpolating data in r

2016-07-22 Thread William Dunlap via R-help

approx() has a 'rule' argument that controls how it deals with
extrapolation.  Run help(approx) and read about the details.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Jul 22, 2016 at 8:29 AM, lily li  wrote:

> Thanks, Ismail.
> For the gaps before 2009-01-05 and after 2009-11-20, I use the year 2010 to
> fill in the missing values for column C. There is no relationship between
> column A, B, and C.
> For the missing values between 2009-01-05 and 2009-11-20, if there are any,
> I found this approach is very helpful.
> with(df, approx(x=time, y=C, xout=seq(min(time), max(time), by="days")))
>
>
>
> On Thu, Jul 21, 2016 at 5:14 PM, Ismail SEZEN 
> wrote:
>
> >
> > > On 22 Jul 2016, at 01:34, lily li  wrote:
> > >
> > > I have a question about interpolating missing values in a dataframe.
> >
> > First of all, filling missing values action must be taken into account
> > very carefully. It must be known the nature of the data that wanted to be
> > filled and most of the time, to let them be NA is the most appropriate
> > action.
> >
> > > The
> > > dataframe is in the following, Column C has no data before 2009-01-05
> and
> > > after 2009-12-31, how to interpolate data for the blanks?
> >
> > Why a dataframe? Is there any relationship between columns A,B and C? If
> > there is, then you might want to consider filling missing values by a
> > linear model approach instead of interpolation. You said that there is
> not
> > data before 2009-01-05 and after 2009-12-31 but according to dataframe,
> > there is not data after 2009-11-20?
> >
> > > That is to say,
> > > interpolate linearly between these two gaps using 5.4 and 6.1? Thanks.
> >
> > Also you metion interpolating blanks but you want interpolation between
> > two gaps? Do you want to fill missing values before 2009-01-05 and after
> > 2009-11-20 or do you want to find intermediate values between 2009-01-05
> > and 2009-11-20? This is a bit unclear.
> >
> > >
> > >
> > > df
> > > timeA  B C
> > > 2009-01-013  4.5
> > > 2009-01-024  5
> > > 2009-01-033.3   6
> > > 2009-01-044.1   7
> > > 2009-01-054.4   6.2   5.4
> > > ...
> > >
> > > 2009-11-205.1   5.5   6.1
> > > 2009-11-215.4   4
> > > ...
> > > 2009-12-314.5   6
> >
> >
> > If you want to fill missing values at the end-points for column C (before
> > 2009-01-05 and after 2009-11-20), and all data you have is between
> > 2009-01-05 and 2009-11-20, this means that you want extrapolation
> (guessing
> > unkonwn values that is out of known values). So, you can use only values
> at
> > column C to guess missing end-point values. You can use splinefun (or
> > spline) functions for this purpose. But let me note that this kind of
> > approach might help you only for a few missing values close to
> end-points.
> > Otherwise, you might find yourself in a huge mistake.
> >
> > As I mentioned in my first sentence, If you have a relationship between
> > all columns or you have data for column C for other years (for instance,
> > assume that you have data for column C for 2007, 2008, and 2010 but not
> > 2009) you may want to try a statistical approach to fill the missing
> values.
> >
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about interpolating data in r

2016-07-22 Thread lily li

Thanks, Ismail.
For the gaps before 2009-01-05 and after 2009-11-20, I use the year 2010 to
fill in the missing values for column C. There is no relationship between
column A, B, and C.
For the missing values between 2009-01-05 and 2009-11-20, if there are any,
I found this approach is very helpful.
with(df, approx(x=time, y=C, xout=seq(min(time), max(time), by="days")))



On Thu, Jul 21, 2016 at 5:14 PM, Ismail SEZEN  wrote:

>
> > On 22 Jul 2016, at 01:34, lily li  wrote:
> >
> > I have a question about interpolating missing values in a dataframe.
>
> First of all, filling missing values action must be taken into account
> very carefully. It must be known the nature of the data that wanted to be
> filled and most of the time, to let them be NA is the most appropriate
> action.
>
> > The
> > dataframe is in the following, Column C has no data before 2009-01-05 and
> > after 2009-12-31, how to interpolate data for the blanks?
>
> Why a dataframe? Is there any relationship between columns A,B and C? If
> there is, then you might want to consider filling missing values by a
> linear model approach instead of interpolation. You said that there is not
> data before 2009-01-05 and after 2009-12-31 but according to dataframe,
> there is not data after 2009-11-20?
>
> > That is to say,
> > interpolate linearly between these two gaps using 5.4 and 6.1? Thanks.
>
> Also you metion interpolating blanks but you want interpolation between
> two gaps? Do you want to fill missing values before 2009-01-05 and after
> 2009-11-20 or do you want to find intermediate values between 2009-01-05
> and 2009-11-20? This is a bit unclear.
>
> >
> >
> > df
> > timeA  B C
> > 2009-01-013  4.5
> > 2009-01-024  5
> > 2009-01-033.3   6
> > 2009-01-044.1   7
> > 2009-01-054.4   6.2   5.4
> > ...
> >
> > 2009-11-205.1   5.5   6.1
> > 2009-11-215.4   4
> > ...
> > 2009-12-314.5   6
>
>
> If you want to fill missing values at the end-points for column C (before
> 2009-01-05 and after 2009-11-20), and all data you have is between
> 2009-01-05 and 2009-11-20, this means that you want extrapolation (guessing
> unkonwn values that is out of known values). So, you can use only values at
> column C to guess missing end-point values. You can use splinefun (or
> spline) functions for this purpose. But let me note that this kind of
> approach might help you only for a few missing values close to end-points.
> Otherwise, you might find yourself in a huge mistake.
>
> As I mentioned in my first sentence, If you have a relationship between
> all columns or you have data for column C for other years (for instance,
> assume that you have data for column C for 2007, 2008, and 2010 but not
> 2009) you may want to try a statistical approach to fill the missing values.
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PDF extraction with tm package

2016-07-22 Thread Jeff Newmiller

This is neither the Xpdf support forum nor the Windows Setup Program 
Reinvention support group... and you really need to read and follow the Posting 
Guide for the R mailing lists.

FWIW I would guess that you need to learn about environment variables and in 
particular about the PATH variable. There are subtleties about when and how 
they get defined that are OS-specific and certainly off topic here that may 
trip you up along the way. Alternatively, you may read the Xpdf documentation 
or a how-to blog about Xpdf that gives you a recipe, but again that is not 
about R. Once you can start a CMD shell and run the command directly then you 
are most of the way to getting R to invoke it.
-- 
Sent from my phone. Please excuse my brevity.

On July 21, 2016 5:26:26 PM PDT, Steven Kang  wrote:
>Hi R users,
>
>I’m having some issues trying to extract texts from PDF file using tm
>package.
>
>Here are the steps that were carried out:
>
>1. Downloaded and installed the following programs:
>
>- Xpdf (Copied the ‘bin32’, ‘bin64’, ‘doc’ folders into ‘C:\Program
>Files\Xpdf’ directory; also added C:\Program
>Files\Xpdf\bin64\pdfinfo.exe &
>C:\Program Files\Xpdf\bin64\pdftotext.exe in existing PATH
>
>- Tesseract
>
>- Imagemagick
>
>2. Used the following scripts and the corresponding error messages:
>
># Directory where PDF files are stored
>
>>cname <- getwd()
>
>>Corpus(DirSource(cname), readerControl=list(reader = readPDF))
>
>Error in system2("pdftotext", c(control$text, shQuote(x), "-"), stdout
>=
>TRUE) :
>'"pdftotext"' not found
>
> In addition: Warning message:
>
>running command '"pdfinfo" "C:\Users\R_Files\XXX.pdf"' had status 127
>
>>file.exists(Sys.which(c("pdfinfo","pdftpotext")))
>[1] FALSE FALSE
>
>It seems like R can’t find pdfinfo & pdftotext exe files, but not sure
>as
>to why this would be the case despite xpdf files being copied into
>‘C:\Program Files’ (Im using Windows 7 64bits)
>
>I’m aware that ‘pdf_text’ function from pdftools package can extract
>texts
>from PDF file and outputs into a string. But I was after something
>which is
>able to convert PDF (ie transaction data) into a dataframe without
>regular
>expression. Is tm package capable of doing this conversion? Are there
>any
>other alternatives to these methods?
>
>Your expertise in resolving this problem would be highly appreciated.
>
>
>Steve
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why the order of parameters in a logistic regression affects results significantly?

2016-07-22 Thread Greg Snow

Please post in plain text, the message is very hard to read with the
reformatting that was done.

Did you receive any warnings when you fit your models?

The fact that the last coefficient is NA in both outputs suggests that
there was some co-linearity in your predictor variables and R chose to
drop one of the offending variables from the model (the last one in
each case).  Depending on the nature of the co-linearity, the
interpretation (and therefore the estimates) can change.

For example lets say that you have 3 predictors, red, green, and blue
that are indicator variables (0/1) and that every subject has a 1 in
exactly one of those variables (so they are co-linear with the
intercept).  If you put the 3 variables into a model with the
intercept in the above order, then R will drop the blue variable and
the interpretation of the coefficients is that the intercept is the
average for blue subjects and the other coefficients are the
differences between red/green and blue on average.  If you refit the
model with the order blue, green, red, then R will drop red from the
model and now the interpretation is that the intercept is the mean for
red subjects and the others are the differences from red on average, a
very different interpretation and therefore different estimates.

I expect something along those lines is going on here.

On Thu, Jul 21, 2016 at 4:04 PM, Qinghua He via R-help
 wrote:
> Using the same data, if I ran
> fit2 
> <-glm(formula=AR~Age+LumA+LumB+HER2+Basal+Normal,family=binomial,data=RacComp1)summary(fit2)exp(coef(fit2))
> I obtained:
>> exp(coef(fit2))(Intercept) AgeLumALumBHER2   
>> Basal  Normal  0.24866935  1.00433781  0.10639937  0.31614001  
>> 0.08220685 20.25180956  NA
> while if I ran
>
> fit2 
> <-glm(formula=AR~Age+LumA+LumB+Basal+Normal+HER2,family=binomial,data=RacComp1)summary(fit2)exp(coef(fit2))
> I obtained:
>> exp(coef(fit2)) (Intercept)  Age LumA LumB
>> Basal   Normal HER2   0.02044232   1.00433781   1.29428846   
>> 3.84566516 246.35185956  12.16443690   NA
>
> Essentially they're the same model - I just moved HER2 to the last. But the 
> OR changed significantly. Can someone explain?
> For the latter result, I don't even know how to interpret as all factors have 
> OR>1 (except Intercept), how could that possible? Can I eliminate the effect 
> of intercept?
> Also, I cannot obtain OR for the last factor due to collinearity. However, I 
> know others obtained OR for all factors for the same dataset. Can someone 
> tell me how to obtain OR for all factors? All factors are categorical 
> variables (i.e., 0 or 1).
> Thanks!
> Peter
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Aggregate data to lower resolution

2016-07-22 Thread Adams, Jean

Milu,

Perhaps an approach like this would work.  In the example below, I
calculate the mean GDP for each 1 degree by 1 degree.

temp$long1 <- floor(temp$longitude)
temp$lat1 <- floor(temp$latitude)
temp1 <- aggregate(GDP ~ long1 + lat1, temp, mean)

  long1 lat1GDP
1   -69  -55 0.90268640
2   -68  -55 0.09831317
3   -72  -54 0.22379000
4   -71  -54 0.14067290
5   -70  -54 0.00300380
6   -69  -54 0.00574220

Jean

On Thu, Jul 21, 2016 at 3:57 PM, Miluji Sb  wrote:

> Dear all,
>
> I have the following GDP data by latitude and longitude at 0.5 degree by
> 0.5 degree.
>
> temp <- dput(head(ptsDF,10))
> structure(list(longitude = c(-68.25, -67.75, -67.25, -68.25,
> -67.75, -67.25, -71.25, -70.75, -69.25, -68.75), latitude = c(-54.75,
> -54.75, -54.75, -54.25, -54.25, -54.25, -53.75, -53.75, -53.75,
> -53.75), GDP = c(1.683046, 0.3212307, 0.0486207, 0.1223268, 0.0171909,
> 0.0062104, 0.22379, 0.1406729, 0.0030038, 0.0057422)), .Names =
> c("longitude",
> "latitude", "GDP"), row.names = c(4L, 17L, 30L, 43L, 56L, 69L,
> 82L, 95L, 108L, 121L), class = "data.frame")
>
> I would like to aggregate the data 1 degree by 1 degree. I understand that
> the first step is to convert to raster. I have tried:
>
> rasterDF <- rasterFromXYZ(temp)
> r <- aggregate(rasterDF,fact=2, fun=sum)
>
> But this does not seem to work. Could anyone help me out please? Thank you
> in advance.
>
> Sincerely,
>
> Milu
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about interpolating data in r

2016-07-22 Thread Jim Lemon

Hi lili,
The problem may lie in the fact that I think you are using
"interpolate" when you mean "extrapolate". In that case, the best you
can do is spread values beyond the points that you have. Find the
slope of the line, put a point at each end of your time data
(2009-01-01 and 2009-12-31) and use "approx" on all three gaps. Note
that this slope is a slippery one indeed and few will accept that the
values so generated mean anything.

Jim

On Fri, Jul 22, 2016 at 9:38 AM, Ismail SEZEN  wrote:
>
>> On 22 Jul 2016, at 01:54, lily li  wrote:
>>
>> Thanks, I meant if there are missing data at the beginning and end of a
>> dataframe, how to interpolate according to available data?
>>
>> For example, the A column has missing values at the beginning and end, how
>> to interpolate linearly between 10 and 12 for the missing values?
>>
>> df <- data.frame(A=c(NA, NA,10,11,12, NA),B=c(5,5,4,3,4,5),C=c(3.3,4,3,1.5,
>> 2.2,4),time=as.Date(c("1990-01-01","1990-02-
>> 07","1990-02-14","1990-02-28","1990-03-01","1990-03-20")))
>>
>
> As William was answered;
>
> with(df, approx(x=time, y=A, xout=seq(min(time, na.rm =T), max(time, na.rm = 
> T), by="days")))
>
> will help you interpolate linearly between knwon values even column has NA’s.
>
>
>>
>> On Thu, Jul 21, 2016 at 4:48 PM, William Dunlap  wrote:
>>
>>> Try approx(), as in:
>>>
>>> df <-
>>> data.frame(A=c(10,11,12),B=c(5,5,4),C=c(3.3,4,3),time=as.Date(c("1990-01-01","1990-02-07","1990-02-14")))
>>> with(df, approx(x=time, y=C, xout=seq(min(time), max(time), by="days")))
>>>
>>> Do you notice how one can copy and paste that example out of the
>>> mail an into R to see how it works?  It would help if your questions
>>> had that same property - show how the example data could be created.
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Thu, Jul 21, 2016 at 3:34 PM, lily li  wrote:
>>>
 I have a question about interpolating missing values in a dataframe. The
 dataframe is in the following, Column C has no data before 2009-01-05 and
 after 2009-12-31, how to interpolate data for the blanks? That is to say,
 interpolate linearly between these two gaps using 5.4 and 6.1? Thanks.


 df
 timeA  B C
 2009-01-013  4.5
 2009-01-024  5
 2009-01-033.3   6
 2009-01-044.1   7
 2009-01-054.4   6.2   5.4
 ...

 2009-11-205.1   5.5   6.1
 2009-11-215.4   4
 ...
 2009-12-314.5   6

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>
>>>
>>
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installing the lubricate package

2016-07-22 Thread Hervé Pagès


also it's lubridate, not lubricate :-/

On 07/21/2016 05:44 PM, Ismail SEZEN wrote:

You don't have to download and install from github. You can install
lubridate package easly from cran repository. If you really intend to
install from github, i advise you install devtools package first and use
install_github function.

http://www.inside-r.org/packages/cran/devtools/docs/install_github

On Fri, Jul 22, 2016, 03:36 lily li  wrote:


Hi R users,

I'm trying to download lubricate from this website, and then install it on
my mac.
https://github.com/hadley/lubridate

but it says windows version does not apply to mac. How to install the
package for mac? Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing Index Labelling on x axis

2016-07-22 Thread Ivan Calandra


Hi Nick,

If I understand you correctly, that should do it:
plot(x=seq_along(data)+2, y=data)

HTH,
Ivan

--
Ivan Calandra, PhD
Scientific Mediator
University of Reims Champagne-Ardenne
GEGENAA - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
ivan.calan...@univ-reims.fr
--
https://www.researchgate.net/profile/Ivan_Calandra
https://publons.com/author/705639/

Le 22/07/2016 à 10:47, WRAY NICHOLAS a écrit :

Hi  I have a vector of data (for example c(2,3,4,5,4,3,2)

data<-c(2,3,4,5,4,3,2)
plot(data)

I simply want to up the index values along the x axis by 2, so that instead of
1, I have 3, instead of 2 I have 4 etc etc.  Despite ages playing around with
the axis function I can't get it to work, and keep getting weird error messages.
  This is likely to be laughably simple but if someone can help me out I'd be
grateful

Thanks Nick W
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Changing Index Labelling on x axis

2016-07-22 Thread WRAY NICHOLAS

Hi  I have a vector of data (for example c(2,3,4,5,4,3,2)

data<-c(2,3,4,5,4,3,2)
plot(data)

I simply want to up the index values along the x axis by 2, so that instead of
1, I have 3, instead of 2 I have 4 etc etc.  Despite ages playing around with
the axis function I can't get it to work, and keep getting weird error messages.
 This is likely to be laughably simple but if someone can help me out I'd be
grateful

Thanks Nick W
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to plot marginal effects (MEM) in R?

2016-07-22 Thread David Winsemius


> On Jul 21, 2016, at 11:44 PM, Faradj Koliev  wrote:
> 
> Dear David Winsemius, 
> 
> Thank you!  
> 
> The sample make no sense, I know. The real data is too big. So, I only want 
> to understand how to plot marginal effects, to visualize them in a proper 
> way. 
> 

Before you start plotting, you first need to first understand that with 
interactions in the model there are N X M effects where N and M are the number 
of levels in the two covariates. The whole point of models is to boil down 
large amounts of "real data", but not so much boiling that you get a congealed, 
burned molasses.

The package you chose to examine the interactions appears ill-equipted to 
provide you with the necessary analysis.

-- 
David.
> Best,
> 
> 
>> 22 juli 2016 kl. 08:35 skrev David Winsemius :
>> 
>>> 
>>> On Jul 21, 2016, at 2:22 PM, Faradj Koliev  wrote:
>>> 
>>> Dear all, 
>>> 
>>> I have two logistic regression models:
>>> 
>>> 
>>>  • model <- glm(Y ~ X1+X2+X3+X4, data = data, family = "binomial")
>>> 
>>> 
>>> 
>>>  • modelInteraction <- glm(Y ~ X1+X2+X3+X4+X1*X4, data = data, family = 
>>> "binomial")
>>> 
>>> To calculate the marginal effects (MEM approach) for these models, I used 
>>> the `mfx` package:
>>> 
>>> 
>>>  • a<- logitmfx(model, data=data, atmean=TRUE)
>>> 
>>> 
>>> 
>>>   •b<- logitmfx(modelInteraction, data=data, atmean=TRUE)
>>> 
>>> 
>>> What I want to do now is 1) plot all the results for "model" and 2) show 
>>> the result just for two variables: X1 and X2. 
>>> 3) I also want to plot the interaction term in ”modelInteraction”.
>> 
>> There is no longer a single "effect" for X1 in modelInteraction in contrast 
>> to the manner as there might be an "effect" for X2. There can only be 
>> predictions for combined situations with particular combinations of values 
>> for X1 and X4.
>> 
>>> model
>> 
>> Call:  glm(formula = Y ~ X1 + X2 + X3 + X4, family = "binomial", data = data)
>> 
>> Coefficients:
>> (Intercept)   X1   X2   X3   X4  
>>-0.3601   1.3353   0.1056   0.2898  -0.3705  
>> 
>> Degrees of Freedom: 68 Total (i.e. Null);  64 Residual
>> Null Deviance:   66.78 
>> Residual Deviance: 62.27 AIC: 72.27
>> 
>> 
>>> modelInteraction
>> 
>> Call:  glm(formula = Y ~ X1 + X2 + X3 + X4 + X1 * X4, family = "binomial", 
>>data = data)
>> 
>> Coefficients:
>> (Intercept)   X1   X2   X3   X4X1:X4 
>>  
>>90.0158 -90.0747   0.1183   0.3064 -15.3688  15.1593  
>> 
>> Degrees of Freedom: 68 Total (i.e. Null);  63 Residual
>> Null Deviance:   66.78 
>> Residual Deviance: 61.49 AIC: 73.49
>> 
>> Notice that a naive attempt to plot an X1  "effect" in modelInteraction 
>> might pick the -90.07 value which would then ignore both the much larger 
>> Intercept value and also ignore the fact that the interaction term has now 
>> split the X4 (and X1) "effects" into multiple pieces.
>> 
>> You need to interpret the effects of X1 in the context of a specification of 
>> a particular X4 value and not forget that the Intercept should not be 
>> ignored. It appears to me that the estimates of the mfx package are 
>> essentially meaningless with the problem you have thrown at it.
>> 
>>> a
>> Call:
>> logitmfx(formula = model, data = data, atmean = TRUE)
>> 
>> Marginal Effects:
>>   dF/dx Std. Err.   z   P>|z|  
>> X1  0.147532  0.087865  1.6791 0.09314 .
>> X2  0.015085  0.193888  0.0778 0.93798  
>> X3  0.040309  0.063324  0.6366 0.52441  
>> X4 -0.050393  0.092947 -0.5422 0.58770  
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> 
>> dF/dx is for discrete change for the following variables:
>> 
>> [1] "X1" "X2" "X4"
>>> b
>> Call:
>> logitmfx(formula = modelInteraction, data = data, atmean = TRUE)
>> 
>> Marginal Effects:
>>dF/dx   Std. Err. z  P>|z|
>> X1-1.e+00  1.2121e-07 -8.25e+06 <2e-16 ***
>> X2 6.5595e-03  8.1616e-01  8.00e-03 0.9936
>> X3 1.6312e-02  2.0326e+00  8.00e-03 0.9936
>> X4-9.6831e-01  1.5806e+01 -6.13e-02 0.9511
>> X1:X4  8.0703e-01  1.4572e+01  5.54e-02 0.9558
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> 
>> dF/dx is for discrete change for the following variables:
>> 
>> [1] "X1" "X2" "X4"
>> 
>> I see no sensible interpretation of the phrase "X1 effect" in the comparison 
>> tables above. The "p-value" in the second table appears to be nonsense 
>> induced by throwing a model formulation that was not anticipated. There is a 
>> negligible improvement in the glm fits:
>> 
>>> anova(model,modelInteraction)
>> Analysis of Deviance Table
>> 
>> Model 1: Y ~ X1 + X2 + X3 + X4
>> Model 2: Y ~ X1 + X2 + X3 + X4 + X1 * X4
>>  Resid. Df Resid. Dev Df Deviance
>> 164 62.274
>> 263 61.495  1  0.77908
>> 
>> 
>> So the notion that the "X1 effect" is now "highly significant" where it was 
>> before not e

Re: [R] readline issue with 3.3.1

2016-07-22 Thread Ralf Goertz

Am Thu, 21 Jul 2016 18:07:43 +0200
schrieb Martin Maechler :

> Ralf Goertz  on Wed, 20 Jul 2016 16:37:53 +0200
> writes:

>> I installed readline version 6.3 and the issue is gone. So probably
>> some of the recent changes in R's readline code are incompatible with
>> version readline version 6.2.
> 
> Yes, it seems so, unfortunately.
> 
> Thank you for reporting !

It would be great if – while fixing this – you also took care of the
SIGWINCH problem described in bug report
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16604 

Thanks, Ralf

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Improving performance by rewriting for loops into apply functions

[R] Improving performance by rewriting for loops into apply functions

Re: [R] Using C library in R

Re: [R] Using C library in R

Re: [R] Using C library in R

Re: [R] Turn character /string as variable/column name in summarize in dplyr

Re: [R] Aggregate data to lower resolution

[R] Using C library in R

Re: [R] Why the order of parameters in a logistic regression affects results significantly?

Re: [R] Why the order of parameters in a logistic regression affects results significantly?

Re: [R] about interpolating data in r

Re: [R] about interpolating data in r

Re: [R] PDF extraction with tm package

Re: [R] Why the order of parameters in a logistic regression affects results significantly?

Re: [R] Aggregate data to lower resolution

Re: [R] about interpolating data in r

Re: [R] installing the lubricate package

Re: [R] Changing Index Labelling on x axis

[R] Changing Index Labelling on x axis

Re: [R] How to plot marginal effects (MEM) in R?

Re: [R] readline issue with 3.3.1

21 matches

Site Navigation

Mail list logo

Footer information