Re: [R] Logistic regression for large data

2022-11-14 Thread Bill Dunlap
summary(Base)

would show if one of columns of Base was read as character data instead of
the expected numeric.  That could cause an explosion in the number of dummy
variables, hence a huge design matrix.

-Bill


On Fri, Nov 11, 2022 at 11:30 PM George Brida 
wrote:

> Dear R users,
>
> I have a database  called Base.csv   (attached to this email) which
> contains 13 columns and 8257 rows and whose the first 8 columns are dummy
> variables which take 1 or 0. The problem is when I wrote the following
> instructions to do a logistic regression , R runs for hours and hours
> without giving an output:
>
> Base=read.csv("C:\\Users\\HP\\Desktop\\New\\Base.csv",header=FALSE,sep=";")
>
> fit_1=glm(Base[,2]~Base[,1]+Base[,10]+Base[,11]+Base[,12]+Base[,13],family=binomial(link="logit"))
>
> Apparently, there is not enough memory to have the requested output. Is
> there any other function for logistic regression that handle large data and
> return output in reasonable time.
>
> Many thanks
>
> Kind regards
>
> George
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic regression for large data

2022-11-12 Thread Ebert,Timothy Aaron
Hi George,
   I did not get an attachment.
   My first step would be to try simplifying things. Do all of these work?

fit_1=glm(Base[,2]~Base[,1],family=binomial(link="logit"))
fit_1=glm(Base[,2]~Base[,10],family=binomial(link="logit"))
fit_1=glm(Base[,2]~Base[,11],family=binomial(link="logit"))
fit_1=glm(Base[,2]~Base[,12],family=binomial(link="logit"))
fit_1=glm(Base[,2]~Base[,13],family=binomial(link="logit"))

This is not a large dataset. That said, if your computer is nearly out of 
memory, even a small dataset might be too much. It might have plenty of 
physical memory, but also lots of (open files, cookies, applications, other 
stuff) that eat memory.

Regards,
Tim



-Original Message-
From: R-help  On Behalf Of George Brida
Sent: Friday, November 11, 2022 4:17 PM
To: r-help@r-project.org
Subject: [R] Logistic regression for large data

[External Email]

Dear R users,

I have a database  called Base.csv   (attached to this email) which
contains 13 columns and 8257 rows and whose the first 8 columns are dummy 
variables which take 1 or 0. The problem is when I wrote the following 
instructions to do a logistic regression , R runs for hours and hours without 
giving an output:

Base=read.csv("C:\\Users\\HP\\Desktop\\New\\Base.csv",header=FALSE,sep=";")
fit_1=glm(Base[,2]~Base[,1]+Base[,10]+Base[,11]+Base[,12]+Base[,13],family=binomial(link="logit"))

Apparently, there is not enough memory to have the requested output. Is there 
any other function for logistic regression that handle large data and return 
output in reasonable time.

Many thanks

Kind regards

George
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%40ufl.edu%7Cb0af80b8620648fcc1ab08dac47fb19e%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638038350110240752%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7Csdata=oxeFwGCpH%2B9Ha%2BDFaWRygEcvOJ2O6AngSKNhMwE%2FczI%3Dreserved=0
PLEASE do read the posting guide 
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.edu%7Cb0af80b8620648fcc1ab08dac47fb19e%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638038350110240752%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7Csdata=FtLW709kbnkMLzylkRRtR1Y%2Fw5oehodb0dmS8DqwGig%3Dreserved=0
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic regression for large data

2022-11-11 Thread David Winsemius
That’s not a large data set. Something else besides memory limits is going on. 
You should post output of summary(Base). 

— 
David
Sent from my iPhone

> On Nov 11, 2022, at 11:29 PM, George Brida  wrote:
> 
> Dear R users,
> 
> I have a database  called Base.csv   (attached to this email) which
> contains 13 columns and 8257 rows and whose the first 8 columns are dummy
> variables which take 1 or 0. The problem is when I wrote the following
> instructions to do a logistic regression , R runs for hours and hours
> without giving an output:
> 
> Base=read.csv("C:\\Users\\HP\\Desktop\\New\\Base.csv",header=FALSE,sep=";")
> fit_1=glm(Base[,2]~Base[,1]+Base[,10]+Base[,11]+Base[,12]+Base[,13],family=binomial(link="logit"))
> 
> Apparently, there is not enough memory to have the requested output. Is
> there any other function for logistic regression that handle large data and
> return output in reasonable time.
> 
> Many thanks
> 
> Kind regards
> 
> George
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Logistic regression for large data

2022-11-11 Thread George Brida
Dear R users,

I have a database  called Base.csv   (attached to this email) which
contains 13 columns and 8257 rows and whose the first 8 columns are dummy
variables which take 1 or 0. The problem is when I wrote the following
instructions to do a logistic regression , R runs for hours and hours
without giving an output:

Base=read.csv("C:\\Users\\HP\\Desktop\\New\\Base.csv",header=FALSE,sep=";")
fit_1=glm(Base[,2]~Base[,1]+Base[,10]+Base[,11]+Base[,12]+Base[,13],family=binomial(link="logit"))

Apparently, there is not enough memory to have the requested output. Is
there any other function for logistic regression that handle large data and
return output in reasonable time.

Many thanks

Kind regards

George
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.