I sympathize with your trouble bringing in data, but you need to catch your breath and figure out what you really have. I think when you get a bit more R practice, you will be able to manage what you bring in without going back to that editor so much.

I feel certain your data is not what you think it is. Here's an example where a factor DOES work on the lhs of a glm:

> y <- factor(c("S","N","S","N","S","N","S","N"))
> x <- rnorm(8)
> glm(y~x,family=binomial(link=logit))

Look here: the system knows y is a factor:
> attributes(y)
$levels
[1] "N" "S"

$class
[1] "factor"

My guess is that your variables are not really factors, but rather character vectors. You have to convert them into factors.
Watch the error I get is the same that you got.


> y <- c("S","N","S","N","S","N","S","N")
> glm(y~x,family=binomial(link=logit))
Error in model.frame(formula, rownames, variables, varnames, extras, extranames, :
invalid variable type


Note the system doesn't know y is "supposed" to be a factor. It just sees characters.

> y
[1] "S" "N" "S" "N" "S" "N" "S" "N"
> levels(y)
NULL
> attributes(y)
NULL

but look:
> glm(as.factor(y)~x,family=binomial(link=logit))



[EMAIL PROTECTED] wrote:

Hi All:
I came across the following problem while working with a dataset, and wondered if there could be a solution I sought here.


My dataset consists of information on 402 individuals with the followng five variables (age,sex, status = a binary variable with levels "case" or "control", mma, dma).
During data check, I found that in the raw data, the data entry operator had mistakenly put a "0" for one participant, so now, the levels show


levels(status)

[1] "0" "control" "case"
The variables mma, and dma are actually numerical variables but in the dataframe, they are represented as "characters". I tried to change the type of the variables (from character to numeric) using the edit function (and bringing up the data grid where then I made changes), but the changes were not saved. I tried
mma1 <- as.numeric(mma)
but I was not successful in converting mma from a character variable to a numeric variable.
So, to edit and "clean" the data, I exported the dataset as a text file to Epi Info 2002 (version 2, Windows). I used the following code:
mysubset <- subset(workingdat, select = c(age,sex,status, mma, dma))
write.table(mysubset, file="mysubset.txt", sep="\t", col.names=NA)
After I made changes in the variables using Epi Info (I created a new variable called "statusrec" containing values "case" and "control"), I exported the file as a ".rec" file (filename "mydata.rec"). I used the following code to read the file in R:
require(foreign)
myData <- read.epiinfo("mydata.rec", read.deleted=NA)
Now, the problem is this, when I want to run a logistic regression, R returns the following error message:


glm(statusrec~mma, family=binomial(link=logit))

Error in model.frame(formula, rownames, variables, varnames, extras, extranames, :
invalid variable type


I cannot figure out the solution. I want to run a logistic regression now with the variable statusrec (which is a binary variable containing values "case" and "control"), and another
variable (say mma, which is now a numeric variable). What does the above error message mean and what could be a possible solution?
Would greatly appreciate your insights and wisdom.
-Arin Basu


______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help



-- Paul E. Johnson email: [EMAIL PROTECTED] Dept. of Political Science http://lark.cc.ukans.edu/~pauljohn University of Kansas Office: (785) 864-9086 Lawrence, Kansas 66045 FAX: (785) 864-5700

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Reply via email to