Dear all,

I hope to be the clearest I can.
Let's say I have a dataset with 10 variables, where 4 of them represent for
me a certain phenomenon that I call Y.
The other 6 represent for me another phenomenon that I call X.

Each one of those variables (10) contains 37 units. Those units are just
the respondents of my analysis (a survey).
Since all the questions are based on a Likert scale, they are qualitative
variables. The scale is from 0 to 7 for all of
them, but there are "-1" and "-2" values where the answer is missing. Hence
the scale goes actually from -2 to 7.

What I want to do is to calculate the regression between my Y (which
contains 4 variables in this case and 37 answers
for each variable) and my X (which contains 6 variables instead and the
same number of respondents). I know that for
qualitative analyses I should use Anova instead of the regression, although
I have read somewhere that it is even possible
to make the regression.

Until now I have tried to act this way:
__________________________________________________________________________________________________________
> apply(Y, 1, function(Y) mean(Y[Y>0])) #calculate the average per rows
(respondents) without considering the negative values

> Y.reg<- c(apply(Y, 1, function(Y) mean(Y[Y>0]))) #create the vector Y,
thus it results like 1 variable with 37 numbers

> apply(X, 1, function(X) mean(X[X>0]))

> X.reg<- c(apply(X, 1, function(X) mean(X[X>0]))) #create the vector
X, thus it results like 1 variable with 37 numbers

> reg1<- lm(Y.reg~ X.reg) #make the first regression
> summary(reg1) #see the results

Call:
lm(formula = Y.reg ~ X.reg)

Residuals:
     Min         1Q       Median      3Q       Max
-2.26183 -0.49434 -0.02658  0.37260  2.08899

Coefficients:
                 Estimate Std. Error   t value   Pr(>|t|)
(Intercept)   4.2577     0.4986      8.539    4.46e-10 ***
X.reg          0.1008     0.1282      0.786    0.437
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7827 on 35 degrees of freedom
Multiple R-squared: 0.01736,    Adjusted R-squared: -0.01072
F-statistic: 0.6182 on 1 and 35 DF,  p-value: 0.437

> layout(matrix(1:4,2,2)) #graphical approach
> plot(reg1)

please see the pfd() function attached.
________________________________________________________________________________________________________

But as you can see, although I do not use Y as composed by 4 variables and
X by 6, and I do not consider the negative values
too, I get a very low score as my R^2.

If I act with anova instead I have this problem:
________________________________________________________________________________________________________
> Ymatrix<- as.matrix(Y)
> Xmatrix<- as.matrix(X) #where both this Y and X are in their first form,
thus composed by more variables (4 and 6) and with
#negative values as well.

> Errore in UseMethod("anova") :
  no applicable method for 'anova' applied to an object of class
"c('matrix', 'integer', 'numeric')"
________________________________________________________________________________________________________

To be honest, a few days ago I succeeded in using anova, but unfortunately
I do not remember how and I did not save the
command anywhere.

What I would like to know is:

- First of all, am I wrong in how I approach to my problem?
- What do you think about the regression output?
- Finally, how can I do to make the anova? If I have to do it.

I really hope I have been clear. Thank you all for any kind of help.

Best,

Andrea

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to