Hi all, Sorry this is a bit long, but the explanation of what I want to do needs to be clear to avoid issues such as this quote..."It is impossible to speak in such a way that you cannot be misunderstood." Karl Popper.
I am running linear regression models, but I am getting expected results. I wonder what else I might try to derive an estimated value of bat echolocation parameters based on forearm measurements. It is known that the size of the bat is negatively related to the characteristic frequency (Fc) of their echolocation calls (decades of my field work) . So in general larger guys have lower frequency calls and smaller guys have higher frequency calls. I have run the regressions based on the FA (valid forearm measurements) and the known and valid Fc ranges for a dozen species or so and using the lm models to "predict" Fc values for a few species that have FA values but have not yet been recorded. Hence there are no valid echolocation call parameters. R Code used is below discussion. I have valid ranges for the known species FA (forearm measurements) and Fc(minimum) and Fc (maximum). So I do two separate runs with the data using the lm model one with FA~Fcmin and one FA~Fcmax. The goal is to provide the predicted (estimated) values for the species with known FA values but w/o verified Fc value ranges. My concern is that the predicted values returned are much lower than the true values for the verified species. Therefore I am not confident the predicted values for those w/o verified Fc ranges are useful. One very helpful person looked at one simple data set I sent and showed that the statistical differences between the true values and predicted were not significant. However Krebs' admonishment to students eons ago "Do not confuse statistical significance with ecological significance" is true here. The values of the predicted ranges are far lower than reality so the few species that do not have field recorded Fc values are suspect. These differences in predicted values from a true range will will make a difference for potentially IDing the unknown calls. A difference of 10kHz Fc generally suggests a different species, albeit some are much closer and may only have a 5 kHz difference. I am looking at acoustic data sets of calls from South America and there are many "sonospecies." These are clearly separate species based on echolocation call parameters that have yet had "faces & voices" matched. We know that call parameters are diagnostic for families and genera even when the species is unknown. It is then the Fc values that assist in identifying the species within a cluster of calls from the same genus. Sample of R code used: Bats <- dget('C:/=Bat data working/Acoustic Parameters/_Working/=Vespertilionidae/Bats.robj') model.lm <- lm(formula=Fc ~ as.factor(FA),data=Bats,na.action=na.omit) > Anova(model.lm,type='II') Error in solve.default(L %*% V %*% t(L)) : system is computationally singular: reciprocal condition number = 0 > summary(model.lm) Call: lm(formula = Fc ~ as.factor(FA), data = Bats, na.action = na.omit) Residuals: ALL 5 residuals are 0: no residual degrees of freedom! Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 53.3 NA NA NA as.factor(FA)34.4 -4.6 NA NA NA as.factor(FA)35.4 2.3 NA NA NA as.factor(FA)35.5 9.0 NA NA NA as.factor(FA)40.5 -7.3 NA NA NA Residual standard error: NaN on 0 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 1, Adjusted R-squared: NaN F-statistic: NaN on 4 and 0 DF, p-value: NA > tmp<-predict(model.lm) > Bats[names(tmp),"predicted"]<-tmp > rm('tmp') > rm('model.lm') > >model.lm <- lm(formula= Fc~ FA,data=Bats,na.action=na.omit) >Anova(model.lm,type='II') >summary(model.lm) >tmp<-predict(model.lm,Bats) >Bats[names(tmp),"Predicted.Fc"]<-tmp >rm('tmp') >rm('model.lm') With the results it can be seen that the predicted Fc values on right are not close to the true Fc values on left and then make me hesitant to accept the 2 with NA predicted values. FYI Species are simple 6 letter coded for genus and species. Species FA Fcmin Fcmax FcMinpredic FcMaxpredic Myoalb 35.3 45.7 48.7 51.73 55.26 Myoata 37 NA NA 49.52 52.59 Myokea 33.7 57.8 61.3 53.80 57.77 Myonig 34.5 51.6 55.7 52.77 56.52 Myooxy 40.5 45.7 47.6 44.98 47.09 Myorip 36 53.3 57.5 50.82 54.16 Myosim 38 NA NA 48.23 51.02 Perhaps simple linear regression is not the method to use? Thanks for any additional suggestions. Bruce [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology