On Tue, Mar 13, 2012 at 9:39 AM, Davy <davykavan...@gmail.com> wrote: > Thanks for the reply. Sorry to be a pain, but could perhaps explain what you > mean by > "you can centerĀ each SNP variable at its mean to make the interaction > termĀ uncorrelated with the main effects".
Suppose rs1 and rs2 are your SNPs rs1cent <- rs1-mean(rs1) rs2cent <- rs2 -mean(rs2) rs12interaction <- rs1cent*rs2cent Now you can approximately test for interaction by testing for correlation between phenotype and rs12interaction. The approximation isn't good enough to be relied on for final results, but it is good enough to screen out, say, the bottom 99% of the models in settings where there is not strong linkage disequilibrium (correlation between SNPs). The advantage of this is not just the lack of glms, but the fact that rs12interaction can be computed for a lot of pairs at once, allowing efficient vectorized code. Perhaps even for all pairs at once, if you have enough memory. > Also, I have never heard of a scores test before but some googling has > turned up the Lagrange multiplier test. Is this the one you mentioned. No, the efficient score or Rao score test. It's based on fitting the model without interaction and testing whether the efficient score, the derivative of the loglikelihood, is zero at the null model. This doesn't require fitting the interaction model, which is why it saves time. Getting large-scale SNP association tests to run fast does require some reasonable familiarity with what is actually going on in the internals of the tests. Or, as many people eventually decide is easier, brute force computing power. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.