On 17 Nov 2002 20:00:23 -0500, Elliot Cramer <[EMAIL PROTECTED]> wrote:
> In sci.stat.edu Radford Neal <[EMAIL PROTECTED]> wrote: > > > : You don't know what you are talking about. There are many, many > : situations in which data is analysed when there are more variables > : than observations. > > but if you know anything about statistics, you don't analyze them as > variables but condense them based on your knowledge to many fewer > variables than observations > > > : The absurdity of saying you can't do anything with more variables than > : observations is well illustrated by the case of spectroscopic data, > : where the number of variables is just the number of frequencies (or > : that you have to throw away the extra data from the better instrument > : before analysing it. > see above > > : PCA isn't necessarily the best way of analysing such data, but it > : isn't senseless. > > It's senseless When I saw a PCA on power-spectral data, the first components were - neatly - the overall power, the frequency (linear trend), the quadratic, and so on. The result wasn't senseless. Maybe it was best to look at it as confirmation, or as a source of coefficients. In fact, I still wonder how much use it would have been, if the "sense" had not been obvious. For the same data, (I'm not sure, but) I think would be a mistake to use *all* the components if you are comparing to new data. The fit that was achieved was necessarily, arbitrarily perfect. On the other hand, for the data from genetic micro-arrays, and other bio-assays, I have been assuming that PCA would give little help. I guess, when I wonder some more, I can accept the possibility, if the samples are big enough. But I think they are stuck with a lot of separate assays. Also, p-levels of statistical tests are misleading when the observed proportions have a huge range: The experiment has practically no test-power for a gene that is seldom seen. I have figured that they do a lot of tabulation of "perfect-but-rare -prediction" in order to get candidates. Eventually, with tons of data in hand, they will have to do a heck-of-a-lot of Bonferroni correction. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
