You might consider a form of PLS - your measurmenets may be highly correlated,
and only a very few can do you any good.  You have a great many output vars,
and few enough inputs.

Jay

Rishabh Gupta wrote:

> Hi All,
>     I'm a research student at the Department Of Electronics, University Of
> York, UK. I'm working a project related to music analysis and
> classification. I am at the stage where I perform some analysis on music
> files (currently only in MIDI format) and extract about 500 variables that
> are related to music properties like pitch, rhythm, polyphony and volume. I
> am performing basic analysis like mean and standard deviation but then I
> also perform more elaborate analysis like measuring complexity of melody and
> rhythm.
>
> The aim is that the variables obtained can be used to perform a number of
> different operations.
>     - The variables can be used to classify / categorise each piece of
> music, on its own, in terms of some meta classifier (e.g. rock, pop,
> classical).
>     - The variables can be used to perform comparison between two files. A
> variable from one music file can be compared to the equivalent variable in
> the other music file. By comparing all the variables in one file with the
> equivalent variable in the other file, an overall similarity measurement can
> be obtained.
>
> The next stage is to test the ability of the of the variables obtained to
> perform the classification / comparison. I need to identify variables that
> are redundant (redundant in the sense of 'they do not provide any
> information' and 'they provide the same information as the other variable')
> so that they can be removed and I need to identify variables that are
> distinguishing (provide the most amount of information).
>
> My Basic Questions Are:
>     - What are the best statistical techniques / methods that should be
> applied here. E.g. I have looked at Principal Component Analysis; this would
> be a good method to remove the redundant variables and hence reduce some the
> amount of data that needs to be processed. Can anyone suggest any other
> sensible statistical anaysis methods?
>     - What are the ideal tools / software to perform the clustering /
> classification. I have access to SPSS software but I have never used it
> before and am not really sure how to apply it or whether it is any good when
> dealing with 100s of variables.
>
> So far I have been analysing each variable on its own 'by eye' by plotting
> the mean and sd for all music files. However this approach is not feasible
> in the long term since I am dealing with such a large number of variables.
> In addition, by looking at each variable on its own, I do not find clusters
> / patterns that are only visible through multivariate analysis. If anyone
> can recommend a better approach I would be greatly appreciated.
>
> Any help or suggestion that can be offered will be greatly appreciated.
>
> Many Thanks!
>
> Rishabh Gupta
>
> =================================================================
> Instructions for joining and leaving this list, remarks about the
> problem of INAPPROPRIATE MESSAGES, and archives are available at
>                   http://jse.stat.ncsu.edu/
> =================================================================

--
Jay Warner
Principal Scientist
Warner Consulting, Inc.
4444 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX: (262) 681-1133
email: [EMAIL PROTECTED]
web: http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?






=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to