Re: [Matplotlib-users] matplotlib.mlab PCA analysis
On Wed, Feb 11, 2009 at 8:00 PM, Marjolaine Rouault mroua...@csir.co.zawrote: Hi, Thanks a lot for your comments. I did try earlier on to remove the bad points but came across some problems when re-ordering my array. I will try out the method sent to me and check the reference. Yep, the compacting/reordering method is appropriate for fixed missing values (typically a grid mask) but not approriate for randomly placed missing values. I didn't read these references, but a simple approach you can implement in python (using for example numpy.linalg.eig applied to you covariance matrix to compute your eofs, and then your pcs) consist in a interative method to fill your missing value and let converge your EOF/PC. 0) first save your mask once: mask = data.mask.copy() 1) fill the missing values with the mean data.filled(data.mean()) 2) make a reconstruction of your data (EOF[1:10] . PC[1:10]) after your PCA analysis using a limited number of modes (met's say 10) : datarec 3) replace you original missing data with your reconstructed field: data = numpy.where(mask, datarec, data) 4) restart from 1) a number of time you can fixe or detect with a criteria based for example on the eigenvalues 5) the finally, you can use you EOFs et PCs, and has a bonus, you filled your data! Regards, Marjolaine. kgdunn+nab...@gmail.com kgdunn%2bnab...@gmail.com 02/11/09 4:06 PM Marjolaine, I am assuming your masked array entries are missing data. Multivariate analysis with missing data can be handled in several standard ways, however these methods don't appear in most Python libraries. Here are some references on the topic that will help you: [1] P.R.C. Nelson and J.F. MacGregor, 1996, Missing data methods in PCA and PLS: Score calculations with incomplete observations, Chemometrics and Intelligent Laboratory Systems, v35, p 45-65. [2] F. Arteaga and A. Ferrer, 2002, Dealing with missing data in MSPC: several methods, different interpretations, some examples, Journal of Chemometrics, v16, p408-418. Paper [1] deals with building a model with missing data, while paper [2] looks at applying an existing PCA model to new data that contains missing entries. Hope these help, Kevin Marjolaine Rouault wrote: Hi, I am struggling to do a PCA analysis on a masked array. Anybody has suggestions on how to deal with masked array when doing PCAs? Best regards, Marjolaine. Quoted from: http://www.nabble.com/matplotlib.mlab-PCA-analysis-tp21932808p21932808.html -- This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html. This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. MailScanner thanks Transtec Computers for their support. -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today- http://p.sf.net/sfu/adobe-com ___ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users -- Stephane Raynaud -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users
Re: [Matplotlib-users] matplotlib.mlab PCA analysis
Hi, Thanks a lot for your comments. I did try earlier on to remove the bad points but came across some problems when re-ordering my array. I will try out the method sent to me and check the reference. Regards, Marjolaine. kgdunn+nab...@gmail.com 02/11/09 4:06 PM Marjolaine, I am assuming your masked array entries are missing data. Multivariate analysis with missing data can be handled in several standard ways, however these methods don't appear in most Python libraries. Here are some references on the topic that will help you: [1] P.R.C. Nelson and J.F. MacGregor, 1996, Missing data methods in PCA and PLS: Score calculations with incomplete observations, Chemometrics and Intelligent Laboratory Systems, v35, p 45-65. [2] F. Arteaga and A. Ferrer, 2002, Dealing with missing data in MSPC: several methods, different interpretations, some examples, Journal of Chemometrics, v16, p408-418. Paper [1] deals with building a model with missing data, while paper [2] looks at applying an existing PCA model to new data that contains missing entries. Hope these help, Kevin Marjolaine Rouault wrote: Hi, I am struggling to do a PCA analysis on a masked array. Anybody has suggestions on how to deal with masked array when doing PCAs? Best regards, Marjolaine. Quoted from: http://www.nabble.com/matplotlib.mlab-PCA-analysis-tp21932808p21932808.html -- This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html. This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. MailScanner thanks Transtec Computers for their support. -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com ___ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users
[Matplotlib-users] matplotlib.mlab PCA analysis
Hi, I am struggling to do a PCA analysis on a masked array. Anybody has suggestions on how to deal with masked array when doing PCAs? Best regards, Marjolaine. -- This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html. This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. MailScanner thanks Transtec Computers for their support. -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com ___ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users
Re: [Matplotlib-users] matplotlib.mlab PCA analysis
Hi Marjolaine, On Tue, Feb 10, 2009 at 12:31 PM, Marjolaine Rouault mroua...@csir.co.zawrote: Hi, I am struggling to do a PCA analysis on a masked array. Anybody has suggestions on how to deal with masked array when doing PCAs? You need to remove missing values at each time step. This means that your missing data are always at the same place. Maybe something like this can work : # Let's say we analyse myfullvar(nt,ny,nx) mask = myfullvar[0] ns = numpy.count(~mask) myvar = numpy.zeros(nt,ns) for it in xrange(nt): myvar[it] = myfullvar[it].compressed() # Then you make a PCA decomposition of myvar and you get back your EOFs myeofs(neof,ns) myfulleofs = numpy.ma.zeros(neof,ny,nx)+numpy.ma.masked for ieof in xrange(neof): myfulleofs[~mask.flat] = myeofs[ieof] Best regards, Marjolaine. -- This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html. This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. MailScanner thanks Transtec Computers for their support. -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today- http://p.sf.net/sfu/adobe-com ___ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users -- Stephane Raynaud -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users