Re: [Matplotlib-users] matplotlib.mlab PCA analysis

2009-02-12 Thread Stephane Raynaud
On Wed, Feb 11, 2009 at 8:00 PM, Marjolaine Rouault mroua...@csir.co.zawrote:

 Hi,

 Thanks a lot for your comments.  I did try earlier on to remove the bad
 points but came across some problems when re-ordering my array. I will try
 out the method sent to me and check the reference.


Yep, the compacting/reordering method is appropriate for fixed missing
values (typically a grid mask) but not approriate for randomly placed
missing values.

I didn't read these references, but a simple approach you can implement in
python (using for example numpy.linalg.eig applied to you covariance matrix
to compute your eofs, and then your pcs) consist in a interative method to
fill your missing value and let converge your EOF/PC.
0) first save your mask once: mask = data.mask.copy()
1) fill the missing values with the mean data.filled(data.mean())
2) make a reconstruction of your data (EOF[1:10] . PC[1:10]) after your PCA
analysis using a limited number of modes (met's say 10) : datarec
3) replace you original missing data with your reconstructed field: data =
numpy.where(mask, datarec, data)
4)  restart from 1) a number of time you can fixe or detect with a criteria
based for example on the eigenvalues
5) the finally, you can use you EOFs et PCs, and has a bonus, you filled
your data!





 Regards, Marjolaine.



  kgdunn+nab...@gmail.com kgdunn%2bnab...@gmail.com 02/11/09 4:06 PM
 
 Marjolaine,

 I am assuming your masked array entries are missing data.  Multivariate
 analysis with missing data can be handled in several standard ways, however
 these methods don't appear in most Python libraries.

 Here are some references on the topic that will help you:

 [1] P.R.C. Nelson and J.F. MacGregor, 1996, Missing data methods in PCA
 and PLS: Score calculations with incomplete observations, Chemometrics and
 Intelligent Laboratory Systems, v35, p 45-65.

 [2] F. Arteaga and A. Ferrer, 2002, Dealing with missing data in MSPC:
 several methods, different interpretations, some examples, Journal of
 Chemometrics, v16, p408-418.

 Paper [1] deals with building a model with missing data, while paper [2]
 looks at applying an existing PCA model to new data that contains missing
 entries.

 Hope these help,
 Kevin

 Marjolaine Rouault wrote:
 
  Hi,
 
  I am struggling to do a PCA analysis on a masked array. Anybody has
  suggestions on how to deal with masked array when doing PCAs?
 
  Best regards, Marjolaine.
 

 Quoted from:
 http://www.nabble.com/matplotlib.mlab-PCA-analysis-tp21932808p21932808.html



 --
 This message is subject to the CSIR's copyright terms and conditions,
 e-mail legal notice, and implemented Open Document Format (ODF) standard.
 The full disclaimer details can be found at
 http://www.csir.co.za/disclaimer.html.

 This message has been scanned for viruses and dangerous content by
 MailScanner,
 and is believed to be clean.  MailScanner thanks Transtec Computers for
 their support.



 --
 Create and Deploy Rich Internet Apps outside the browser with
 Adobe(R)AIR(TM)
 software. With Adobe AIR, Ajax developers can use existing skills and code
 to
 build responsive, highly engaging applications that combine the power of
 local
 resources and data with the reach of the web. Download the Adobe AIR SDK
 and
 Ajax docs to start building applications today-
 http://p.sf.net/sfu/adobe-com
 ___
 Matplotlib-users mailing list
 Matplotlib-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/matplotlib-users




-- 
Stephane Raynaud
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users


Re: [Matplotlib-users] matplotlib.mlab PCA analysis

2009-02-11 Thread Marjolaine Rouault
Hi,

Thanks a lot for your comments.  I did try earlier on to remove the bad points 
but came across some problems when re-ordering my array. I will try out the 
method sent to me and check the reference.

Regards, Marjolaine.



 kgdunn+nab...@gmail.com 02/11/09 4:06 PM 
Marjolaine,

I am assuming your masked array entries are missing data.  Multivariate 
analysis with missing data can be handled in several standard ways, however 
these methods don't appear in most Python libraries.

Here are some references on the topic that will help you:

[1] P.R.C. Nelson and J.F. MacGregor, 1996, Missing data methods in PCA and 
PLS: Score calculations with incomplete observations, Chemometrics and 
Intelligent Laboratory Systems, v35, p 45-65.

[2] F. Arteaga and A. Ferrer, 2002, Dealing with missing data in MSPC: several 
methods, different interpretations, some examples, Journal of Chemometrics, 
v16, p408-418.

Paper [1] deals with building a model with missing data, while paper [2] looks 
at applying an existing PCA model to new data that contains missing entries.

Hope these help,
Kevin

Marjolaine Rouault wrote:
 
 Hi,
 
 I am struggling to do a PCA analysis on a masked array. Anybody has
 suggestions on how to deal with masked array when doing PCAs?
 
 Best regards, Marjolaine.


Quoted from: 
http://www.nabble.com/matplotlib.mlab-PCA-analysis-tp21932808p21932808.html



-- 
This message is subject to the CSIR's copyright terms and conditions, e-mail 
legal notice, and implemented Open Document Format (ODF) standard. 
The full disclaimer details can be found at 
http://www.csir.co.za/disclaimer.html.

This message has been scanned for viruses and dangerous content by MailScanner, 
and is believed to be clean.  MailScanner thanks Transtec Computers for their 
support.


--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
___
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users


[Matplotlib-users] matplotlib.mlab PCA analysis

2009-02-10 Thread Marjolaine Rouault
Hi,

I am struggling to do a PCA analysis on a masked array. Anybody has suggestions 
on how to deal with masked array when doing PCAs?

Best regards, Marjolaine.



-- 
This message is subject to the CSIR's copyright terms and conditions, e-mail 
legal notice, and implemented Open Document Format (ODF) standard. 
The full disclaimer details can be found at 
http://www.csir.co.za/disclaimer.html.

This message has been scanned for viruses and dangerous content by MailScanner, 
and is believed to be clean.  MailScanner thanks Transtec Computers for their 
support.


--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
___
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users


Re: [Matplotlib-users] matplotlib.mlab PCA analysis

2009-02-10 Thread Stephane Raynaud
Hi Marjolaine,

On Tue, Feb 10, 2009 at 12:31 PM, Marjolaine Rouault mroua...@csir.co.zawrote:

 Hi,

 I am struggling to do a PCA analysis on a masked array. Anybody has
 suggestions on how to deal with masked array when doing PCAs?



You need to remove missing values at each time step.
This means that your missing data are always at the same place.
Maybe something like this can work :

# Let's say we analyse myfullvar(nt,ny,nx)
mask = myfullvar[0]
ns = numpy.count(~mask)
myvar = numpy.zeros(nt,ns)
for it in xrange(nt):
  myvar[it] = myfullvar[it].compressed()

# Then you make a PCA decomposition of myvar and you get back your EOFs
myeofs(neof,ns)
myfulleofs = numpy.ma.zeros(neof,ny,nx)+numpy.ma.masked
for ieof in xrange(neof):
  myfulleofs[~mask.flat] = myeofs[ieof]




 Best regards, Marjolaine.



 --
 This message is subject to the CSIR's copyright terms and conditions,
 e-mail legal notice, and implemented Open Document Format (ODF) standard.
 The full disclaimer details can be found at
 http://www.csir.co.za/disclaimer.html.

 This message has been scanned for viruses and dangerous content by
 MailScanner,
 and is believed to be clean.  MailScanner thanks Transtec Computers for
 their support.



 --
 Create and Deploy Rich Internet Apps outside the browser with
 Adobe(R)AIR(TM)
 software. With Adobe AIR, Ajax developers can use existing skills and code
 to
 build responsive, highly engaging applications that combine the power of
 local
 resources and data with the reach of the web. Download the Adobe AIR SDK
 and
 Ajax docs to start building applications today-
 http://p.sf.net/sfu/adobe-com
 ___
 Matplotlib-users mailing list
 Matplotlib-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/matplotlib-users




-- 
Stephane Raynaud
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users