Re: [scikit-learn] Finding the PC that captures a specific variable

2021-01-22 Thread Mahmood Naderan
Hi Thanks for the replies. I read about the available functions in the PCA section. Consider the following code x = StandardScaler().fit_transform(x) pca = PCA() principalComponents = pca.fit_transform(x) principalDf = pd.DataFrame(data = principalComponents) loadings = pca.components_ finalDf = p

Re: [scikit-learn] scikit-learn 0.24 installation fails with ModuleNotFoundError: No module named 'scipy'

2021-01-22 Thread Guillaume Lemaître
OK, so the normal install is working. Now, to fix your issue we need to understand how `sc.install_pypi_package` is working and mainly how does it call `pip`. We need to make sure that it call the right pip (the system `pip3` in your case). On Fri, 22 Jan 2021 at 14:39, Bertrand B. wrote: > Tha

Re: [scikit-learn] scikit-learn 0.24 installation fails with ModuleNotFoundError: No module named 'scipy'

2021-01-22 Thread Bertrand B.
Thank you Guillaume for your help, I am using : (running on AWS EMR-6.2) pip3 --version pip 9.0.3 from /usr/lib/python3.7/site-packages (python 3.7) pip3 install scikit-learn Collecting scikit-learn Using cached https://files.pythonhosted.org/packages/f4/7b/d415b0c89babf23dcd8ee631015f043e2d7

Re: [scikit-learn] Finding the PC that captures a specific variable

2021-01-22 Thread Nicolas Hug
Hi Mahmood, There are different pieces of info that you can get from PCA: 1. How important is a given PC to reconstruct the entire dataset -> This is given by explained_variance_ratio_ as Guillaume suggested 2. What is the contribution of each feature to each PC (remember that a PC is a line

Re: [scikit-learn] Finding the PC that captures a specific variable

2021-01-22 Thread Julio Antonio Soto
Hi Mahmood, I believe your question is answered here: https://stackoverflow.com/questions/22984335/recovering-features-names-of-explained-variance-ratio-in-pca-with-sklearn > El 22 ene 2021, a las 10:26, Guillaume Lemaître > escribió: > >  > I am not really understanding the question, sorry

Re: [scikit-learn] Finding the PC that captures a specific variable

2021-01-22 Thread Guillaume Lemaître
I am not really understanding the question, sorry. Are you seeking for the `explained_variance_ratio_` attribute that give you a relative value of the eigenvalues associated to the eigenvectors? On Fri, 22 Jan 2021 at 10:16, Mahmood Naderan wrote: > Hi > I have a question about PCA and that is,

[scikit-learn] Finding the PC that captures a specific variable

2021-01-22 Thread Mahmood Naderan
Hi I have a question about PCA and that is, how we can determine, a variable, X, is better captured by which factor (principal component)? For example, maybe one variable has low weight in the first PC but has a higher weight in the fifth PC. When I use the PCA from Scikit, I have to manually wor

Re: [scikit-learn] scikit-learn 0.24 installation fails with ModuleNotFoundError: No module named 'scipy'

2021-01-22 Thread Guillaume Lemaître
@Bertrand Could you tell us which version of `pip` to you use (you need pip >= 19.0 for manylinux2010 and pip >= 19.3 for manylinux2014) On Fri, 22 Jan 2021 at 09:49, Guillaume Lemaître wrote: > We might experience an issue with PyPI not selecting the manylinux2010 > wheel: https://github.com/sc

Re: [scikit-learn] scikit-learn 0.24 installation fails with ModuleNotFoundError: No module named 'scipy'

2021-01-22 Thread Guillaume Lemaître
We might experience an issue with PyPI not selecting the manylinux2010 wheel: https://github.com/scikit-learn/scikit-learn/issues/19233 We have to check but we will probably shortly upload manylinux1 wheels that should resolve the issue. I am curious if fetching the wheel by hand and installing vi