Hi Luca,
>From the docs,
fit(*X*, *y*,
*n_jobs=1*)<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression.fit>
Fit linear model.
Parameters :
*X* : numpy array or sparse matrix of shape [n_samples,n_features]
Training data
With your example code, my traceback is
*------------------------------------------------------------*
*Traceback (most recent call last):*
* File "<ipython console>", line 1, in <module>*
* File "/home/jaques/scikit-learn/sklearn/linear_model/base.py", line 363,
in fit*
* X, y, self.fit_intercept, self.normalize, self.copy_X)*
* File "/home/jaques/scikit-learn/sklearn/linear_model/base.py", line 103,
in center_data*
* X_std = np.ones(X.shape[1])*
*IndexError: tuple index out of range*
Hence why your reshaping fixes the problem.
to match the API, your X, must be of shape [n_samples,n_features]
So I don't think an issue is necessary, as it is expected, although, having
a better error message in terms of what the input should be could be useful.
Thoughts, list?
Hope this helps
Kind Regards,
Jaques
2013/9/24 Luca Cerone <[email protected]>
> Dear all,
>
> I have noticed that the Linear Regression fails to perform the prediction
> if performed on
> with a dataset and target that are normal array.
>
> You can replicate this as follows:
>
> from pylab import linspace, permutation, randn
> from sklearn import linear_model
>
> >>>
> clf = linear_model.LinearRegression()
>
> x = linspace(0,1,201)
> noise = 0.2 * randn(*x.shape)
> y = 0.5 + 2 * x + noise
>
> clf.fit(x,y)
>
> fails with the following message:
>
> TypeError Traceback (most recent call last)
> <ipython-input-134-5c1831092d7a> in <module>()
> ----> 1 clf.fit(x,y)
>
> /home/lcerone/CNOVE/local/lib/python2.7/site-packages/sklearn/linear_model/base.pyc
> in fit(self, X, y, n_jobs)
> 361
> 362 X, y, X_mean, y_mean, X_std = self._center_data(
> --> 363 X, y, self.fit_intercept, self.normalize, self.copy_X)
> 364
> 365 if sp.issparse(X):
>
> /home/lcerone/CNOVE/local/lib/python2.7/site-packages/sklearn/linear_model/base.pyc
> in center_data(X, y, fit_intercept, normalize, copy, sample_weight)
> 98 if normalize:
> 99 X_std = np.sqrt(np.sum(X ** 2, axis=0))
> --> 100 X_std[X_std == 0] = 1
> 101 X /= X_std
> 102 else:
>
> TypeError: 'numpy.float64' object does not support item assignment
>
> <<<
>
> This however can be solved by reshaping the arrays x,y (x has only 1
> dimension and so does y)
>
> xx = x.reshape((x.shape[0],-1))
> yy = y.reshape((y.shape[0],-1))
> clf.fit(xx,yy)
>
> correctly solves the regression problem.
>
> Similary if now I try to run prediction using an array zz generated using
> linspace the task fails, but can be solved easily by reshaping the array zz.
>
> I was wondering if this is the intended behaviour or if I should submit an
> issue on github.
>
> Have a nice day,
> Cheers,
> Luca
>
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general