Hello, I am trying to perform ridge regression on a relatively large data set 70 million examples 24 million very sparse features.
E.G. I have created an X matrix with dimensions (73725855, 24652292), an associated y vector with dimensions (73725855,), and a sample_weights vector with identical dimensions ((73725855,)). In this case, the y vector is a rating, and the sample_weights describe how many times a given rating occurred. I need to use one of the sparse solvers, as the data set does not fit in memory as a dense matrix, however it seems that all of the sparse solvers do not accept a sample_weights vector. Does anyone have experience with weighted ridge regression on large sparse matrices? I am new to the world of machine learning, so please forgive me for any vocabulary mistakes! Thanks, Cory
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
