A basic question _chisquare implementation from
feature_selection/univariate_selection.py. (I could have run it in
debugger to understand, but "chi2" implementation is exhausting all
memory during numpy.dot() just before the call to __chisquare. Hence I
had to manually verify the code.)
Copying 2 lines from def __chisquare(f_obs, f_exp):
>>>
f_obs = np.asarray(f_obs, dtype=np.float64)
k = len(f_obs)
<<<
The type of 'f_obs' before and after the call to np.asarray is csr
matrix and array as noted much below. Now what is len(f_obs) supposed
to be? When I tried to mimic these operations in pdb, I got
"TypeError('len() of unsized object',)"
I am confused as what 'k' is supposed to be. Should it be the number
of rows of f_obs?
Type before function call:
<325056x1617899 sparse matrix of type '<type 'numpy.int64'>'
with 150016582 stored elements in Compressed Sparse Column format>
Type after function call:
array(<325056x1617899 sparse matrix of type '<type 'numpy.int64'>'
with 150016582 stored elements in Compressed Sparse Column
format>, dtype=object)
-Anitha
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general