Peter Otten wrote: > Daiyue Weng wrote: > >> Hi, I tried to use DataFrame.values to convert a list of columns in a >> dataframe to a numpy ndarray/matrix, >> >> matrix = df.values[:, list_of_cols] >> >> but got an error, >> >> IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis >> (None) and integer or boolean arrays are valid indices >> >> so what's the problem with the list of columns I passed in? >> >> many thanks > > Your suggestively named list_of_cols is probably not a list. Have your > script print its value and type before the failing operation: > > print(type(list_of_cols), list_of_cols) >> matrix = df.values[:, list_of_cols]
Am Do Mai 26 2016, 09:21:59 schrieb Daiyue Weng: [If you had sent this to the list I would have seen it earlier. Just in case you didn't solve the problem in the meantime:] > it prints > > <class 'list'> ['key1', 'key2'] So my initial assumption was wrong -- list_of_cols is a list. However, df.values is a numpy array and therefore expects integer indices: >>> df = pd.DataFrame([[1,2,3],[4,5,6]], columns="key1 key2 key3".split()) >>> df key1 key2 key3 0 1 2 3 1 4 5 6 [2 rows x 3 columns] >>> df.values array([[1, 2, 3], [4, 5, 6]]) >>> df.values[["key1", "key2"]] Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for int() with base 10: 'key1' (I get a different error message, probably because we use different versions of numpy) To fix the problem you can either use integers >>> df.values[:,[0, 1]] array([[1, 2], [4, 5]]) or select the columns in pandas: >>> df[["key1", "key2"]].values array([[1, 2], [4, 5]]) -- https://mail.python.org/mailman/listinfo/python-list