Re: [Scikit-learn-general] Strange KFold behavior with shuffling.

2014-12-15 Thread Andy
On 12/13/2014 01:09 AM, He-chien Tsai wrote: > Thanks for reply. I misused random.seed as it returns None. > I passed an integer to random_state but it remains that unexpected > behaviors. > After I cloned the estimator by sklearn.base.clone,e the result > becomes reasonable. > > clfs = [ (clone(

Re: [Scikit-learn-general] Strange KFold behavior with shuffling.

2014-12-12 Thread He-chien Tsai
Thanks for reply. I misused random.seed as it returns None. I passed an integer to random_state but it remains that unexpected behaviors. After I cloned the estimator by sklearn.base.clone,e the result becomes reasonable. clfs = [ (clone(pipe).fit(x[train_index], y[train_index]), (x[test_index], y

Re: [Scikit-learn-general] Strange KFold behavior with shuffling.

2014-12-12 Thread Andy
random.seed returns nothing, and the random module is not used, it is numpy.random. You should just pass the integer. On 12/09/2014 06:50 PM, He-chien Tsai wrote: Thanks for your approach, I didn't notice that cross_val_score accepts cross validator as cv Your approach makes that strange beha

Re: [Scikit-learn-general] Strange KFold behavior with shuffling.

2014-12-09 Thread He-chien Tsai
Thanks for your approach, I didn't notice that cross_val_score accepts cross validator as cv Your approach makes that strange behavior disappeared! But I still can't figure out what mistake I made, my original code looks nothing wrong. BTW, I used pipeline because I planned using data transformati

Re: [Scikit-learn-general] Strange KFold behavior with shuffling.

2014-12-09 Thread Sebastian Raschka
What is your dataset size? I am a little bit curious whether you need the pipe.fit(), I'd do the CV usually like this clf1 = Pipeline([ ('classifier', RandomForestClassifier(n_estimators=100, min_samples_leaf=10,random_state=random.seed(1234))) cv = KFold(n=X_train.shape[0], n_f

[Scikit-learn-general] Strange KFold behavior with shuffling.

2014-12-09 Thread He-chien Tsai
I got two strange cross-validation scores even I tried different parameter of random_state in KFold, the last fold significantly lower than other folds like this: [0.66555285540704734, 0.64459295261239369, 0.64611178614823817, 0.6488456865127582, 0.65268915223336377, 0.65603160133697969, 0.6