Could you generate more samples, set penalty to none, reduce the tolerance and check the coefficients instead of predictions. This is sure to be sure that this is not only a numerical error.
Sent from my phone - sorry to be brief and potential misspell. Original Message From: benoit.pres...@u-bourgogne.fr Sent: 8 October 2019 20:27 To: scikit-learn@python.org Reply to: scikit-learn@python.org Subject: [scikit-learn] logistic regression results are not stable between solvers Dear scikit-learn users, I am using logistic regression to make some predictions. On my own data, I do not get the same results between solvers. I managed to reproduce this issue on synthetic data (see the code below). All solvers seem to converge (n_iter_ < max_iter), so why do I get different results? If results between solvers are not stable, which one to choose? Best regards, Ben ------------------------------------------ Here is the code I used to generate synthetic data: from sklearn.datasets import make_classification from sklearn.model_selection import StratifiedShuffleSplit from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression # RANDOM_SEED = 2 # X_sim, y_sim = make_classification(n_samples=200, n_features=45, n_informative=10, n_redundant=0, n_repeated=0, n_classes=2, n_clusters_per_class=1, random_state=RANDOM_SEED, shuffle=False) # sss = StratifiedShuffleSplit(n_splits=10, test_size=0.2, random_state=RANDOM_SEED) for train_index_split, test_index_split in sss.split(X_sim, y_sim): X_split_train, X_split_test = X_sim[train_index_split], X_sim[test_index_split] y_split_train, y_split_test = y_sim[train_index_split], y_sim[test_index_split] ss = StandardScaler() X_split_train = ss.fit_transform(X_split_train) X_split_test = ss.transform(X_split_test) # classifier_lbfgs = LogisticRegression(fit_intercept=True, max_iter=20000000, verbose=1, random_state=RANDOM_SEED, C=1e9, solver='lbfgs') classifier_lbfgs.fit(X_split_train, y_split_train) print('classifier lbfgs iter:', classifier_lbfgs.n_iter_) classifier_saga = LogisticRegression(fit_intercept=True, max_iter=20000000, verbose=1, random_state=RANDOM_SEED, C=1e9, solver='saga') classifier_saga.fit(X_split_train, y_split_train) print('classifier saga iter:', classifier_saga.n_iter_) # y_pred_lbfgs = classifier_lbfgs.predict(X_split_test) y_pred_saga = classifier_saga.predict(X_split_test) # if (y_pred_lbfgs==y_pred_saga).all() == False: print('lbfgs does not give the same results as saga :-( !') exit() _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn