As you can notice in the code below, I do scale the data. I do not get any convergence warning and moreover I always have n_iter_ < max_iter.
> Le 8 oct. 2019 à 19:51, Andreas Mueller <t3k...@gmail.com> a écrit : > > I'm pretty sure SAGA is not converging. Unless you scale the data, SAGA is > very slow to converge. > >> On 10/8/19 7:19 PM, Benoît Presles wrote: >> Dear scikit-learn users, >> >> I am using logistic regression to make some predictions. On my own data, I >> do not get the same results between solvers. I managed to reproduce this >> issue on synthetic data (see the code below). >> All solvers seem to converge (n_iter_ < max_iter), so why do I get different >> results? >> If results between solvers are not stable, which one to choose? >> >> >> Best regards, >> Ben >> >> ------------------------------------------ >> >> Here is the code I used to generate synthetic data: >> >> from sklearn.datasets import make_classification >> from sklearn.model_selection import StratifiedShuffleSplit >> from sklearn.preprocessing import StandardScaler >> from sklearn.linear_model import LogisticRegression >> # >> RANDOM_SEED = 2 >> # >> X_sim, y_sim = make_classification(n_samples=200, >> n_features=45, >> n_informative=10, >> n_redundant=0, >> n_repeated=0, >> n_classes=2, >> n_clusters_per_class=1, >> random_state=RANDOM_SEED, >> shuffle=False) >> # >> sss = StratifiedShuffleSplit(n_splits=10, test_size=0.2, >> random_state=RANDOM_SEED) >> for train_index_split, test_index_split in sss.split(X_sim, y_sim): >> X_split_train, X_split_test = X_sim[train_index_split], >> X_sim[test_index_split] >> y_split_train, y_split_test = y_sim[train_index_split], >> y_sim[test_index_split] >> ss = StandardScaler() >> X_split_train = ss.fit_transform(X_split_train) >> X_split_test = ss.transform(X_split_test) >> # >> classifier_lbfgs = LogisticRegression(fit_intercept=True, >> max_iter=20000000, verbose=1, random_state=RANDOM_SEED, C=1e9, >> solver='lbfgs') >> classifier_lbfgs.fit(X_split_train, y_split_train) >> print('classifier lbfgs iter:', classifier_lbfgs.n_iter_) >> classifier_saga = LogisticRegression(fit_intercept=True, >> max_iter=20000000, verbose=1, random_state=RANDOM_SEED, C=1e9, >> solver='saga') >> classifier_saga.fit(X_split_train, y_split_train) >> print('classifier saga iter:', classifier_saga.n_iter_) >> # >> y_pred_lbfgs = classifier_lbfgs.predict(X_split_test) >> y_pred_saga = classifier_saga.predict(X_split_test) >> # >> if (y_pred_lbfgs==y_pred_saga).all() == False: >> print('lbfgs does not give the same results as saga :-( !') >> exit() >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn