Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package python-scikit-learn for openSUSE:Factory checked in at 2022-10-27 13:53:08 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/python-scikit-learn (Old) and /work/SRC/openSUSE:Factory/.python-scikit-learn.new.2275 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-scikit-learn" Thu Oct 27 13:53:08 2022 rev:19 rq:1030924 version:1.1.2 Changes: -------- --- /work/SRC/openSUSE:Factory/python-scikit-learn/python-scikit-learn.changes 2022-09-17 20:08:13.760791552 +0200 +++ /work/SRC/openSUSE:Factory/.python-scikit-learn.new.2275/python-scikit-learn.changes 2022-10-27 13:53:23.116326617 +0200 @@ -1,0 +2,8 @@ +Tue Oct 11 13:10:22 UTC 2022 - Ben Greiner <c...@bnavigator.de> + +- Update dependencies +- Add sklearn-pr24283-gradient-segfault.patch + * gh#scikit-learn/scikit-learn#24283 +- Update test suite setup. + +------------------------------------------------------------------- New: ---- sklearn-pr24283-gradient-segfault.patch ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-scikit-learn.spec ++++++ --- /var/tmp/diff_new_pack.95fVps/_old 2022-10-27 13:53:23.736329780 +0200 +++ /var/tmp/diff_new_pack.95fVps/_new 2022-10-27 13:53:23.748329841 +0200 @@ -16,8 +16,6 @@ # -%{?!python_module:%define python_module() python3-%{**}} -%define skip_python2 1 %global flavor @BUILD_FLAVOR@%{nil} %if "%{flavor}" == "test-py38" %define psuffix -test-py38 @@ -42,6 +40,8 @@ %bcond_with test %endif %bcond_with extratest +# enable pytest color output for local debugging: osc --with pytestcolor +%bcond_with pytestcolor Name: python-scikit-learn%{psuffix} Version: 1.1.2 Release: 0 @@ -49,20 +49,22 @@ License: BSD-3-Clause URL: https://scikit-learn.org/ Source0: https://files.pythonhosted.org/packages/source/s/scikit-learn/scikit-learn-%{version}.tar.gz -BuildRequires: %{python_module Cython >= 0.28.5} -BuildRequires: %{python_module devel} -BuildRequires: %{python_module joblib >= 0.11} +# PATCH-FIX-UPSTREAM sklearn-pr24283-gradient-segfault.patch gh#scikit-learn/scikit-learn#24283 +Patch0: sklearn-pr24283-gradient-segfault.patch +BuildRequires: %{python_module Cython >= 0.29.24} +BuildRequires: %{python_module devel >= 3.8} +BuildRequires: %{python_module joblib >= 1.0.0} BuildRequires: %{python_module numpy-devel >= 1.17.3} BuildRequires: %{python_module scipy >= 1.3.2} BuildRequires: %{python_module setuptools} BuildRequires: %{python_module threadpoolctl >= 2.0.0} -BuildRequires: %{python_module xml} BuildRequires: fdupes BuildRequires: gcc-c++ BuildRequires: gcc-fortran BuildRequires: openblas-devel BuildRequires: python-rpm-macros -Requires: python-joblib >= 0.11 +# Check sklearn/_min_dependencies.py for dependencies +Requires: python-joblib >= 1.0.0 Requires: python-numpy >= 1.17.3 Requires: python-scipy >= 1.3.2 Requires: python-threadpoolctl >= 2.0.0 @@ -70,18 +72,20 @@ Suggests: python-matplotlib Suggests: python-pandas Suggests: python-seaborn -Provides: python-sklearn +Provides: python-sklearn = %{version}-%{release} %if "%{python_flavor}" == "python3" || "%{?python_provides}" == "python3" -Provides: sklearn +Provides: sklearn = %{version}-%{release} %endif # SECTION test requirements %if %{with test} -BuildRequires: %{python_module pytest >= 4.0} -BuildRequires: %{python_module scikit-learn} +BuildRequires: %{python_module pytest >= 5.0.1} +BuildRequires: %{python_module pytest-rerunfailures} +BuildRequires: %{python_module pytest-xdist} +BuildRequires: %{python_module scikit-learn = %{version}} %if %{with extratest} BuildRequires: %{python_module matplotlib >= 3.1.2} -BuildRequires: %{python_module pandas >= 0.25.0} -BuildRequires: %{python_module scikit-image >= 0.13} +BuildRequires: %{python_module pandas >= 1.0.5} +BuildRequires: %{python_module scikit-image >= 0.16.2} %endif %endif # /SECTION @@ -93,8 +97,10 @@ %prep %autosetup -p1 -n scikit-learn-%{version} - rm -rf sklearn/.pytest_cache +%if !%{with pytestcolor} +sed -i '/--color=yes/d' setup.cfg +%endif %build %if !%{with test} @@ -108,30 +114,20 @@ %endif %if %{with test} -# Precision-related errors on non-x86 platforms -%ifarch %{ix86} x86_64 %check +mkdir test_dir +pushd test_dir export SKLEARN_SKIP_NETWORK_TESTS=1 -NO_TESTS="test_feature_importance_regression or test_minibatch_with_many_reassignments" -NO_TESTS+=" or test_sparse_coder_parallel_mmap or test_explained_variances" -# test_negative_sample_weights_mask_all_samples[weights-are-zero-NuSVC] Fatal Python error: Aborted -NO_TESTS+=" or test_negative_sample_weights_mask_all_samples" -# Disable test_fetch_openml_verify_checksum for now, no clue why it fail. -NO_TESTS+=" or test_fetch_openml_verify_checksum[True]" -NO_TESTS+=" or test_fetch_openml_verify_checksum[False]" - -# Precision-related errors on 32 bit arch +NO_TESTS="dummyprefix" +%ifarch %{ix86} %{arm} +# Precision-related errors on 32 bit # https://github.com/scikit-learn/scikit-learn/issues/19230 -%ifarch i586 %{arm} NO_TESTS+=" or test_convergence_dtype_consistency" +NO_TESTS+=" or test_imputation_missing_value_in_test_array" %endif - -mkdir test_dir -pushd test_dir -%pytest_arch -p no:cacheprovider -v -k "not ($NO_TESTS)" %{$python_sitearch}/sklearn +%pytest_arch -v --pyargs sklearn -n auto -k "not ($NO_TESTS)" popd %endif -%endif %if !%{with test} %files %{python_files} ++++++ sklearn-pr24283-gradient-segfault.patch ++++++ >From dd7de2bfbf39222153f8c2deb203a0e1efef8640 Mon Sep 17 00:00:00 2001 From: "Thomas J. Fan" <thomasjp...@gmail.com> Date: Sat, 27 Aug 2022 10:28:03 -0400 Subject: FIX Treat negative categoricals as unknown during predict PR: #24282 Fixes #24274 Index: scikit-learn-1.1.2/sklearn/ensemble/_hist_gradient_boosting/_predictor.pyx =================================================================== --- scikit-learn-1.1.2.orig/sklearn/ensemble/_hist_gradient_boosting/_predictor.pyx +++ scikit-learn-1.1.2/sklearn/ensemble/_hist_gradient_boosting/_predictor.pyx @@ -66,7 +66,10 @@ cdef inline Y_DTYPE_C _predict_one_from_ else: node_idx = node.right elif node.is_categorical: - if in_bitset_2d_memoryview( + if data_val < 0: + # data_val is not in the accepted range, so it is treated as missing value + node_idx = node.left if node.missing_go_to_left else node.right + elif in_bitset_2d_memoryview( raw_left_cat_bitsets, <X_BINNED_DTYPE_C>data_val, node.bitset_idx): Index: scikit-learn-1.1.2/sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py =================================================================== --- scikit-learn-1.1.2.orig/sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py +++ scikit-learn-1.1.2/sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py @@ -1159,3 +1159,28 @@ def test_no_user_warning_with_scoring(): with warnings.catch_warnings(): warnings.simplefilter("error", UserWarning) est.fit(X_df, y) + + +def test_unknown_category_that_are_negative(): + """Check that unknown categories that are negative does not error. + + Non-regression test for #24274. + """ + rng = np.random.RandomState(42) + n_samples = 1000 + X = np.c_[rng.rand(n_samples), rng.randint(4, size=n_samples)] + y = np.zeros(shape=n_samples) + y[X[:, 1] % 2 == 0] = 1 + + hist = HistGradientBoostingRegressor( + random_state=0, + categorical_features=[False, True], + max_iter=10, + ).fit(X, y) + + # Check that negative values from the second column are treated like a + # missing category + X_test_neg = np.asarray([[1, -2], [3, -4]]) + X_test_nan = np.asarray([[1, np.nan], [3, np.nan]]) + + assert_allclose(hist.predict(X_test_neg), hist.predict(X_test_nan)) Index: scikit-learn-1.1.2/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py =================================================================== --- scikit-learn-1.1.2.orig/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py +++ scikit-learn-1.1.2/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py @@ -1186,6 +1186,8 @@ class HistGradientBoostingRegressor(Regr For each categorical feature, there must be at most `max_bins` unique categories, and each categorical value must be in [0, max_bins -1]. + During prediction, categories encoded as a negative value are treated as + missing values. Read more in the :ref:`User Guide <categorical_support_gbdt>`. @@ -1515,6 +1517,8 @@ class HistGradientBoostingClassifier(Cla For each categorical feature, there must be at most `max_bins` unique categories, and each categorical value must be in [0, max_bins -1]. + During prediction, categories encoded as a negative value are treated as + missing values. Read more in the :ref:`User Guide <categorical_support_gbdt>`.