commit python-sklearn-pandas for openSUSE:Factory

Source-Sync Sun, 13 Feb 2022 10:52:03 -0800

Script 'mail_helper' called by obssrc
Hello community,

here is the log from the commit of package python-sklearn-pandas for 
openSUSE:Factory checked in at 2022-02-13 19:51:06
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-sklearn-pandas (Old)
 and      /work/SRC/openSUSE:Factory/.python-sklearn-pandas.new.1956 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Package is "python-sklearn-pandas"

Sun Feb 13 19:51:06 2022 rev:7 rq:954095 version:2.2.0

Changes:
--------
--- 
/work/SRC/openSUSE:Factory/python-sklearn-pandas/python-sklearn-pandas.changes  
    2021-02-02 14:44:59.341002057 +0100
+++ 
/work/SRC/openSUSE:Factory/.python-sklearn-pandas.new.1956/python-sklearn-pandas.changes
    2022-02-13 19:51:52.142344769 +0100
@@ -1,0 +2,13 @@
+Sat Feb 12 23:37:31 UTC 2022 - Arun Persaud <a...@gmx.de>
+
+- update to version 2.2.0:
+  * Added an ability to provide callable functions instead of static
+    column list.
+
+- changes from version 2.1.0:
+  * Removed test for Python 3.6 and added Python 3.9
+  * Added deprecation warning for NumericalTransformer
+  * Fixed pickling issue causing integration issues with Baikal.
+  * Started publishing package to conda repo
+
+-------------------------------------------------------------------

Old:
----
  sklearn-pandas-2.0.4-gh.tar.gz

New:
----
  sklearn-pandas-2.2.0-gh.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-sklearn-pandas.spec ++++++
--- /var/tmp/diff_new_pack.FzEXl4/_old  2022-02-13 19:51:52.602345995 +0100
+++ /var/tmp/diff_new_pack.FzEXl4/_new  2022-02-13 19:51:52.610346015 +0100
@@ -1,7 +1,7 @@
 #
 # spec file for package python-sklearn-pandas
 #
-# Copyright (c) 2021 SUSE LLC
+# Copyright (c) 2022 SUSE LLC
 #
 # All modifications and additions to the file contributed by third parties
 # remain the property of their copyright owners, unless otherwise agreed
@@ -21,10 +21,10 @@
 # SciPy 1.6.0 dropped support for Python 3.6
 %define         skip_python36 1
 Name:           python-sklearn-pandas
-Version:        2.0.4
+Version:        2.2.0
 Release:        0
 Summary:        Pandas integration with sklearn
-License:        Zlib AND BSD-2-Clause
+License:        BSD-2-Clause AND Zlib
 Group:          Development/Languages/Python
 URL:            https://github.com/scikit-learn-contrib/sklearn-pandas
 Source:         
%{url}/archive/v%{version}.tar.gz#/sklearn-pandas-%{version}-gh.tar.gz

++++++ sklearn-pandas-2.0.4-gh.tar.gz -> sklearn-pandas-2.2.0-gh.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/.circleci/config.yml 
new/sklearn-pandas-2.2.0/.circleci/config.yml
--- old/sklearn-pandas-2.0.4/.circleci/config.yml       2021-01-06 
17:28:10.000000000 +0100
+++ new/sklearn-pandas-2.2.0/.circleci/config.yml       2021-05-08 
10:05:59.000000000 +0200
@@ -1,15 +1,30 @@
 version: 2
 jobs:
-  test:
+  test37:
     docker:
-      - image: circleci/python:3.6
+      - image: circleci/python:3.7
     steps:
       - checkout
       - run: pip install --user nox
       - run: ~/.local/bin/nox
+  test38:
+    docker:
+      - image: circleci/python:3.8
+    steps:
+      - checkout
+      - run: pip install --user nox
+      - run: ~/.local/bin/nox  
+  test39:
+    docker:
+      - image: cimg/python:3.9.1
+    steps:
+      - checkout
+      - run: pip install --user nox
+      - run: ~/.local/bin/nox            
 
 workflows:
   version: 2
   build_and_test:
     jobs:
-      - test
+      - test37
+      - test39
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/sklearn-pandas-2.0.4/.github/workflows/python-publish.yml 
new/sklearn-pandas-2.2.0/.github/workflows/python-publish.yml
--- old/sklearn-pandas-2.0.4/.github/workflows/python-publish.yml       
2021-01-06 17:28:10.000000000 +0100
+++ new/sklearn-pandas-2.2.0/.github/workflows/python-publish.yml       
1970-01-01 01:00:00.000000000 +0100
@@ -1,50 +0,0 @@
-# This workflows will upload a Python Package using Twine when a release is 
created
-# For more information see: 
https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries
-
-name: Upload Python Package
-
-on: 
-  workflow_dispatch:
-    branches:
-      - main
-
-jobs:
-  test:
-    runs-on: ubuntu-latest
-    strategy:
-      matrix:
-        python-version: [3.6, 3.7, 3.8]
-    steps:
-    - uses: actions/checkout@v2
-    - name: Set up Python ${{ matrix.python-version }}
-      uses: actions/setup-python@v1
-      with:
-        python-version: ${{ matrix.python-version }}  
-    - name: Install dependencies
-      run: |
-        python -m pip install --upgrade pip
-        pip install nox
-    - name: Test with pytest
-      run: nox
-            
-  deploy:
-    needs: test
-    runs-on: ubuntu-latest
-
-    steps:
-    - uses: actions/checkout@v2
-    - name: Set up Python
-      uses: actions/setup-python@v2
-      with:
-        python-version: '3.x'
-    - name: Install dependencies
-      run: |
-        python -m pip install --upgrade pip
-        pip install setuptools wheel twine
-    - name: Build and publish
-      env:
-        TWINE_USERNAME: __token__
-        TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
-      run: |
-        python setup.py sdist bdist_wheel
-        twine upload --repository pypi dist/*
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/sklearn-pandas-2.0.4/.github/workflows/step1_test.yml 
new/sklearn-pandas-2.2.0/.github/workflows/step1_test.yml
--- old/sklearn-pandas-2.0.4/.github/workflows/step1_test.yml   1970-01-01 
01:00:00.000000000 +0100
+++ new/sklearn-pandas-2.2.0/.github/workflows/step1_test.yml   2021-05-08 
10:05:59.000000000 +0200
@@ -0,0 +1,28 @@
+# This workflows will upload a Python Package using Twine when a release is 
created
+# For more information see: 
https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries
+
+name: 1 Test Package
+
+on: 
+  workflow_dispatch:
+    branches:
+      - main
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: [3.7, 3.8, 3.9]
+    steps:
+    - uses: actions/checkout@v2
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v1
+      with:
+        python-version: ${{ matrix.python-version }}  
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install nox
+    - name: Test with pytest
+      run: nox
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/sklearn-pandas-2.0.4/.github/workflows/step2_release.yml 
new/sklearn-pandas-2.2.0/.github/workflows/step2_release.yml
--- old/sklearn-pandas-2.0.4/.github/workflows/step2_release.yml        
1970-01-01 01:00:00.000000000 +0100
+++ new/sklearn-pandas-2.2.0/.github/workflows/step2_release.yml        
2021-05-08 10:05:59.000000000 +0200
@@ -0,0 +1,32 @@
+name: 2 Release Package
+
+on: 
+  workflow_dispatch:
+    branches:
+      - main    
+
+jobs:
+  
+  release:
+    runs-on: ubuntu-latest
+  
+    steps:
+      - name: Checkout Code
+        uses: actions/checkout@v2
+        with:
+          fetch-depth: 0          
+      - name: Changelog
+        uses: scottbrenner/generate-changelog-action@master
+        id: Changelog
+      - name: Create Release
+        id: create_release
+        uses: actions/create-release@latest
+        env: 
+           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # This token is provided 
by Actions, you do not need to create your own token
+        with:
+          tag_name: ${{ github.ref }}
+          release_name: ${{ github.ref }}
+          body: |
+            ${{ steps.Changelog.outputs.changelog }}
+          draft: false
+          prerelease: false
\ No newline at end of file
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/sklearn-pandas-2.0.4/.github/workflows/step3_pypi_deploy.yml 
new/sklearn-pandas-2.2.0/.github/workflows/step3_pypi_deploy.yml
--- old/sklearn-pandas-2.0.4/.github/workflows/step3_pypi_deploy.yml    
1970-01-01 01:00:00.000000000 +0100
+++ new/sklearn-pandas-2.2.0/.github/workflows/step3_pypi_deploy.yml    
2021-05-08 10:05:59.000000000 +0200
@@ -0,0 +1,31 @@
+name: 3 PyPI Deploy
+
+on: 
+  workflow_dispatch:
+    branches:
+      - main 
+
+jobs:
+
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout Code
+        uses: actions/checkout@v2
+        with:
+          fetch-depth: 0          
+      - name: Set up Python
+        uses: actions/setup-python@v2
+        with:
+          python-version: "3.x"            
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install setuptools wheel twine
+      - name: Build and publish PyPI
+        env:
+          TWINE_USERNAME: __token__
+          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}  
+        run: |
+          python setup.py sdist bdist_wheel                 
+          twine upload --repository pypi dist/*
\ No newline at end of file
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/sklearn-pandas-2.0.4/.github/workflows/step4_conda_deploy.yml 
new/sklearn-pandas-2.2.0/.github/workflows/step4_conda_deploy.yml
--- old/sklearn-pandas-2.0.4/.github/workflows/step4_conda_deploy.yml   
1970-01-01 01:00:00.000000000 +0100
+++ new/sklearn-pandas-2.2.0/.github/workflows/step4_conda_deploy.yml   
2021-05-08 10:05:59.000000000 +0200
@@ -0,0 +1,21 @@
+name: 4 Conda Deploy
+
+on: 
+  workflow_dispatch:
+    branches:
+      - main 
+
+jobs:
+
+  conda_deploy:
+    runs-on: ubuntu-latest
+    # needs: test
+  
+    steps:
+      - uses: actions/checkout@v2              
+      - name: publish-to-conda
+        uses: fcakyon/conda-publish-action@v1.3
+        with:
+          subdir: 'conda'
+          anacondatoken: ${{ secrets.ANACONDA_TOKEN }}
+          platforms: 'win osx linux'          
\ No newline at end of file
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/README.rst 
new/sklearn-pandas-2.2.0/README.rst
--- old/sklearn-pandas-2.0.4/README.rst 2021-01-06 17:28:10.000000000 +0100
+++ new/sklearn-pandas-2.2.0/README.rst 2021-05-08 10:05:59.000000000 +0200
@@ -4,6 +4,10 @@
 
 .. image:: 
https://circleci.com/gh/scikit-learn-contrib/sklearn-pandas.svg?style=svg
     :target: https://circleci.com/gh/scikit-learn-contrib/sklearn-pandas
+.. image:: https://img.shields.io/pypi/v/sklearn-pandas.svg
+   :target: https://pypi.python.org/pypi/sklearn-pandas/
+.. image:: https://anaconda.org/conda-forge/sklearn-pandas/badges/version.svg
+   :target: https://anaconda.org/conda-forge/sklearn-pandas/
 
 This module provides a bridge between `Scikit-Learn 
<http://scikit-learn.org/stable>`__'s machine learning methods and `pandas 
<https://pandas.pydata.org>`__-style Data Frames.
 In particular, it provides a way to map ``DataFrame`` columns to 
transformations, which are later recombined into features.
@@ -15,6 +19,9 @@
 
     # pip install sklearn-pandas
 
+or conda-forge::
+
+    # conda install -c conda-forge sklearn-pandas
 
 Tests
 -----
@@ -23,9 +30,11 @@
 
     # python -m doctest README.rst
 
+
 Usage
 -----
 
+
 Import
 ******
 
@@ -43,25 +52,33 @@
     >>> import pandas as pd
     >>> import numpy as np
     >>> import sklearn.preprocessing, sklearn.decomposition, \
-    ...     sklearn.linear_model, sklearn.pipeline, sklearn.metrics
+    ...     sklearn.linear_model, sklearn.pipeline, sklearn.metrics, \
+    ...     sklearn.compose
     >>> from sklearn.feature_extraction.text import CountVectorizer
 
+
 Load some Data
 **************
 
+
 Normally you'll read the data from a file, but for demonstration purposes 
we'll create a data frame from a Python dict::
 
     >>> data = pd.DataFrame({'pet':      ['cat', 'dog', 'dog', 'fish', 'cat', 
'dog', 'cat', 'fish'],
     ...                      'children': [4., 6, 3, 3, 2, 3, 5, 4],
     ...                      'salary':   [90., 24, 44, 27, 32, 59, 36, 27]})
 
+
 Transformation Mapping
 ----------------------
 
+
 Map the Columns to Transformations
 **********************************
 
-The mapper takes a list of tuples. The first element of each tuple is a column 
name from the pandas DataFrame, or a list containing one or multiple columns 
(we will see an example with multiple columns later). The second element is an 
object which will perform the transformation which will be applied to that 
column. The third one is optional and is a dictionary containing the 
transformation options, if applicable (see "custom column names for transformed 
features" below).
+The mapper takes a list of tuples. Each tuple has three elements:
+  1. column name(s): The first element is a column name from the pandas 
DataFrame, or a list containing one or multiple columns (we will see an example 
with multiple columns later) or an instance of a callable function such as 
`make_column_selector 
<https://scikit-learn.org/stable/modules/generated/sklearn.compose.make_column_selector.html>`__.
 
+  2. transformer(s): The second element is an object which will perform the 
transformation which will be applied to that column. 
+  3. attributes: The third one is optional and is a dictionary containing the 
transformation options, if applicable (see "custom column names for transformed 
features" below).
 
 Let's see an example::
 
@@ -70,7 +87,7 @@
     ...     (['children'], sklearn.preprocessing.StandardScaler())
     ... ])
 
-The difference between specifying the column selector as ``'column'`` (as a 
simple string) and ``['column']`` (as a list with one element) is the shape of 
the array that is passed to the transformer. In the first case, a one 
dimensional array will be passed, while in the second case it will be a 
2-dimensional array with one column, i.e. a column vector.
+The difference between specifying the column selector as ``'column'`` (as a 
simple string) and ``['column']`` (as a list with one element) is the shape of 
the array that is passed to the transformer. In the first case, a one 
dimensional array will be passed, while in the second case it will be a 
2-dimensional array with one column, i.e. a column vector. 
 
 This behaviour mimics the same pattern as pandas' dataframes ``__getitem__``  
indexing:
 
@@ -81,6 +98,7 @@
 
 Be aware that some transformers expect a 1-dimensional input (the 
label-oriented ones) while some others, like ``OneHotEncoder`` or ``Imputer``, 
expect 2-dimensional input, with the shape ``[n_samples, n_features]``.
 
+
 Test the Transformation
 ***********************
 
@@ -143,6 +161,46 @@
   >>> mapper_alias.transformed_names_
   ['standard_scaled_children', 'children_raw']
 
+
+Dynamic Columns
+***********************
+In some situations the columns are not known before hand and we would like to 
dynamically select them during the fit operation. As shown below, in such 
situations you can provide either a custom callable or use 
`make_column_selector 
<https://scikit-learn.org/stable/modules/generated/sklearn.compose.make_column_selector.html>`__.
 
+
+
+    >>> class GetColumnsStartingWith:
+    ...   def __init__(self, start_str):
+    ...     self.pattern = start_str
+    ...
+    ...   def __call__(self, X:pd.DataFrame=None):
+    ...     return [c for c in X.columns if c.startswith(self.pattern)]
+    ...
+    >>> df = pd.DataFrame({
+    ...    'sepal length (cm)': [1.0, 2.0, 3.0],
+    ...    'sepal width (cm)': [1.0, 2.0, 3.0],
+    ...    'petal length (cm)': [1.0, 2.0, 3.0],
+    ...    'petal width (cm)': [1.0, 2.0, 3.0]
+    ... })
+    >>> t = DataFrameMapper([
+    ...     (
+    ...       sklearn.compose.make_column_selector(dtype_include=float),
+    ...       sklearn.preprocessing.StandardScaler(),
+    ...       {'alias': 'x'}
+    ...     ),
+    ...     (
+    ...       GetColumnsStartingWith('petal'),
+    ...       None,
+    ...       {'alias': 'petal'}
+    ...     )], df_out=True, default=False)
+    >>> t.fit(df).transform(df).shape
+    (3, 6)
+    >>> t.transformed_names_
+    ['x_0', 'x_1', 'x_2', 'x_3', 'petal_0', 'petal_1']
+
+
+
+Above we use `make_column_selector` to select all columns that are of type 
float and also use a custom callable function to select columns that start with 
the word 'petal'.
+
+
 Passing Series/DataFrames to the transformers
 *********************************************
 
@@ -455,6 +513,20 @@
 Changelog
 ---------
 
+
+2.2.0 (2021-05-07)
+******************
+* Added an ability to provide callable functions instead of static column list.
+
+
+2.1.0 (2021-02-26)
+******************
+* Removed test for Python 3.6 and added Python 3.9
+* Added deprecation warning for NumericalTransformer
+* Fixed pickling issue causing integration issues with Baikal.  
+* Started publishing package to conda repo
+
+
 2.0.4 (2020-11-06)
 ******************
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/conda/conda_build_config.yml 
new/sklearn-pandas-2.2.0/conda/conda_build_config.yml
--- old/sklearn-pandas-2.0.4/conda/conda_build_config.yml       1970-01-01 
01:00:00.000000000 +0100
+++ new/sklearn-pandas-2.2.0/conda/conda_build_config.yml       2021-05-08 
10:05:59.000000000 +0200
@@ -0,0 +1,4 @@
+python:
+    - 3.7
+    - 3.8
+    - 3.9
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/conda/meta.yaml 
new/sklearn-pandas-2.2.0/conda/meta.yaml
--- old/sklearn-pandas-2.0.4/conda/meta.yaml    1970-01-01 01:00:00.000000000 
+0100
+++ new/sklearn-pandas-2.2.0/conda/meta.yaml    2021-05-08 10:05:59.000000000 
+0200
@@ -0,0 +1,37 @@
+{% set data = load_setup_py_data() %}
+
+package:
+  name: sklearn-pandas
+  version: {{ data['version'] }}
+
+source:
+  path: ..
+
+build:
+  number: 0
+  script: python setup.py install --single-version-externally-managed 
--record=record.txt
+
+requirements:
+  build:
+    - python
+    - scikit-learn>=0.23.0
+    - scipy>=1.5.1
+    - pandas>=1.1.4
+    - numpy>=1.18.1
+
+  run:
+    - python
+    - scikit-learn>=0.23.0
+    - scipy>=1.5.1
+    - pandas>=1.1.4
+    - numpy>=1.18.1
+
+test:
+  imports:
+    - sklearn_pandas
+
+about:
+  home: {{ data['url'] }}
+  license: {{ data['license'] }}
+
+summary: {{ data['description'] }}
\ No newline at end of file
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/noxfile.py 
new/sklearn-pandas-2.2.0/noxfile.py
--- old/sklearn-pandas-2.0.4/noxfile.py 2021-01-06 17:28:10.000000000 +0100
+++ new/sklearn-pandas-2.2.0/noxfile.py 2021-05-08 10:05:59.000000000 +0200
@@ -2,20 +2,20 @@
 
 @nox.session
 def lint(session):
-    session.install('pytest==5.3.5', 'setuptools==45.2',
-                    'wheel==0.34.2', 'flake8==3.7.9',
-                    'numpy==1.18.1', 'pandas==1.0.5')
+    session.install('pytest>=5.3.5', 'setuptools>=45.2',
+                    'wheel>=0.34.2', 'flake8>=3.7.9',
+                    'numpy==1.18.1', 'pandas==1.1.4')
     session.install('.')
     session.run('flake8', 'sklearn_pandas/', 'tests')
 
 @nox.session
-@nox.parametrize('numpy', ['1.18.1', '1.19.4'])
-@nox.parametrize('scipy', ['1.4.1', '1.5.4'])
-@nox.parametrize('pandas', ['1.0.5', '1.1.4'])
+@nox.parametrize('numpy', ['1.18.1', '1.19.4', '1.20.1'])
+@nox.parametrize('scipy', ['1.5.4', '1.6.0'])
+@nox.parametrize('pandas', ['1.1.4', '1.2.2'])
 def tests(session, numpy, scipy, pandas):
-    session.install('pytest==5.3.5', 
-                    'setuptools==45.2',
-                    'wheel==0.34.2',
+    session.install('pytest>=5.3.5', 
+                    'setuptools>=45.2',
+                    'wheel>=0.34.2',
                     f'numpy=={numpy}',
                     f'scipy=={scipy}',
                     f'pandas=={pandas}'
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/setup.py 
new/sklearn-pandas-2.2.0/setup.py
--- old/sklearn-pandas-2.0.4/setup.py   2021-01-06 17:28:10.000000000 +0100
+++ new/sklearn-pandas-2.2.0/setup.py   2021-05-08 10:05:59.000000000 +0200
@@ -39,10 +39,11 @@
       keywords=['scikit', 'sklearn', 'pandas'],
       install_requires=[
           'scikit-learn>=0.23.0',
-          'scipy>=1.4.1',
-          'pandas>=1.0.5',
+          'scipy>=1.5.1',
+          'pandas>=1.1.4',
           'numpy>=1.18.1'
       ],
       tests_require=['pytest', 'mock'],
       cmdclass={'test': PyTest},
-      )
+      license='MIT License'
+)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/sklearn_pandas/__init__.py 
new/sklearn-pandas-2.2.0/sklearn_pandas/__init__.py
--- old/sklearn-pandas-2.0.4/sklearn_pandas/__init__.py 2021-01-06 
17:28:10.000000000 +0100
+++ new/sklearn-pandas-2.2.0/sklearn_pandas/__init__.py 2021-05-08 
10:05:59.000000000 +0200
@@ -1,4 +1,4 @@
-__version__ = '2.0.4'
+__version__ = '2.2.0'
 
 import logging
 logger = logging.getLogger(__name__)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/sklearn-pandas-2.0.4/sklearn_pandas/dataframe_mapper.py 
new/sklearn-pandas-2.2.0/sklearn_pandas/dataframe_mapper.py
--- old/sklearn-pandas-2.0.4/sklearn_pandas/dataframe_mapper.py 2021-01-06 
17:28:10.000000000 +0100
+++ new/sklearn-pandas-2.2.0/sklearn_pandas/dataframe_mapper.py 2021-05-08 
10:05:59.000000000 +0200
@@ -1,11 +1,9 @@
 import contextlib
-
 from datetime import datetime
 import pandas as pd
 import numpy as np
 from scipy import sparse
 from sklearn.base import BaseEstimator, TransformerMixin
-
 from .cross_validation import DataWrapper
 from .pipeline import make_transformer_pipeline, _call_fit, TransformerPipeline
 from . import logger
@@ -29,8 +27,14 @@
     return transformers
 
 
-def _build_feature(columns, transformers, options={}):
-    return (columns, _build_transformer(transformers), options)
+def _build_feature(columns, transformers, options={}, X=None):
+    if X is None:
+        return (columns, _build_transformer(transformers), options)
+    return (
+        columns(X) if callable(columns) else columns,
+        _build_transformer(transformers),
+        options
+    )
 
 
 def _elapsed_secs(t1):
@@ -116,14 +120,16 @@
         if (df_out and (sparse or default)):
             raise ValueError("Can not use df_out with sparse or default")
 
-    def _build(self):
+    def _build(self, X=None):
         """
         Build attributes built_features and built_default.
         """
         if isinstance(self.features, list):
-            self.built_features = [_build_feature(*f) for f in self.features]
+            self.built_features = [
+                _build_feature(*f, X=X) for f in self.features
+            ]
         else:
-            self.built_features = self.features
+            self.built_features = _build_feature(*self.features, X=X)
         self.built_default = _build_transformer(self.default)
 
     @property
@@ -156,6 +162,7 @@
 
     def __setstate__(self, state):
         # compatibility for older versions of sklearn-pandas
+        super().__setstate__(state)
         self.features = [_build_feature(*feat) for feat in state['features']]
         self.sparse = state.get('sparse', False)
         self.default = state.get('default', False)
@@ -184,11 +191,13 @@
         Get a subset of columns from the given table X.
 
         X       a Pandas dataframe; the table to select columns from
-        cols    a string or list of strings representing the columns
-                to select
+        cols    a string or list of strings representing the columns to select.
+                It can also be a callable that returns True or False, i.e.
+                compatible with the built-in filter function.
 
         Returns a numpy array with the data from the selected columns
         """
+
         if isinstance(cols, string_types):
             return_vector = True
             cols = [cols]
@@ -225,7 +234,7 @@
         y       the target vector relative to X, optional
 
         """
-        self._build()
+        self._build(X=X)
 
         for columns, transformers, options in self.built_features:
             t1 = datetime.now()
@@ -314,7 +323,7 @@
         fit_transform.
         """
         if do_fit:
-            self._build()
+            self._build(X=X)
 
         extracted = []
         transformed_names_ = []
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/sklearn_pandas/transformers.py 
new/sklearn-pandas-2.2.0/sklearn_pandas/transformers.py
--- old/sklearn-pandas-2.0.4/sklearn_pandas/transformers.py     2021-01-06 
17:28:10.000000000 +0100
+++ new/sklearn-pandas-2.2.0/sklearn_pandas/transformers.py     2021-05-08 
10:05:59.000000000 +0200
@@ -1,6 +1,7 @@
 import numpy as np
 import pandas as pd
 from sklearn.base import TransformerMixin
+import warnings
 
 
 def _get_mask(X, value):
@@ -30,6 +31,13 @@
                 in SUPPORTED_FUNCTIONS variable. Throws assertion error if the
                 not supported.
         """
+
+        warnings.warn("""
+            NumericalTransformer will be deprecated in 3.0 version.
+            Please use Sklearn.base.TransformerMixin to write
+            customer transformers
+            """, DeprecationWarning)
+
         assert func in self.SUPPORTED_FUNCTIONS, \
             f"Only following func are supported: {self.SUPPORTED_FUNCTIONS}"
         super(NumericalTransformer, self).__init__()
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/test.py 
new/sklearn-pandas-2.2.0/test.py
--- old/sklearn-pandas-2.0.4/test.py    1970-01-01 01:00:00.000000000 +0100
+++ new/sklearn-pandas-2.2.0/test.py    2021-05-08 10:05:59.000000000 +0200
@@ -0,0 +1,30 @@
+import pytest
+from unittest.mock import Mock
+import numpy as np
+import pandas as pd
+from sklearn_pandas import DataFrameMapper
+from sklearn.compose import make_column_selector
+from sklearn.preprocessing import StandardScaler
+
+
+class GetStartWith:
+    def __init__(self, start_str):
+        self.start_str = start_str
+
+    def __call__(self, X: pd.DataFrame) -> list:
+        return [c for c in X.columns if c.startswith(self.start_str)]
+
+
+df = pd.DataFrame({
+    'sepal length (cm)': [1.0, 2.0, 3.0],
+    'sepal width (cm)': [1.0, 2.0, 3.0],
+    'petal length (cm)': [1.0, 2.0, 3.0],
+    'petal width (cm)': [1.0, 2.0, 3.0]
+})
+t = DataFrameMapper([
+    (make_column_selector(dtype_include=float), StandardScaler(), {'alias': 
'x'}),
+    (GetStartWith('petal'), None, {'alias': 'petal'})
+], df_out=True, default=False)
+
+t.fit(df)
+print(t.transform(df).shape)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/sklearn-pandas-2.0.4/tests/test_dataframe_mapper.py 
new/sklearn-pandas-2.2.0/tests/test_dataframe_mapper.py
--- old/sklearn-pandas-2.0.4/tests/test_dataframe_mapper.py     2021-01-06 
17:28:10.000000000 +0100
+++ new/sklearn-pandas-2.2.0/tests/test_dataframe_mapper.py     2021-05-08 
10:05:59.000000000 +0200
@@ -20,6 +20,7 @@
 import numpy as np
 from numpy.testing import assert_array_equal
 import pickle
+from sklearn.compose import make_column_selector
 
 from sklearn_pandas import DataFrameMapper
 from sklearn_pandas.dataframe_mapper import _handle_feature, _build_transformer
@@ -969,3 +970,19 @@
     dft = M.fit_transform(df)
     assert dft['feat1'].dtype == np.dtype('int64')
     assert dft['feat2'].dtype == np.dtype('float64')
+
+
+def test_make_column_selector(iris_dataframe):
+    t = DataFrameMapper([
+        (make_column_selector(dtype_include=float), None, {'alias': 'x'}),
+        ('sepal length (cm)', None),
+    ], df_out=True, default=False)
+
+    xt = t.fit(iris_dataframe).transform(iris_dataframe)
+    expected = ['x_0', 'x_1', 'x_2', 'x_3', 'sepal length (cm)']
+    assert list(xt.columns) == expected
+
+    pickled = pickle.dumps(t)
+    t2 = pickle.loads(pickled)
+    xt2 = t2.transform(iris_dataframe)
+    assert np.array_equal(xt.values, xt2.values)

commit python-sklearn-pandas for openSUSE:Factory

Reply via email to