commit python-Scrapy for openSUSE:Factory

Source-Sync Thu, 07 Oct 2021 15:07:56 -0700

Script 'mail_helper' called by obssrc
Hello community,

here is the log from the commit of package python-Scrapy for openSUSE:Factory 
checked in at 2021-10-08 00:06:30
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-Scrapy (Old)
 and      /work/SRC/openSUSE:Factory/.python-Scrapy.new.2443 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Package is "python-Scrapy"

Fri Oct  8 00:06:30 2021 rev:11 rq:924057 version:2.5.1

Changes:
--------
--- /work/SRC/openSUSE:Factory/python-Scrapy/python-Scrapy.changes      
2021-09-09 23:08:09.300873495 +0200
+++ /work/SRC/openSUSE:Factory/.python-Scrapy.new.2443/python-Scrapy.changes    
2021-10-08 00:07:29.293899563 +0200
@@ -1,0 +2,31 @@
+Thu Oct  7 14:35:57 UTC 2021 - Ben Greiner <c...@bnavigator.de>
+
+- Update to 2.5.1, Security bug fix
+  * boo#1191446, CVE-2021-41125
+  * If you use HttpAuthMiddleware (i.e. the http_user and
+    http_pass spider attributes) for HTTP authentication,
+    any request exposes your credentials to the request
+    target.
+  * To prevent unintended exposure of authentication
+    credentials to unintended domains, you must now
+    additionally set a new, additional spider attribute,
+    http_auth_domain, and point it to the specific domain to
+    which the authentication credentials must be sent.
+  * If the http_auth_domain spider attribute is not set, the
+    domain of the first request will be considered the HTTP
+    authentication target, and authentication credentials
+    will only be sent in requests targeting that domain.
+  * If you need to send the same HTTP authentication
+    credentials to multiple domains, you can use
+    w3lib.http.basic_auth_header instead to set the value of
+    the Authorization header of your requests.
+  * If you really want your spider to send the same HTTP
+    authentication credentials to any domain, set the
+    http_auth_domain spider attribute to None.
+  * Finally, if you are a user of scrapy-splash, know that
+    this version of Scrapy breaks compatibility with
+    scrapy-splash 0.7.2 and earlier. You will need to upgrade
+    scrapy-splash to a greater version for it to continue to
+    work.
+
+-------------------------------------------------------------------

Old:
----
  Scrapy-2.5.0.tar.gz

New:
----
  Scrapy-2.5.1.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-Scrapy.spec ++++++
--- /var/tmp/diff_new_pack.ueQEjY/_old  2021-10-08 00:07:29.773900377 +0200
+++ /var/tmp/diff_new_pack.ueQEjY/_new  2021-10-08 00:07:29.777900383 +0200
@@ -21,7 +21,7 @@
 # python-uvloop does not support python3.6
 %define skip_python36 1
 Name:           python-Scrapy
-Version:        2.5.0
+Version:        2.5.1
 Release:        0
 Summary:        A high-level Python Screen Scraping framework
 License:        BSD-3-Clause

++++++ Scrapy-2.5.0.tar.gz -> Scrapy-2.5.1.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.5.0/PKG-INFO new/Scrapy-2.5.1/PKG-INFO
--- old/Scrapy-2.5.0/PKG-INFO   2021-04-06 16:48:12.813729500 +0200
+++ new/Scrapy-2.5.1/PKG-INFO   2021-10-05 15:48:14.482129600 +0200
@@ -1,6 +1,6 @@
-Metadata-Version: 1.2
+Metadata-Version: 2.1
 Name: Scrapy
-Version: 2.5.0
+Version: 2.5.1
 Summary: A high-level Web Crawling and Web Scraping framework
 Home-page: https://scrapy.org
 Author: Scrapy developers
@@ -10,116 +10,6 @@
 Project-URL: Documentation, https://docs.scrapy.org/
 Project-URL: Source, https://github.com/scrapy/scrapy
 Project-URL: Tracker, https://github.com/scrapy/scrapy/issues
-Description: ======
-        Scrapy
-        ======
-        
-        .. image:: https://img.shields.io/pypi/v/Scrapy.svg
-           :target: https://pypi.python.org/pypi/Scrapy
-           :alt: PyPI Version
-        
-        .. image:: https://img.shields.io/pypi/pyversions/Scrapy.svg
-           :target: https://pypi.python.org/pypi/Scrapy
-           :alt: Supported Python Versions
-        
-        .. image:: https://github.com/scrapy/scrapy/workflows/Ubuntu/badge.svg
-           :target: 
https://github.com/scrapy/scrapy/actions?query=workflow%3AUbuntu
-           :alt: Ubuntu
-        
-        .. image:: https://github.com/scrapy/scrapy/workflows/macOS/badge.svg
-           :target: 
https://github.com/scrapy/scrapy/actions?query=workflow%3AmacOS
-           :alt: macOS
-        
-        .. image:: https://github.com/scrapy/scrapy/workflows/Windows/badge.svg
-           :target: 
https://github.com/scrapy/scrapy/actions?query=workflow%3AWindows
-           :alt: Windows
-        
-        .. image:: https://img.shields.io/badge/wheel-yes-brightgreen.svg
-           :target: https://pypi.python.org/pypi/Scrapy
-           :alt: Wheel Status
-        
-        .. image:: 
https://img.shields.io/codecov/c/github/scrapy/scrapy/master.svg
-           :target: https://codecov.io/github/scrapy/scrapy?branch=master
-           :alt: Coverage report
-        
-        .. image:: https://anaconda.org/conda-forge/scrapy/badges/version.svg
-           :target: https://anaconda.org/conda-forge/scrapy
-           :alt: Conda Version
-        
-        
-        Overview
-        ========
-        
-        Scrapy is a fast high-level web crawling and web scraping framework, 
used to
-        crawl websites and extract structured data from their pages. It can be 
used for
-        a wide range of purposes, from data mining to monitoring and automated 
testing.
-        
-        Scrapy is maintained by Zyte_ (formerly Scrapinghub) and `many other
-        contributors`_.
-        
-        .. _many other contributors: 
https://github.com/scrapy/scrapy/graphs/contributors
-        .. _Zyte: https://www.zyte.com/
-        
-        Check the Scrapy homepage at https://scrapy.org for more information,
-        including a list of features.
-        
-        
-        Requirements
-        ============
-        
-        * Python 3.6+
-        * Works on Linux, Windows, macOS, BSD
-        
-        Install
-        =======
-        
-        The quick way::
-        
-            pip install scrapy
-        
-        See the install section in the documentation at
-        https://docs.scrapy.org/en/latest/intro/install.html for more details.
-        
-        Documentation
-        =============
-        
-        Documentation is available online at https://docs.scrapy.org/ and in 
the ``docs``
-        directory.
-        
-        Releases
-        ========
-        
-        You can check https://docs.scrapy.org/en/latest/news.html for the 
release notes.
-        
-        Community (blog, twitter, mail list, IRC)
-        =========================================
-        
-        See https://scrapy.org/community/ for details.
-        
-        Contributing
-        ============
-        
-        See https://docs.scrapy.org/en/master/contributing.html for details.
-        
-        Code of Conduct
-        ---------------
-        
-        Please note that this project is released with a Contributor Code of 
Conduct
-        (see https://github.com/scrapy/scrapy/blob/master/CODE_OF_CONDUCT.md).
-        
-        By participating in this project you agree to abide by its terms.
-        Please report unacceptable behavior to opensou...@zyte.com.
-        
-        Companies using Scrapy
-        ======================
-        
-        See https://scrapy.org/companies/ for a list.
-        
-        Commercial Support
-        ==================
-        
-        See https://scrapy.org/support/ for details.
-        
 Platform: UNKNOWN
 Classifier: Framework :: Scrapy
 Classifier: Development Status :: 5 - Production/Stable
@@ -139,3 +29,117 @@
 Classifier: Topic :: Software Development :: Libraries :: Application 
Frameworks
 Classifier: Topic :: Software Development :: Libraries :: Python Modules
 Requires-Python: >=3.6
+License-File: LICENSE
+License-File: AUTHORS
+
+======
+Scrapy
+======
+
+.. image:: https://img.shields.io/pypi/v/Scrapy.svg
+   :target: https://pypi.python.org/pypi/Scrapy
+   :alt: PyPI Version
+
+.. image:: https://img.shields.io/pypi/pyversions/Scrapy.svg
+   :target: https://pypi.python.org/pypi/Scrapy
+   :alt: Supported Python Versions
+
+.. image:: https://github.com/scrapy/scrapy/workflows/Ubuntu/badge.svg
+   :target: https://github.com/scrapy/scrapy/actions?query=workflow%3AUbuntu
+   :alt: Ubuntu
+
+.. image:: https://github.com/scrapy/scrapy/workflows/macOS/badge.svg
+   :target: https://github.com/scrapy/scrapy/actions?query=workflow%3AmacOS
+   :alt: macOS
+
+.. image:: https://github.com/scrapy/scrapy/workflows/Windows/badge.svg
+   :target: https://github.com/scrapy/scrapy/actions?query=workflow%3AWindows
+   :alt: Windows
+
+.. image:: https://img.shields.io/badge/wheel-yes-brightgreen.svg
+   :target: https://pypi.python.org/pypi/Scrapy
+   :alt: Wheel Status
+
+.. image:: https://img.shields.io/codecov/c/github/scrapy/scrapy/master.svg
+   :target: https://codecov.io/github/scrapy/scrapy?branch=master
+   :alt: Coverage report
+
+.. image:: https://anaconda.org/conda-forge/scrapy/badges/version.svg
+   :target: https://anaconda.org/conda-forge/scrapy
+   :alt: Conda Version
+
+
+Overview
+========
+
+Scrapy is a fast high-level web crawling and web scraping framework, used to
+crawl websites and extract structured data from their pages. It can be used for
+a wide range of purposes, from data mining to monitoring and automated testing.
+
+Scrapy is maintained by Zyte_ (formerly Scrapinghub) and `many other
+contributors`_.
+
+.. _many other contributors: 
https://github.com/scrapy/scrapy/graphs/contributors
+.. _Zyte: https://www.zyte.com/
+
+Check the Scrapy homepage at https://scrapy.org for more information,
+including a list of features.
+
+
+Requirements
+============
+
+* Python 3.6+
+* Works on Linux, Windows, macOS, BSD
+
+Install
+=======
+
+The quick way::
+
+    pip install scrapy
+
+See the install section in the documentation at
+https://docs.scrapy.org/en/latest/intro/install.html for more details.
+
+Documentation
+=============
+
+Documentation is available online at https://docs.scrapy.org/ and in the 
``docs``
+directory.
+
+Releases
+========
+
+You can check https://docs.scrapy.org/en/latest/news.html for the release 
notes.
+
+Community (blog, twitter, mail list, IRC)
+=========================================
+
+See https://scrapy.org/community/ for details.
+
+Contributing
+============
+
+See https://docs.scrapy.org/en/master/contributing.html for details.
+
+Code of Conduct
+---------------
+
+Please note that this project is released with a Contributor Code of Conduct
+(see https://github.com/scrapy/scrapy/blob/master/CODE_OF_CONDUCT.md).
+
+By participating in this project you agree to abide by its terms.
+Please report unacceptable behavior to opensou...@zyte.com.
+
+Companies using Scrapy
+======================
+
+See https://scrapy.org/companies/ for a list.
+
+Commercial Support
+==================
+
+See https://scrapy.org/support/ for details.
+
+
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.5.0/Scrapy.egg-info/PKG-INFO 
new/Scrapy-2.5.1/Scrapy.egg-info/PKG-INFO
--- old/Scrapy-2.5.0/Scrapy.egg-info/PKG-INFO   2021-04-06 16:48:12.000000000 
+0200
+++ new/Scrapy-2.5.1/Scrapy.egg-info/PKG-INFO   2021-10-05 15:48:14.000000000 
+0200
@@ -1,6 +1,6 @@
-Metadata-Version: 1.2
+Metadata-Version: 2.1
 Name: Scrapy
-Version: 2.5.0
+Version: 2.5.1
 Summary: A high-level Web Crawling and Web Scraping framework
 Home-page: https://scrapy.org
 Author: Scrapy developers
@@ -10,116 +10,6 @@
 Project-URL: Documentation, https://docs.scrapy.org/
 Project-URL: Source, https://github.com/scrapy/scrapy
 Project-URL: Tracker, https://github.com/scrapy/scrapy/issues
-Description: ======
-        Scrapy
-        ======
-        
-        .. image:: https://img.shields.io/pypi/v/Scrapy.svg
-           :target: https://pypi.python.org/pypi/Scrapy
-           :alt: PyPI Version
-        
-        .. image:: https://img.shields.io/pypi/pyversions/Scrapy.svg
-           :target: https://pypi.python.org/pypi/Scrapy
-           :alt: Supported Python Versions
-        
-        .. image:: https://github.com/scrapy/scrapy/workflows/Ubuntu/badge.svg
-           :target: 
https://github.com/scrapy/scrapy/actions?query=workflow%3AUbuntu
-           :alt: Ubuntu
-        
-        .. image:: https://github.com/scrapy/scrapy/workflows/macOS/badge.svg
-           :target: 
https://github.com/scrapy/scrapy/actions?query=workflow%3AmacOS
-           :alt: macOS
-        
-        .. image:: https://github.com/scrapy/scrapy/workflows/Windows/badge.svg
-           :target: 
https://github.com/scrapy/scrapy/actions?query=workflow%3AWindows
-           :alt: Windows
-        
-        .. image:: https://img.shields.io/badge/wheel-yes-brightgreen.svg
-           :target: https://pypi.python.org/pypi/Scrapy
-           :alt: Wheel Status
-        
-        .. image:: 
https://img.shields.io/codecov/c/github/scrapy/scrapy/master.svg
-           :target: https://codecov.io/github/scrapy/scrapy?branch=master
-           :alt: Coverage report
-        
-        .. image:: https://anaconda.org/conda-forge/scrapy/badges/version.svg
-           :target: https://anaconda.org/conda-forge/scrapy
-           :alt: Conda Version
-        
-        
-        Overview
-        ========
-        
-        Scrapy is a fast high-level web crawling and web scraping framework, 
used to
-        crawl websites and extract structured data from their pages. It can be 
used for
-        a wide range of purposes, from data mining to monitoring and automated 
testing.
-        
-        Scrapy is maintained by Zyte_ (formerly Scrapinghub) and `many other
-        contributors`_.
-        
-        .. _many other contributors: 
https://github.com/scrapy/scrapy/graphs/contributors
-        .. _Zyte: https://www.zyte.com/
-        
-        Check the Scrapy homepage at https://scrapy.org for more information,
-        including a list of features.
-        
-        
-        Requirements
-        ============
-        
-        * Python 3.6+
-        * Works on Linux, Windows, macOS, BSD
-        
-        Install
-        =======
-        
-        The quick way::
-        
-            pip install scrapy
-        
-        See the install section in the documentation at
-        https://docs.scrapy.org/en/latest/intro/install.html for more details.
-        
-        Documentation
-        =============
-        
-        Documentation is available online at https://docs.scrapy.org/ and in 
the ``docs``
-        directory.
-        
-        Releases
-        ========
-        
-        You can check https://docs.scrapy.org/en/latest/news.html for the 
release notes.
-        
-        Community (blog, twitter, mail list, IRC)
-        =========================================
-        
-        See https://scrapy.org/community/ for details.
-        
-        Contributing
-        ============
-        
-        See https://docs.scrapy.org/en/master/contributing.html for details.
-        
-        Code of Conduct
-        ---------------
-        
-        Please note that this project is released with a Contributor Code of 
Conduct
-        (see https://github.com/scrapy/scrapy/blob/master/CODE_OF_CONDUCT.md).
-        
-        By participating in this project you agree to abide by its terms.
-        Please report unacceptable behavior to opensou...@zyte.com.
-        
-        Companies using Scrapy
-        ======================
-        
-        See https://scrapy.org/companies/ for a list.
-        
-        Commercial Support
-        ==================
-        
-        See https://scrapy.org/support/ for details.
-        
 Platform: UNKNOWN
 Classifier: Framework :: Scrapy
 Classifier: Development Status :: 5 - Production/Stable
@@ -139,3 +29,117 @@
 Classifier: Topic :: Software Development :: Libraries :: Application 
Frameworks
 Classifier: Topic :: Software Development :: Libraries :: Python Modules
 Requires-Python: >=3.6
+License-File: LICENSE
+License-File: AUTHORS
+
+======
+Scrapy
+======
+
+.. image:: https://img.shields.io/pypi/v/Scrapy.svg
+   :target: https://pypi.python.org/pypi/Scrapy
+   :alt: PyPI Version
+
+.. image:: https://img.shields.io/pypi/pyversions/Scrapy.svg
+   :target: https://pypi.python.org/pypi/Scrapy
+   :alt: Supported Python Versions
+
+.. image:: https://github.com/scrapy/scrapy/workflows/Ubuntu/badge.svg
+   :target: https://github.com/scrapy/scrapy/actions?query=workflow%3AUbuntu
+   :alt: Ubuntu
+
+.. image:: https://github.com/scrapy/scrapy/workflows/macOS/badge.svg
+   :target: https://github.com/scrapy/scrapy/actions?query=workflow%3AmacOS
+   :alt: macOS
+
+.. image:: https://github.com/scrapy/scrapy/workflows/Windows/badge.svg
+   :target: https://github.com/scrapy/scrapy/actions?query=workflow%3AWindows
+   :alt: Windows
+
+.. image:: https://img.shields.io/badge/wheel-yes-brightgreen.svg
+   :target: https://pypi.python.org/pypi/Scrapy
+   :alt: Wheel Status
+
+.. image:: https://img.shields.io/codecov/c/github/scrapy/scrapy/master.svg
+   :target: https://codecov.io/github/scrapy/scrapy?branch=master
+   :alt: Coverage report
+
+.. image:: https://anaconda.org/conda-forge/scrapy/badges/version.svg
+   :target: https://anaconda.org/conda-forge/scrapy
+   :alt: Conda Version
+
+
+Overview
+========
+
+Scrapy is a fast high-level web crawling and web scraping framework, used to
+crawl websites and extract structured data from their pages. It can be used for
+a wide range of purposes, from data mining to monitoring and automated testing.
+
+Scrapy is maintained by Zyte_ (formerly Scrapinghub) and `many other
+contributors`_.
+
+.. _many other contributors: 
https://github.com/scrapy/scrapy/graphs/contributors
+.. _Zyte: https://www.zyte.com/
+
+Check the Scrapy homepage at https://scrapy.org for more information,
+including a list of features.
+
+
+Requirements
+============
+
+* Python 3.6+
+* Works on Linux, Windows, macOS, BSD
+
+Install
+=======
+
+The quick way::
+
+    pip install scrapy
+
+See the install section in the documentation at
+https://docs.scrapy.org/en/latest/intro/install.html for more details.
+
+Documentation
+=============
+
+Documentation is available online at https://docs.scrapy.org/ and in the 
``docs``
+directory.
+
+Releases
+========
+
+You can check https://docs.scrapy.org/en/latest/news.html for the release 
notes.
+
+Community (blog, twitter, mail list, IRC)
+=========================================
+
+See https://scrapy.org/community/ for details.
+
+Contributing
+============
+
+See https://docs.scrapy.org/en/master/contributing.html for details.
+
+Code of Conduct
+---------------
+
+Please note that this project is released with a Contributor Code of Conduct
+(see https://github.com/scrapy/scrapy/blob/master/CODE_OF_CONDUCT.md).
+
+By participating in this project you agree to abide by its terms.
+Please report unacceptable behavior to opensou...@zyte.com.
+
+Companies using Scrapy
+======================
+
+See https://scrapy.org/companies/ for a list.
+
+Commercial Support
+==================
+
+See https://scrapy.org/support/ for details.
+
+
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.5.0/Scrapy.egg-info/SOURCES.txt 
new/Scrapy-2.5.1/Scrapy.egg-info/SOURCES.txt
--- old/Scrapy-2.5.0/Scrapy.egg-info/SOURCES.txt        2021-04-06 
16:48:12.000000000 +0200
+++ new/Scrapy-2.5.1/Scrapy.egg-info/SOURCES.txt        2021-10-05 
15:48:14.000000000 +0200
@@ -25,7 +25,6 @@
 docs/faq.rst
 docs/index.rst
 docs/news.rst
-docs/pip.txt
 docs/requirements.txt
 docs/versioning.rst
 docs/_ext/scrapydocs.py
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.5.0/docs/news.rst 
new/Scrapy-2.5.1/docs/news.rst
--- old/Scrapy-2.5.0/docs/news.rst      2021-04-06 16:48:02.000000000 +0200
+++ new/Scrapy-2.5.1/docs/news.rst      2021-10-05 15:48:05.000000000 +0200
@@ -3,6 +3,44 @@
 Release notes
 =============
 
+.. _release-2.5.1:
+
+Scrapy 2.5.1 (2021-10-05)
+-------------------------
+
+*   **Security bug fix:**
+
+    If you use
+    :class:`~scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware`
+    (i.e. the ``http_user`` and ``http_pass`` spider attributes) for HTTP
+    authentication, any request exposes your credentials to the request target.
+
+    To prevent unintended exposure of authentication credentials to unintended
+    domains, you must now additionally set a new, additional spider attribute,
+    ``http_auth_domain``, and point it to the specific domain to which the
+    authentication credentials must be sent.
+
+    If the ``http_auth_domain`` spider attribute is not set, the domain of the
+    first request will be considered the HTTP authentication target, and
+    authentication credentials will only be sent in requests targeting that
+    domain.
+
+    If you need to send the same HTTP authentication credentials to multiple
+    domains, you can use :func:`w3lib.http.basic_auth_header` instead to
+    set the value of the ``Authorization`` header of your requests.
+
+    If you *really* want your spider to send the same HTTP authentication
+    credentials to any domain, set the ``http_auth_domain`` spider attribute
+    to ``None``.
+
+    Finally, if you are a user of `scrapy-splash`_, know that this version of
+    Scrapy breaks compatibility with scrapy-splash 0.7.2 and earlier. You will
+    need to upgrade scrapy-splash to a greater version for it to continue to
+    work.
+
+.. _scrapy-splash: https://github.com/scrapy-plugins/scrapy-splash
+
+
 .. _release-2.5.0:
 
 Scrapy 2.5.0 (2021-04-06)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.5.0/docs/pip.txt 
new/Scrapy-2.5.1/docs/pip.txt
--- old/Scrapy-2.5.0/docs/pip.txt       2021-04-06 16:48:02.000000000 +0200
+++ new/Scrapy-2.5.1/docs/pip.txt       1970-01-01 01:00:00.000000000 +0100
@@ -1,3 +0,0 @@
-# In pip 20.3-21.0, the default dependency resolver causes the build in
-# ReadTheDocs to fail due to memory exhaustion or timeout.
-pip<20.3
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.5.0/docs/requirements.txt 
new/Scrapy-2.5.1/docs/requirements.txt
--- old/Scrapy-2.5.0/docs/requirements.txt      2021-04-06 16:48:02.000000000 
+0200
+++ new/Scrapy-2.5.1/docs/requirements.txt      2021-10-05 15:48:05.000000000 
+0200
@@ -1,4 +1,4 @@
 Sphinx>=3.0
 sphinx-hoverxref>=0.2b1
 sphinx-notfound-page>=0.4
-sphinx_rtd_theme>=0.4
+sphinx-rtd-theme>=0.5.2
\ No newline at end of file
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.5.0/docs/topics/downloader-middleware.rst 
new/Scrapy-2.5.1/docs/topics/downloader-middleware.rst
--- old/Scrapy-2.5.0/docs/topics/downloader-middleware.rst      2021-04-06 
16:48:02.000000000 +0200
+++ new/Scrapy-2.5.1/docs/topics/downloader-middleware.rst      2021-10-05 
15:48:05.000000000 +0200
@@ -323,8 +323,21 @@
     This middleware authenticates all requests generated from certain spiders
     using `Basic access authentication`_ (aka. HTTP auth).
 
-    To enable HTTP authentication from certain spiders, set the ``http_user``
-    and ``http_pass`` attributes of those spiders.
+    To enable HTTP authentication for a spider, set the ``http_user`` and
+    ``http_pass`` spider attributes to the authentication data and the
+    ``http_auth_domain`` spider attribute to the domain which requires this
+    authentication (its subdomains will be also handled in the same way).
+    You can set ``http_auth_domain`` to ``None`` to enable the
+    authentication for all requests but you risk leaking your authentication
+    credentials to unrelated domains.
+
+    .. warning::
+        In previous Scrapy versions HttpAuthMiddleware sent the authentication
+        data with all requests, which is a security problem if the spider
+        makes requests to several different domains. Currently if the
+        ``http_auth_domain`` attribute is not set, the middleware will use the
+        domain of the first request, which will work for some spiders but not
+        for others. In the future the middleware will produce an error instead.
 
     Example::
 
@@ -334,6 +347,7 @@
 
             http_user = 'someuser'
             http_pass = 'somepass'
+            http_auth_domain = 'intranet.example.com'
             name = 'intranet.example.com'
 
             # .. rest of the spider code omitted ...
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.5.0/scrapy/VERSION 
new/Scrapy-2.5.1/scrapy/VERSION
--- old/Scrapy-2.5.0/scrapy/VERSION     2021-04-06 16:48:02.000000000 +0200
+++ new/Scrapy-2.5.1/scrapy/VERSION     2021-10-05 15:48:05.000000000 +0200
@@ -1 +1 @@
-2.5.0
+2.5.1
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/Scrapy-2.5.0/scrapy/downloadermiddlewares/httpauth.py 
new/Scrapy-2.5.1/scrapy/downloadermiddlewares/httpauth.py
--- old/Scrapy-2.5.0/scrapy/downloadermiddlewares/httpauth.py   2021-04-06 
16:48:02.000000000 +0200
+++ new/Scrapy-2.5.1/scrapy/downloadermiddlewares/httpauth.py   2021-10-05 
15:48:05.000000000 +0200
@@ -3,10 +3,14 @@
 
 See documentation in docs/topics/downloader-middleware.rst
 """
+import warnings
 
 from w3lib.http import basic_auth_header
 
 from scrapy import signals
+from scrapy.exceptions import ScrapyDeprecationWarning
+from scrapy.utils.httpobj import urlparse_cached
+from scrapy.utils.url import url_is_from_any_domain
 
 
 class HttpAuthMiddleware:
@@ -24,8 +28,23 @@
         pwd = getattr(spider, 'http_pass', '')
         if usr or pwd:
             self.auth = basic_auth_header(usr, pwd)
+            if not hasattr(spider, 'http_auth_domain'):
+                warnings.warn('Using HttpAuthMiddleware without 
http_auth_domain is deprecated and can cause security '
+                              'problems if the spider makes requests to 
several different domains. http_auth_domain '
+                              'will be set to the domain of the first request, 
please set it to the correct value '
+                              'explicitly.',
+                              category=ScrapyDeprecationWarning)
+                self.domain_unset = True
+            else:
+                self.domain = spider.http_auth_domain
+                self.domain_unset = False
 
     def process_request(self, request, spider):
         auth = getattr(self, 'auth', None)
         if auth and b'Authorization' not in request.headers:
-            request.headers[b'Authorization'] = auth
+            domain = urlparse_cached(request).hostname
+            if self.domain_unset:
+                self.domain = domain
+                self.domain_unset = False
+            if not self.domain or url_is_from_any_domain(request.url, 
[self.domain]):
+                request.headers[b'Authorization'] = auth
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/Scrapy-2.5.0/tests/test_downloadermiddleware_httpauth.py 
new/Scrapy-2.5.1/tests/test_downloadermiddleware_httpauth.py
--- old/Scrapy-2.5.0/tests/test_downloadermiddleware_httpauth.py        
2021-04-06 16:48:02.000000000 +0200
+++ new/Scrapy-2.5.1/tests/test_downloadermiddleware_httpauth.py        
2021-10-05 15:48:05.000000000 +0200
@@ -1,13 +1,60 @@
 import unittest
 
+from w3lib.http import basic_auth_header
+
 from scrapy.http import Request
 from scrapy.downloadermiddlewares.httpauth import HttpAuthMiddleware
 from scrapy.spiders import Spider
 
 
+class TestSpiderLegacy(Spider):
+    http_user = 'foo'
+    http_pass = 'bar'
+
+
 class TestSpider(Spider):
     http_user = 'foo'
     http_pass = 'bar'
+    http_auth_domain = 'example.com'
+
+
+class TestSpiderAny(Spider):
+    http_user = 'foo'
+    http_pass = 'bar'
+    http_auth_domain = None
+
+
+class HttpAuthMiddlewareLegacyTest(unittest.TestCase):
+
+    def setUp(self):
+        self.spider = TestSpiderLegacy('foo')
+
+    def test_auth(self):
+        mw = HttpAuthMiddleware()
+        mw.spider_opened(self.spider)
+
+        # initial request, sets the domain and sends the header
+        req = Request('http://example.com/')
+        assert mw.process_request(req, self.spider) is None
+        self.assertEqual(req.headers['Authorization'], 
basic_auth_header('foo', 'bar'))
+
+        # subsequent request to the same domain, should send the header
+        req = Request('http://example.com/')
+        assert mw.process_request(req, self.spider) is None
+        self.assertEqual(req.headers['Authorization'], 
basic_auth_header('foo', 'bar'))
+
+        # subsequent request to a different domain, shouldn't send the header
+        req = Request('http://example-noauth.com/')
+        assert mw.process_request(req, self.spider) is None
+        self.assertNotIn('Authorization', req.headers)
+
+    def test_auth_already_set(self):
+        mw = HttpAuthMiddleware()
+        mw.spider_opened(self.spider)
+        req = Request('http://example.com/',
+                      headers=dict(Authorization='Digest 123'))
+        assert mw.process_request(req, self.spider) is None
+        self.assertEqual(req.headers['Authorization'], b'Digest 123')
 
 
 class HttpAuthMiddlewareTest(unittest.TestCase):
@@ -20,13 +67,45 @@
     def tearDown(self):
         del self.mw
 
+    def test_no_auth(self):
+        req = Request('http://example-noauth.com/')
+        assert self.mw.process_request(req, self.spider) is None
+        self.assertNotIn('Authorization', req.headers)
+
+    def test_auth_domain(self):
+        req = Request('http://example.com/')
+        assert self.mw.process_request(req, self.spider) is None
+        self.assertEqual(req.headers['Authorization'], 
basic_auth_header('foo', 'bar'))
+
+    def test_auth_subdomain(self):
+        req = Request('http://foo.example.com/')
+        assert self.mw.process_request(req, self.spider) is None
+        self.assertEqual(req.headers['Authorization'], 
basic_auth_header('foo', 'bar'))
+
+    def test_auth_already_set(self):
+        req = Request('http://example.com/',
+                      headers=dict(Authorization='Digest 123'))
+        assert self.mw.process_request(req, self.spider) is None
+        self.assertEqual(req.headers['Authorization'], b'Digest 123')
+
+
+class HttpAuthAnyMiddlewareTest(unittest.TestCase):
+
+    def setUp(self):
+        self.mw = HttpAuthMiddleware()
+        self.spider = TestSpiderAny('foo')
+        self.mw.spider_opened(self.spider)
+
+    def tearDown(self):
+        del self.mw
+
     def test_auth(self):
-        req = Request('http://scrapytest.org/')
+        req = Request('http://example.com/')
         assert self.mw.process_request(req, self.spider) is None
-        self.assertEqual(req.headers['Authorization'], b'Basic Zm9vOmJhcg==')
+        self.assertEqual(req.headers['Authorization'], 
basic_auth_header('foo', 'bar'))
 
     def test_auth_already_set(self):
-        req = Request('http://scrapytest.org/',
+        req = Request('http://example.com/',
                       headers=dict(Authorization='Digest 123'))
         assert self.mw.process_request(req, self.spider) is None
         self.assertEqual(req.headers['Authorization'], b'Digest 123')
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.5.0/tox.ini new/Scrapy-2.5.1/tox.ini
--- old/Scrapy-2.5.0/tox.ini    2021-04-06 16:48:02.000000000 +0200
+++ new/Scrapy-2.5.1/tox.ini    2021-10-05 15:48:05.000000000 +0200
@@ -19,6 +19,8 @@
     mitmproxy >= 4.0.4, < 5; python_version >= '3.6' and python_version < 
'3.7' and platform_system != 'Windows' and implementation_name != 'pypy'
     # Extras
     botocore>=1.4.87
+    # Peek support breaks tests.
+    queuelib < 1.6.0
 passenv =
     S3_TEST_FILE_URI
     AWS_ACCESS_KEY_ID
@@ -63,7 +65,8 @@
     reppy
     robotexclusionrulesparser
     # Test dependencies
-    pylint
+    # Force the pylint version used in CI for the 2.5.0 tag
+    pylint==2.7.4
 commands =
     pylint conftest.py docs extras scrapy setup.py tests

commit python-Scrapy for openSUSE:Factory

Reply via email to