commit python-Scrapy for openSUSE:Factory

Source-Sync Wed, 09 Nov 2022 03:57:28 -0800

Script 'mail_helper' called by obssrc
Hello community,

here is the log from the commit of package python-Scrapy for openSUSE:Factory 
checked in at 2022-11-09 12:56:49
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-Scrapy (Old)
 and      /work/SRC/openSUSE:Factory/.python-Scrapy.new.1597 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Package is "python-Scrapy"

Wed Nov  9 12:56:49 2022 rev:17 rq:1034478 version:2.7.1

Changes:
--------
--- /work/SRC/openSUSE:Factory/python-Scrapy/python-Scrapy.changes      
2022-10-29 20:18:02.822511252 +0200
+++ /work/SRC/openSUSE:Factory/.python-Scrapy.new.1597/python-Scrapy.changes    
2022-11-09 12:57:09.704255337 +0100
@@ -1,0 +2,9 @@
+Mon Nov  7 20:35:15 UTC 2022 - Yogalakshmi Arunachalam <yarunacha...@suse.com>
+
+- Update to v2.7.1 
+  * Relaxed the restriction introduced in 2.6.2 so that the 
Proxy-Authentication header can again be set explicitly in certain cases,
+    restoring compatibility with scrapy-zyte-smartproxy 2.1.0 and older
+  Bug fixes
+  * full change-log 
https://docs.scrapy.org/en/latest/news.html#scrapy-2-7-1-2022-11-02
+
+-------------------------------------------------------------------

Old:
----
  Scrapy-2.7.0.tar.gz

New:
----
  Scrapy-2.7.1.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-Scrapy.spec ++++++
--- /var/tmp/diff_new_pack.hp86AX/_old  2022-11-09 12:57:10.312258764 +0100
+++ /var/tmp/diff_new_pack.hp86AX/_new  2022-11-09 12:57:10.328258855 +0100
@@ -19,7 +19,7 @@
 %{?!python_module:%define python_module() python3-%{**}}
 %define skip_python2 1
 Name:           python-Scrapy
-Version:        2.7.0
+Version:        2.7.1
 Release:        0
 Summary:        A high-level Python Screen Scraping framework
 License:        BSD-3-Clause

++++++ Scrapy-2.7.0.tar.gz -> Scrapy-2.7.1.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/PKG-INFO new/Scrapy-2.7.1/PKG-INFO
--- old/Scrapy-2.7.0/PKG-INFO   2022-10-17 15:11:39.746693400 +0200
+++ new/Scrapy-2.7.1/PKG-INFO   2022-11-02 12:18:23.573532600 +0100
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: Scrapy
-Version: 2.7.0
+Version: 2.7.1
 Summary: A high-level Web Crawling and Web Scraping framework
 Home-page: https://scrapy.org
 Author: Scrapy developers
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/Scrapy.egg-info/PKG-INFO 
new/Scrapy-2.7.1/Scrapy.egg-info/PKG-INFO
--- old/Scrapy-2.7.0/Scrapy.egg-info/PKG-INFO   2022-10-17 15:11:39.000000000 
+0200
+++ new/Scrapy-2.7.1/Scrapy.egg-info/PKG-INFO   2022-11-02 12:18:23.000000000 
+0100
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: Scrapy
-Version: 2.7.0
+Version: 2.7.1
 Summary: A high-level Web Crawling and Web Scraping framework
 Home-page: https://scrapy.org
 Author: Scrapy developers
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/docs/intro/install.rst 
new/Scrapy-2.7.1/docs/intro/install.rst
--- old/Scrapy-2.7.0/docs/intro/install.rst     2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/docs/intro/install.rst     2022-11-02 12:18:03.000000000 
+0100
@@ -52,7 +52,7 @@
 * `twisted`_, an asynchronous networking framework
 * `cryptography`_ and `pyOpenSSL`_, to deal with various network-level 
security needs
 
-Some of these packages themselves depends on non-Python packages
+Some of these packages themselves depend on non-Python packages
 that might require additional installation steps depending on your platform.
 Please check :ref:`platform-specific guides below 
<intro-install-platform-notes>`.
 
@@ -187,7 +187,7 @@
   * Install `homebrew`_ following the instructions in https://brew.sh/
 
   * Update your ``PATH`` variable to state that homebrew packages should be
-    used before system packages (Change ``.bashrc`` to ``.zshrc`` accordantly
+    used before system packages (Change ``.bashrc`` to ``.zshrc`` accordingly
     if you're using `zsh`_ as default shell)::
 
       echo "export PATH=/usr/local/bin:/usr/local/sbin:$PATH" >> ~/.bashrc
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/docs/news.rst 
new/Scrapy-2.7.1/docs/news.rst
--- old/Scrapy-2.7.0/docs/news.rst      2022-10-17 15:11:29.000000000 +0200
+++ new/Scrapy-2.7.1/docs/news.rst      2022-11-02 12:18:03.000000000 +0100
@@ -3,6 +3,58 @@
 Release notes
 =============
 
+.. _release-2.7.1:
+
+Scrapy 2.7.1 (2022-11-02)
+-------------------------
+
+New features
+~~~~~~~~~~~~
+
+-   Relaxed the restriction introduced in 2.6.2 so that the
+    ``Proxy-Authentication`` header can again be set explicitly, as long as the
+    proxy URL in the :reqmeta:`proxy` metadata has no other credentials, and
+    for as long as that proxy URL remains the same; this restores compatibility
+    with scrapy-zyte-smartproxy 2.1.0 and older (:issue:`5626`).
+
+Bug fixes
+~~~~~~~~~
+
+-   Using ``-O``/``--overwrite-output`` and ``-t``/``--output-format`` options
+    together now produces an error instead of ignoring the former option
+    (:issue:`5516`, :issue:`5605`).
+
+-   Replaced deprecated :mod:`asyncio` APIs that implicitly use the current
+    event loop with code that explicitly requests a loop from the event loop
+    policy (:issue:`5685`, :issue:`5689`).
+
+-   Fixed uses of deprecated Scrapy APIs in Scrapy itself (:issue:`5588`,
+    :issue:`5589`).
+
+-   Fixed uses of a deprecated Pillow API (:issue:`5684`, :issue:`5692`).
+
+-   Improved code that checks if generators return values, so that it no longer
+    fails on decorated methods and partial methods (:issue:`5323`,
+    :issue:`5592`, :issue:`5599`, :issue:`5691`).
+
+Documentation
+~~~~~~~~~~~~~
+
+-   Upgraded the Code of Conduct to Contributor Covenant v2.1 (:issue:`5698`).
+
+-   Fixed typos (:issue:`5681`, :issue:`5694`).
+
+Quality assurance
+~~~~~~~~~~~~~~~~~
+
+-   Re-enabled some erroneously disabled flake8 checks (:issue:`5688`).
+
+-   Ignored harmless deprecation warnings from :mod:`typing` in tests
+    (:issue:`5686`, :issue:`5697`).
+
+-   Modernized our CI configuration (:issue:`5695`, :issue:`5696`).
+
+
 .. _release-2.7.0:
 
 Scrapy 2.7.0 (2022-10-17)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/commands.rst 
new/Scrapy-2.7.1/docs/topics/commands.rst
--- old/Scrapy-2.7.0/docs/topics/commands.rst   2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/docs/topics/commands.rst   2022-11-02 12:18:03.000000000 
+0100
@@ -271,11 +271,31 @@
 
 Start crawling using a spider.
 
+Supported options:
+
+* ``-h, --help``: show a help message and exit
+
+* ``-a NAME=VALUE``: set a spider argument (may be repeated)
+
+* ``--output FILE`` or ``-o FILE``: append scraped items to the end of FILE 
(use - for stdout), to define format set a colon at the end of the output URI 
(i.e. ``-o FILE:FORMAT``)
+
+* ``--overwrite-output FILE`` or ``-O FILE``: dump scraped items into FILE, 
overwriting any existing file, to define format set a colon at the end of the 
output URI (i.e. ``-O FILE:FORMAT``)
+
+* ``--output-format FORMAT`` or ``-t FORMAT``: deprecated way to define format 
to use for dumping items, does not work in combination with ``-O``
+
 Usage examples::
 
     $ scrapy crawl myspider
     [ ... myspider starts crawling ... ]
 
+    $ scrapy -o myfile:csv myspider
+    [ ... myspider starts crawling and appends the result to the file myfile 
in csv format ... ]
+
+    $ scrapy -O myfile:json myspider
+    [ ... myspider starts crawling and saves the result in myfile in json 
format overwriting the original content... ]
+
+    $ scrapy -o myfile -t csv myspider
+    [ ... myspider starts crawling and appends the result to the file myfile 
in csv format ... ]
 
 .. command:: check
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/contracts.rst 
new/Scrapy-2.7.1/docs/topics/contracts.rst
--- old/Scrapy-2.7.0/docs/topics/contracts.rst  2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/docs/topics/contracts.rst  2022-11-02 12:18:03.000000000 
+0100
@@ -102,7 +102,7 @@
     .. method:: Contract.post_process(output)
 
         This allows processing the output of the callback. Iterators are
-        converted listified before being passed to this hook.
+        converted to lists before being passed to this hook.
 
 Raise :class:`~scrapy.exceptions.ContractFail` from
 :class:`~scrapy.contracts.Contract.pre_process` or
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/extensions.rst 
new/Scrapy-2.7.1/docs/topics/extensions.rst
--- old/Scrapy-2.7.0/docs/topics/extensions.rst 2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/docs/topics/extensions.rst 2022-11-02 12:18:03.000000000 
+0100
@@ -17,7 +17,7 @@
 
 It is customary for extensions to prefix their settings with their own name, to
 avoid collision with existing (and future) extensions. For example, a
-hypothetic extension to handle `Google Sitemaps`_ would use settings like
+hypothetical extension to handle `Google Sitemaps`_ would use settings like
 ``GOOGLESITEMAP_ENABLED``, ``GOOGLESITEMAP_DEPTH``, and so on.
 
 .. _Google Sitemaps: https://en.wikipedia.org/wiki/Sitemaps
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/leaks.rst 
new/Scrapy-2.7.1/docs/topics/leaks.rst
--- old/Scrapy-2.7.0/docs/topics/leaks.rst      2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/docs/topics/leaks.rst      2022-11-02 12:18:03.000000000 
+0100
@@ -154,7 +154,7 @@
 If your project has too many spiders executed in parallel,
 the output of :func:`prefs()` can be difficult to read.
 For this reason, that function has a ``ignore`` argument which can be used to
-ignore a particular class (and all its subclases). For
+ignore a particular class (and all its subclasses). For
 example, this won't show any live references to spiders:
 
 >>> from scrapy.spiders import Spider
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/request-response.rst 
new/Scrapy-2.7.1/docs/topics/request-response.rst
--- old/Scrapy-2.7.0/docs/topics/request-response.rst   2022-10-17 
15:11:29.000000000 +0200
+++ new/Scrapy-2.7.1/docs/topics/request-response.rst   2022-11-02 
12:18:03.000000000 +0100
@@ -446,7 +446,7 @@
 Scenarios where changing the request fingerprinting algorithm may cause
 undesired results include, for example, using the HTTP cache middleware (see
 :class:`~scrapy.downloadermiddlewares.httpcache.HttpCacheMiddleware`).
-Changing the request fingerprinting algorithm would invalidade the current
+Changing the request fingerprinting algorithm would invalidate the current
 cache, requiring you to redownload all requests again.
 
 Otherwise, set :setting:`REQUEST_FINGERPRINTER_IMPLEMENTATION` to ``'2.7'`` in
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/pytest.ini new/Scrapy-2.7.1/pytest.ini
--- old/Scrapy-2.7.0/pytest.ini 2022-10-17 15:11:29.000000000 +0200
+++ new/Scrapy-2.7.1/pytest.ini 2022-11-02 12:18:03.000000000 +0100
@@ -24,3 +24,5 @@
 filterwarnings =
     ignore:scrapy.downloadermiddlewares.decompression is deprecated
     ignore:Module scrapy.utils.reqser is deprecated
+    ignore:typing.re is deprecated
+    ignore:typing.io is deprecated
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/VERSION 
new/Scrapy-2.7.1/scrapy/VERSION
--- old/Scrapy-2.7.0/scrapy/VERSION     2022-10-17 15:11:29.000000000 +0200
+++ new/Scrapy-2.7.1/scrapy/VERSION     2022-11-02 12:18:03.000000000 +0100
@@ -1 +1 @@
-2.7.0
+2.7.1
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/commands/__init__.py 
new/Scrapy-2.7.1/scrapy/commands/__init__.py
--- old/Scrapy-2.7.0/scrapy/commands/__init__.py        2022-10-17 
15:11:29.000000000 +0200
+++ new/Scrapy-2.7.1/scrapy/commands/__init__.py        2022-11-02 
12:18:03.000000000 +0100
@@ -115,9 +115,11 @@
         parser.add_argument("-a", dest="spargs", action="append", default=[], 
metavar="NAME=VALUE",
                             help="set spider argument (may be repeated)")
         parser.add_argument("-o", "--output", metavar="FILE", action="append",
-                            help="append scraped items to the end of FILE (use 
- for stdout)")
+                            help="append scraped items to the end of FILE (use 
- for stdout),"
+                                 " to define format set a colon at the end of 
the output URI (i.e. -o FILE:FORMAT)")
         parser.add_argument("-O", "--overwrite-output", metavar="FILE", 
action="append",
-                            help="dump scraped items into FILE, overwriting 
any existing file")
+                            help="dump scraped items into FILE, overwriting 
any existing file,"
+                                 " to define format set a colon at the end of 
the output URI (i.e. -O FILE:FORMAT)")
         parser.add_argument("-t", "--output-format", metavar="FORMAT",
                             help="format to use for dumping items")
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/core/engine.py 
new/Scrapy-2.7.1/scrapy/core/engine.py
--- old/Scrapy-2.7.0/scrapy/core/engine.py      2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/scrapy/core/engine.py      2022-11-02 12:18:03.000000000 
+0100
@@ -257,9 +257,7 @@
 
     def download(self, request: Request, spider: Optional[Spider] = None) -> 
Deferred:
         """Return a Deferred which fires with a Response as result, only 
downloader middlewares are applied"""
-        if spider is None:
-            spider = self.spider
-        else:
+        if spider is not None:
             warnings.warn(
                 "Passing a 'spider' argument to ExecutionEngine.download is 
deprecated",
                 category=ScrapyDeprecationWarning,
@@ -267,7 +265,7 @@
             )
             if spider is not self.spider:
                 logger.warning("The spider '%s' does not match the open 
spider", spider.name)
-        if spider is None:
+        if self.spider is None:
             raise RuntimeError(f"No open spider to crawl: {request}")
         return self._download(request, spider).addBoth(self._downloaded, 
request, spider)
 
@@ -278,11 +276,14 @@
         self.slot.remove_request(request)
         return self.download(result, spider) if isinstance(result, Request) 
else result
 
-    def _download(self, request: Request, spider: Spider) -> Deferred:
+    def _download(self, request: Request, spider: Optional[Spider]) -> 
Deferred:
         assert self.slot is not None  # typing
 
         self.slot.add_request(request)
 
+        if spider is None:
+            spider = self.spider
+
         def _on_success(result: Union[Response, Request]) -> Union[Response, 
Request]:
             if not isinstance(result, (Response, Request)):
                 raise TypeError(f"Incorrect type: expected Response or 
Request, got {type(result)}: {result!r}")
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/Scrapy-2.7.0/scrapy/downloadermiddlewares/httpproxy.py 
new/Scrapy-2.7.1/scrapy/downloadermiddlewares/httpproxy.py
--- old/Scrapy-2.7.0/scrapy/downloadermiddlewares/httpproxy.py  2022-10-17 
15:11:29.000000000 +0200
+++ new/Scrapy-2.7.1/scrapy/downloadermiddlewares/httpproxy.py  2022-11-02 
12:18:03.000000000 +0100
@@ -78,4 +78,7 @@
                     del request.headers[b'Proxy-Authorization']
                 del request.meta['_auth_proxy']
         elif b'Proxy-Authorization' in request.headers:
-            del request.headers[b'Proxy-Authorization']
+            if proxy_url:
+                request.meta['_auth_proxy'] = proxy_url
+            else:
+                del request.headers[b'Proxy-Authorization']
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/pipelines/images.py 
new/Scrapy-2.7.1/scrapy/pipelines/images.py
--- old/Scrapy-2.7.0/scrapy/pipelines/images.py 2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/scrapy/pipelines/images.py 2022-11-02 12:18:03.000000000 
+0100
@@ -160,7 +160,14 @@
 
         if size:
             image = image.copy()
-            image.thumbnail(size, self._Image.ANTIALIAS)
+            try:
+                # Image.Resampling.LANCZOS was added in Pillow 9.1.0
+                # remove this try except block,
+                # when updating the minimum requirements for Pillow.
+                resampling_filter = self._Image.Resampling.LANCZOS
+            except AttributeError:
+                resampling_filter = self._Image.ANTIALIAS
+            image.thumbnail(size, resampling_filter)
 
         buf = BytesIO()
         image.save(buf, 'JPEG')
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/conf.py 
new/Scrapy-2.7.1/scrapy/utils/conf.py
--- old/Scrapy-2.7.0/scrapy/utils/conf.py       2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/scrapy/utils/conf.py       2022-11-02 12:18:03.000000000 
+0100
@@ -155,6 +155,15 @@
             raise UsageError(
                 "Please use only one of -o/--output and -O/--overwrite-output"
             )
+        if output_format:
+            raise UsageError(
+                "-t/--output-format is a deprecated command line option"
+                " and does not work in combination with -O/--overwrite-output."
+                " To specify a format please specify it after a colon at the 
end of the"
+                " output URI (i.e. -O <URI>:<FORMAT>)."
+                " Example working in the tutorial: "
+                "scrapy crawl quotes -O quotes.json:json"
+            )
         output = overwrite_output
         overwrite = True
 
@@ -162,9 +171,13 @@
         if len(output) == 1:
             check_valid_format(output_format)
             message = (
-                'The -t command line option is deprecated in favor of '
-                'specifying the output format within the output URI. See the '
-                'documentation of the -o and -O options for more information.'
+                "The -t/--output-format command line option is deprecated in 
favor of "
+                "specifying the output format within the output URI using the 
-o/--output or the"
+                " -O/--overwrite-output option (i.e. -o/-O <URI>:<FORMAT>). 
See the documentation"
+                " of the -o or -O option or the following examples for more 
information. "
+                "Examples working in the tutorial: "
+                "scrapy crawl quotes -o quotes.csv:csv   or   "
+                "scrapy crawl quotes -O quotes.json:json"
             )
             warnings.warn(message, ScrapyDeprecationWarning, stacklevel=2)
             return {output[0]: {'format': output_format}}
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/defer.py 
new/Scrapy-2.7.1/scrapy/utils/defer.py
--- old/Scrapy-2.7.0/scrapy/utils/defer.py      2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/scrapy/utils/defer.py      2022-11-02 12:18:03.000000000 
+0100
@@ -26,7 +26,7 @@
 from twisted.python.failure import Failure
 
 from scrapy.exceptions import IgnoreRequest
-from scrapy.utils.reactor import is_asyncio_reactor_installed
+from scrapy.utils.reactor import is_asyncio_reactor_installed, 
get_asyncio_event_loop_policy
 
 
 def defer_fail(_failure: Failure) -> Deferred:
@@ -269,7 +269,8 @@
             return ensureDeferred(o)
         else:
             # wrapping the coroutine into a Future and then into a Deferred, 
this requires AsyncioSelectorReactor
-            return Deferred.fromFuture(asyncio.ensure_future(o))
+            event_loop = get_asyncio_event_loop_policy().get_event_loop()
+            return Deferred.fromFuture(asyncio.ensure_future(o, 
loop=event_loop))
     return o
 
 
@@ -320,7 +321,8 @@
                 d = treq.get('https://example.com/additional')
                 additional_response = await deferred_to_future(d)
     """
-    return d.asFuture(asyncio.get_event_loop())
+    policy = get_asyncio_event_loop_policy()
+    return d.asFuture(policy.get_event_loop())
 
 
 def maybe_deferred_to_future(d: Deferred) -> Union[Deferred, Future]:
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/misc.py 
new/Scrapy-2.7.1/scrapy/utils/misc.py
--- old/Scrapy-2.7.0/scrapy/utils/misc.py       2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/scrapy/utils/misc.py       2022-11-02 12:18:03.000000000 
+0100
@@ -9,6 +9,7 @@
 from contextlib import contextmanager
 from importlib import import_module
 from pkgutil import iter_modules
+from functools import partial
 
 from w3lib.html import replace_entities
 
@@ -226,7 +227,18 @@
         return value is None or isinstance(value, ast.NameConstant) and 
value.value is None
 
     if inspect.isgeneratorfunction(callable):
-        code = re.sub(r"^[\t ]+", "", inspect.getsource(callable))
+        func = callable
+        while isinstance(func, partial):
+            func = func.func
+
+        src = inspect.getsource(func)
+        pattern = re.compile(r"(^[\t ]+)")
+        code = pattern.sub("", src)
+
+        match = pattern.match(src)  # finds indentation
+        if match:
+            code = re.sub(f"\n{match.group(0)}", "\n", code)  # remove 
indentation
+
         tree = ast.parse(code)
         for node in walk_callable(tree):
             if isinstance(node, ast.Return) and not returns_none(node):
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/reactor.py 
new/Scrapy-2.7.1/scrapy/utils/reactor.py
--- old/Scrapy-2.7.0/scrapy/utils/reactor.py    2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/scrapy/utils/reactor.py    2022-11-02 12:18:03.000000000 
+0100
@@ -51,6 +51,19 @@
         return self._func(*self._a, **self._kw)
 
 
+def get_asyncio_event_loop_policy():
+    policy = asyncio.get_event_loop_policy()
+    if (
+        sys.version_info >= (3, 8)
+        and sys.platform == "win32"
+        and not isinstance(policy, asyncio.WindowsSelectorEventLoopPolicy)
+    ):
+        policy = asyncio.WindowsSelectorEventLoopPolicy()
+        asyncio.set_event_loop_policy(policy)
+
+    return policy
+
+
 def install_reactor(reactor_path, event_loop_path=None):
     """Installs the :mod:`~twisted.internet.reactor` with the specified
     import path. Also installs the asyncio event loop with the specified import
@@ -58,16 +71,14 @@
     reactor_class = load_object(reactor_path)
     if reactor_class is asyncioreactor.AsyncioSelectorReactor:
         with suppress(error.ReactorAlreadyInstalledError):
-            if sys.version_info >= (3, 8) and sys.platform == "win32":
-                policy = asyncio.get_event_loop_policy()
-                if not isinstance(policy, 
asyncio.WindowsSelectorEventLoopPolicy):
-                    
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
+            policy = get_asyncio_event_loop_policy()
             if event_loop_path is not None:
                 event_loop_class = load_object(event_loop_path)
                 event_loop = event_loop_class()
                 asyncio.set_event_loop(event_loop)
             else:
-                event_loop = asyncio.get_event_loop()
+                event_loop = policy.get_event_loop()
+
             asyncioreactor.install(eventloop=event_loop)
     else:
         *module, _ = reactor_path.split(".")
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/url.py 
new/Scrapy-2.7.1/scrapy/utils/url.py
--- old/Scrapy-2.7.0/scrapy/utils/url.py        2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/scrapy/utils/url.py        2022-11-02 12:18:03.000000000 
+0100
@@ -34,6 +34,7 @@
     lowercase_path = parse_url(url).path.lower()
     return any(lowercase_path.endswith(ext) for ext in extensions)
 
+
 def parse_url(url, encoding=None):
     """Return urlparsed url from the given argument (which could be an already
     parsed url)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/Scrapy-2.7.0/tests/test_downloadermiddleware_httpproxy.py 
new/Scrapy-2.7.1/tests/test_downloadermiddleware_httpproxy.py
--- old/Scrapy-2.7.0/tests/test_downloadermiddleware_httpproxy.py       
2022-10-17 15:11:29.000000000 +0200
+++ new/Scrapy-2.7.1/tests/test_downloadermiddleware_httpproxy.py       
2022-11-02 12:18:03.000000000 +0100
@@ -400,6 +400,9 @@
         self.assertNotIn(b'Proxy-Authorization', request.headers)
 
     def test_proxy_authentication_header_proxy_without_credentials(self):
+        """As long as the proxy URL in request metadata remains the same, the
+        Proxy-Authorization header is used and kept, and may even be
+        changed."""
         middleware = HttpProxyMiddleware()
         request = Request(
             'https://example.com',
@@ -408,7 +411,16 @@
         )
         assert middleware.process_request(request, spider) is None
         self.assertEqual(request.meta['proxy'], 'https://example.com')
-        self.assertNotIn(b'Proxy-Authorization', request.headers)
+        self.assertEqual(request.headers['Proxy-Authorization'], b'Basic foo')
+
+        assert middleware.process_request(request, spider) is None
+        self.assertEqual(request.meta['proxy'], 'https://example.com')
+        self.assertEqual(request.headers['Proxy-Authorization'], b'Basic foo')
+
+        request.headers['Proxy-Authorization'] = b'Basic bar'
+        assert middleware.process_request(request, spider) is None
+        self.assertEqual(request.meta['proxy'], 'https://example.com')
+        self.assertEqual(request.headers['Proxy-Authorization'], b'Basic bar')
 
     def test_proxy_authentication_header_proxy_with_same_credentials(self):
         middleware = HttpProxyMiddleware()
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/tests/test_loader.py 
new/Scrapy-2.7.1/tests/test_loader.py
--- old/Scrapy-2.7.0/tests/test_loader.py       2022-10-17 15:11:29.000000000 
+0200
+++ new/Scrapy-2.7.1/tests/test_loader.py       2022-11-02 12:18:03.000000000 
+0100
@@ -295,7 +295,7 @@
 
         l.add_css('name', 'div::text')
         self.assertEqual(l.get_output_value('name'), ['Marta'])
-    
+
     def test_init_method_with_base_response(self):
         """Selector should be None after initialization"""
         response = Response("https://scrapy.org";)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/Scrapy-2.7.0/tests/test_utils_asyncio.py 
new/Scrapy-2.7.1/tests/test_utils_asyncio.py
--- old/Scrapy-2.7.0/tests/test_utils_asyncio.py        2022-10-17 
15:11:29.000000000 +0200
+++ new/Scrapy-2.7.1/tests/test_utils_asyncio.py        2022-11-02 
12:18:03.000000000 +0100
@@ -1,3 +1,4 @@
+import warnings
 from unittest import TestCase
 
 from pytest import mark
@@ -13,5 +14,6 @@
         self.assertEqual(is_asyncio_reactor_installed(), self.reactor_pytest 
== 'asyncio')
 
     def test_install_asyncio_reactor(self):
-        # this should do nothing
-        
install_reactor("twisted.internet.asyncioreactor.AsyncioSelectorReactor")
+        with warnings.catch_warnings(record=True) as w:
+            
install_reactor("twisted.internet.asyncioreactor.AsyncioSelectorReactor")
+            self.assertEqual(len(w), 0)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/Scrapy-2.7.0/tests/test_utils_misc/test_return_with_argument_inside_generator.py
 
new/Scrapy-2.7.1/tests/test_utils_misc/test_return_with_argument_inside_generator.py
--- 
old/Scrapy-2.7.0/tests/test_utils_misc/test_return_with_argument_inside_generator.py
        2022-10-17 15:11:29.000000000 +0200
+++ 
new/Scrapy-2.7.1/tests/test_utils_misc/test_return_with_argument_inside_generator.py
        2022-11-02 12:18:03.000000000 +0100
@@ -1,5 +1,6 @@
 import unittest
 import warnings
+from functools import partial
 from unittest import mock
 
 from scrapy.utils.misc import is_generator_with_return_value, 
warn_on_generator_with_return_value
@@ -165,9 +166,99 @@
             warn_on_generator_with_return_value(None, l2)
             self.assertEqual(len(w), 0)
 
+    def test_generators_return_none_with_decorator(self):
+        def decorator(func):
+            def inner_func():
+                func()
+            return inner_func
+
+        @decorator
+        def f3():
+            yield 1
+            return None
+
+        @decorator
+        def g3():
+            yield 1
+            return
+
+        @decorator
+        def h3():
+            yield 1
+
+        @decorator
+        def i3():
+            yield 1
+            yield from generator_that_returns_stuff()
+
+        @decorator
+        def j3():
+            yield 1
+
+            def helper():
+                return 0
+
+            yield helper()
+
+        @decorator
+        def k3():
+            """
+docstring
+            """
+            url = """
+https://example.org
+        """
+            yield url
+            return
+
+        @decorator
+        def l3():
+            return
+
+        assert not is_generator_with_return_value(top_level_return_none)
+        assert not is_generator_with_return_value(f3)
+        assert not is_generator_with_return_value(g3)
+        assert not is_generator_with_return_value(h3)
+        assert not is_generator_with_return_value(i3)
+        assert not is_generator_with_return_value(j3)  # not recursive
+        assert not is_generator_with_return_value(k3)  # not recursive
+        assert not is_generator_with_return_value(l3)
+
+        with warnings.catch_warnings(record=True) as w:
+            warn_on_generator_with_return_value(None, top_level_return_none)
+            self.assertEqual(len(w), 0)
+        with warnings.catch_warnings(record=True) as w:
+            warn_on_generator_with_return_value(None, f3)
+            self.assertEqual(len(w), 0)
+        with warnings.catch_warnings(record=True) as w:
+            warn_on_generator_with_return_value(None, g3)
+            self.assertEqual(len(w), 0)
+        with warnings.catch_warnings(record=True) as w:
+            warn_on_generator_with_return_value(None, h3)
+            self.assertEqual(len(w), 0)
+        with warnings.catch_warnings(record=True) as w:
+            warn_on_generator_with_return_value(None, i3)
+            self.assertEqual(len(w), 0)
+        with warnings.catch_warnings(record=True) as w:
+            warn_on_generator_with_return_value(None, j3)
+            self.assertEqual(len(w), 0)
+        with warnings.catch_warnings(record=True) as w:
+            warn_on_generator_with_return_value(None, k3)
+            self.assertEqual(len(w), 0)
+        with warnings.catch_warnings(record=True) as w:
+            warn_on_generator_with_return_value(None, l3)
+            self.assertEqual(len(w), 0)
+
     @mock.patch("scrapy.utils.misc.is_generator_with_return_value", 
new=_indentation_error)
     def test_indentation_error(self):
         with warnings.catch_warnings(record=True) as w:
             warn_on_generator_with_return_value(None, top_level_return_none)
             self.assertEqual(len(w), 1)
             self.assertIn('Unable to determine', str(w[0].message))
+
+    def test_partial(self):
+        def cb(arg1, arg2):
+            yield {}
+
+        partial_cb = partial(cb, arg1=42)
+        assert not is_generator_with_return_value(partial_cb)

commit python-Scrapy for openSUSE:Factory

Reply via email to