Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package python-Scrapy for openSUSE:Factory checked in at 2022-11-09 12:56:49 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/python-Scrapy (Old) and /work/SRC/openSUSE:Factory/.python-Scrapy.new.1597 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-Scrapy" Wed Nov 9 12:56:49 2022 rev:17 rq:1034478 version:2.7.1 Changes: -------- --- /work/SRC/openSUSE:Factory/python-Scrapy/python-Scrapy.changes 2022-10-29 20:18:02.822511252 +0200 +++ /work/SRC/openSUSE:Factory/.python-Scrapy.new.1597/python-Scrapy.changes 2022-11-09 12:57:09.704255337 +0100 @@ -1,0 +2,9 @@ +Mon Nov 7 20:35:15 UTC 2022 - Yogalakshmi Arunachalam <yarunacha...@suse.com> + +- Update to v2.7.1 + * Relaxed the restriction introduced in 2.6.2 so that the Proxy-Authentication header can again be set explicitly in certain cases, + restoring compatibility with scrapy-zyte-smartproxy 2.1.0 and older + Bug fixes + * full change-log https://docs.scrapy.org/en/latest/news.html#scrapy-2-7-1-2022-11-02 + +------------------------------------------------------------------- Old: ---- Scrapy-2.7.0.tar.gz New: ---- Scrapy-2.7.1.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-Scrapy.spec ++++++ --- /var/tmp/diff_new_pack.hp86AX/_old 2022-11-09 12:57:10.312258764 +0100 +++ /var/tmp/diff_new_pack.hp86AX/_new 2022-11-09 12:57:10.328258855 +0100 @@ -19,7 +19,7 @@ %{?!python_module:%define python_module() python3-%{**}} %define skip_python2 1 Name: python-Scrapy -Version: 2.7.0 +Version: 2.7.1 Release: 0 Summary: A high-level Python Screen Scraping framework License: BSD-3-Clause ++++++ Scrapy-2.7.0.tar.gz -> Scrapy-2.7.1.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/PKG-INFO new/Scrapy-2.7.1/PKG-INFO --- old/Scrapy-2.7.0/PKG-INFO 2022-10-17 15:11:39.746693400 +0200 +++ new/Scrapy-2.7.1/PKG-INFO 2022-11-02 12:18:23.573532600 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 2.1 Name: Scrapy -Version: 2.7.0 +Version: 2.7.1 Summary: A high-level Web Crawling and Web Scraping framework Home-page: https://scrapy.org Author: Scrapy developers diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/Scrapy.egg-info/PKG-INFO new/Scrapy-2.7.1/Scrapy.egg-info/PKG-INFO --- old/Scrapy-2.7.0/Scrapy.egg-info/PKG-INFO 2022-10-17 15:11:39.000000000 +0200 +++ new/Scrapy-2.7.1/Scrapy.egg-info/PKG-INFO 2022-11-02 12:18:23.000000000 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 2.1 Name: Scrapy -Version: 2.7.0 +Version: 2.7.1 Summary: A high-level Web Crawling and Web Scraping framework Home-page: https://scrapy.org Author: Scrapy developers diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/docs/intro/install.rst new/Scrapy-2.7.1/docs/intro/install.rst --- old/Scrapy-2.7.0/docs/intro/install.rst 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/docs/intro/install.rst 2022-11-02 12:18:03.000000000 +0100 @@ -52,7 +52,7 @@ * `twisted`_, an asynchronous networking framework * `cryptography`_ and `pyOpenSSL`_, to deal with various network-level security needs -Some of these packages themselves depends on non-Python packages +Some of these packages themselves depend on non-Python packages that might require additional installation steps depending on your platform. Please check :ref:`platform-specific guides below <intro-install-platform-notes>`. @@ -187,7 +187,7 @@ * Install `homebrew`_ following the instructions in https://brew.sh/ * Update your ``PATH`` variable to state that homebrew packages should be - used before system packages (Change ``.bashrc`` to ``.zshrc`` accordantly + used before system packages (Change ``.bashrc`` to ``.zshrc`` accordingly if you're using `zsh`_ as default shell):: echo "export PATH=/usr/local/bin:/usr/local/sbin:$PATH" >> ~/.bashrc diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/docs/news.rst new/Scrapy-2.7.1/docs/news.rst --- old/Scrapy-2.7.0/docs/news.rst 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/docs/news.rst 2022-11-02 12:18:03.000000000 +0100 @@ -3,6 +3,58 @@ Release notes ============= +.. _release-2.7.1: + +Scrapy 2.7.1 (2022-11-02) +------------------------- + +New features +~~~~~~~~~~~~ + +- Relaxed the restriction introduced in 2.6.2 so that the + ``Proxy-Authentication`` header can again be set explicitly, as long as the + proxy URL in the :reqmeta:`proxy` metadata has no other credentials, and + for as long as that proxy URL remains the same; this restores compatibility + with scrapy-zyte-smartproxy 2.1.0 and older (:issue:`5626`). + +Bug fixes +~~~~~~~~~ + +- Using ``-O``/``--overwrite-output`` and ``-t``/``--output-format`` options + together now produces an error instead of ignoring the former option + (:issue:`5516`, :issue:`5605`). + +- Replaced deprecated :mod:`asyncio` APIs that implicitly use the current + event loop with code that explicitly requests a loop from the event loop + policy (:issue:`5685`, :issue:`5689`). + +- Fixed uses of deprecated Scrapy APIs in Scrapy itself (:issue:`5588`, + :issue:`5589`). + +- Fixed uses of a deprecated Pillow API (:issue:`5684`, :issue:`5692`). + +- Improved code that checks if generators return values, so that it no longer + fails on decorated methods and partial methods (:issue:`5323`, + :issue:`5592`, :issue:`5599`, :issue:`5691`). + +Documentation +~~~~~~~~~~~~~ + +- Upgraded the Code of Conduct to Contributor Covenant v2.1 (:issue:`5698`). + +- Fixed typos (:issue:`5681`, :issue:`5694`). + +Quality assurance +~~~~~~~~~~~~~~~~~ + +- Re-enabled some erroneously disabled flake8 checks (:issue:`5688`). + +- Ignored harmless deprecation warnings from :mod:`typing` in tests + (:issue:`5686`, :issue:`5697`). + +- Modernized our CI configuration (:issue:`5695`, :issue:`5696`). + + .. _release-2.7.0: Scrapy 2.7.0 (2022-10-17) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/commands.rst new/Scrapy-2.7.1/docs/topics/commands.rst --- old/Scrapy-2.7.0/docs/topics/commands.rst 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/docs/topics/commands.rst 2022-11-02 12:18:03.000000000 +0100 @@ -271,11 +271,31 @@ Start crawling using a spider. +Supported options: + +* ``-h, --help``: show a help message and exit + +* ``-a NAME=VALUE``: set a spider argument (may be repeated) + +* ``--output FILE`` or ``-o FILE``: append scraped items to the end of FILE (use - for stdout), to define format set a colon at the end of the output URI (i.e. ``-o FILE:FORMAT``) + +* ``--overwrite-output FILE`` or ``-O FILE``: dump scraped items into FILE, overwriting any existing file, to define format set a colon at the end of the output URI (i.e. ``-O FILE:FORMAT``) + +* ``--output-format FORMAT`` or ``-t FORMAT``: deprecated way to define format to use for dumping items, does not work in combination with ``-O`` + Usage examples:: $ scrapy crawl myspider [ ... myspider starts crawling ... ] + $ scrapy -o myfile:csv myspider + [ ... myspider starts crawling and appends the result to the file myfile in csv format ... ] + + $ scrapy -O myfile:json myspider + [ ... myspider starts crawling and saves the result in myfile in json format overwriting the original content... ] + + $ scrapy -o myfile -t csv myspider + [ ... myspider starts crawling and appends the result to the file myfile in csv format ... ] .. command:: check diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/contracts.rst new/Scrapy-2.7.1/docs/topics/contracts.rst --- old/Scrapy-2.7.0/docs/topics/contracts.rst 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/docs/topics/contracts.rst 2022-11-02 12:18:03.000000000 +0100 @@ -102,7 +102,7 @@ .. method:: Contract.post_process(output) This allows processing the output of the callback. Iterators are - converted listified before being passed to this hook. + converted to lists before being passed to this hook. Raise :class:`~scrapy.exceptions.ContractFail` from :class:`~scrapy.contracts.Contract.pre_process` or diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/extensions.rst new/Scrapy-2.7.1/docs/topics/extensions.rst --- old/Scrapy-2.7.0/docs/topics/extensions.rst 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/docs/topics/extensions.rst 2022-11-02 12:18:03.000000000 +0100 @@ -17,7 +17,7 @@ It is customary for extensions to prefix their settings with their own name, to avoid collision with existing (and future) extensions. For example, a -hypothetic extension to handle `Google Sitemaps`_ would use settings like +hypothetical extension to handle `Google Sitemaps`_ would use settings like ``GOOGLESITEMAP_ENABLED``, ``GOOGLESITEMAP_DEPTH``, and so on. .. _Google Sitemaps: https://en.wikipedia.org/wiki/Sitemaps diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/leaks.rst new/Scrapy-2.7.1/docs/topics/leaks.rst --- old/Scrapy-2.7.0/docs/topics/leaks.rst 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/docs/topics/leaks.rst 2022-11-02 12:18:03.000000000 +0100 @@ -154,7 +154,7 @@ If your project has too many spiders executed in parallel, the output of :func:`prefs()` can be difficult to read. For this reason, that function has a ``ignore`` argument which can be used to -ignore a particular class (and all its subclases). For +ignore a particular class (and all its subclasses). For example, this won't show any live references to spiders: >>> from scrapy.spiders import Spider diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/docs/topics/request-response.rst new/Scrapy-2.7.1/docs/topics/request-response.rst --- old/Scrapy-2.7.0/docs/topics/request-response.rst 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/docs/topics/request-response.rst 2022-11-02 12:18:03.000000000 +0100 @@ -446,7 +446,7 @@ Scenarios where changing the request fingerprinting algorithm may cause undesired results include, for example, using the HTTP cache middleware (see :class:`~scrapy.downloadermiddlewares.httpcache.HttpCacheMiddleware`). -Changing the request fingerprinting algorithm would invalidade the current +Changing the request fingerprinting algorithm would invalidate the current cache, requiring you to redownload all requests again. Otherwise, set :setting:`REQUEST_FINGERPRINTER_IMPLEMENTATION` to ``'2.7'`` in diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/pytest.ini new/Scrapy-2.7.1/pytest.ini --- old/Scrapy-2.7.0/pytest.ini 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/pytest.ini 2022-11-02 12:18:03.000000000 +0100 @@ -24,3 +24,5 @@ filterwarnings = ignore:scrapy.downloadermiddlewares.decompression is deprecated ignore:Module scrapy.utils.reqser is deprecated + ignore:typing.re is deprecated + ignore:typing.io is deprecated diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/VERSION new/Scrapy-2.7.1/scrapy/VERSION --- old/Scrapy-2.7.0/scrapy/VERSION 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/VERSION 2022-11-02 12:18:03.000000000 +0100 @@ -1 +1 @@ -2.7.0 +2.7.1 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/commands/__init__.py new/Scrapy-2.7.1/scrapy/commands/__init__.py --- old/Scrapy-2.7.0/scrapy/commands/__init__.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/commands/__init__.py 2022-11-02 12:18:03.000000000 +0100 @@ -115,9 +115,11 @@ parser.add_argument("-a", dest="spargs", action="append", default=[], metavar="NAME=VALUE", help="set spider argument (may be repeated)") parser.add_argument("-o", "--output", metavar="FILE", action="append", - help="append scraped items to the end of FILE (use - for stdout)") + help="append scraped items to the end of FILE (use - for stdout)," + " to define format set a colon at the end of the output URI (i.e. -o FILE:FORMAT)") parser.add_argument("-O", "--overwrite-output", metavar="FILE", action="append", - help="dump scraped items into FILE, overwriting any existing file") + help="dump scraped items into FILE, overwriting any existing file," + " to define format set a colon at the end of the output URI (i.e. -O FILE:FORMAT)") parser.add_argument("-t", "--output-format", metavar="FORMAT", help="format to use for dumping items") diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/core/engine.py new/Scrapy-2.7.1/scrapy/core/engine.py --- old/Scrapy-2.7.0/scrapy/core/engine.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/core/engine.py 2022-11-02 12:18:03.000000000 +0100 @@ -257,9 +257,7 @@ def download(self, request: Request, spider: Optional[Spider] = None) -> Deferred: """Return a Deferred which fires with a Response as result, only downloader middlewares are applied""" - if spider is None: - spider = self.spider - else: + if spider is not None: warnings.warn( "Passing a 'spider' argument to ExecutionEngine.download is deprecated", category=ScrapyDeprecationWarning, @@ -267,7 +265,7 @@ ) if spider is not self.spider: logger.warning("The spider '%s' does not match the open spider", spider.name) - if spider is None: + if self.spider is None: raise RuntimeError(f"No open spider to crawl: {request}") return self._download(request, spider).addBoth(self._downloaded, request, spider) @@ -278,11 +276,14 @@ self.slot.remove_request(request) return self.download(result, spider) if isinstance(result, Request) else result - def _download(self, request: Request, spider: Spider) -> Deferred: + def _download(self, request: Request, spider: Optional[Spider]) -> Deferred: assert self.slot is not None # typing self.slot.add_request(request) + if spider is None: + spider = self.spider + def _on_success(result: Union[Response, Request]) -> Union[Response, Request]: if not isinstance(result, (Response, Request)): raise TypeError(f"Incorrect type: expected Response or Request, got {type(result)}: {result!r}") diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/downloadermiddlewares/httpproxy.py new/Scrapy-2.7.1/scrapy/downloadermiddlewares/httpproxy.py --- old/Scrapy-2.7.0/scrapy/downloadermiddlewares/httpproxy.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/downloadermiddlewares/httpproxy.py 2022-11-02 12:18:03.000000000 +0100 @@ -78,4 +78,7 @@ del request.headers[b'Proxy-Authorization'] del request.meta['_auth_proxy'] elif b'Proxy-Authorization' in request.headers: - del request.headers[b'Proxy-Authorization'] + if proxy_url: + request.meta['_auth_proxy'] = proxy_url + else: + del request.headers[b'Proxy-Authorization'] diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/pipelines/images.py new/Scrapy-2.7.1/scrapy/pipelines/images.py --- old/Scrapy-2.7.0/scrapy/pipelines/images.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/pipelines/images.py 2022-11-02 12:18:03.000000000 +0100 @@ -160,7 +160,14 @@ if size: image = image.copy() - image.thumbnail(size, self._Image.ANTIALIAS) + try: + # Image.Resampling.LANCZOS was added in Pillow 9.1.0 + # remove this try except block, + # when updating the minimum requirements for Pillow. + resampling_filter = self._Image.Resampling.LANCZOS + except AttributeError: + resampling_filter = self._Image.ANTIALIAS + image.thumbnail(size, resampling_filter) buf = BytesIO() image.save(buf, 'JPEG') diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/conf.py new/Scrapy-2.7.1/scrapy/utils/conf.py --- old/Scrapy-2.7.0/scrapy/utils/conf.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/utils/conf.py 2022-11-02 12:18:03.000000000 +0100 @@ -155,6 +155,15 @@ raise UsageError( "Please use only one of -o/--output and -O/--overwrite-output" ) + if output_format: + raise UsageError( + "-t/--output-format is a deprecated command line option" + " and does not work in combination with -O/--overwrite-output." + " To specify a format please specify it after a colon at the end of the" + " output URI (i.e. -O <URI>:<FORMAT>)." + " Example working in the tutorial: " + "scrapy crawl quotes -O quotes.json:json" + ) output = overwrite_output overwrite = True @@ -162,9 +171,13 @@ if len(output) == 1: check_valid_format(output_format) message = ( - 'The -t command line option is deprecated in favor of ' - 'specifying the output format within the output URI. See the ' - 'documentation of the -o and -O options for more information.' + "The -t/--output-format command line option is deprecated in favor of " + "specifying the output format within the output URI using the -o/--output or the" + " -O/--overwrite-output option (i.e. -o/-O <URI>:<FORMAT>). See the documentation" + " of the -o or -O option or the following examples for more information. " + "Examples working in the tutorial: " + "scrapy crawl quotes -o quotes.csv:csv or " + "scrapy crawl quotes -O quotes.json:json" ) warnings.warn(message, ScrapyDeprecationWarning, stacklevel=2) return {output[0]: {'format': output_format}} diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/defer.py new/Scrapy-2.7.1/scrapy/utils/defer.py --- old/Scrapy-2.7.0/scrapy/utils/defer.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/utils/defer.py 2022-11-02 12:18:03.000000000 +0100 @@ -26,7 +26,7 @@ from twisted.python.failure import Failure from scrapy.exceptions import IgnoreRequest -from scrapy.utils.reactor import is_asyncio_reactor_installed +from scrapy.utils.reactor import is_asyncio_reactor_installed, get_asyncio_event_loop_policy def defer_fail(_failure: Failure) -> Deferred: @@ -269,7 +269,8 @@ return ensureDeferred(o) else: # wrapping the coroutine into a Future and then into a Deferred, this requires AsyncioSelectorReactor - return Deferred.fromFuture(asyncio.ensure_future(o)) + event_loop = get_asyncio_event_loop_policy().get_event_loop() + return Deferred.fromFuture(asyncio.ensure_future(o, loop=event_loop)) return o @@ -320,7 +321,8 @@ d = treq.get('https://example.com/additional') additional_response = await deferred_to_future(d) """ - return d.asFuture(asyncio.get_event_loop()) + policy = get_asyncio_event_loop_policy() + return d.asFuture(policy.get_event_loop()) def maybe_deferred_to_future(d: Deferred) -> Union[Deferred, Future]: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/misc.py new/Scrapy-2.7.1/scrapy/utils/misc.py --- old/Scrapy-2.7.0/scrapy/utils/misc.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/utils/misc.py 2022-11-02 12:18:03.000000000 +0100 @@ -9,6 +9,7 @@ from contextlib import contextmanager from importlib import import_module from pkgutil import iter_modules +from functools import partial from w3lib.html import replace_entities @@ -226,7 +227,18 @@ return value is None or isinstance(value, ast.NameConstant) and value.value is None if inspect.isgeneratorfunction(callable): - code = re.sub(r"^[\t ]+", "", inspect.getsource(callable)) + func = callable + while isinstance(func, partial): + func = func.func + + src = inspect.getsource(func) + pattern = re.compile(r"(^[\t ]+)") + code = pattern.sub("", src) + + match = pattern.match(src) # finds indentation + if match: + code = re.sub(f"\n{match.group(0)}", "\n", code) # remove indentation + tree = ast.parse(code) for node in walk_callable(tree): if isinstance(node, ast.Return) and not returns_none(node): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/reactor.py new/Scrapy-2.7.1/scrapy/utils/reactor.py --- old/Scrapy-2.7.0/scrapy/utils/reactor.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/utils/reactor.py 2022-11-02 12:18:03.000000000 +0100 @@ -51,6 +51,19 @@ return self._func(*self._a, **self._kw) +def get_asyncio_event_loop_policy(): + policy = asyncio.get_event_loop_policy() + if ( + sys.version_info >= (3, 8) + and sys.platform == "win32" + and not isinstance(policy, asyncio.WindowsSelectorEventLoopPolicy) + ): + policy = asyncio.WindowsSelectorEventLoopPolicy() + asyncio.set_event_loop_policy(policy) + + return policy + + def install_reactor(reactor_path, event_loop_path=None): """Installs the :mod:`~twisted.internet.reactor` with the specified import path. Also installs the asyncio event loop with the specified import @@ -58,16 +71,14 @@ reactor_class = load_object(reactor_path) if reactor_class is asyncioreactor.AsyncioSelectorReactor: with suppress(error.ReactorAlreadyInstalledError): - if sys.version_info >= (3, 8) and sys.platform == "win32": - policy = asyncio.get_event_loop_policy() - if not isinstance(policy, asyncio.WindowsSelectorEventLoopPolicy): - asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy()) + policy = get_asyncio_event_loop_policy() if event_loop_path is not None: event_loop_class = load_object(event_loop_path) event_loop = event_loop_class() asyncio.set_event_loop(event_loop) else: - event_loop = asyncio.get_event_loop() + event_loop = policy.get_event_loop() + asyncioreactor.install(eventloop=event_loop) else: *module, _ = reactor_path.split(".") diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/scrapy/utils/url.py new/Scrapy-2.7.1/scrapy/utils/url.py --- old/Scrapy-2.7.0/scrapy/utils/url.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/scrapy/utils/url.py 2022-11-02 12:18:03.000000000 +0100 @@ -34,6 +34,7 @@ lowercase_path = parse_url(url).path.lower() return any(lowercase_path.endswith(ext) for ext in extensions) + def parse_url(url, encoding=None): """Return urlparsed url from the given argument (which could be an already parsed url) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/tests/test_downloadermiddleware_httpproxy.py new/Scrapy-2.7.1/tests/test_downloadermiddleware_httpproxy.py --- old/Scrapy-2.7.0/tests/test_downloadermiddleware_httpproxy.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/tests/test_downloadermiddleware_httpproxy.py 2022-11-02 12:18:03.000000000 +0100 @@ -400,6 +400,9 @@ self.assertNotIn(b'Proxy-Authorization', request.headers) def test_proxy_authentication_header_proxy_without_credentials(self): + """As long as the proxy URL in request metadata remains the same, the + Proxy-Authorization header is used and kept, and may even be + changed.""" middleware = HttpProxyMiddleware() request = Request( 'https://example.com', @@ -408,7 +411,16 @@ ) assert middleware.process_request(request, spider) is None self.assertEqual(request.meta['proxy'], 'https://example.com') - self.assertNotIn(b'Proxy-Authorization', request.headers) + self.assertEqual(request.headers['Proxy-Authorization'], b'Basic foo') + + assert middleware.process_request(request, spider) is None + self.assertEqual(request.meta['proxy'], 'https://example.com') + self.assertEqual(request.headers['Proxy-Authorization'], b'Basic foo') + + request.headers['Proxy-Authorization'] = b'Basic bar' + assert middleware.process_request(request, spider) is None + self.assertEqual(request.meta['proxy'], 'https://example.com') + self.assertEqual(request.headers['Proxy-Authorization'], b'Basic bar') def test_proxy_authentication_header_proxy_with_same_credentials(self): middleware = HttpProxyMiddleware() diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/tests/test_loader.py new/Scrapy-2.7.1/tests/test_loader.py --- old/Scrapy-2.7.0/tests/test_loader.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/tests/test_loader.py 2022-11-02 12:18:03.000000000 +0100 @@ -295,7 +295,7 @@ l.add_css('name', 'div::text') self.assertEqual(l.get_output_value('name'), ['Marta']) - + def test_init_method_with_base_response(self): """Selector should be None after initialization""" response = Response("https://scrapy.org") diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/tests/test_utils_asyncio.py new/Scrapy-2.7.1/tests/test_utils_asyncio.py --- old/Scrapy-2.7.0/tests/test_utils_asyncio.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/tests/test_utils_asyncio.py 2022-11-02 12:18:03.000000000 +0100 @@ -1,3 +1,4 @@ +import warnings from unittest import TestCase from pytest import mark @@ -13,5 +14,6 @@ self.assertEqual(is_asyncio_reactor_installed(), self.reactor_pytest == 'asyncio') def test_install_asyncio_reactor(self): - # this should do nothing - install_reactor("twisted.internet.asyncioreactor.AsyncioSelectorReactor") + with warnings.catch_warnings(record=True) as w: + install_reactor("twisted.internet.asyncioreactor.AsyncioSelectorReactor") + self.assertEqual(len(w), 0) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/Scrapy-2.7.0/tests/test_utils_misc/test_return_with_argument_inside_generator.py new/Scrapy-2.7.1/tests/test_utils_misc/test_return_with_argument_inside_generator.py --- old/Scrapy-2.7.0/tests/test_utils_misc/test_return_with_argument_inside_generator.py 2022-10-17 15:11:29.000000000 +0200 +++ new/Scrapy-2.7.1/tests/test_utils_misc/test_return_with_argument_inside_generator.py 2022-11-02 12:18:03.000000000 +0100 @@ -1,5 +1,6 @@ import unittest import warnings +from functools import partial from unittest import mock from scrapy.utils.misc import is_generator_with_return_value, warn_on_generator_with_return_value @@ -165,9 +166,99 @@ warn_on_generator_with_return_value(None, l2) self.assertEqual(len(w), 0) + def test_generators_return_none_with_decorator(self): + def decorator(func): + def inner_func(): + func() + return inner_func + + @decorator + def f3(): + yield 1 + return None + + @decorator + def g3(): + yield 1 + return + + @decorator + def h3(): + yield 1 + + @decorator + def i3(): + yield 1 + yield from generator_that_returns_stuff() + + @decorator + def j3(): + yield 1 + + def helper(): + return 0 + + yield helper() + + @decorator + def k3(): + """ +docstring + """ + url = """ +https://example.org + """ + yield url + return + + @decorator + def l3(): + return + + assert not is_generator_with_return_value(top_level_return_none) + assert not is_generator_with_return_value(f3) + assert not is_generator_with_return_value(g3) + assert not is_generator_with_return_value(h3) + assert not is_generator_with_return_value(i3) + assert not is_generator_with_return_value(j3) # not recursive + assert not is_generator_with_return_value(k3) # not recursive + assert not is_generator_with_return_value(l3) + + with warnings.catch_warnings(record=True) as w: + warn_on_generator_with_return_value(None, top_level_return_none) + self.assertEqual(len(w), 0) + with warnings.catch_warnings(record=True) as w: + warn_on_generator_with_return_value(None, f3) + self.assertEqual(len(w), 0) + with warnings.catch_warnings(record=True) as w: + warn_on_generator_with_return_value(None, g3) + self.assertEqual(len(w), 0) + with warnings.catch_warnings(record=True) as w: + warn_on_generator_with_return_value(None, h3) + self.assertEqual(len(w), 0) + with warnings.catch_warnings(record=True) as w: + warn_on_generator_with_return_value(None, i3) + self.assertEqual(len(w), 0) + with warnings.catch_warnings(record=True) as w: + warn_on_generator_with_return_value(None, j3) + self.assertEqual(len(w), 0) + with warnings.catch_warnings(record=True) as w: + warn_on_generator_with_return_value(None, k3) + self.assertEqual(len(w), 0) + with warnings.catch_warnings(record=True) as w: + warn_on_generator_with_return_value(None, l3) + self.assertEqual(len(w), 0) + @mock.patch("scrapy.utils.misc.is_generator_with_return_value", new=_indentation_error) def test_indentation_error(self): with warnings.catch_warnings(record=True) as w: warn_on_generator_with_return_value(None, top_level_return_none) self.assertEqual(len(w), 1) self.assertIn('Unable to determine', str(w[0].message)) + + def test_partial(self): + def cb(arg1, arg2): + yield {} + + partial_cb = partial(cb, arg1=42) + assert not is_generator_with_return_value(partial_cb)