Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package python-tesserocr for openSUSE:Factory checked in at 2023-03-19 00:31:08 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/python-tesserocr (Old) and /work/SRC/openSUSE:Factory/.python-tesserocr.new.31432 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-tesserocr" Sun Mar 19 00:31:08 2023 rev:15 rq:1072758 version:2.6.0 Changes: -------- --- /work/SRC/openSUSE:Factory/python-tesserocr/python-tesserocr.changes 2023-02-27 12:56:07.307701106 +0100 +++ /work/SRC/openSUSE:Factory/.python-tesserocr.new.31432/python-tesserocr.changes 2023-03-19 00:31:08.508251627 +0100 @@ -1,0 +2,15 @@ +Fri Mar 17 22:09:28 UTC 2023 - Mia Herkt <m...@0x0.st> + +- Update to 2.6.0 + * _pix_to_image now works with binary images + gh#sirfz/tesserocr#274 + * SetImage with alpha channels support + gh#sirfz/tesserocr#280 + * Leptonica 1.83.0 support + gh#sirfz/tesserocr#306 + * Pointsize should be returned even if fontname doesn't exist + gh#sirfz/tesserocr#308 + * Added Python 3.10, 3.11 setup classifiers +- Drop 1441bec703cf68161acce5e85907ddd71c47fdc3.patch + +------------------------------------------------------------------- Old: ---- 1441bec703cf68161acce5e85907ddd71c47fdc3.patch tesserocr-2.5.2.tar.gz New: ---- tesserocr-2.6.0.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-tesserocr.spec ++++++ --- /var/tmp/diff_new_pack.RxOr3S/_old 2023-03-19 00:31:09.664257111 +0100 +++ /var/tmp/diff_new_pack.RxOr3S/_new 2023-03-19 00:31:09.668257130 +0100 @@ -17,14 +17,13 @@ Name: python-tesserocr -Version: 2.5.2 +Version: 2.6.0 Release: 0 Summary: A Python wrapper around tesseract-ocr License: MIT Group: Development/Languages/Python URL: https://github.com/sirfz/tesserocr Source: https://files.pythonhosted.org/packages/source/t/tesserocr/tesserocr-%{version}.tar.gz -Patch1: 1441bec703cf68161acce5e85907ddd71c47fdc3.patch BuildRequires: %{python_module Cython} BuildRequires: %{python_module Pillow} BuildRequires: %{python_module devel} @@ -52,7 +51,6 @@ %prep %setup -q -n tesserocr-%{version} -%patch1 -p1 %build %python_build ++++++ tesserocr-2.5.2.tar.gz -> tesserocr-2.6.0.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tesserocr-2.5.2/PKG-INFO new/tesserocr-2.6.0/PKG-INFO --- old/tesserocr-2.5.2/PKG-INFO 2021-06-19 23:08:30.000000000 +0200 +++ new/tesserocr-2.6.0/PKG-INFO 2023-03-14 13:54:18.325920600 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 2.1 Name: tesserocr -Version: 2.5.2 +Version: 2.6.0 Summary: A simple, Pillow-friendly, Python wrapper around tesseract-ocr API using Cython Home-page: https://github.com/sirfz/tesserocr Author: Fayez Zouheiry @@ -14,8 +14,8 @@ wrapper around the ``tesseract-ocr`` API for Optical Character Recognition (OCR). - .. image:: https://travis-ci.org/sirfz/tesserocr.svg?branch=master - :target: https://travis-ci.org/sirfz/tesserocr + .. image:: https://travis-ci.com/sirfz/tesserocr.svg?branch=master + :target: https://travis-ci.com/sirfz/tesserocr :alt: TravisCI build status .. image:: https://img.shields.io/pypi/v/tesserocr.svg?maxAge=2592000 @@ -293,6 +293,8 @@ Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 +Classifier: Programming Language :: Python :: 3.10 +Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: PyPy Classifier: Programming Language :: Cython diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tesserocr-2.5.2/README.rst new/tesserocr-2.6.0/README.rst --- old/tesserocr-2.5.2/README.rst 2021-06-19 23:05:59.000000000 +0200 +++ new/tesserocr-2.6.0/README.rst 2021-09-14 14:55:56.000000000 +0200 @@ -6,8 +6,8 @@ wrapper around the ``tesseract-ocr`` API for Optical Character Recognition (OCR). -.. image:: https://travis-ci.org/sirfz/tesserocr.svg?branch=master - :target: https://travis-ci.org/sirfz/tesserocr +.. image:: https://travis-ci.com/sirfz/tesserocr.svg?branch=master + :target: https://travis-ci.com/sirfz/tesserocr :alt: TravisCI build status .. image:: https://img.shields.io/pypi/v/tesserocr.svg?maxAge=2592000 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tesserocr-2.5.2/setup.py new/tesserocr-2.6.0/setup.py --- old/tesserocr-2.5.2/setup.py 2021-06-19 22:02:07.000000000 +0200 +++ new/tesserocr-2.6.0/setup.py 2023-03-13 16:14:00.000000000 +0100 @@ -317,6 +317,8 @@ 'Programming Language :: Python :: 3.7', 'Programming Language :: Python :: 3.8', 'Programming Language :: Python :: 3.9', + 'Programming Language :: Python :: 3.10', + 'Programming Language :: Python :: 3.11', 'Programming Language :: Python :: Implementation :: CPython', 'Programming Language :: Python :: Implementation :: PyPy', 'Programming Language :: Cython', diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tesserocr-2.5.2/tesseract.pxd new/tesserocr-2.6.0/tesseract.pxd --- old/tesserocr-2.5.2/tesseract.pxd 2021-06-19 22:09:33.000000000 +0200 +++ new/tesserocr-2.6.0/tesseract.pxd 2021-09-14 18:48:53.000000000 +0200 @@ -8,6 +8,7 @@ cdef extern from "leptonica/allheaders.h" nogil: struct Pix: int informat + int d struct Box: int x @@ -36,6 +37,7 @@ Pix *pixReadMemBmp(cuchar_t *, size_t) int pixWriteMemJpeg(unsigned char **, size_t *, Pix *, int, int) int pixWriteMem(unsigned char **, size_t *, Pix *, int) + Pix *pixConvertTo8(Pix *, int) void pixDestroy(Pix **) void ptaDestroy(Pta **) int setMsgSeverity(int) @@ -236,6 +238,10 @@ cdef cppclass TessTextRenderer(TessResultRenderer): TessTextRenderer(cchar_t *) except + + IF TESSERACT_VERSION >= 0x3999800: + cdef cppclass TessAltoRenderer(TessResultRenderer): + TessAltoRenderer(cchar_t *) except + + cdef cppclass TessHOcrRenderer(TessResultRenderer): TessHOcrRenderer(cchar_t *, bool) except + diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tesserocr-2.5.2/tesseract5.pxd new/tesserocr-2.6.0/tesseract5.pxd --- old/tesserocr-2.5.2/tesseract5.pxd 2021-06-19 22:53:48.000000000 +0200 +++ new/tesserocr-2.6.0/tesseract5.pxd 2023-03-13 16:13:32.000000000 +0100 @@ -9,6 +9,7 @@ cdef extern from "leptonica/allheaders.h" nogil: struct Pix: int informat + int d struct Box: int x @@ -37,6 +38,7 @@ Pix *pixReadMemBmp(cuchar_t *, size_t) int pixWriteMemJpeg(unsigned char **, size_t *, Pix *, int, int) int pixWriteMem(unsigned char **, size_t *, Pix *, int) + Pix *pixConvertTo8(Pix *, int) void pixDestroy(Pix **) void ptaDestroy(Pta **) int setMsgSeverity(int) @@ -52,6 +54,16 @@ L_SEVERITY_ERROR = 5 # Print error and higher messages L_SEVERITY_NONE = 6 # Highest severity: print no messages +cdef extern from *: + """ + #if (LIBLEPT_MAJOR_VERSION > 1) || (LIBLEPT_MINOR_VERSION > 82) + // The public API of Leptonica 1.83.0 hides details of some data + // structures which are used by tesserocr (see Pix, Box, ... above). + // Get those details by including a private header file. + #include <leptonica/pix_internal.h> + #endif + """ + cdef extern from "tesseract/publictypes.h" namespace "tesseract" nogil: cdef enum PolyBlockType: PT_UNKNOWN # Type is not yet known. Keep as the first element. @@ -175,6 +187,9 @@ cdef cppclass TessTextRenderer(TessResultRenderer): TessTextRenderer(cchar_t *) except + + cdef cppclass TessAltoRenderer(TessResultRenderer): + TessAltoRenderer(cchar_t *) except + + cdef cppclass TessHOcrRenderer(TessResultRenderer): TessHOcrRenderer(cchar_t *, bool) except + diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tesserocr-2.5.2/tesserocr.egg-info/PKG-INFO new/tesserocr-2.6.0/tesserocr.egg-info/PKG-INFO --- old/tesserocr-2.5.2/tesserocr.egg-info/PKG-INFO 2021-06-19 23:08:29.000000000 +0200 +++ new/tesserocr-2.6.0/tesserocr.egg-info/PKG-INFO 2023-03-14 13:54:17.000000000 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 2.1 Name: tesserocr -Version: 2.5.2 +Version: 2.6.0 Summary: A simple, Pillow-friendly, Python wrapper around tesseract-ocr API using Cython Home-page: https://github.com/sirfz/tesserocr Author: Fayez Zouheiry @@ -14,8 +14,8 @@ wrapper around the ``tesseract-ocr`` API for Optical Character Recognition (OCR). - .. image:: https://travis-ci.org/sirfz/tesserocr.svg?branch=master - :target: https://travis-ci.org/sirfz/tesserocr + .. image:: https://travis-ci.com/sirfz/tesserocr.svg?branch=master + :target: https://travis-ci.com/sirfz/tesserocr :alt: TravisCI build status .. image:: https://img.shields.io/pypi/v/tesserocr.svg?maxAge=2592000 @@ -293,6 +293,8 @@ Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 +Classifier: Programming Language :: Python :: 3.10 +Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: PyPy Classifier: Programming Language :: Cython diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tesserocr-2.5.2/tesserocr.pyx new/tesserocr-2.6.0/tesserocr.pyx --- old/tesserocr-2.5.2/tesserocr.pyx 2021-06-19 22:09:33.000000000 +0200 +++ new/tesserocr-2.6.0/tesserocr.pyx 2023-03-14 13:52:05.000000000 +0100 @@ -18,7 +18,7 @@ ['eng', 'osd', 'equ']) """ -__version__ = '2.5.2' +__version__ = '2.6.0' import os from io import BytesIO @@ -329,6 +329,9 @@ cdef bytes _image_buffer(image): """Return raw bytes of a PIL Image""" with BytesIO() as f: + # Pix and BMP only allow alpha as RGBA: + if image.mode in ['LA', 'PA', 'RGBa', 'La']: + image = image.convert('RGBA') image.save(f, image.format or 'BMP') return f.getvalue() @@ -340,6 +343,9 @@ size_t size int result int fmt = pix.informat + if pix.d == 1: + # prevent catastrophic 8-bit conversion + pix = pixConvertTo8(pix, 0) if fmt > 0: result = pixWriteMem(&buff, &size, pix, fmt) else: @@ -889,7 +895,7 @@ &is_monospace, &is_serif, &is_smallcaps, &pointsize, &font_id) if font_name == NULL: - return None + font_name = "" return { 'font_name': font_name, 'bold': is_bold, @@ -2069,10 +2075,19 @@ renderer = new TessOsdRenderer(outputbase) return renderer + IF TESSERACT_VERSION >= 0x3999800: + self._baseapi.GetBoolVariable("tessedit_create_alto", &b) + if b: + renderer = new TessAltoRenderer(outputbase) + self._baseapi.GetBoolVariable("tessedit_create_hocr", &b) if b: self._baseapi.GetBoolVariable("hocr_font_info", &font_info) - renderer = new TessHOcrRenderer(outputbase, font_info) + temp = new TessHOcrRenderer(outputbase, font_info) + if renderer == NULL: + renderer = temp + else: + renderer.insert(temp) self._baseapi.GetBoolVariable("tessedit_create_pdf", &b) if b: @@ -2120,6 +2135,9 @@ Set at least one of the following variables to enable renderers before calling this method:: + tessedit_create_alto (bool): ALTO Renderer + Make sure to set ``document_title`` to the image filename if you + want to have the ALTO-XML reference it. tessedit_create_hocr (bool): hOCR Renderer if ``font_info`` is ``True`` then it'll be included in the output. tessedit_create_pdf (bool): PDF Renderer @@ -2135,8 +2153,13 @@ Args: outputbase (str): The name of the output file excluding extension. For example, "/path/to/chocolate-chip-cookie-recipe". + Must not be empty. Use "-" or "stdout" to write to the current + process' standard output. filename (str): Can point to a single image, a multi-page TIFF, - or a plain text list of image filenames. + or a plain text list of image filenames. If Tesseract is built + with libcurl support, and ``str`` is a URL starting with "http:" + or "https:" then the image file is downloaded from that location + to the current working directory first. Kwargs: retry_config (str): Is useful for debugging. If specified, you can fall @@ -2174,11 +2197,13 @@ """Turn a single image into symbolic text. See :meth:`ProcessPages` for descriptions of the keyword arguments - and all other details. + and all other details (esp. output renderers). Args: outputbase (str): The name of the output file excluding extension. For example, "/path/to/chocolate-chip-cookie-recipe". + Must not be empty. Use "-" or "stdout" to write to the current + process' standard output. image (:class:`PIL.Image`): The image processed. page_index (int): Page index (metadata). filename (str): `filename` and `page_index` are metadata