Ninovolador added a comment.
In T352524#9374563 <https://phabricator.wikimedia.org/T352524#9374563>, @Samwilson wrote: >> Maybe someone at Wikimedia OCR project can tell us how they manage to get the full resolution image from every page > > The Wikisource extension uses the same image that's already in the Page namespace page (i.e. from ProofreadPage). The width of this image can be customized per Index page, but is usually somewhere around 1000 pixels. That's curious. I went to check and noticed that when you zoom out on the openseadragon thing, the OCR button uses the lower resolution image. I don't think this is expected. Compare: https://ocr.wmcloud.org/api.php?engine=tesseract&langs[]=es&image=https://upload.wikimedia.org/wikipedia/commons/thumb/f/f2/Origen_de_las_especies_por_medio_de_la_selecci%C3%B3n_natural.djvu/page141-987px-Origen_de_las_especies_por_medio_de_la_selecci%C3%B3n_natural.djvu.jpg&uselang=es https://ocr.wmcloud.org/api.php?engine=tesseract&langs[]=es&image=https://upload.wikimedia.org/wikipedia/commons/thumb/f/f2/Origen_de_las_especies_por_medio_de_la_selecci%C3%B3n_natural.djvu/page141-98700px-Origen_de_las_especies_por_medio_de_la_selecci%C3%B3n_natural.djvu.jpg&uselang=es Fortunately, if you put in an arbitrarily large image size (I guess something like 3000px is enough for most cases), the thumbnail server gets you the higher resolution avalaible, and OCR quality increases. TASK DETAIL https://phabricator.wikimedia.org/T352524 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ninovolador Cc: Samwilson, Aklapper, pywikibot-bugs-list, Ninovolador, mevo, KLawal-WMF, PMenon-WMF, KSiebert, NRodriguez, PotsdamLamb, Osps7, Jyoo1011, JohnsonLee01, SHEKH, Dijkstra, Khutuck, Zkhalido, HMonroy, Viztor, Wenyi, Inductiveload, dmaza, Xover, Tbscho, MayS, Mdupont, JJMC89, B20180, Dvorapa, Bodhisattwa, Altostratus, TheresNoTime, Avicennasis, Nakon, mys_721tx, MusikAnimal, Xqt, jayvdb, Ricordisamoa, -jem-, Thurs, Masti, Alchimista, Krenair
_______________________________________________ pywikibot-bugs mailing list -- [email protected] To unsubscribe send an email to [email protected]
