[Bug 21526] Bug in Djvu text layer extraction

2010-12-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 MZMcBride changed: What|Removed |Added CC||b...@mzmcbride.com --- Comment #17 from MZ

[Bug 21526] Bug in Djvu text layer extraction

2010-12-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #16 from Simon Lipp 2010-12-08 11:41:32 UTC --- @Tim Starling I wasn’t aware of the performance issues of using create_function, sorry. But since the created function is static, it should be trivial to factor it out ; I used create

[Bug 21526] Bug in Djvu text layer extraction

2010-12-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #15 from Andrew Billinghurst 2010-12-08 11:28:21 UTC --- Many thanks to all. As a side not to Wikisourcerers the files need to be purged at Commons to get them to reload the text layer properly. -- Configure bugmail: https://bug

[Bug 21526] Bug in Djvu text layer extraction

2010-12-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 Tim Starling changed: What|Removed |Added CC||tstarl...@wikimedia.org --- Comment #14

[Bug 21526] Bug in Djvu text layer extraction

2010-11-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #13 from Andrew Billinghurst 2010-11-03 10:55:19 UTC --- This is reported as fixed and for a period of time, and even with asking nicely for it to be given some priority for the Wikisource sites there is neither action, nor evidenc

[Bug 21526] Bug in Djvu text layer extraction

2010-07-25 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #12 from Simon Lipp 2010-07-25 15:17:25 UTC --- Well, in the meanwhile, it’s still possible to manually fix the broken djvu files ; my own pdf to djvu converter has these lines : # Workaround for MediaWiki bug #21526 # see https:/

[Bug 21526] Bug in Djvu text layer extraction

2010-07-25 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #11 from Andrew Billinghurst 2010-07-25 14:44:47 UTC --- It would be nice if this bug fix could be considered out of session to look to be implemented at the Wikisource sites ahead of scheduled updates (next full application review

[Bug 21526] Bug in Djvu text layer extraction

2010-07-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 ThomasV changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|

[Bug 21526] Bug in Djvu text layer extraction

2010-07-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #9 from Simon Lipp 2010-07-07 10:15:52 UTC --- Created an attachment (id=7557) --> (https://bugzilla.wikimedia.org/attachment.cgi?id=7557) Patch Found the problem (I dropped the empty-page case). Attached an updated patch that fi

[Bug 21526] Bug in Djvu text layer extraction

2010-07-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #8 from Simon Lipp 2010-07-06 16:10:27 UTC --- > With the patch, pages are no longer aligned with the text. Strange ; by the time I made the patch, I didn’t see this problem. I’ll look at it during this week. -- Configure bugmai

[Bug 21526] Bug in Djvu text layer extraction

2010-07-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #7 from Andrew Billinghurst 2010-07-06 16:02:48 UTC --- http://en.wikisource.org/wiki/Index:Blackwood%27s_Magazine_volume_003.djvu has a problem at 122 http://en.wikisource.org/w/index.php?title=Page:Blackwood%27s_Magazine_volume_0

[Bug 21526] Bug in Djvu text layer extraction

2010-07-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #6 from ThomasV 2010-07-06 08:58:21 UTC --- I tested your patch on this djvu file: http://fr.wikisource.org/wiki/Livre:Revue_des_Romans_%281839%29.djvu The file does not have the bug; djvu text extraction works without the patch. W

[Bug 21526] Bug in Djvu text layer extraction

2010-07-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #5 from Simon Lipp 2010-07-06 08:32:09 UTC --- > or provide a posix regexp ? That’s not possible. Matching C-like quoted strings needs look-ahead and possessive operators, which are not available in POSIX syntax. But if you have a

[Bug 21526] Bug in Djvu text layer extraction

2010-07-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #4 from ThomasV 2010-07-06 08:12:44 UTC --- The proposed patch is a perl-compatible regexp. I am not familiar with that syntax, this is why I have not commited it. Could someone have a look at it, or provide a posix regexp ? --

[Bug 21526] Bug in Djvu text layer extraction

2010-07-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 Andrew Billinghurst changed: What|Removed |Added CC||billinghu...@gmail.com A

[Bug 21526] Bug in Djvu text layer extraction

2010-04-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 Lars Aronsson changed: What|Removed |Added Priority|Normal |High Severity|enhancement

[Bug 21526] Bug in Djvu text layer extraction

2010-04-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 --- Comment #2 from Lars Aronsson 2010-05-01 01:12:56 UTC --- To extract the OCR text (without pixel coordinates for each word) for the page NNN, this command should do: djvused -e 'select NNN; print-pure-txt' FILENAME.djvu -- Configure bugm

[Bug 21526] Bug in Djvu text layer extraction

2010-04-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 Lars Aronsson changed: What|Removed |Added CC||l...@aronsson.se --- Comment #1 from L

[Bug 21526] Bug in Djvu text layer extraction

2009-11-15 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21526 ThomasV changed: What|Removed |Added CC||thoma...@gmx.de Status|NEW