[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-19 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #39 from Ziyuan Yao yaoziy...@gmail.com 2012-01-19 23:50:38 UTC 
---
Are render servers updated yet? As I still see Chinese lines not take up a
page's full width (there's much space left on each Chinese line's right side).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-13 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #37 from Volker Haas volker.h...@pediapress.com 2012-01-13 
14:08:02 UTC ---
I updated to the latest reportlab version. The problem mixing cjk and non-cjk
text should be fixed. The render servers will be updated sometime next week.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-13 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #38 from Ziyuan Yao yaoziy...@gmail.com 2012-01-13 14:25:05 UTC 
---
Volker: Appreciate your hard work!

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #34 from Volker Haas volker.h...@pediapress.com 2012-01-12 
13:04:48 UTC ---
I just found out that the latest reportlab version seems to handle non-cjk text
inside cjk text (with wordWrap='CJK') correctly. Installation of the newest
reportlab version failed, and I didn't realize that. 
-- Merging the latest reportlab version should therefore solve this problem.
I'll see if I can do this...

One Problem in non-cjk inside cjk remains: the text isnt' justified correctly
anymore, but I'd just ignore that...

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #35 from Ziyuan Yao yaoziy...@gmail.com 2012-01-12 13:14:08 UTC 
---
Great to hear that. Eager to see a sample PDF of your latest finding.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #36 from Ziyuan Yao yaoziy...@gmail.com 2012-01-12 16:40:16 UTC 
---
I confirm. I downloaded and installed the latest snapshot
reportlab-20120111203740 successfully and ran your test script. It does wrap
both CJK and Western text correctly.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

Volker Haas volker.h...@pediapress.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #23 from Volker Haas volker.h...@pediapress.com 2012-01-11 
09:19:55 UTC ---
As you also found out, reportlab does not support zero width space chars. I
needed that for other purposes in the past as well. The best solution/hack I
could come up with, was too use space and set the font size to the smallest
possible value.

I implemented the following:

In all non-cjk wikis the text is checked for cjk characters. If cjk characters
are found fake zero-width-space chars are inserted. I tested this for a couple
of articles and the strategy seems to make sense.

As for your suggestion to use another PDF framework as reportlab: doing this is
a huge amount of work, therefore this is no option at the moment.

The render servers will be updated in the next 24 hours. I'll close this as
fixed.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #24 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 10:31:56 UTC 
---
Great news. Can you give me a PDF that demostrates your smallest-size spaces?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #25 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 10:36:18 UTC 
---
I like your solution for non-cjk wikis (using tiny spaces). But you didn't
mention what to do with cjk wikis. I assume you will use wordwrap=CJK for them,
right?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #26 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 10:40:57 UTC 
---
I just tried out your tiniest space concept in LibreOffice. Perfect!
Virtually invisible spaces! You're a genius. No need to show me the PDF now.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #27 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 10:46:51 UTC 
---
One more question: Your tiny-space idea is a universal solution that can also
apply to cjk wikis, because a cjk wiki can also contain Western words (which
better be wrapped at spaces).

What is the reason you don't apply it to cjk wikis?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #28 from Volker Haas volker.h...@pediapress.com 2012-01-11 
13:03:00 UTC ---
For cjk wikis the built-in cjk word wrapping of reportlab is used. This
probably breaks non-cjk text that is embedded...But I am pretty sure that at
least for japanese the algorithms to break lines are more sophisticated than
just splitting after any letter. I am hoping that the built-in reportlab word
wrapping function does that. But I am not sure...

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #29 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 14:07:10 UTC 
---
First, using ReportLab's cjk wordwrap algorithm will break English words into
two lines. This is well demonstrated by your own test script.

Second, also ReportLab's cjk wordwrap algorithm can more sophisticatedly break
Japanese sentences, this benefit is very small, while the drawback of cutting
Western words in halves is very significant.

In Chinese, some full-width punctuation marks such as ,。;” generally don't
appear at the beginning of a line either, but as a Chinese I consider this an
expendable rule if we can keep Western words uncut.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #30 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 14:07:48 UTC 
---
s/also/although

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #31 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 14:22:29 UTC 
---
Here are two Wikipedia links that talk about the so-called CJK wordwrap rules:

http://en.wikipedia.org/wiki/Word_wrap#Word_wrapping_in_text_containing_Chinese.2C_Japanese.2C_and_Korean

http://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_language#Line_breaking_rules_in_Japanese_text_.28Kinsoku_Shori.29

I have reviewed them all. Not a single of them is as serious as don't break
Western words into two lines. They can be ignored altogether. Most text
editors and viewers don't obey these rules anyway.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #32 from Ziyuan Yao yaoziy...@gmail.com 2012-01-12 04:32:58 UTC 
---
Found a problem with the tiny space approach: Chinese characters don't take up
the full space of a line; there is still much space left on the right side of
each line. For example: try http://www.mediawiki.org/wiki/MediaWiki/zh-hans

I guess this is caused by how ReportLab counts the text length of what's
already put on a line: after putting each word, it adds that word's length and
a normal space's width. But now there are actually two kinds of space width:
normal width (as between two English words) and tiny width (as between two
Chinese characters). It seems ReportLab thinks all spaces are using the normal
width, therefore starting a new line prematurely.

Can this be fixed? Can you let ReportLab count tiny spaces as tiny spaces, not
normal spaces?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #33 from Ziyuan Yao yaoziy...@gmail.com 2012-01-12 05:07:13 UTC 
---
If we can't easily modify ReportLab to distinguish tiny space widths from
normal space widths, I'd rather see this arrangement:

For non-cjk wikis, insert a normal-sized space after each CJK character, and
then use wordwrap=Western.

For cjk wikis, use wordwrap=cjk.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

Volker Haas volker.h...@pediapress.com changed:

   What|Removed |Added

 CC||volker.h...@pediapress.com

--- Comment #13 from Volker Haas volker.h...@pediapress.com 2012-01-10 
14:45:01 UTC ---
We need to distinguish two different cases:

1) rendering a PDF from the chinese/japanese Wikipedia.

2) rendering a PDF from any other wikipedia which has some chinese/japanese
text embedded inside the article.

The example you (Ziyuan) give at the very top is case 2). Your last post
suggests that this case can be handled correctly with a recent reportlab
version.

I believe this is not true. I checked out the latest reportlab version from
their subversion repository and made a little test script (I'll attach that).
The result seems to indicate that mixed cjk and and non-cjk text can't be
rendered correctly. The line breaks are either correct for cjk of non-cjk text.
(line wrapping behaviour can be toggled by enabling or disabling the CJK
wordWrapping.)

(I didn't bother to use a proper font for cjk text - but that should not
matter, except that all cjk letters are rendered as black boxes.)

Case 1) is a different matter: this should basically work. If not please
provide a minimal example / article URL.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #14 from Volker Haas volker.h...@pediapress.com 2012-01-10 
14:46:35 UTC ---
Created attachment 9833
  -- https://bugzilla.wikimedia.org/attachment.cgi?id=9833
test script for linebreak check for mixed cjk and non-cjk text

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #15 from Ziyuan Yao yaoziy...@gmail.com 2012-01-10 15:36:52 UTC 
---
First, I don't have MediaWiki installed on my computer so I can't run your test
script.

If ReportLab doesn't support line wrapping for mixed cjk and non-cjk text
correctly, I suggest we do the following:

Step 1: For every CJK character in the text, insert a Unicode control character
U+200B zero-width space after it. This is supposed to cause a line-wrapping
after a CJK character when a line is full.

Step 2: Disable CJK wordWrapping. Use Western-style word wrapping.

Step 3: Now you should see a long CJK string wrapped at the end of a line.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #16 from Ziyuan Yao yaoziy...@gmail.com 2012-01-10 15:44:17 UTC 
---
The line-wrapping rule for CJK/non-CJK mixed text is actually very simple: You
should either wrap the line at a whitespace (as in a Western text), or after a
CJK character.

So, if possible, use the above rule to pre-wrap a text before feeding it to
ReportLab.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #17 from Ziyuan Yao yaoziy...@gmail.com 2012-01-10 15:51:36 UTC 
---
OK, now I installed python-reportlab in my Fedora 16 and can run your test
script. I understand your problem. I'll test if I can insert U+200B after every
CJK character. If U+200B fails, we can insert a normal space after every CJK
character. This will definitely wrap a line after a CJK character, but with the
drawback that all CJK characters will be separated by spaces (instead of
sticking together).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #18 from Ziyuan Yao yaoziy...@gmail.com 2012-01-10 16:05:04 UTC 
---
OK. I tried. U+200B doesn't work with ReportLab:

p1 =
Paragraph(uMediaWiki\u200B是\u200B一\u200B个\u200B最\u200B初\u200B用\u200B于\u200B维\u200B基\u200B百\u200B科\u200B的\u200B自\u200B由\u200Bwiki\u200B程\u200B序\u200B包\u200B,\u200B用\u200BPHP\u200B语\u200B言\u200B写\u200B成\u200B。\u200B现\u200B在\u200B,\u200B非\u200B营\u200B利\u200B的\u200B维\u200B基\u200B媒\u200B体\u200B基\u200B金\u200B会\u200B的\u200B其\u200B他\u200B计\u200B划\u200B、\u200B许\u200B多\u200B其\u200B他\u200Bwiki\u200B网\u200B站\u200B以\u200B及\u200B本\u200B网\u200B站\u200B(\u200BMediaWiki\u200B主\u200B页\u200B)\u200B都\u200B在\u200B使\u200B用\u200B这\u200B个\u200B程\u200B序\u200B包\u200B。,
s)

But normal spaces do:

p1 = Paragraph(uMediaWiki 是 一 个 最 初 用 于 维 基 百 科 的 自 由 wiki 程 序 包 , 用 PHP 语 言 写
成 。 现 在 , 非 营 利 的 维 基 媒 体 基 金 会 的 其 他 计 划 、 许 多 其 他 wiki 网 站 以 及 本 网 站 (
MediaWiki 主 页 ) 都 在 使 用 这 个 程 序 包 。, s)

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #19 from Ziyuan Yao yaoziy...@gmail.com 2012-01-10 16:08:13 UTC 
---
I'll write to ReportLab's mailing list, suggesting them to create a new
wordWrap option mixed, so that ReportLab can directly support wrapping mixed
text.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #20 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 03:34:56 UTC 
---
ReportLab says working on this problem is not their priority. So I'm trying to
fix it personally in their source code.

I found (and they told me) their source code is actually very old (2006). It's
before Unicode went mainstream, which is why they don't support mixed text
wrapping well.

So, is it hard for PediaPress to switch to a more modern PDF library, such as
TCPDF, which I already saw people say is good at Unicode support?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #21 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 04:26:15 UTC 
---
Cite http://en.wikipedia.org/wiki/TCPDF :

TCPDF is currently the only PHP-based library that includes complete support
for UTF-8 Unicode and right-to-left languages, including the bidirectional
algorithm.[1]

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #22 from Ziyuan Yao yaoziy...@gmail.com 2012-01-11 06:11:47 UTC 
---
OK, Volker Haas, I have come up with a simple way to fix all this:

We will first determine whether a wiki page is mostly Western (then we'll use
wordwrap=Western) or not (then we'll use wordwrap=CJK).

The definition of mostly Western can be: the longest consecutive CJK string
in the page is shorter than 10 characters.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #10 from Ziyuan Yao yaoziy...@gmail.com 2012-01-02 11:10:35 UTC 
---
It seems currently MediaWiki's Collection extension uses the ReportLab PDF
library to render PDF files
(http://www.mediawiki.org/wiki/Extension:PDF_Writer#Technical).

ReportLab is one of the PDF libraries listed in the above Wikipedia reference.

Maybe we should persuade ReportLab to fix this problem first.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #11 from Ziyuan Yao yaoziy...@gmail.com 2012-01-02 11:19:32 UTC 
---
On ReportLab's Samples page
(http://www.reportlab.com/software/documentation/rml-samples/), there is a
test_031_japanese.pdf
(http://www.reportlab.com/examples/rml/test/test_031_japanese.pdf) which shows
that ReportLab can do Japanese text wrapping perfectly, while MediaWiki's
Download as PDF can't wrap a long Japanese line at all. Why is that? I'm also
asking this on ReportLab's mailing list
(http://two.pairlist.net/mailman/listinfo/reportlab-users).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2012-01-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #12 from Ziyuan Yao yaoziy...@gmail.com 2012-01-02 12:05:17 UTC 
---
Good news, everybody! The solution to this problem has been given by
ReportLab's personnel, as follows:

On 2 January 2012 11:33, Yao Ziyuan yaoziy...@gmail.com wrote:
 So now I'm confused. Is it MediaWiki's or ReportLab's fault for the
 line wrapping problem described in the above bug report
 (https://bugzilla.wikimedia.org/show_bug.cgi?id=33430)?


MediaWiki (actually PediaPress.de) decided to use our library a few years ago;
we did some work to improve inline images to support equations, but
they did not mention Asian line wrapping at the time and I did not know
about this limitation.

I guess they are simply not using our wordwrap=CJK option.  Our library
needs to be told this is Japanese/Chinese, use a different algorithm;
it does not auto-detect based on the encoding.

Also, until some time last year, we could not properly handle mixed text
in the same sentence.  We have improved this now.


- Andy

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #3 from Ziyuan Yao yaoziy...@gmail.com 2011-12-30 11:47:20 UTC ---
Enabling the Chinese Wikipedia to provide ebook creation properly can help
spread Wikipedia knowledge in China freely.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

Mark A. Hershberger m...@everybody.org changed:

   What|Removed |Added

   Priority|Unprioritized   |High
 CC||m...@everybody.org

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

Christoph Kepper christoph.kep...@pediapress.com changed:

   What|Removed |Added

 CC||christoph.kepper@pediapress
   ||.com

--- Comment #4 from Christoph Kepper christoph.kep...@pediapress.com 
2011-12-30 20:30:18 UTC ---
Fixing this bug will probably be only a partial success. About 18 month ago we
(PediaPress) were experimenting a little bit with Japanese, but we encountered
numerous problems (text-direction, layout rules, lack of support both from the
community and our tools) that scared us off pursuing this further. Imo it will
take a lot of determination and perseverance as well as ongoing support from
native speakers/developers to create decent ebooks.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #5 from Ziyuan Yao yaoziy...@gmail.com 2011-12-30 21:08:24 UTC ---
(In reply to comment #4)
 Fixing this bug will probably be only a partial success. About 18 month ago we
 (PediaPress) were experimenting a little bit with Japanese, but we encountered
 numerous problems (text-direction, layout rules, lack of support both from the
 community and our tools) that scared us off pursuing this further. Imo it will
 take a lot of determination and perseverance as well as ongoing support from
 native speakers/developers to create decent ebooks.

First, fixing this bug alone will improve the usefulness of Chinese/Japanese
ebooks from 1% to 99.9%.

Second, I suggest MediaWiki reuses a mature HTML rendering engine (e.g. WebKit)
or text rendering engine (e.g. Pango), instead of reinventing all the wheels
again.

Third, MediaWiki can for now ignore complex formatting features such as
text-direction, layout rules and just focus on drawing plain text lines and
images correctly. Keep it simple, stupid for the first version.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #6 from Ziyuan Yao yaoziy...@gmail.com 2011-12-30 21:31:07 UTC ---
I have played around with some Chinese pages on mediawiki.org and so far the
only problem I have seen is no line wrapping. I don't see problems you
mentioned like text-direction; note that Chinese and Japanese also use the
left-to-right text direction just like English. Text direction is only a
problem for Middle East languages like Arabic and Hebrew.

I see MediaWiki can already draw basic stuff right: text, images and tables,
except line wrapping for Chinese/Japanese.

Here is a simple rule set for line wrapping:

IF there is a whitespace near the page's right margin THEN
break the line at that whitespace;
ELSE IF there is a Chinese/Japanese character near the page's right margin THEN
break before or after that character;
ELSE
break forcibly at the page's right margin (and optionally draw a soft
return character to indicate this forced break).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #7 from Ziyuan Yao yaoziy...@gmail.com 2011-12-30 21:42:48 UTC ---
Although Chinese and Japanese don't use spaces to separate words, you can
actually think there is an invisible space before and after every
Chinese/Japanese character, and this invisible space is always a good
line-wrapping point just like normal spaces.

There is actually a Unicode control character U+200B zero-width space
(http://en.wikipedia.org/wiki/Zero-width_space) for this invisible space
concept.

With U+200B in mind, we can also simplify our line-wrapping rule set as:

add a U+200B after every Chinese/Japanese character;
IF there is a whitespace (including U+200B) near the page's right margin THEN
break the line at that whitespace;
ELSE
break forcibly at the page's right margin (and optionally draw a soft
return character to indicate this forced break).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #8 from Ziyuan Yao yaoziy...@gmail.com 2011-12-30 21:46:06 UTC ---
Either of the above two rule sets can solve the line wrapping problem, although
in the long run I recommend using a mature HTML-to-PDF library instead of
reinventing all the wheels.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #9 from Ziyuan Yao yaoziy...@gmail.com 2011-12-31 03:46:09 UTC ---
I just did a little research on what FOSS PDF libraries are available. Here's a
good list:

http://en.wikipedia.org/wiki/List_of_PDF_software#Development_libraries

TCPDF (http://en.wikipedia.org/wiki/TCPDF) seems to be a good candidate.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

p858snake p858sn...@gmail.com changed:

   What|Removed |Added

 CC||developm...@pediapress.com,
   ||p858sn...@gmail.com
  Component|General/Unknown |Collection
Version|1.18.0  |any
Product|MediaWiki   |MediaWiki extensions

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

shi zhao shiz...@gmail.com changed:

   What|Removed |Added

   Keywords||accessibility
 CC||shiz...@gmail.com

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

Benjamin Chen cn.chenmi...@gmail.com changed:

   What|Removed |Added

 CC||cn.chenmi...@gmail.com

--- Comment #1 from Benjamin Chen cn.chenmi...@gmail.com 2011-12-30 07:46:54 
UTC ---
(In reply to comment #0)
 The Chinese and Japanese languages are the only two languages in the world 
 that
 don't use spaces to separate words. 

Not true, many other languages does not use spaces as well :)

IIRC, the wrapping of Chinese paragraphs was fine around September because I
used it to generate several files. Probably some changes in the config or
extension itself caused this problem.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 33430] Create a book and Download as PDF don't wrap Chinese or Japanese lines

2011-12-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=33430

--- Comment #2 from Ziyuan Yao yaoziy...@gmail.com 2011-12-30 07:58:28 UTC ---
Benjamin Chen: AFAIK, Only Chinese and Japanese apply. Korean uses square-like
characters but it does have spaces between words.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l