On 03/10/2013 12:19 AM, Victor Vasiliev wrote:
> After recent discussion on this list I realized that this has been
> in discussion for as long as four years I went WTF and decided to
> Just Go Ahead and Fix It. As a result, I made a patch to MediaWiki
> which allows it to output recent changes feed in JSON: 
> <https://gerrit.wikimedia.org/r/#/c/52922/>
> 
> Also, I wrote a daemon which captures this feed and serves them
> through WebSockets and simple text-oriented protocol [...] : 
> <https://github.com/wikimedia/mediawiki-rcsub>
> 
> This daemon is written in Python using Twisted and Autobahn and it
> takes ~200 lines of code (initial version took ~80).

One thing you should consider is whether to escape non-ASCII
characters (characters above U+007F) or to encode them using UTF-8.

Python's json.dumps() escapes these characters by default
(ensure_ascii = True). If you don't want them escaped (as hex-encoded
UTF-16 code units), it's best to decide now, before clients with
broken UTF-8 support come into use.

I recently made a [patch][1] (not yet merged) that would add an opt-in
"UTF8_OK" feature to FormatJson::encode(). The new option would
unescape everything above U+007F (except for U+2028 and U+2029, for
compatibility with JavaScript eval() based parsing).

> I hope that now getting recent changes via reasonable format is a
> matter of code review and deployment, and we will finally get
> something reasonable to work with (with access from web
> browsers!).

I don't consider encoding "撤销由158.64.77.102于2013年1月22日 (二)
16:46的版本24659468中的繁简破坏" (90 bytes using UTF-8) as

"\u64a4\u9500\u7531158.64.77.102\u4e8e2013\u5e741\u670822\u65e5
(\u4e8c)
16:46\u7684\u7248\u672c24659468\u4e2d\u7684\u7e41\u7b80\u7834\u574f"
(141 bytes)

to be reasonable at all for a brand-new protocol running over an 8-bit
clean channel.

[1]: https://gerrit.wikimedia.org/r/#/c/50140/

-- 
Wikipedia user PleaseStand
http://en.wikipedia.org/wiki/User:PleaseStand

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to