Ragib,

(copied Tamil Wiki list)

We've faced an issue similar to Bug #5948. Due to non-canonicalisation, there 
are two articles on the same title in Tamil Wikipedia!
http://ta.wikipedia.org/wiki/%E0%AE%AA%E0%AF%87%E0%AE%9A%E0%AF%8D%E0%AE%9A%E0%AF%81:%E0%AE%AE%E2%80%8C%E0%AE%9E%E0%AF%8D%E0%AE%9A%E2%80%8C%E0%AE%B3%E0%AF%8D_%E0%AE%95%E0%AE%BE%E0%AE%AE%E0%AE%BE%E0%AE%B2%E0%AF%88
 (Tamil discussion)

- Sundar

 "That language is an instrument of human reason, and not merely a medium for 
the expression of thought, is a truth generally admitted."
- George Boole, quoted in Iverson's Turing Award Lecture



----- Original Message ----
> From: Ragib Hasan <ragibha...@gmail.com>
> To: Discussion list on Indian language projects of Wikimedia. 
><wikimediaindia-l@lists.wikimedia.org>
> Sent: Wed, December 29, 2010 10:23:06 AM
> Subject: Re: [Wikimediaindia-l] Indic languages & unicode issues.
> 
> I'm curious about the issue you are discussing ... is this similar to
> a  long-standing bug that affects Bengali, Assamese, and Bishnupriya
> Manipuri  wikipedias?
> https://bugzilla.wikimedia.org/show_bug.cgi?id=5948
> 
> 
> Ragib
> 
> 
> User:Ragib  on en and bn
> 
> 
> --
> Ragib Hasan, Ph.D
> NSF Computing Innovation  Fellow and
> Assistant Research Scientist
> 
> Dept of Computer  Science
> Johns Hopkins University
> 3400 N Charles Street
> Baltimore, MD  21218
> 
> Website:
> http://www.ragibhasan.com
> 
> 
> 
> On Mon, Dec  27, 2010 at 1:29 AM, BalaSundaraRaman <sundarbe...@yahoo.com>  
>wrote:
> >> Unicode's decision to bring the second encoding  in
> >
> >> standard was widely  debated  and opposed mainly by FOSS  developer
> >> community from Malayalam.  Unicode announced the dual  encoding scheme
> >> without canonical equivalence  definition in 2005  and reverted it when
> >> scholars and developers opposed   it.
> >
> > Sadly, you're not alone in this, Santhosh.
> > We have  had canonical non-equivalence issues and many more (similar to the
> >  atomic chillu issue) in Tamil too. :(
> > Part of it was inherited from the  umbrellaish ISCII model (done with good
> > intentions, I believe).
> >  They put the abugidas of the Indo-Aryan languages and other systems like  
>Tamil
> > (haven't studied other writing systems enough to comment upon)  into one 
>bucket
> > and we're still suffering for that. They cite stability  when legitimate 
>changes
> > are sought, but allow such breaking  changes.
> >
> > I'm sure you'll be working with the search engines to  map the equivalent 
>glyph
> > sequences. Also, please explore mediawiki tech  solutions to add redirects 
or
> > hidden texts (though not  ideal).
> >
> > - Sundar
> >
> > "That language is an instrument  of human reason, and not merely a medium 
> > for 
>the
> > expression of thought,  is a truth generally admitted."
> > - George Boole, quoted in Iverson's  Turing Award Lecture
> >
> >
> >
> > ----- Original Message  ----
> >> From: Santhosh Thottingal <santhosh.thottin...@gmail.com>
> >>  To: Discussion list on Indian language projects of Wikimedia.
> >><wikimediaindia-l@lists.wikimedia.org>
> >>  Sent: Sun, December 26, 2010 10:28:17 PM
> >> Subject: Re:  [Wikimediaindia-l] Indic languages & unicode issues.
> >>
> >>  On Sun, Dec 26, 2010 at 7:43 PM, CherianTinu Abraham
> >> <tinucher...@gmail.com>  wrote:
> >> >  Hi all,
> >> > Happened to see Gerard's blog  post on issues with Malayalam  Wikipedia
> >> > & Unicode upgrade  to
> >> >  5.1  http://ultimategerardm.blogspot.com/2010/12/malayalam-enigma.html
> >>
> >>
> >>  The  issue is very complex. There were heated debates around this  topic
> >> in  Unicode Indic Mailing list for years. In short the issue  is about
> >> dual  encoding- representing a letter using two types of  unicode
> >> character codes.  Unicode's decision to bring the second  encoding in
> >> standard was widely  debated  and opposed mainly by FOSS  developer
> >> community from Malayalam.  Unicode announced the dual  encoding scheme
> >> without canonical equivalence  definition in 2005  and reverted it when
> >> scholars and developers opposed   it.
> >> The same proposal again introduced. Foss community, language   scholars
> >> protested the proposal. The SMC community submitted a  document with  17
> >> reasons why dual encoding should not be  introduced.-  see
> >>  http://wiki.smc.org.in/images/2/23/SMC_Unicode_5.1.pdf
> >> Similarly a   seminar conducted to discuss the issue by University of
> >> Kerala  opposed the  proposal.   see
>>>http://images2.wikia.nocookie.net/__cb20080131071131/fci/images/1/19/Report_of_Workshop.pdf
>f
> >>f
> >>    But Unicode technical consortium did not bother to answer both of
> >>  these  reports and went ahead with the decision in Unicode 5.1. The
> >>  dual encoding  scheme is with out any canonical equivalence  definition.
> >> Since it is not  there in standard I doubt whether  Operating systems
> >> will implement it, not to  mention about search  engines.
> >>
> >> Since the new encoding scheme is defined   without backward
> >> compatibility, or against unicode's stability  policy,   Malayalam FOSS
> >> community decided not to implement it until  issues are  resolved and
> >> continuing with unicode 5.0 encoding.  Malayalam news portals  also
> >> follow unicode 5.0. Most of the tools  from Google also continue  with
> >> unicode 5.0 based encoding.  Malayalam wikipedia decided to go  ahead
> >> with latest version of  unicode. I had resisted this move in  the
> >> discussion pages of  Malayalam wikipedia. The decision was taken  based
> >> on voting by a  small community of editors and not based on  proper
> >> technical  analysis.
> >>
> >>
> >> Believe it or not, this is how   Malayalam wiki is rendered inWindows XP
> >> IE 8 box with OS default   font:
> >> http://thottingal.in/tmp/ml-wiki-winxp-IE8.png
> >> I  hope it gives some  clue about the issue that Gerard  mentioned.
> >>
> >> Most of the discussions  happened around the  encoding issue was in
> >> Malayalam(in Malayalam wiki or in  blogs), but  this English blog post
> >> might summarize  it
> >>  http://www.j4v4m4n.in/2009/11/07/unicode-or-malayalam/
> >>
> >>
> >>  Discussions  happened in Malayalam wikipedia(content in Malayalam
> >>  language)
>>>http://ml.wikipedia.org/wiki/വിക്കിപീഡിയ:പഞ്ചായത്ത്_(സാങ്കേതികം)/യൂണികോഡ്_5.1.0/ചർച്ച_(പഴയവ)
>)
> >>
> >>
> >>  Thanks
> >> Santhosh Thottingal
> >>  http://thottingal.in
> >>
> >>  _______________________________________________
> >> Wikimediaindia-l l  mailing list
> >> Wikimediaindia-l@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
> >>
> >
> >  _______________________________________________
> > Wikimediaindia-l  mailing list
> > Wikimediaindia-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
> >
> 
> _______________________________________________
> Wikimediaindia-l  mailing list
> Wikimediaindia-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
> 

_______________________________________________
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l

Reply via email to