Re: [Wikitech-l] Unicode equivalence

2010-05-29 Thread Robert Ullmann
I've looked at this a bit more. There are more serious problems.

Apparently, no-one converted the 5.0 titles in the wiki to 5.1 when
normalization was turned on; there are pages that can't be accessed.
(!) for example, try this (Malayalam for fish):

http://ml.wiktionary.org/wiki/%E0%B4%AE%E0%B5%80%E0%B4%A8%E0%B5%8D%E2%80%8D

that gets normalized to 5.1, which is a redirect to the 5.0 form (in
this case) which is normalized back to 5.1. (there is a variation in
the 5.0 form too that complicates it) The content page exists (I can
see it in the XML dump), but can't be accessed because there is no way
of referring to it.

Was it necessary to force the normalization to 5.1? I would think just
using the 5.1 forms by convention would be/would have been entirely
adequate? Maybe with a bit of bot conversion? (Moving 5.0 to 5.1
leaving redirects, converting text while leaving iwiki links alone.)

The present state apparently can't be bot-fixed, as (some) content
pages can't be read.

As it is, it is impossible to write valid iwiki language links to 5.0
forms on other wikis. One could create 5.1 redirects on the other
wikis and link to them; but that doesn't help cases like above where
one can't even access the content page. There are 998 of them (as of
the last XML dump, 3 April) in this state apparently.

Mind you, I'm not sure I have all the details right yet, and I'd like
to read through a current dump, now that they are running again.

Robert

On Sat, May 22, 2010 at 2:57 PM, Platonides platoni...@gmail.com wrote:
 We should probably normalise to 5.1 on all wikis.
 I can view the 5.0 characters but not the 5.1 ones, though.

 But would
 someone tell me where in the server code this is done? I have not been
 able to find it. Then I can understand a bit better, possibly just fix
 it in the bot code somehow, or suggest a fix server-side.

 http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/languages/classes/LanguageMl.php?revision=61282view=markup


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2010-05-29 Thread Platonides
Robert Ullmann wrote:
 I've looked at this a bit more. There are more serious problems.
 
 Apparently, no-one converted the 5.0 titles in the wiki to 5.1 when
 normalization was turned on; there are pages that can't be accessed.
 (!) for example, try this (Malayalam for fish):
 
 http://ml.wiktionary.org/wiki/%E0%B4%AE%E0%B5%80%E0%B4%A8%E0%B5%8D%E2%80%8D
 
 that gets normalized to 5.1, which is a redirect to the 5.0 form (in
 this case) which is normalized back to 5.1. (there is a variation in
 the 5.0 form too that complicates it) The content page exists (I can
 see it in the XML dump), but can't be accessed because there is no way
 of referring to it.


Request on bugzilla a run of cleanupTitles.php on mlwiki.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2010-05-29 Thread Robert Ullmann
In December, Praveen Prakesh wrote
We are currently using Unicode 5.1 Redirect to unicode 5.0 titled
articles in some cases. After converting both these titles become same.
Is that a problem?

Yes, it is ...

cleanupTitles will resolve collisions with redirects?

Robert

On Sat, May 29, 2010 at 5:26 PM, Platonides platoni...@gmail.com wrote:
 Robert Ullmann wrote:
 I've looked at this a bit more. There are more serious problems.

 Apparently, no-one converted the 5.0 titles in the wiki to 5.1 when
 normalization was turned on; there are pages that can't be accessed.
 (!) for example, try this (Malayalam for fish):

 http://ml.wiktionary.org/wiki/%E0%B4%AE%E0%B5%80%E0%B4%A8%E0%B5%8D%E2%80%8D

 that gets normalized to 5.1, which is a redirect to the 5.0 form (in
 this case) which is normalized back to 5.1. (there is a variation in
 the 5.0 form too that complicates it) The content page exists (I can
 see it in the XML dump), but can't be accessed because there is no way
 of referring to it.


 Request on bugzilla a run of cleanupTitles.php on mlwiki.


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2010-05-29 Thread Robert Ullmann
To answer my own question, no, it won't: it will move all the 5.0
pages that have 5.1 redirects to pages named Broken/ID: and the
result is a huge mess.

Either the redirects must be deleted first, or the code in
cleanupTitles fixed to move over redirects.

Robert

On Sat, May 29, 2010 at 5:31 PM, Robert Ullmann rlullm...@gmail.com wrote:
 In December, Praveen Prakesh wrote
 We are currently using Unicode 5.1 Redirect to unicode 5.0 titled
 articles in some cases. After converting both these titles become same.
 Is that a problem?

 Yes, it is ...

 cleanupTitles will resolve collisions with redirects?

 Robert

 On Sat, May 29, 2010 at 5:26 PM, Platonides platoni...@gmail.com wrote:
 Robert Ullmann wrote:
 I've looked at this a bit more. There are more serious problems.

 Apparently, no-one converted the 5.0 titles in the wiki to 5.1 when
 normalization was turned on; there are pages that can't be accessed.
 (!) for example, try this (Malayalam for fish):

 http://ml.wiktionary.org/wiki/%E0%B4%AE%E0%B5%80%E0%B4%A8%E0%B5%8D%E2%80%8D

 that gets normalized to 5.1, which is a redirect to the 5.0 form (in
 this case) which is normalized back to 5.1. (there is a variation in
 the 5.0 form too that complicates it) The content page exists (I can
 see it in the XML dump), but can't be accessed because there is no way
 of referring to it.


 Request on bugzilla a run of cleanupTitles.php on mlwiki.


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2010-05-29 Thread Ilmari Karonen
On 05/29/2010 05:42 PM, Robert Ullmann wrote:
 To answer my own question, no, it won't: it will move all the 5.0
 pages that have 5.1 redirects to pages named Broken/ID: and the
 result is a huge mess.

 Either the redirects must be deleted first, or the code in
 cleanupTitles fixed to move over redirects.

Fixing cleanupTitles.php to allow moving pages over one-rev 
self-redirects, just like with normal page moving, would IMHO be a good 
idea anyway.  The code can be adapted from Title::isValidMoveTarget() 
and Title::moveOverExistingRedirect().  In fact, the former method could 
probably be used as is -- while the situation in cleanupTitles is a bit 
unusual, in that title normalization will cause the redirect to point to 
itself, Title::isValidMoveTarget() seems to already have code in place 
to handle that special case.

-- 
Ilmari Karonen

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2010-05-29 Thread Junaid P V
Hi,
Tim Starling had done title cleanups for all malayalam wikis just after 5.1
normalization activated,  to avoid title conflict he renamed all titles
those are using 5.0 chills by prepending 'Broken/'. So you can see those
pages 
herehttp://ml.wiktionary.org/w/index.php?title=%E0%B4%AA%E0%B5%8D%E0%B4%B0%E0%B4%A4%E0%B5%8D%E0%B4%AF%E0%B5%87%E0%B4%95%E0%B4%82%3A%E0%B4%AA%E0%B5%82%E0%B5%BC%E0%B4%B5%E0%B5%8D%E0%B4%B5%E0%B4%AA%E0%B4%A6%E0%B4%B8%E0%B5%82%E0%B4%9A%E0%B4%BF%E0%B4%95prefix=Broken%2Fnamespace=0
.

We had restored those titles in ml.wikipedia, but not yet on ml.wiktionary
:(

On 29 May 2010 20:34, Ilmari Karonen nos...@vyznev.net wrote:

 On 05/29/2010 05:42 PM, Robert Ullmann wrote:
  To answer my own question, no, it won't: it will move all the 5.0
  pages that have 5.1 redirects to pages named Broken/ID: and the
  result is a huge mess.
 
  Either the redirects must be deleted first, or the code in
  cleanupTitles fixed to move over redirects.

 Fixing cleanupTitles.php to allow moving pages over one-rev
 self-redirects, just like with normal page moving, would IMHO be a good
 idea anyway.  The code can be adapted from Title::isValidMoveTarget()
 and Title::moveOverExistingRedirect().  In fact, the former method could
 probably be used as is -- while the situation in cleanupTitles is a bit
 unusual, in that title normalization will cause the redirect to point to
 itself, Title::isValidMoveTarget() seems to already have code in place
 to handle that special case.

 --
 Ilmari Karonen

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
http://junaidpv.in
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2010-05-22 Thread Robert Ullmann
Hi,

If you don't still have this thread, the background is that the
Malayam projects want to, and are, using Unicode 5.1 for five
characters that have composed code points in 5.1, and decomposed in
5.0. The equivalences are:

CHILLU NN 0D23, 0D4D, 200D0D7A
CHILLU N   0D28, 0D4D, 200D0D7B
CHILLU RR 0D30, 0D4D, 200D0D7C
CHILLU L0D32, 0D4D, 200D0D7D
CHILLU LL  0D33, 0D4D, 200D0D7E

Somewhere in the server code, these are normalized to 5.1 for the ml
projects. Problem:

http://ml.wiktionary.org/w/index.php?title=%E0%B4%95%E0%B5%81%E0%B4%B1%E0%B5%81%E0%B4%95%E0%B5%8D%E0%B4%95%E0%B5%BBaction=history

What you see happening is Interwicket trying to create the language
links. It adds the correct link(s), to the 5.0 forms on the other
wikts; then on the next scan of the language links tables it removes
the links as invalid, as the 5.1 titles don't exist on the other
wikts. This then repeats. (;-)

The problem is that it can't write the correct link, as the text
normalization fixes it.

The other direction isn't a problem, the links are to the 5.0 forms,
and when followed are normalized to 5.1 in the title lookup, and the
page found.

I'm not (yet) suggesting a particular solution, there are several
possibilities (from fairly decent to grotesque hackery ...). But would
someone tell me where in the server code this is done? I have not been
able to find it. Then I can understand a bit better, possibly just fix
it in the bot code somehow, or suggest a fix server-side.

Best Regards,
Robert

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2010-05-22 Thread Platonides
We should probably normalise to 5.1 on all wikis.
I can view the 5.0 characters but not the 5.1 ones, though.

 But would
 someone tell me where in the server code this is done? I have not been
 able to find it. Then I can understand a bit better, possibly just fix
 it in the bot code somehow, or suggest a fix server-side.

http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/languages/classes/LanguageMl.php?revision=61282view=markup


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2010-01-26 Thread Praveen Prakash
On Wed, Dec 23, 2009 at 7:40 AM, Praveen Prakash me.prav...@gmail.comwrote:

 Tim Starling wrote:

 If you say Unicode 5.1 is the best solution for your community, then
 I'm willing to take your word for that.


 Hi,

 voting on this subject

 http://ml.wikipedia.org/wiki/WP:Panchayath_(Technical)/Unicode_5.1.0http://ml.wikipedia.org/wiki/WP:Panchayath_%28Technical%29/Unicode_5.1.0

 Praveen


IE 6 fix is simple.
http://ml.wikipedia.org/wiki/Help:To_Read_in_Malayalam#For_Windows.2FIE_6_users
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-12-22 Thread Praveen Prakash
Tim Starling wrote:
 If you say Unicode 5.1 is the best solution for your community, then
 I'm willing to take your word for that.
   
Hi,

voting on this subject

http://ml.wikipedia.org/wiki/WP:Panchayath_(Technical)/Unicode_5.1.0

Praveen



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-12-02 Thread Praveen Prakash
Tim Starling wrote:
 The only way we can implement equivalence is by converting to a
 canoncial form, this is how it is done in every part of MediaWiki. The
 ability to treat strings as binary, and to compare them byte-by-byte,
 is essential to the performance of the system.

 It may be possible to convert to the Unicode 5.0 form when we generate
 the edit page for certain browsers, and to convert back to 5.1 when
 they save the page. But that would be more complicated to develop than
 to just convert to Unicode 5.1 all the time.
   
If possible that will be very useful. I afraid IE6 still has its share.
 If you say Unicode 5.1 is the best solution for your community, then
 I'm willing to take your word for that.
http://ml.wikipedia.org/w/index.php?title=???:???_(??)oldid=508212#.E0.B4.AF.E0.B5.82.E0.B4.A3.E0.B4.BF.E0.B4.95.E0.B5.8B.E0.B4.A1.E0.B5.8D_5.1.0
 

Here in this link most people proposed changing to 5.1. 

We are currently using Unicode 5.1 Redirect to unicode 5.0 titled 
articles in some cases. After converting both these titles become same. 
Is that a problem?


-- 
Wikipedia Affiliate Button 
http://wikimediafoundation.org/wiki/Support_Wikipedia/en
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-12-01 Thread Marcus Buck
Tim Starling hett schreven:
 Gerard Meijssen wrote:
   
 Hoi,
 Given that we should be moving forward not backward, it makes more sense to
 provide Unicode 5.1 characters and webfonts.

 The big thing of MediaWiki was that it supported Unicode when this was still
 a new thing to do. We should support the latest and the best Unicode
 support.
 

 You did read the post didn't you? Forcing everyone to buy Windows 7 is
 not generally the way we do things. Unless the client situation is not
 as bad as it sounds, we will need to have a transition period where we
 support older clients until their market share falls far lower than
 50%, which is where, by Praveen's figures, it is now.
   
I guess you are both right. To me the best solution seems to be: accept 
both as input (obviously), normalize everything to 5.1 and store it in 
that codeset (so our data is consistently 5.1). For output convert it to 
5.0 to evade problems with clients not yet ready for 5.1. The advantage 
is, that our data is stored in the most modern format, but still the 
clients are served data that they can process.
If there are performance problems with the conversion on serving or 
anything like that, of course storing the data in 5.0 is still good 
enough. More important than discussing the specific technical details is 
actually doing it, implementing it.

Marcus Buck
User:Slomox

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-12-01 Thread Roan Kattouw
2009/12/1 Marcus Buck w...@marcusbuck.org:
 If there are performance problems with the conversion on serving or
 anything like that, of course storing the data in 5.0 is still good
 enough.
You're answering your own question here: converting the data once and
storing it in 5.0 so it's ready-to-serve is of course faster and
easier than juggling between 5.0 and 5.1 all the time.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-12-01 Thread Marco Schuster
2009/12/1 Praveen Prakash me.prav...@gmail.com

 Popular transliteration tool for Malayalam typing (*Varamozhi*) and popular
 font (*Anjali OldLipi*) are currently supporting Unicode 5.1 in windows.
 Recently (two or three days before) Microsoft announced their own tool for
 Malayalam typing which also supporting 5.1. Microsoft's default Karthika
 font for Malayalam also now supporting 5.1. But IE6 is not supporting
 unicode 5.1 even with supporting fonts.

Is dynamic reverse conversion at clientside using javascript possible?
This way we could output UC5.1 to everything supporting it, and older /
crappy browsers / OSes can display still correctly.
Sure, it adds a JS dependency, but I do think we can require JS for that.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unicode equivalence

2009-12-01 Thread Aryeh Gregor
On Tue, Dec 1, 2009 at 9:30 AM, Marco Schuster
ma...@harddisk.is-a-geek.org wrote:
 Is dynamic reverse conversion at clientside using javascript possible?

I can't see any justification for requiring JavaScript here.  We
should be able to do it server-side if it needs to be done at all.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-12-01 Thread Tim Starling
Praveen Prakash wrote:
 This is the condition of current stage of Malayalam Computing. So I thought
 it is better to put equivalence than switching to a particular version. If
 it not possible switching as said by Tim is appreciable. I prefer 5.1 which
 is future.

The only way we can implement equivalence is by converting to a
canoncial form, this is how it is done in every part of MediaWiki. The
ability to treat strings as binary, and to compare them byte-by-byte,
is essential to the performance of the system.

It may be possible to convert to the Unicode 5.0 form when we generate
the edit page for certain browsers, and to convert back to 5.1 when
they save the page. But that would be more complicated to develop than
to just convert to Unicode 5.1 all the time.

If you say Unicode 5.1 is the best solution for your community, then
I'm willing to take your word for that.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-11-30 Thread Tim Starling
Praveen Prakash wrote:
 Currently Windows 7 is the only operating system 
 which supports Unicode 5.1.0. (? according to my knowledge), but lot of 
 third-party tools for writing and reading Malayalam supports new 
 version. 

So do you want everything to be converted to the Unicode 5.0 version,
including page titles, namespaces and article content, and for Unicode
5.1 characters sent by browsers during editing to be converted to
Unicode 5.0 before storage? We can probably set that up.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-11-30 Thread Praveen Prakash
Is it possible to implement some method to tell server both characters 
are same?? It is heard that more changes coming in future versions of 
unicode. And now almost half of the data coming is in unicode 5.1 
version. I am not sure about reverse converting.
Tim Starling wrote:
 Praveen Prakash wrote:
   
 Currently Windows 7 is the only operating system 
 which supports Unicode 5.1.0. (? according to my knowledge), but lot of 
 third-party tools for writing and reading Malayalam supports new 
 version. 
 

 So do you want everything to be converted to the Unicode 5.0 version,
 including page titles, namespaces and article content, and for Unicode
 5.1 characters sent by browsers during editing to be converted to
 Unicode 5.0 before storage? We can probably set that up.

 -- Tim Starling


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

   


-- 
Wikipedia Affiliate Button 
http://wikimediafoundation.org/wiki/Support_Wikipedia/en
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-11-30 Thread Gerard Meijssen
Hoi,
Given that we should be moving forward not backward, it makes more sense to
provide Unicode 5.1 characters and webfonts.

The big thing of MediaWiki was that it supported Unicode when this was still
a new thing to do. We should support the latest and the best Unicode
support.

NB this is not an issue that is problematic for Malayam alone. Another
script that was updated was Devanagari ... used for Hindi for instance.. We
have a request for support for fonts for the Ge'ez script ... used for
Amharic and a few others.
Thanks,
 GerardM

2009/12/1 Tim Starling tstarl...@wikimedia.org

 Praveen Prakash wrote:
  Currently Windows 7 is the only operating system
  which supports Unicode 5.1.0. (? according to my knowledge), but lot of
  third-party tools for writing and reading Malayalam supports new
  version.

 So do you want everything to be converted to the Unicode 5.0 version,
 including page titles, namespaces and article content, and for Unicode
 5.1 characters sent by browsers during editing to be converted to
 Unicode 5.0 before storage? We can probably set that up.

 -- Tim Starling


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-11-30 Thread Tim Starling
Praveen Prakash wrote:
 Is it possible to implement some method to tell server both characters 
 are same?? 

No.

 It is heard that more changes coming in future versions of 
 unicode. And now almost half of the data coming is in unicode 5.1 
 version. I am not sure about reverse converting.

That link you gave in your last post had a conversion table, it looks
pretty straightforward:

CHILLU NN - NNA, VIRAMA, ZWJ
CHILLU N - NA, VIRAMA, ZWJ
CHILLU RR - RA, VIRAMA, ZWJ
CHILLU L - LA, VIRAMA, ZWJ
CHILLU LL - LLA, VIRAMA, ZWJ

The other new characters would remain unconverted.

Gerard Meijssen wrote:
 Hoi,
 Given that we should be moving forward not backward, it makes more sense to
 provide Unicode 5.1 characters and webfonts.
 
 The big thing of MediaWiki was that it supported Unicode when this was still
 a new thing to do. We should support the latest and the best Unicode
 support.

You did read the post didn't you? Forcing everyone to buy Windows 7 is
not generally the way we do things. Unless the client situation is not
as bad as it sounds, we will need to have a transition period where we
support older clients until their market share falls far lower than
50%, which is where, by Praveen's figures, it is now.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Unicode equivalence

2009-11-30 Thread Praveen Prakash
On Tue, Dec 1, 2009 at 7:55 AM, Tim Starling tstarl...@wikimedia.orgwrote:

 Praveen Prakash wrote:
  Is it possible to implement some method to tell server both characters
  are same??

 No.

  It is heard that more changes coming in future versions of
  unicode. And now almost half of the data coming is in unicode 5.1
  version. I am not sure about reverse converting.

 That link you gave in your last post had a conversion table, it looks
 pretty straightforward:

 CHILLU NN - NNA, VIRAMA, ZWJ
 CHILLU N - NA, VIRAMA, ZWJ
 CHILLU RR - RA, VIRAMA, ZWJ
 CHILLU L - LA, VIRAMA, ZWJ
 CHILLU LL - LLA, VIRAMA, ZWJ

 The other new characters would remain unconverted.

 Gerard Meijssen wrote:
  Hoi,
  Given that we should be moving forward not backward, it makes more sense
 to
  provide Unicode 5.1 characters and webfonts.
 
  The big thing of MediaWiki was that it supported Unicode when this was
 still
  a new thing to do. We should support the latest and the best Unicode
  support.

 You did read the post didn't you? Forcing everyone to buy Windows 7 is
 not generally the way we do things. Unless the client situation is not
 as bad as it sounds, we will need to have a transition period where we
 support older clients until their market share falls far lower than
 50%, which is where, by Praveen's figures, it is now.

 -- Tim Starling


Letter k (ൿ) was undefined in prior Unicode 5.1. According to my knowledge
letter nta (ന്റ) and tta (റ്റ) need explicit OS support (?) to display,
Which is not available yet (*Windows 7* ??). I am sorry, exclusion of these
letters are not intentional. I included chillaksharams only because they are
creating most of the problem. Frankly we didnt face any problem caused by
other letters. But it is possible.

Popular transliteration tool for Malayalam typing (*Varamozhi*) and popular
font (*Anjali OldLipi*) are currently supporting Unicode 5.1 in windows.
Recently (two or three days before) Microsoft announced their own tool for
Malayalam typing which also supporting 5.1. Microsoft's default Karthika
font for Malayalam also now supporting 5.1. But IE6 is not supporting
unicode 5.1 even with supporting fonts.

In the case of Linux there are third party input definitions for SCIM (Mozhi
and Inscript), and altered default fonts (*Rachana*, *Meera* etc) to unicode
5.1 which are widely using ones. No Linux OS giving default support for
unicode 5.1 on Malayalam, but it can be fixed by Firefox extension, altered
tools and fonts etc. Sometimes that need technical knowledge.

In Mac OS there is no proper support for Malayalam by default. But Some
Malayalies fixed it and its in Unicode 5.1.

So number of people using Unicode 5.1 is increasing day by day.

But there are lot of people who believes that new Chillaksharams definitions
are grammatically not correct and not ready to switch. I personally not with
them. Do I need to invite some people from both sides here? I afraid none of
them are ready for some agreement. :-(. We are discussing this problem since
before the release of Unicode 5.1 (More than 2 years), still not solved.

Once Unicode has implemented these atomic Chillaksharams (i think in 2004)
and then they removed it on next version for further discussions. This is
the second inclusion of those characters. It does'nt look like those chars
will be compromised in future versions.

This is the condition of current stage of Malayalam Computing. So I thought
it is better to put equivalence than switching to a particular version. If
it not possible switching as said by Tim is appreciable. I prefer 5.1 which
is future.



 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l