Re: [Ilugc] counting tamil characters with python

2010-11-26 Thread Bharathi Subramanian
On Wed, Nov 24, 2010 at 1:43 PM, Kenneth Gonsalves wrote: > I am not counting syllables, I am counting characters in the > conventional sense of the word. The content stored in file/array are character encoding values and the actual characters or images (glyphs) showed on the screen are complete

Re: [Ilugc] counting tamil characters with python

2010-11-23 Thread Kenneth Gonsalves
On Tue, 2010-11-23 at 20:03 -0800, Santhosh Thottingal wrote: > >if this is not readable in your mail client, the code is here: > > > >http://bitbucket.org/lawgon/tamtrans/src/21197e0f1388/syllcount.py > > Number of syllables is is not the number of characters excluding > vowels. > For example

Re: [Ilugc] counting tamil characters with python

2010-11-23 Thread Kenneth Gonsalves
On Wed, 2010-11-24 at 10:28 +0530, Arun Venkataswamy wrote: > > hi, > > > > in context of the discussion on counting tamil characters, here is > one > > solution: > > > > > Hi Kenneth, > > I am not good in Python, but does your code handle vowels coming in > the > front? That is, vowels used in th

Re: [Ilugc] counting tamil characters with python

2010-11-23 Thread Santhosh Thottingal
On Tue, 23 Nov 2010 19:29:32 -0800 Kenneth Gonsalves wrote >On Wed, 2010-11-24 at 08:50 +0530, Kenneth Gonsalves wrote: >> in context of the discussion on counting tamil characters, here is one >> solution: > >if this is not readable in your mail client, the code is here: > >htt

Re: [Ilugc] counting tamil characters with python

2010-11-23 Thread Kenneth Gonsalves
On Wed, 2010-11-24 at 08:50 +0530, Kenneth Gonsalves wrote: > in context of the discussion on counting tamil characters, here is one > solution: if this is not readable in your mail client, the code is here: http://bitbucket.org/lawgon/tamtrans/src/21197e0f1388/syllcount.py -- regards KG http:/

[Ilugc] counting tamil characters with python

2010-11-23 Thread Kenneth Gonsalves
hi, in context of the discussion on counting tamil characters, here is one solution: #!/usr/bin/env python # -*- coding: utf-8 -*- import codecs def countsyll(instring): s = codecs.utf_8_encode(instring) x = codecs.utf_8_decode(s[0])[0] syllen = 0 vowels = [u'\u0bbe',u'\u0bb