Yes, this is a trimmed down version of padma, a generic indic transliteration tool Thanks for comments though.
Terry Reedy wrote: > "Vyz" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > > Its a module to transliterate from telugu language written in roman > > script into native unicode. right now its running in a browser window > > at www.lekhini.org I Intend to translate it into python so that I can > > integrate into other tools I have. I am able to pass arguments and get > > output from the script also would be OK. or how about ways to wrap > > these javascript functions with python. > > Leaving aside the code the manipulated the display and user interaction, > the code should be pretty straightforward logic (if-else statements) and > table lookups, so translation to Python should be straightforward also. > > I checked parser.js. I don't know javascript but it looks to me like a > mixture of C and Python. The for loop headers have to be rewritten, and > the switch changed to if-elif. What looks different is the attachment as > attributes of method functions to functions rather than classes. > > As for 'wrapping': can you get a standard javascript interpreter? If so, > you could possibly adjust the js so you can pipe a roman string to the js > program and have it pipe back the telegu unicode version. > > >> > I have a script with hundreds of lines of javascript spread accross 7 > >> > files. Is there any tool out there to automatically or > >> > semi-automatically translate the code into python. > > unicode.js is mostly a few hundred verbose lines like > > Unicode.codePoints[Padma.lang_TELUGU].letter_PHA = "\u0C2B"; > > that setup the translation dict. Because the object model is different, I > suspect that these all need to be changed, but, I also suspect, in a > mechanical way. > > If one were starting in Python, one might either just define a dict more > compactly like > TEL_uni = {letter_PHA:"\u0C2B", ...} > *or* probably better, use the builtin unicodedata module as much as > possible. > > >>> import unicodedata as u > >>> pha = u.name(u'\u0c2b') > >>> pha > 'TELUGU LETTER PHA' > >>> u.lookup(pha) > u'\u0c2b' > > I don't know what you do with js statement like this: > Unicode.toPadma[Unicode.codePoints[Padma.lang_TELUGU].misc_VIRAMA + > Unicode.codePoints[Padma.lang_TELUGU].letter_KA] = Padma.vattu_KA; > where a constant seems to be assigned to a sum. But whatever these do > might correspond to the u.normalize function. > > This appears to be based on a generic Indian-script transliteration program > (Padma), so there may be functions not really needed for Telegu. (I am > familiar with Devanagri but know nothing of Telegu and its script except > that it is Dravidian rather than Indo-European-Sanskritic.) > > Good luck. > > Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list