Gora and others, I have been using hindi typing on linux quite a bit recently. I do not need to say that it is really good, and far superior to anything I have seen on windows. I have two small problems that you may be able to help with.
I still have some problem with keyboard layouts. I use phonetic layout most of the time but I still do not know where some of the letters are. It will be useful if somebody could pass an image file that gives the keyboard layout. The second problem is that gnome-terminal does not show hindi text correctly. I use mutt for my e-mail and open it on gnome-terminal. Hindi text is not readable there. The half letters come only with a "halant", and the "matras" are also not correctly placed. How does one extend hindi support to gnome-terminal? Vikas On Wed, Jun 27, 2007 at 10:53:11AM +0530, Gora Mohanty wrote: > On Tue, 2007-06-26 at 22:05 +0100, Vivek Rai wrote: > > > (c) Vastly improve existing spell-checking dictionaries, as these > > > rules are not of much use without an adequate dictionary. One > > > way is to have someone type in dictionaries that are out of > > > copyright. > > I understand aspell is based on list of valid words, right? How about > > this as a quick way to generate a starting version of such a list? > > > > a) store any public domain text in hindi/oriya in a text file. select > > inputs with reliable spellings. > > b) create a script that splits the file into words on different lines, > > does sort -u, and then finally generates a sorted list. > > c) each of us run this script on any public domain local language text > > that we find, and upload our lists to the indlinux website. > > > > a master list can then be build from this? > > We have done broadly what you suggest on several texts, and along with > input from other sources, we now have a pretty comprehensive list of > Hindi words, which should number about 30-40K by now. The problem is > to have it proof-read. At the time of proof-reading, we should also > have people add affix information for aspell. I will post a note about > this soon, and we can talk about a web interface to let people easily > do the proof-reading, and add affix information. > > The other thing I am having a summer intern work on is building a page > scraper in Python that will crawl web pages, and grab text within a > specified Unicode range. As this would have to be proof-read for > validity, I see this as being more beneficial for (a) getting an idea > of common mis-spellings, (b) building a corpus in various domains, and > (c) as a snapshot of how the language evolves, and how new words come > in. Maybe tie this to Newsrack (http://newsrack.in). > > > this wouldnt be perfect, and we will still have to manually keep on > > adding to it, sorting any misspellings from our sources etc, but i > > think this could still be a good start. > > > > आगे मेरा सुझाव ये भी है की हम इस मेल सूची पर अब हिन्दी में बोलचाल > > बढ़ायें, ताकि हमारे पास अधिक से अधिक हिन्दी भाषा का पाठ उप्लब्ध हो. > > आप ठीक कह रहें हैं, लेकिन इसमें कई मुश्किले हैं. एक तो मेरी हिन्दी > कमज़ोर है, जिस वजह से मुझे हिन्दी > में लिखने अधिक समय लगता है. कई लोगों को तो मुझसे भी ज़्यादा धिक्कत होती > है. तो मेरा सुझाव यह रहेगा > की लोग अपनी मनचाही भाषा में लिखें, और जवाब दूसरी भाषा में भी आये. > > Regards, > Gora > > > _______________________________________________ > ilugd mailinglist -- ilugd@lists.linux-delhi.org > http://frodo.hserus.net/mailman/listinfo/ilugd > Archives at: http://news.gmane.org/gmane.user-groups.linux.delhi > http://www.mail-archive.com/ilugd@lists.linux-delhi.org/ _______________________________________________ ilugd mailinglist -- ilugd@lists.linux-delhi.org http://frodo.hserus.net/mailman/listinfo/ilugd Archives at: http://news.gmane.org/gmane.user-groups.linux.delhi http://www.mail-archive.com/ilugd@lists.linux-delhi.org/