[Apertium-stuff] [GSoC 2014] Improving support for non-standard text input

2014-03-17 Thread Roque Lopez
Hello Apertium team, First, congratulations for the big work in the Apertium project. I hope to be part of your team in this summer. I am Roque López, a master student in Computer Science at São Paulo University-Brazil (USP). I have strong interest in Natural Language Processing, for this reason

[Apertium-stuff] [GSOC] Code/Improving support for non-standard text input

2014-03-17 Thread Aymen B.S
Hello, Project idea: Code/Improving support for non-standard text input As far as I could understand, there is no indication for a specific target language. Should I assume it is English? Otherwise, I'm majoring in computer science at Pennsylvania State University and last semester I had an NLP

Re: [Apertium-stuff] [GSOC] Unify the metadix formats Queries (Mikel Forcada)

2014-03-17 Thread Gaurav Agrawal
Hello Mikel, Basic Understanding of the project: *Problem with Current Scenario:* 1. The xslt and other pre-processing needs to applied manually on the meta dictionary files before we actually compile the dictionary files. 2. The xslt files varies from language pair to pair. Expected: 1. T

Re: [Apertium-stuff] GSoC Project

2014-03-17 Thread Gabriel Gregori Manzano
Hi Andrei, I've read your application and it looks very promising, the only point I think I could make is to have into account both small and large texts translations when profiling. I believe the size of the text to translate can make a difference when making optimization decisions. About the br

Re: [Apertium-stuff] GSOC Idea: Take a language pair and make it state of the art

2014-03-17 Thread Francis Tyers
El dl 17 de 03 de 2014 a les 13:21 -0700, en/na Alex Aruj va escriure: > Hello Fran and list, > > > Thank you for your responses. Regarding the first topic in my last > e-mail about "visualization" of coverage and quality, I found a graph > on the wiki (http://wiki.apertium.org/wiki/File:Wikipedi

Re: [Apertium-stuff] GSOC Idea: Take a language pair and make it state of the art

2014-03-17 Thread Francis Tyers
El dl 17 de 03 de 2014 a les 21:54 +0100, en/na Xavi Ivars va escriure: > Hi Alex, > > > What are the missing steps in order for it to produce a > translation (I entered "tiny" as a suitable translation" and > for the purpose of practice)? > > > > > > > You

Re: [Apertium-stuff] GSOC Idea: Take a language pair and make it state of the art

2014-03-17 Thread Xavi Ivars
Hi Alex, > > What are the missing steps in order for it to produce a translation (I > entered "tiny" as a suitable translation" and for the purpose of practice)? You need to "install" the new version of the dictionaries (with your additions) or call apertium with the current directory aperti

[Apertium-stuff] Need of Number of Variable Part in the Metadix

2014-03-17 Thread Gaurav Agrawal
Hello all, I am working on the project Unify the Metadix format project, a point is suggested by the Unhammer: Along with the Unfiy of Metadix, we may also check if the current metadix format is good enough for us or we need some evaluation ? Presently we have the single variable part of the lem

Re: [Apertium-stuff] GSOC Idea: Take a language pair and make it state of the art

2014-03-17 Thread Alex Aruj
Hello Fran and list, Thank you for your responses. Regarding the first topic in my last e-mail about "visualization" of coverage and quality, I found a graph on the wiki ( http://wiki.apertium.org/wiki/File:Wikipedia-n-zipf.png) that could spark some ideas about how to illustrate how effective Ape

Re: [Apertium-stuff] Improving support for non-standard text input

2014-03-17 Thread Akshay Minocha
Hi Saurabh, Thanks for the dataset review. You had suggested brb and bday. Well these words are already in the abbreviation list suggested by me earlier int he proposal. Since the most frequently used abbreviations are unique ( non repititive ) , eg, brb is be right back and this is the

Re: [Apertium-stuff] Improving support for non-standard text input

2014-03-17 Thread Francis Tyers
Well, we can definitely see that en-eo has a bad time with translating Turkish: @*berryhuckle *yok *ben *sana *faceten ĉeıyıım *sen *yine *çok *güzel *çıkm.ı*şsın *ben *yine *kötü *çıkm.ı*şım *ama *olsun *fotoğ*rafımızın *olması *iyi *bişeyy :)) :) F. El dl 17 de 03 de 2014 a les 14:18 -0400,

Re: [Apertium-stuff] Improving support for non-standard text input

2014-03-17 Thread Saurabh Hota
Hi I have gone through the archives and Akshay has good data set of shortened words which can be used to train which vowels are dropped. Also we have to note that abbreviations and shorten form are different like brb - > be right back and bday -> birthday. So we have to handle them separately. And

Re: [Apertium-stuff] GSoC Project

2014-03-17 Thread Andrei Sfrent
Hi, Following Francis' suggestion, I decided to apply for the VM-for-transfer project. I started to write my application [1], any feedback would be greatly appreciated. I also included a proof of concept in the application [2] in which I started to analyze the code and hack different parts of it.

Re: [Apertium-stuff] Improving support for non-standard text input

2014-03-17 Thread Francis Tyers
El dl 17 de 03 de 2014 a les 13:46 +0530, en/na Saurabh Hota va escriure: > Hi Francis > > > I am Saurabh, a fourth year undergraduate student majoring in Computer > Science > at Indian Institute of Technology. I am interested to work on > improving support > for non standard words(NSW). > > >

Re: [Apertium-stuff] [GSOC] Unify the metadix formats Queries (Mikel Forcada)

2014-03-17 Thread Gaurav Agrawal
Hello JImmy, Sorry for missing this case, I will treat it and update the file on the github and inform you. Can you please also share the comments on my understanding about the project ? http://wiki.apertium.org/wiki/User:Ergaurav3/Discussion_Gsoc_Unify_Metadix Thanks and Regards, *Gaurav Agr

Re: [Apertium-stuff] Coding challenge - Apertium in chat clients

2014-03-17 Thread Francis Tyers
Can you please supply a Makefile for linux. :) Fran El dl 17 de 03 de 2014 a les 12:23 +0530, en/na Sachin Shastri va escriure: > Hello, I am almost done with the plugins to be made for chat clients, > but I wanted to post one of them ,first to make sure I have understood > the problem statement

[Apertium-stuff] Improving support for non-standard text input

2014-03-17 Thread Saurabh Hota
Hi Francis I am Saurabh, a fourth year undergraduate student majoring in Computer Science at Indian Institute of Technology. I am interested to work on improving support for non standard words(NSW). I have read some papers and have a vast collection of general tweets, form that I have observed th

Re: [Apertium-stuff] GSOC proposal : accent and diacritic restoration

2014-03-17 Thread Sarthak Nandi
Hi, I checked out charlifter, and its already respecting the superblanks. Have I understood it wrong or there is something else to the coding challenge ? Thanks, Sarthak On Mon, Mar 17, 2014 at 12:33 AM, Sarthak Nandi wrote: > Hi guys, > > Hello, > I am a student doing B.Tech from VIT Vellor