Re: New Member to LT - for Tamil

2014-07-29 Thread Panagiotis Minos
Hello, On Mon, Jul 14, 2014 at 11:02 AM, Daniel Naber < daniel.na...@languagetool.org> wrote: > On 2014-07-14 09:12, Elanjelian Venugopal wrote: > > > Hi, have installed JDK 1.8.0_05 and tested. No changes. :( > > Mhh, it seems we need to document this workaround. Panagiotis, do you > see any way

Re: New Member to LT - for Tamil

2014-07-21 Thread Elanjelian Venugopal
I had earlier built the standalone with DskipTests. Anyway, I found the error: my bad. Please include the following the tagger dictionary: செய்யசெய்VAN For the documentation, this explains the one tag added so far: VANவினையெச்சம் - Passive participle (verbal or adjectival): வர,

Re: New Member to LT - for Tamil

2014-07-21 Thread Daniel Naber
On 2014-07-21 17:59, Elanjelian Venugopal wrote: > Hi, thanks for the guidance. Got the rules working now. The revised > grammar.xml is attached. Did you run the tests? If so, how? I'm asking because I'm getting this error: Did expect one error in: "அவர் அக்காரியத்தை உடனே செய்ய சொன்னார்." BTW,

Re: New Member to LT - for Tamil

2014-07-21 Thread Elanjelian Venugopal
Hi, thanks for the guidance. Got the rules working now. The revised grammar.xml is attached. As to documenting the tags, do mean, for example, creating a file like tagset.txt for English that has the following documentation: CCCoordinating conjunction: and, or, either, if, as, since, once, ne

Re: New Member to LT - for Tamil

2014-07-21 Thread Daniel Naber
On 2014-07-21 14:39, Elanjelian Venugopal wrote: > I've created another set of rules for Tamil that is to be based on the > POS tagging. -e. I've added the tagger and the rules. The rules don't work yet, so they are commented out for now. You can use the "-v" option of the command line LT to s

Re: New Member to LT - for Tamil

2014-07-21 Thread Elanjelian Venugopal
Hi, I've created another set of rules for Tamil that is to be based on the POS tagging. -e. (([ஃ-ஹ][ா-்]?)+)[ா-்] க[ா-ௌ]?(([க-ஹ][ா-்]?)+)? 'போக', 'வர', 'படிக்க' போன்ற (செ(ய்)ய' என்னும்) வினையெச்சங்களின் பின் ஒற்று மிகும்.\1க் \2 என்பதே

Re: New Member to LT - for Tamil

2014-07-21 Thread Elanjelian Venugopal
Hi Daniel, No I didn't build a binary. I followed the example here: http://wiki.languagetool.org/developing-a-tagger-dictionary#toc2 and created them manually. I am trying to get hold of a POS tagger for Tamil that is open sourced; will keep you posted if it comes through. -e. On 21 July 2014 2

Re: New Member to LT - for Tamil

2014-07-21 Thread Daniel Naber
On 2014-07-21 13:30, Elanjelian Venugopal wrote: Hi Elanjelian, > I've created a first draft of the binary file for a particular type of > inflection for Tamil. Hopefully it is correct. Tamil is highly > inflected; so there would be thousands of word forms that need to be > added with different p

Re: New Member to LT - for Tamil

2014-07-15 Thread Daniel Naber
On 2014-07-15 20:01, Elanjelian Venugopal wrote: > Now, having played around with the rules, it is quite clear now that I > need to use pos tagging to make life easy. How do I implement it? It's documented here: http://wiki.languagetool.org/developing-a-tagger-dictionary#toc2 The languages we sup

Re: New Member to LT - for Tamil

2014-07-15 Thread Elanjelian Venugopal
Now, having played around with the rules, it is quite clear now that I need to use pos tagging to make life easy. How do I implement it? I looked at the English language files - but there are so many docs in its resource folder, I'm not sure where to start. I have for example a tagged word list of

Re: New Member to LT - for Tamil

2014-07-15 Thread Elanjelian Venugopal
Hi Dominique! On 14 July 2014 17:03, Dominique Pellé wrote: > replace... > > புது மா கோலம் போடு > மயிலே. > > ... with... > > புது மா > கோலம் போடு மயிலே. > > The corrections are verified when running tests which > sometimes catches mistakes in the rule. > I think this would only work when the s

Re: New Member to LT - for Tamil

2014-07-15 Thread Daniel Naber
On 2014-07-15 13:15, Elanjelian Venugopal wrote: > Tamil is not case sensitive; so rule 149 doesn't apply. The rest > apply. I have just activated them for Tamil. Regards Daniel -- Want fast and easy access to all t

Re: New Member to LT - for Tamil

2014-07-15 Thread Elanjelian Venugopal
On 15 July 2014 16:24, Daniel Naber wrote: > -Translate the user interface. You should have received an invitation to > the translation tool. Most languages have a translation of the user > interface, but there's no technical reason that you have to make a > translation. For example, if almost al

Re: New Member to LT - for Tamil

2014-07-15 Thread Daniel Naber
On 2014-07-14 14:31, Elanjelian Venugopal wrote: > I've created the following. Not really sure if I'm on the right track. The next steps for Tamil support will be: -Translate the user interface. You should have received an invitation to the translation tool. Most languages have a translation of

Re: New Member to LT - for Tamil

2014-07-14 Thread Daniel Naber
On 2014-07-14 14:31, Elanjelian Venugopal wrote: > I've created the following. Not really sure if I'm on the right track. I have added it to the source code. I think it doesn't work yet, as you'll need at least one rule with break="yes", otherwise there will be no sentence boundaries at all. On

Re: New Member to LT - for Tamil

2014-07-14 Thread Elanjelian Venugopal
On 11 July 2014 22:29, Daniel Naber wrote: > > Then, we will need to add a sentence tokenizer that detects sentence > boundaries. This is described at > http://wiki.languagetool.org/customizing-sentence-segmentation-in-srx-rules > . > Can you work with that? > I've created the following. Not rea

Re: New Member to LT - for Tamil

2014-07-14 Thread Marcin Miłkowski
W dniu 2014-07-14 09:12, Elanjelian Venugopal pisze: > Hi, have installed JDK 1.8.0_05 and tested. No changes. :( > > And, BTW, how do I push my changes to grammar.xml back to you? It > appears I don't have sufficient permission to push it to the master. -e. Use pull request. Regards, Marcin > >

Re: New Member to LT - for Tamil

2014-07-14 Thread Dominique Pellé
Elanjelian Venugopal wrote: > Added a second rule group to the grammar.xml Hi Elanjelian Since you have non-trivial suggestions with \1 etc. such as \1க்\2, I would advise to use correction='...'. Ex, replace... புது மா கோலம் போடு மயிலே. ... with... புது மா கோலம் போடு மயிலே. The corrections

Re: New Member to LT - for Tamil

2014-07-14 Thread Elanjelian Venugopal
Added a second rule group to the grammar.xml On 14 July 2014 16:02, Daniel Naber wrote: > On 2014-07-14 09:12, Elanjelian Venugopal wrote: > > > Hi, have installed JDK 1.8.0_05 and tested. No changes. :( > > Mhh, it seems we need to document this workaround. Panagiotis, do you > see any way we

Re: New Member to LT - for Tamil

2014-07-14 Thread Daniel Naber
On 2014-07-14 09:12, Elanjelian Venugopal wrote: > Hi, have installed JDK 1.8.0_05 and tested. No changes. :( Mhh, it seems we need to document this workaround. Panagiotis, do you see any way we can have our own workaround so the user doesn't need to do anything? > And, BTW, how do I push my c

Re: New Member to LT - for Tamil

2014-07-14 Thread Elanjelian Venugopal
Hi, have installed JDK 1.8.0_05 and tested. No changes. :( And, BTW, how do I push my changes to grammar.xml back to you? It appears I don't have sufficient permission to push it to the master. -e. On 13 July 2014 20:33, Daniel Naber wrote: > On 2014-07-13 13:22, Panagiotis Minos wrote: > > >

Re: New Member to LT - for Tamil

2014-07-13 Thread Daniel Naber
On 2014-07-13 13:22, Panagiotis Minos wrote: > There is a bug report about this issue for more than a year, see > https://bugs.openjdk.java.net/browse/JDK-8008572 [1] The bug report says Java 7 is affected - does it maybe work on Java 8 without a work-around? Regards Daniel

Re: New Member to LT - for Tamil

2014-07-13 Thread Panagiotis Minos
This issue is specific to Windows version of JAVA (tested on my Windows 7 PC), but doesn't affect my Linux system. There is a bug report about this issue for more than a year, see https://bugs.openjdk.java.net/browse/JDK-8008572 I followed the instructions and created a fontconfig.properties (see

Re: New Member to LT - for Tamil

2014-07-13 Thread Elanjelian Venugopal
When I changed the main.java file in the standalone folder as stated below: static final String HTML_FONT_START = ""; static final String HTML_FONT_END = ""; static final String HTML_GREY_FONT_START = ""; The Tamil fonts are displayed in the results area, but the dialog and menu areas are s

Re: New Member to LT - for Tamil

2014-07-13 Thread Elanjelian Venugopal
On 13 July 2014 15:46, Panagiotis Minos wrote: > > There is no need to modify the GUI or include the font, as long as the > font exist on the system. My Ubuntu 14.04 has the ttf-indic-fonts-core > package and i can see the glyphs (see attached screenshot). > Hi, Looks good. It uses Lohit Tamil.

Re: New Member to LT - for Tamil

2014-07-12 Thread Daniel Naber
On 2014-07-12 10:15, Elanjelian Venugopal wrote: > Tamil doesn't yet have a good universal font. Nonetheless, the > following three could be considered the standard. At least one of them > will be present in most machines: Panos is the expert for the GUI, I hope he has an idea how to cleanly imp

Re: New Member to LT - for Tamil

2014-07-12 Thread Daniel Naber
On 2014-07-12 03:27, Elanjelian Venugopal wrote: > The menu item only affects the input box. I doesn't affect the output > box and the dialogs. It also doesn't have an effect on the options box > either. Screenshot: > https://github.com/tamiliam/languagetool/blob/master/languagetool-standalone/scr

Re: New Member to LT - for Tamil

2014-07-11 Thread Elanjelian Venugopal
The menu item only affects the input box. I doesn't affect the output box and the dialogs. It also doesn't have an effect on the options box either. Screenshot: https://github.com/tamiliam/languagetool/blob/master/languagetool-standalone/screenshot.jpg On 12 July 2014 01:13, Daniel Naber wrote:

Re: New Member to LT - for Tamil

2014-07-11 Thread Daniel Naber
On 2014-07-11 18:02, Elanjelian Venugopal wrote: > Compiled and tested. Works. Except the UI fonts need to be changed. Is > there an easy way to change them? The screenshot attached for > reference. There's a menu item to change the font, but I'm not sure if it affects dialogs. The string in thi

Re: New Member to LT - for Tamil

2014-07-11 Thread Daniel Naber
On 2014-07-11 16:57, Elanjelian Venugopal wrote: > This is not that clear to me. Does that mean I could only use the 26 > types that are stated between line 117 and 142 in the rules.xsd file? Exactly. Regards Daniel ---

Re: New Member to LT - for Tamil

2014-07-11 Thread Elanjelian Venugopal
On 11 July 2014 22:29, Daniel Naber wrote: > > -type="இலக்கண_அமைப்பில்_சொற்கள்" -> type can only have a limited set of > values. These are defined in rules.xsd in line 117ff. > This is not that clear to me. Does that mean I could only use the 26 types that are stated between line 117 and 142 in

Re: New Member to LT - for Tamil

2014-07-11 Thread Daniel Naber
On 2014-07-11 14:58, Elanjelian Venugopal wrote: > The latest grammar.xml file is available here: > > https://github.com/tamiliam/languagetool/tree/master/languagetool-language-modules/ta/src/main/resources/org/languagetool/rules/ta I have just added initial support for Tamil. Note that I had to

Re: New Member to LT - for Tamil

2014-07-11 Thread Elanjelian Venugopal
The latest grammar.xml file is available here: https://github.com/tamiliam/languagetool/tree/master/languagetool-language-modules/ta/src/main/resources/org/languagetool/rules/ta On 11 July 2014 19:21, Daniel Naber wrote: > On 2014-07-11 12:55, Elanjelian Venugopal wrote: > > > May I simply pas

Re: New Member to LT - for Tamil

2014-07-11 Thread Daniel Naber
On 2014-07-11 12:55, Elanjelian Venugopal wrote: > May I simply pass you the grammar.xml for Tamil so it could be > included in the next nightly build? Yes, please send me the file. Regards Daniel -- Open source busi

Re: New Member to LT - for Tamil

2014-07-11 Thread Elanjelian Venugopal
Hi Daniel, Thanks for your response. I managed to build LT from the source files in github. But despite many tries, I still could not add Tamil and build it. I get this "noclassdeffounderror" - and I don't know Java to identify what is causing the error. May I simply pass you the grammar.xml for

Re: New Member to LT - for Tamil

2014-07-11 Thread Daniel Naber
On 2014-07-11 06:04, Elanjelian Venugopal wrote: > How is the 'mvn install' command different from 'mvn clean package'? > -e. It will also install the results (the JARs files) into your local Maven repository (~/.m2/repository under Linux). That's relevant if you later compile only parts of the

Re: New Member to LT - for Tamil

2014-07-10 Thread Elanjelian Venugopal
Hi, How is the 'mvn install' command different from 'mvn clean package'? -e. On 10 July 2014 18:19, Daniel Naber wrote: > On 2014-07-10 11:30, Elanjelian Venugopal wrote: > > Hi Elanjelian, > > welcome again to LanguageTool! > > > But when I tried to run mvn archetype:generate, it downloaded a

Re: New Member to LT - for Tamil

2014-07-10 Thread Daniel Naber
On 2014-07-10 11:30, Elanjelian Venugopal wrote: Hi Elanjelian, welcome again to LanguageTool! > But when I tried to run mvn archetype:generate, it downloaded a bunch > dependencies, I think, but ultimately failed to build. Message said > "Failed to Execute goal org.apache.maven " The prope

New Member to LT - for Tamil

2014-07-10 Thread Elanjelian Venugopal
Hi, I just joined this group, and am keen to add support for Tamil. As a first step, I tested a couple of simple regex-based rules, and they appear to work here: http://community.languagetool.org/ruleEditor2/ . Next, I have: 1) Downloaded the standalone LT for desktop; 2) Forked and created a l