Re: Inflecting second token with postag from the first

2016-09-13 Thread Jaume Ortolà i Font
2016-09-13 22:27 GMT+02:00 Andriy Rysin : > Sorry if this is already written somewhere - I looked at wiki pages but > could not find anything relevant. > > I have two tokens (first name and last name) and in the suggestion I want > to inflect second token the same as the first. I tried to do this:

Re: [pt-BR] Question about language tool home page.

2016-07-16 Thread Jaume Ortolà i Font
2016-07-16 21:43 GMT+02:00 Matheus Poletto : > Hi guys; > I was looking in https://www.languagetool.org so i found the different > versions of translations like to /pt/, /uk/ ... As im doing the pt-BR i > would like ask you if is possible to translate and create a pt-BR version, > cause some word

Re: Help creating rule pt_PT

2016-07-15 Thread Jaume Ortolà i Font
positives: > "Traduzir em sexo" > > I believe I really need to have the rule as I planned it. > > Do you have any suggestions or should I create two rules with "Latim" in > one? > > Thanks! > > Kind regards, > >Marco A.G.Pinto > --

Re: Help creating rule pt_PT

2016-07-15 Thread Jaume Ortolà i Font
Hi, Most languages have the postag NCMS000, including "Latim". Try: * * Regards, Jaume Ortolà 2016-07-15 12:45 GMT+02:00 Marco A.G.Pinto : > Hello! > > I am trying to create the following rule: > traduzir *em *LANG -> traduzir *para *LANG > (translate TO LANG) > > ** > ** > * * >

Re: bulk corrections in Wikipedia using LT

2016-06-30 Thread Jaume Ortolà i Font
2016-06-30 13:02 GMT+02:00 Juan Martorell : > Great job, Jaume! > > However I found some too-greedy corrections in change_always.txt for > Spanish: > > "esta formada" > "esta constituida" > > Recently, the rule excluded the diacritic tilde for referrers, so to speak: > > from: > "Quedé con la abog

Re: bulk corrections in Wikipedia using LT

2016-06-27 Thread Jaume Ortolà i Font
2016-06-27 16:12 GMT+02:00 Mike Unwalla : > > [2] > https://github.com/jaumeortola/cawiki-roofreading/blob/master/examples/example_Spanish.txt > > I get a 404 not found message. > Sorry, I deleted a character inadvertently. Try this: https://github.com/jaumeortola/cawiki-proofreading/blob/master

bulk corrections in Wikipedia using LT

2016-06-27 Thread Jaume Ortolà i Font
Hi, For some time now I have been using the results of LT analysis to make corrections in the Catalan Wikipedia. I have done almost a million edits. There are very different types of edits. Some are just typos fixed with simple “search and replace”, and others are LT rules that need more or less s

Re: Improving the rules from yesterday

2016-06-14 Thread Jaume Ortolà i Font
Try this: É pois [!.] Usar vírgula: \1, Será verdade? É pois! Regards, Jaume Ortolà 2016-06-14 11:32 GMT+02:00 Marco A.G.Pinto : > Hello Jaume, > > I was wondering if you could help me. > > I wanted to improve yes

Re: Need help creating rule

2016-06-13 Thread Jaume Ortolà i Font
Hi Marco, If you want to take into account every possibility (only one comma present, no comma at all) and always give the proper suggestion, you'll need to write different rules. One rule for: "é pois" é pois [,;:–—\(] Us

Re: Chrome extension update

2016-06-01 Thread Jaume Ortolà i Font
2016-06-01 12:51 GMT+02:00 Daniel Naber : > On 2016-06-01 10:36, Jaume Ortolà i Font wrote: > > Could we add this option? > > > > Assume this variety of Catalan: > > Should be done now in the Beta, could you please test it? > > > https://chrome.google.com/webs

Re: Chrome extension update

2016-06-01 Thread Jaume Ortolà i Font
Daniel, Could we add this option? Assume this variety of Catalan: Catalan = ca-ES Catalan (Valencian) = ca-ES-valencia Regards, Jaume Ortolà 2016-05-31 15:54 GMT+02:00 Daniel Naber : > On 2016-05-30 18:32, Daniel Naber wrote: > > > could everyone please test this update of our Chrome extensi

Re: new HTTP API with JSON output

2016-05-25 Thread Jaume Ortolà i Font
2016-05-25 15:35 GMT+02:00 Daniel Naber : > > A prototype of a new API is now online and can be tested here: > https://languagetool.org/http-api/swagger-ui/#/default -- please provide > feedback, this API is supposed to be stable for the next 10 years... It looks good. The tag should probably

ideas for English rules

2016-05-24 Thread Jaume Ortolà i Font
Hi, This document can provide ideas for new English rules: "Misused English words and expressions in EU publications" [1] Some rules should be quite straightforward: * with the aim to (do) > with the aim of (doing) * competences > powers, jurisdiction * Concerning.../ For what concerns... > with

Re: Spanish confusion rule

2016-05-13 Thread Jaume Ortolà i Font
2016-05-13 15:14 GMT+02:00 Juan Martorell : > > I have some examples where both can be used but can have different meaning > and lead to confusion: > > "No lo digo sino lo hago" =/= "No lo digo si no lo hago" > "Esto debe hacerse ahora, si no más tarde" =?= "Esto no debe hacerse > ahora, sino más

Re: Spanish confusion rule

2016-05-13 Thread Jaume Ortolà i Font
2016-05-13 13:58 GMT+02:00 Juan Martorell : > > Some like 'si no' <-> 'sino' usage, heavily dependant on semantics. > I don't think this case is dependant on semantics (leaving a part "sino" as a noun, which usually has a determinant). It depends on the structure of the sentence. The rules used

Re: ignoring certain tokens in rules

2016-05-06 Thread Jaume Ortolà i Font
relatively rare. Regards, Jaume Ortolà 2016-05-05 16:22 GMT+02:00 Jaume Ortolà i Font : > Hi, > > I think Marcin talked about this idea some time ago. > > Sometimes tokens like quotations (or other characters) should be ignored > in some rules. That is, the sentence should be

ignoring certain tokens in rules

2016-05-05 Thread Jaume Ortolà i Font
Hi, I think Marcin talked about this idea some time ago. Sometimes tokens like quotations (or other characters) should be ignored in some rules. That is, the sentence should be checked as if this token is not present. Any idea about how could it be implemented? Alternatively, tokens like this on

Re: DictionaryExporter error

2016-04-18 Thread Jaume Ortolà i Font
2016-04-18 10:27 GMT+02:00 Juan Martorell : > Hi > > On 15 April 2016 at 14:27, Jaume Ortolà i Font > wrote: > >> >> This is fixed now. >> >> In the last update of these tools, I tried not to change the input and >> ouput formats. But the use

Re: Error trying to update French synthesizer dictionary

2016-04-15 Thread Jaume Ortolà i Font
Hi Dominique, This script can be helpful: https://github.com/Softcatala/catalan-dict-tools/blob/master/build-morfologik-lt.sh Regards, Jaume Ortolà 2016-04-15 22:33 GMT+02:00 Dominique Pellé : > Hi > > I'm trying to upgrade the French POS tag and synthesizer > after updating my script: > > >

Re: DictionaryExporter error

2016-04-15 Thread Jaume Ortolà i Font
Hi, Juan. This is fixed now. In the last update of these tools, I tried not to change the input and ouput formats. But the use of "*" as a separator was an unexpeted choice. Regards, Jaume Ortolà 2016-04-15 11:51 GMT+02:00 Juan Martorell : > Hi, > > I created a script >

Re: Roadmap for Spanish

2016-04-06 Thread Jaume Ortolà i Font
2016-04-06 20:27 GMT+02:00 Marcin Miłkowski : > > > To transform one adjective into an adverb, in English you use the suffix > > `-ly` and in Spanish you use the suffix `-mente`: > > > > Equal --> equally > > Igual --> igualmente > > > > I found 18340 candidates for suffixation in the Spanish dict

Re: Roadmap for Spanish

2016-04-06 Thread Jaume Ortolà i Font
2016-04-06 14:55 GMT+02:00 Juan Martorell : > But more important are some derivatives, both suffixed and prefixed. > Hi Juan, I can tell you my experience in these points. > To transform one adjective into an adverb, in English you use the suffix > `-ly` and in Spanish you use the suffix `-men

Re: Roadmap for Spanish

2016-04-05 Thread Jaume Ortolà i Font
2014-06-06 20:45 GMT+02:00 Juan Martorell : > > *1st and foremost: disambiguator:* > > My current strategy for disambiguation is starting by the longer > constructions and then downsizing to the two tokens constructions. Positive > and negative examples should be included. > I can point out some

Re: Preventing inflections in suggestions

2016-03-12 Thread Jaume Ortolà i Font
2016-03-12 10:22 GMT+01:00 Marcin Miłkowski : > I remove archaic forms for English and Polish words altogether. You're > right, removing individual forms from the synthesizer is the easiest way > (not to mention it will be computationally cheap). > > I believe I also did this for some tags (not su

Re: updating to Morfologik 2.1.0

2016-03-08 Thread Jaume Ortolà i Font
2016-03-08 18:02 GMT+01:00 Marcin Miłkowski : > I think it's almost completely irrelevant. And for some languages, the > differences are much bigger (e.g., for Polish), so fsa5 is definitely > not the best format. So please go ahead with CFSA2. > Ok. In any case, there is no need to rebuild the d

updating to Morfologik 2.1.0

2016-03-08 Thread Jaume Ortolà i Font
Hi, I have done the changes required in LT for updating to Morfologik 2.1.0. You can see them in the branch "updatemorfologik" (a code clean-up is pending). Someone should test these changes before I push them. The inputs for the dictionary builders are the same as before. As for the ouputs, th

MS Word add-in translations

2016-02-07 Thread Jaume Ortolà i Font
Hi, If you want to translate the LanguageTool MS Word add-in into your language, you can do it now at transifex.com. See the file WinFormStrings.resx. Most of the strings are already translated using existing translations. Regards, Jaume Ortolà

Re: MS Word add-in for LT

2016-02-03 Thread Jaume Ortolà i Font
2016-02-03 16:10 GMT+01:00 Andriy Rysin : > Hi Jaume > > it seems that Ukrainian (uk-UA) is not in the list, can you please > take a look at that? > Sorry. I truncated the list in one place. It will be fixed in the next release. Jaume -

Re: MS Word add-in for LT

2016-02-03 Thread Jaume Ortolà i Font
2016-02-03 10:30 GMT+01:00 Mike Unwalla : > Hello Jaume, > > The add-in works now. (It is not necessary for the Windows Firewall to > have an entry for Microsoft Word. Without an entry, I can still access the > server on languagetool.org.) > > Refer to the attachments for suggestions and comments.

Re: MS Word add-in for LT

2016-02-03 Thread Jaume Ortolà i Font
2016-02-02 22:24 GMT+01:00 Daniel Naber : > On 2016-02-02 21:54, Jaume Ortolà i Font wrote: > > > I am not able to test every language, specially non-latin ones > > (Japanese, etc.). > > You could use the same document we also use for (manually) testing > LibreOffic

Re: MS Word add-in for LT

2016-02-02 Thread Jaume Ortolà i Font
Thanks, Daniel. That was the bug. I have fixed it and published a new release. I have also completed the list of languages supported by LT [1]. It seems that Asturian, Breton and Tagalog cannot be defined in a MS Word document. In order to use LanguageTool with these languages, the user has to de

Re: MS Word add-in for LT

2016-02-01 Thread Jaume Ortolà i Font
2016-02-01 19:42 GMT+01:00 Mike Unwalla : > Hello, > > I am struggling to use the Word add-in with Word 2010. I tried to install > on 2 different computers (Windows 7, Windows 8) and get the same problems > on both computers. > > I set LT to run as server on port 8081. I tested that LT for Chrome

Re: MS Word add-in for LT

2016-01-30 Thread Jaume Ortolà i Font
2016-01-30 9:55 GMT+01:00 Marcin Miłkowski : > > Why not simply port some of the code that we have for listing all > categories of rules -- or even write up a small piece of Java code to > create a resource file that would be used to create a localized dialog > for a given language? This seems the

Re: MS Word add-in for LT

2016-01-29 Thread Jaume Ortolà i Font
2016-01-29 14:07 GMT+01:00 Marcin Miłkowski : > W dniu 29.01.2016 o 12:27, Jaume Ortolà i Font pisze: > Just tested and it works in MS Word 2007. > > There are some settings that seem to be relevant only for Catalan, > though, in the Settings dialog box (general, valencia etc.). T

Re: MS Word add-in for LT

2016-01-29 Thread Jaume Ortolà i Font
ce to use LT in their work. > > Regards, > Andriy > > 2016-01-26 17:42 GMT-05:00 Marcin Miłkowski : > > Hi Jaume, > > > > this is very good news! > > > > W dniu 26.01.2016 o 10:47, Jaume Ortolà i Font pisze: > >> Hi, > >> > >> I

MS Word add-in for LT

2016-01-26 Thread Jaume Ortolà i Font
Hi, I have made a beta release of a MS Word add-in for LanguageTool [1]. ("Add-in" is Microsoft terminology for "plug-in"). It has some limitations, but I think it can work fine and be useful. The checking is made only in a dialog box, with the usual options in these dialogues. Unfortunately the

Re: introduce new color for style errors

2016-01-05 Thread Jaume Ortolà i Font
Hi, In some installations of LanguageTool I use the "type" attribute in the elements "category", "rulegroup", "rule" to assign the colors. [1] For example: * Red: type="misspelling" * Blue, by default including: type="grammar" type="typographical" ... * Green: type="style

Re: strange results in languagetool.org

2015-12-25 Thread Jaume Ortolà i Font
2015-12-25 17:58 GMT+01:00 Daniel Naber : > On 2015-12-25 15:29, Jaume Ortolà i Font wrote: > > > The problem is caused probably by an old file that was removed from > > the project, but remains in the server installation used in > > languagetool.org [1]. > > Indeed

Re: strange results in languagetool.org

2015-12-25 Thread Jaume Ortolà i Font
2015-12-25 12:00 GMT+01:00 Daniel Naber : > On 2015-12-25 11:20, Jaume Ortolà i Font wrote: > > > Thanks, it works for me now. And what about the other problem? Words > > like "elapé" or "macroprocés" are marked as errors only in the web > > editor,

Re: strange results in languagetool.org

2015-12-25 Thread Jaume Ortolà i Font
2015-12-25 10:00 GMT+01:00 Daniel Naber : > On 2015-12-24 23:18, Jaume Ortolà i Font wrote: > > > The server running for languagetool.org [1] seems to be updated daily. > > It founds an error in a sentence with a rule I wrote yesterday (in > > Catalan): > > > &g

strange results in languagetool.org

2015-12-24 Thread Jaume Ortolà i Font
Hi, I've found very strange results in the web interface of languagetool.org. The server running for languagetool.org seems to be updated daily. It founds an error in a sentence with a rule I wrote yesterday (in Catalan): - Per què sigui així. But the "rule implementation" is not available when

limit the numer of suggestions in LibreOffice

2015-12-21 Thread Jaume Ortolà i Font
Hi, I would like to limit the maximum number of suggestions that are shown in LibreOffice. In Catalan the Morfologik speller is used for spelling suggestions, and this number is sometimes excessive. I'm thinking of 15 suggestions. Is it okay to do it here for everybody? [1] Regards, Jaume Ortol

Re: LanguageTool in 2015 + the future

2015-12-14 Thread Jaume Ortolà i Font
2015-12-07 19:30 GMT+01:00 Marcin Miłkowski : > I think there's a community that we haven't addressed at all: language > professionals, be it proofreaders or translators (and translation > agencies). Translators are using suboptimal tools, such as Apsic XBench, > for their proofreading tasks. If w

Re: False error given on the online Esperanto checker, can't reproduce it in command line

2015-10-29 Thread Jaume Ortolà i Font
Hi Dominique, When there is no space at the end of the sentence, the last token has the POS tag "PARA_END", and this tag makes rule match: You can see the difference (with space vs without space) here: http://community.languagetool.org/analysis/analyzeText Regards, Jaume Ortolà 2015-10-29 2

Re: LanguageTool for Chrome

2015-10-27 Thread Jaume Ortolà i Font
It works for me! I just found that entities like   or & (not visible) are detected as a spelling errors. Regards, Jaume Ortolà 2015-10-27 18:48 GMT+01:00 Daniel Naber : > On 2015-10-27 18:29, Xavi Ivars wrote: > > > I just did som esmall tests in Gmail, and it still broke the layout, > > remov

Re: LanguageTool for Chrome

2015-10-23 Thread Jaume Ortolà i Font
2015-10-23 16:40 GMT+02:00 Daniel Naber : > On 2015-10-22 11:21, Jaume Ortolà i Font wrote: > > I have tested it and I found some strange behavior when trying to > > replace errors with suggestions. > > thanks for your feedback, a new version (0.8.1) has just been released, &

Re: LanguageTool for Chrome

2015-10-22 Thread Jaume Ortolà i Font
Hi Daniel, Good job! I have tested it and I found some strange behavior when trying to replace errors with suggestions. - When the error is the first word of the text, the replacement is not done. Nothing happens. - In a text area, when the text has more than one newline, and I try to make a rep

Re: new syntax available

2015-10-06 Thread Jaume Ortolà i Font
Thanks, Daniel. It is very useful. Do you suggest converting all simple rules to this new syntax? Do you expect some improvement in performance? Regards, Jaume 2015-10-05 14:59 GMT+02:00 Daniel Naber : > Hi, > > there's now a first and limited implementation of the syntax in > master. Instead

Re: Idea to introduce tag in LT grammar rules.

2015-09-05 Thread Jaume Ortolà i Font
2015-09-05 16:11 GMT+02:00 Daniel Naber : > On 2015-09-04 23:21, Dominique Pellé wrote: > > > I wish I could write a rule pattern like this: > > > > plein temps#chaque fois#rude épreuve#vol > > d’oiseau > > What about a more radical approach (which would be trickier to > implement): > > a >

Re: improvements in Morfologik speller

2015-06-11 Thread Jaume Ortolà i Font
2015-06-10 9:38 GMT+02:00 Daniel Naber : > On 2015-06-08 21:27, Jaume Ortolà i Font wrote: > > > You are right. These results are not expected. I will look at them > > again. > > I have another problem now with the Morfologik snapshot and release: > "is" is

Re: improvements in Morfologik speller

2015-06-08 Thread Jaume Ortolà i Font
2015-06-08 9:39 GMT+02:00 Daniel Naber : > On 2015-06-02 15:06, Jaume Ortolà i Font wrote: > > Hi Jaume, > > sorry for the late reply. > > > There are some failures with the current German LanguageTool tests. > > Could you take a look, Daniel? You need to use replac

improvements in Morfologik speller

2015-06-02 Thread Jaume Ortolà i Font
Hi, I'm testing some minor improvements in the Morfologik speller. They are here: https://github.com/jaumeortola/morfologik-stemming The most important are: - Try all possible replacements at the same point of a word (not only the longest one). [1] - Apply the properties "ignore-diacritics" and

non-breaking space / spacebefore="no"

2015-05-18 Thread Jaume Ortolà i Font
Hi, A non-breaking space in a pattern rule is considered as spacebefore="no". Is there a reason for this behavior? Regrads, Jaume Ortolà -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest

command-line XML output

2015-05-09 Thread Jaume Ortolà i Font
Hi, I need to use the command-line XML output (with the --api option). The list of unkown words is needed but it is missing. Can we add this list to the XML? It would be something like this: Regards, Jaume Ortolà

Re: Multiple zero of min occurances

2015-04-29 Thread Jaume Ortolà i Font
2015-04-29 19:38 GMT+02:00 Andriy Rysin : > I just found out that if I have multiple tokens with min="0" my > patterns don't match. Looking at the code it seems like if min="0" we > only check for next pattern to match but that next may also have 0 > mins. > I wrote little patch with tests that ma

Re: regexp case sensitivity

2015-04-26 Thread Jaume Ortolà i Font
2015-04-26 23:40 GMT+02:00 Andriy Rysin : > Looks like in regexp is case sensitive by default, but in > it's not. Is this only for me? If not was this by design? > It is the other way around: regexp is case insensitive and regexp is case sensitive. I'd prefer both to be case insensitive by d

Re: Concordance error pt_PT

2015-04-21 Thread Jaume Ortolà i Font
> > Kind regards, > >Marco A.G.Pinto >-- > > On 21/04/2015 13:13, Jaume Ortolà i Font wrote: > > Hi, > > You have a problem in the example correction. The rule should look like > this: > > > > > >

Re: Concordance error pt_PT

2015-04-21 Thread Jaume Ortolà i Font
Hi, You have a problem in the example correction. The rule should look like this: Erro de concordância do plural. As vaca está no pasto. A vaca está no pasto. If there is no suggestion, the correction field has to be empty. Regards, Jaume Ortolà 2015-04-21 13:57 GMT+02:00 Marc

Re: Concordance error - pt_PT

2015-04-15 Thread Jaume Ortolà i Font
This dictionary is the FreeLing Potuguese dictionary. It was added by me. Alberto suggested using another one provided by him. Regards, Jaume 2015-04-15 14:47 GMT+02:00 Marco A.G.Pinto : > Daniel, > > It returns the following (I mixed masculine with feminine words): > > > Kind regards, >

Re: Concordance error - pt_PT

2015-04-14 Thread Jaume Ortolà i Font
Hi Marco, You need a tagger dictionary if you want to find concordance errors. We talked some time ago about adding a tagger dictionary for Portuguese. Is there any news on this? Regards, Jaume 2015-04-14 15:20 GMT+02:00 Marco A.G.Pinto : > Hello! > > Could someone explain how to add concordan

Re: MultiThreadedJLanguageTool

2015-02-22 Thread Jaume Ortolà i Font
2015-02-22 15:04 GMT+01:00 Andriy Rysin : > No, the only thing I pushed that will lead to regressions was remove > more than one consequitive overlapping matches in SameRuleGroupFilter > (and also make sure we remove conequitive overlaps produced by > multiple threads). The regressions above seems

proofreading long documents

2015-01-29 Thread Jaume Ortolà i Font
Hi, I use LanguageTool in command-line for proof-reading long documents (whole books) and I'd like to make this process easily available to more people (without additional scripts). It could become a web service, but some people doesn't want to send copyrighted material to a public webpage, and so

Tests fail: concurrency problem?

2014-12-23 Thread Jaume Ortolà i Font
Hi, I get test errors in HTTPServerLoadTest with the current master branch (no other changes), the same or similar errors in different machines. Has anyone else seen this error? Regards, Jaume Ortolà Tests in error: HTTPServerLoadTest.testHTTPServer:61 » Execution java.lang.AssertionError:

Re: added.txt activated for most languages

2014-12-22 Thread Jaume Ortolà i Font
Hi Daniel, I use the manual-tagger not only as a way to add new words and tags, but also as a means of fixing tags temporarily until the next dictionary update. So if there is a manual tag, the dictionary tag is ignored. I think that makes sense. Could we do it likewise in the CombiningTagger? We

Re: Changes in UpperCaseSentenceStart

2014-12-21 Thread Jaume Ortolà i Font
; > > Sat, 20 Dec 2014 11:36:13 +0100 от Jaume Ortolà i Font : > > Hi, > > I have modified the rule UpperCaseSentenceStart so that there is a match > in sentences starting with quotes like « or “ and a lower case word. > > In the nightly tests there are some new matches

Changes in UpperCaseSentenceStart

2014-12-20 Thread Jaume Ortolà i Font
Hi, I have modified the rule UpperCaseSentenceStart so that there is a match in sentences starting with quotes like « or “ and a lower case word. In the nightly tests there are some new matches for different languages. Tell me if there is any problem. In French there are new matches caused by wr

Re: bug: morfologik rule with word "ls"

2014-11-29 Thread Jaume Ortolà i Font
ATTERN = Pattern.compile(".*" + "" + "\\d+" + ".*"); to make it more robust. I have found some segments of words converted unexpectedly in accepted words. Regards, Jaume Ortolà 2014-11-29 11:05 GMT+01:00 Daniel Naber : > On 2014-11-28 23:46, Jaume Or

bug: morfologik rule with word "ls"

2014-11-28 Thread Jaume Ortolà i Font
I have found a strange bug. Take the non-existent word "ls" (LS). This happens in Catalan: 1) The POS tag is null. The word is not in the dictionary and it is not tagged by the tagger. OK 2) In MorfologikCatalanSpellerRuleTest there is a rule match. OK 3) There is no rule match in the LT web edito

Re: Question about Spanish language

2014-11-03 Thread Jaume Ortolà i Font
Hi, I'm not sure I understand the question. In Spanish LL and RR are usually double letters o digraphs except in a few cases. RR are two independent letters when they come from adding a prefix to a word: inter+relacionar = interrelacionar; hiper+realismo =hiperrealismo, etc. But the spelling of

Re: Applying matched token's POS tag to another matched token

2014-10-31 Thread Jaume Ortolà i Font
Currently it's not possible. I have need it too sometimes. Regards, Jaume Ortolà 2014-10-30 17:37 GMT+01:00 Linas Valiukas : > Hi there, > > LanguageTool seems to provide an ability to apply POS tag of a match to a > word, like this (taken from "Development Overview" page): > > kierować > > How

Re: Case sensitivity in MultiWordChunker

2014-10-26 Thread Jaume Ortolà i Font
2014-10-26 14:03 GMT+01:00 R.J. Baars : > What does Multiwordchunker do? See a previous thread in this list: "spell checker enhancement" (sept 16). Jaume -- ___ Languagetool-

Re: Case sensitivity in MultiWordChunker

2014-10-26 Thread Jaume Ortolà i Font
2014-10-24 20:27 GMT+02:00 Andriy Rysin : > Was it by design that MultiWordChunker is case sensitive and we need > to duplicate most of the lines for lower and upper cases? > Would it make sense to add a flag setCaseSensitive() to make it automatic? > It makes sense. I agree and I need it too. L

Wikicheck not working for some articles

2014-10-13 Thread Jaume Ortolà i Font
Hi, Wikicheck is not working now for articles with titles that include some diacritic. See, for example, [1]. It used to work well. Regards, Jaume Ortolà [1] http://tools.wmflabs.org/languagetool/pageCheck/index?lang=ca&url=Llista_dels_rius_m%C3%A9s_llargs ---

Re: IndexOutOfBoundsException with min=0 attribute in pattern rule

2014-10-13 Thread Jaume Ortolà i Font
Hi, I think a token min="0" at the end or at the start of a pattern is useless. The pattern is equivalent with or without this token. The error probably comes from a bug. Nobody tried token min="0" at the end of a pattern precisely because it is useless. Regards, Jaume Ortolà 2014-10-13 17:07

Re: looking for more semantic rules

2014-10-07 Thread Jaume Ortolà i Font
Hi, I have an idea for extending the rule that checks dates and days of the week. If the year (or even the month) is not mentioned in the sentence, perhaps we could assume the current year (or month). This could be useful in informal writing (e.g. in mail messages), where this kind of errors is mo

Re: Morfologik speller

2014-10-03 Thread Jaume Ortolà i Font
2014-10-03 14:50 GMT+02:00 Marcin Miłkowski : > W dniu 2014-10-03 o 13:22, R.J. Baars pisze: > > Marcin, > > > > would it be possible to use the morfologik speller as a separate program, > > to throw a list of words at, and get the alternatives? > > No. It does not tokenize words, and you need a l

Re: Large amount of rules ...

2014-09-27 Thread Jaume Ortolà i Font
2014-09-27 11:06 GMT+02:00 R.J. Baars : > > It is all about suggesting a Dutch word for a loanword. > Then why don't you use a simple replace rule (in Java)? You can use the existing one (or adapt it) and put the list of words in a text file. Jaume ---

Re: spell checker enhancement

2014-09-16 Thread Jaume Ortolà i Font
> > Op 16-09-14 om 13:23 schreef Jaume Ortolà i Font: > > 2014-09-16 13:03 GMT+02:00 R.Baars : > >> I see. This is probably of no use for spellchecking, but it is for >> postagging. >> >> > It gives no suggestions, but it can be used for avoiding fa

Re: spell checker enhancement

2014-09-16 Thread Jaume Ortolà i Font
ed (ie, tag the inside tokens too). Regards, Jaume > (Might come in handy for just this tagging..) > > Ruud > > Op 16-09-14 om 12:56 schreef Jaume Ortolà i Font: > > Hi, Ruud. > > I don't find any documentation. It is used in Polish, French, Catalan, > Russi

Re: spell checker enhancement

2014-09-16 Thread Jaume Ortolà i Font
12:33 GMT+02:00 R.Baars : > Jaume, thanks, but I am not sure. > > Depends on its implementation I think. > > Where can I find more info? > > Ruud > > Op 16-09-14 om 12:26 schreef Jaume Ortolà i Font: > > 2014-09-16 11:21 GMT+02:00 R.J. Baars : > >> We don&#

Re: spell checker enhancement

2014-09-16 Thread Jaume Ortolà i Font
2014-09-16 11:21 GMT+02:00 R.J. Baars : > We don't agree. There is a spellchecker, but also a single word ignore > list for it. > There are XML rules, but also a Simplereplace rule, a compounding rule. > > So apart from the hammer and the screwdriver, there are more tools. > > There is indeed anot

Re: Multiple suggestions by SimpleReplaceRule

2014-09-13 Thread Jaume Ortolà i Font
2014-09-13 10:24 GMT+02:00 R.J. Baars : > I was wondering if the simplereplacerule supports multiple suggestions. > > I wanted to suggest 'd.m.v.' and 'door middel van' for 'dmv'. > > dmv=d.m.v > dmv=door middel van > > You can write: dmv=d.m.v.|door middel van The rule also supports multiple wr

Re: spell checking Cincinatti -> Cincinnati

2014-09-12 Thread Jaume Ortolà i Font
Hi Daniel, Another option. Use distance=1 and add replacement pairs (n nn, nn n, t tt, tt t). I made optimizations in the morfologik speller specifically for this use. What happens with performance? Regards, Jaume Ortolà 2014-09-12 17:53 GMT+02:00 Daniel Naber : > Hi, > > we currently don't

Re: Suggestion: find POS tag of portion of a word in XML rules

2014-09-10 Thread Jaume Ortolà i Font
Hi Dominique, I think the best thing to do is to change the tokenization appropriately, and segment the pronouns in different tokens. That's what it's done in Catalan. Of course, the tokenizer gets a little more complex. But, after that, you can do many more things in the rules. The alternatives s

Re: Bug is disambiguator?

2014-09-03 Thread Jaume Ortolà i Font
2014-09-03 12:12 GMT+02:00 Marcin Miłkowski : > You will see that the Catalan pattern rule breaks then. Please fix it, > and I'll see if that's everything we need. Thanks. Fixed a couple of disambiguation rules. Try now. (If you want to run all the tests right now, you need to comment out the r

Re: Bug is disambiguator?

2014-09-03 Thread Jaume Ortolà i Font
2014-09-03 9:59 GMT+02:00 Marcin Miłkowski : > > We could, in principle, try to add this kind of test to the > disambiguator action but I'm not sure if it won't break something. > Hi, I agree with Dominique. This behavior can generate errors which are very hard to detect. If we change it, it's

Re: Bug is disambiguator?

2014-09-03 Thread Jaume Ortolà i Font
Dominique, As far as I remember (it is documented somwhere), that is what happens when you try to filter a non-existent tag. You try to filter "N.*" but there is no N.* tag in the token. In your sentence "eil" is not tagged with N. You need something like this: u[ln]|a[nlr]

Re: locqualityissuetype

2014-08-27 Thread Jaume Ortolà i Font
2014-08-27 19:26 GMT+02:00 R.J. Baars : > I see. But don't understand. What I do understand is it meant to specify > something, out of an issue list. > > Is there an issue list somewhere (these documents are so complicated...) > See the list of values here: http://www.w3.org/TR/its20/#lqissue-ty

Re: The SENT_END challenge

2014-08-09 Thread Jaume Ortolà i Font
Hi, A possible and simple solution is to write two rules. One for sentences with ending punctuation: (you|thei|ou)r [.?!] And another one for sentences without ending punctuation: (you|thei|ou)r They are i

Re: enabling and disabling rules in LT command-line

2014-07-20 Thread Jaume Ortolà i Font
2014-07-20 18:07 GMT+02:00 Daniel Naber : > On 2014-07-20 11:22, Jaume Ortolà i Font wrote: > > > enabled = "list of rules..." > > disabled = "list of rules..." > > enabledOnly = yes [by default, no] > > > > Could we implement the same

enabling and disabling rules in LT command-line

2014-07-20 Thread Jaume Ortolà i Font
Hi, I need to enable and disable rules at the same time in command-line. This is already done in the server mode with three parameters[1]: enabled = "list of rules..." disabled = "list of rules..." enabledOnly = yes [by default, no] Could we implement the same approach in the command-line? Wil

sample Portuguese rules

2014-07-08 Thread Jaume Ortolà i Font
Here you can see the results of the sample rules I created in Portuguese: https://languagetool.org/regression-tests/20140708/result_pt_20140708.html "Suas" is wrongly tagged in the Freeling dictionary as singular. It should be plural. That explains most of the false alarms. But the rule needs so

Re: Tagger Dictionary and Minho University - pt_PT

2014-07-08 Thread Jaume Ortolà i Font
can generate tags similar to Freeling (as in Galician, Spanish or Catalan), some existing rules could be used as models, and those who are familiar with them (as myself) could contribute more readily. Regards, Jaume Ortolà On Tue, Jul 8, 2014 at 9:39 PM, Jaume Ortolà i Font > wrote: > >>

Re: Tagger Dictionary and Minho University - pt_PT

2014-07-08 Thread Jaume Ortolà i Font
; or can I still change in grammar.xml and commit? > > > Thanks! > > Kind regards, > >Marco A.G.Pinto >----------- > > > On 08/07/2014 20:43, Jaume Ortolà i Font wrote: > > Marco, > > I have committed a Portuguese tagger dictionary built

Re: Tagger Dictionary and Minho University - pt_PT

2014-07-08 Thread Jaume Ortolà i Font
Marco, I have committed a Portuguese tagger dictionary built from the Freeling dictionary. It's more than enough to start. Now you have more than a milion tagged word forms. Once the code is updated, you will be able to write your own rules in the online rule editor: http://community.languagetool

Re: Morphologic Analyser to solve concordance issue for Portuguese

2014-07-08 Thread Jaume Ortolà i Font
n add the tagger dictionary in 15 > minutes if you want. Creating the dictionary from hunspell is a *BAD* > idea if you already have a tagged wordlist. > > Regards, > Marcin > > > :-P > > Kind regards, > >Marco A.G.Pinto >-

Re: Morphologic Analyser to solve concordance issue for Portuguese

2014-07-08 Thread Jaume Ortolà i Font
2014-07-08 17:34 GMT+02:00 Marco A.G.Pinto : > Hello! > > I have contacted my Minho University friends who make the pt_PT > dictionaries for Mozilla and OpenOffice/LibreOffice. > > They said they can create the postag dictionary and help. > > Hi Marco, What I and Marcin try to say is that there i

Re: Morphologic Analyser to solve concordance issue for Portuguese

2014-07-08 Thread Jaume Ortolà i Font
2014-07-08 9:37 GMT+02:00 Marcin Miłkowski : > > The Portuguese dictionary is already built. We simply haven't included > it yet because we usually start from a certain number of rules, and then > add the tagger. Using the tags in rules is a very good idea overall. > > I agree with Marcin. The mos

possible new English rule

2014-05-28 Thread Jaume Ortolà i Font
Could it be a useful rule? a|an|the Probably a bad construction: a/the + infinitive a compete catastrophe an argue in a complete catastrophe a show Regards, Jaume Ortolà -- Time is money. Stop wasting it!

rules default="off" are enabled in Wikipedia check

2014-05-06 Thread Jaume Ortolà i Font
Hi, This happens now in the WikiCheck and in the nightly differences. For example, with this rule from Catalan grammar.xml: It was caused by some change today. Regards, Jaume -- Is your legacy SCM system holding you ba

  1   2   3   >