Re: trying OpenRegex

2013-08-28 Thread Jaume Ortolà i Font
2013/8/28 Daniel Naber > On 2013-08-27 19:56, Jaume Ortolà i Font wrote: > > > I have implemented this solution for the . It seems to work. > > Thanks! > > > Git question. Is it OK to publish my modifications with: "git push > > origin my_local_branch"? > > Did that work or did you push the chang

Re: trying OpenRegex

2013-08-28 Thread Daniel Naber
On 2013-08-27 19:56, Jaume Ortolà i Font wrote: > I have implemented this solution for the . It seems to work.  Thanks! > Git question. Is it OK to publish my modifications with: "git push > origin my_local_branch"? Did that work or did you push the changes with a different command? Regards

Re: trying OpenRegex

2013-08-27 Thread Jaume Ortolà i Font
2013/8/20 Daniel Naber > On 2013-08-14 18:59, Marcin Miłkowski wrote: > > > For "or", I can see two solutions: > > > > (a) run-time conversion of such rules to a list of normal rules (when > > reading the file, in the similar way as phrases are used) -- this is > > the > > easiest way and it seem

Re: trying OpenRegex

2013-08-24 Thread Daniel Naber
On 2013-08-14 10:59, Daniel Naber wrote: >> The only drawback I see is that matching will probably become slower. > > There's one more thing it seems: backreferences are not possible with > OpenRegex. That is, rules like these won't be possible: Yet another issue: OpenRegex only allows greedy ma

Re: trying OpenRegex

2013-08-21 Thread Daniel Naber
On 2013-08-14 18:59, Marcin Miłkowski wrote: Hi Marcin, > This, I think, requires introducing a couple of booleans. The > pseudo-code would be something like this: is there any chance you could give this a try? I thought I got it, but today I noticed that it's still buggy. Maybe my approach was

Re: trying OpenRegex

2013-08-20 Thread Daniel Naber
On 2013-08-14 18:59, Marcin Miłkowski wrote: > For "or", I can see two solutions: > > (a) run-time conversion of such rules to a list of normal rules (when > reading the file, in the similar way as phrases are used) -- this is > the > easiest way and it seems straightforward I agree. Does someb

Re: trying OpenRegex

2013-08-14 Thread Dominique Pellé
On Wed, Aug 14, 2013 at 6:59 PM, Marcin Miłkowski wrote: > > W dniu 2013-08-14 17:45, Daniel Naber pisze: > > On 2013-08-14 17:14, Marcin Miłkowski wrote: > > > >> Switching to OpenRegex seems like a nice idea but I noticed a really > >> huge slowdown when you change from a language with 100 rules

Re: trying OpenRegex

2013-08-14 Thread Marcin Miłkowski
W dniu 2013-08-14 17:45, Daniel Naber pisze: > On 2013-08-14 17:14, Marcin Miłkowski wrote: > >> Switching to OpenRegex seems like a nice idea but I noticed a really >> huge slowdown when you change from a language with 100 rules to French >> or Polish - this is all because of the number of rules.

Re: trying OpenRegex

2013-08-14 Thread Daniel Naber
On 2013-08-13 11:17, Daniel Naber wrote: > The only drawback I see is that matching will probably become slower. A use case where performance is important is checking the Wikipedia dumps, as we do on our server. Actually we're checking only a tiny part of them, because it takes so long. Anyway

Re: trying OpenRegex

2013-08-14 Thread Daniel Naber
On 2013-08-14 17:14, Marcin Miłkowski wrote: > Switching to OpenRegex seems like a nice idea but I noticed a really > huge slowdown when you change from a language with 100 rules to French > or Polish - this is all because of the number of rules. My really long-term idea about OpenRegex is that o

Re: trying OpenRegex

2013-08-14 Thread Marcin Miłkowski
W dniu 2013-08-14 12:04, Dominique Pellé pisze: > On Wed, Aug 14, 2013 at 10:59 AM, Daniel Naber > wrote: > > On 2013-08-13 11:17, Daniel Naber wrote: > > > >> The only drawback I see is that matching will probably become slower. > > > > There's one more thing

Re: trying OpenRegex

2013-08-14 Thread Dominique Pellé
On Wed, Aug 14, 2013 at 10:59 AM, Daniel Naber wrote: > On 2013-08-13 11:17, Daniel Naber wrote: > >> The only drawback I see is that matching will probably become slower. > > There's one more thing it seems: backreferences are not possible with > OpenRegex. That is, rules like these won't be poss

RE: trying OpenRegex

2013-08-14 Thread Mike Unwalla
.de] Sent: 13 August 2013 10:17 To: LanguageTool Developer List Subject: trying OpenRegex Hi, OpenRegex[1] is a nice Java package that supports regular expressions on any kind of list - for example on AnalyzedTokenReadings, the objects used internally by LT. Using OpenRegex instead of our own match

Re: trying OpenRegex

2013-08-14 Thread Daniel Naber
On 2013-08-13 11:17, Daniel Naber wrote: > The only drawback I see is that matching will probably become slower. There's one more thing it seems: backreferences are not possible with OpenRegex. That is, rules like these won't be possible: Of course you can still access the match in the , but

Re: trying OpenRegex

2013-08-13 Thread Dominique Pellé
Daniel Naber wrote: > On 2013-08-13 21:26, Daniel Naber wrote: > >> Matching English noun phrases with LT currently seems impossible or >> awkward (http://wiki.languagetool.org/tips-and-tricks#toc5). > > Looking at the example on that page: > > [pos="jj"]+ > is equivalent to > postag="jj"/> > >

Re: trying OpenRegex

2013-08-13 Thread Daniel Naber
On 2013-08-13 21:26, Daniel Naber wrote: > Matching English noun phrases with LT currently seems impossible or > awkward (http://wiki.languagetool.org/tips-and-tricks#toc5). Looking at the example on that page: [pos="jj"]+ is equivalent to The problem with this seems to be that the LT equivale

Re: trying OpenRegex

2013-08-13 Thread Daniel Naber
On 2013-08-13 20:24, Dominique Pellé wrote: > But if new features of OpenRegex are used (such as skipping tokens), > then we need to extend xml tags/attributes of grammar.xml. I'm > curious > how it could look like. For example like this: : match 1 to 3 tokens (basically a skip - to me, it ma

Re: trying OpenRegex

2013-08-13 Thread Andriy Rysin
I am all for using more powerful engine if it allows to simplify rules. Most users probably won't be checking megabytes of text so speed is not critical unless the slowdown is tremendous. I guess it may slow down some of our regression tests but I could definitely live with it if it allows opti

Re: trying OpenRegex

2013-08-13 Thread Dominique Pellé
Daniel Naber wrote: > Hi, > > OpenRegex[1] is a nice Java package that supports regular expressions > on any kind of list - for example on AnalyzedTokenReadings, the objects > used internally by LT. Using OpenRegex instead of our own matching > algorithm would have some benefits: > > * we could e

trying OpenRegex

2013-08-13 Thread Daniel Naber
Hi, OpenRegex[1] is a nice Java package that supports regular expressions on any kind of list - for example on AnalyzedTokenReadings, the objects used internally by LT. Using OpenRegex instead of our own matching algorithm would have some benefits: * we could easily support min and max attribu