
Say I want to detect invalid use of word "a" (= has, verb)
instead of "à" (= at, preposition) in many French expressions
such as:

   a nouveau -> à nouveau
   a plein temps -> à plein temps
   a rude épreuve -> à rude épreuve
   a vol d'oiseau -> à vol d'oiseau

I wish I could write a rule pattern like this:

      <tokens>plein temps#chaque fois#rude épreuve#vol d’oiseau</tokens>

Notice the <tokens> tag, with an 's' instead of <token>.
The # character and space characters inside <tokens>...#...#...</tokens>
would be automatically interpreted in such a way that the above rule
is equivalent to much more verbose set of rules:


In other words:
* each # character inside <tokens>...#...#...</tokens> creates
  a new <rule>.
* And the spaces inside <tokens>...</token>> causes automatic
  tokenization so that something like <tokens>rude épreuve</tokens>
  is automatically interpreted as <token>rude</token><token>épreuve</token>.

I'm curious whether rule maintainers would find it useful.

Languagetool-devel mailing list

Reply via email to