So I just confirmed with some farther testing that spacebefore considers a
zero-width space as "spacebefore" so my rule will be false.
Is the only way to proceed then a java rule? Or is there some way I can
add an exception to "spacebefore" to be all spaces except a zero-width
space (U+200B)?
The reason is in Khmer there are certain conjunctions that should always
have a "Real" space before them, not just a zero-width space, so I am
trying to create a rule to detect this.
Thanks,
Nathan
On Wed, May 22, 2013 at 10:35 PM, Nathan Wells <[email protected]> wrote:
> Yes, I used the "spacebefore" detection (and I could have totally used it
> in the wrong way - so that could also be the problem!). But on my test it
> seemed that a zero-width space was included as a "space" so therefore
> "spacebefore" was true (Khmer words have a zero-width space between them -
> I am trying to detect a normal space before a word). Am I correct that
> "spacebefore" will think a zero-width space is a "space" the same as a
> normal space?
>
> Is there anyway to detect specifically a normal space, and ignore a
> zero-width space?
>
> Thanks,
> Nathan
>
>
>
>
> On Wed, May 22, 2013 at 10:27 PM, Marcin Miłkowski <[email protected]>wrote:
>
>> W dniu 2013-05-22 16:00, Nathan Wells pisze:
>> > Hello Again,
>> >
>> > I am writing a rule trying to detect a space (U+0020) before a certain
>> > token for Khmer. And if it is not present (or if only a zero-width space
>> > exists U+200B) to add a space before the words.
>> >
>> > But it looks rules in the grammar.xml might not be able to discern the
>> > difference between a zero-width space and a space...does that have to be
>> > done in a java rule?
>> >
>>
>> No. See here:
>>
>> http://wiki.languagetool.org/tips-and-tricks#toc13
>>
>> Best regards,
>> Marcin
>>
>> > I don't really know java so I would rather keep things in the
>> > grammar.xml for Khmer.
>> >
>> > I tried this, but it didn't work:
>> >
>> > <rule id="CONJUNCTION_SPACE" name="Add space before certain
>> conjunctions">
>> > <pattern>
>> > <marker>
>> > <token spacebefore="no" regexp="yes">(ដើម្បី|ពីព្រោះ|ហើយនិង)</token>
>> > </marker>
>> > </pattern>
>> > <message>Add a full space before this word.
>> > <suggestion><match no="1"
>> > regexp_match="(ដើម្បី|ពីព្រោះ|ហើយនិង)" regexp_replace="
>> $1"></match></suggestion>
>> > </message>
>> > <short>Add a full space before this word.</short>
>> > <example type="correct">
>> > គាត់បានទៅ<marker> ដើម្បី</marker>មើល។
>> > </example>
>> > <example type="incorrect" correction=" ដើម្បី">
>> > គាត់បានទៅ<marker>ដើម្បី</marker>មើល។
>> > </example>
>> > </rule>
>> >
>> > Any help would be much appreciated - thanks!
>> > Nathan
>> >
>> >
>> >
>> ------------------------------------------------------------------------------
>> > Try New Relic Now & We'll Send You this Cool Shirt
>> > New Relic is the only SaaS-based application performance monitoring
>> service
>> > that delivers powerful full stack analytics. Optimize and monitor your
>> > browser, app, & servers with just a few lines of code. Try New Relic
>> > and get this awesome Nerd Life shirt!
>> http://p.sf.net/sfu/newrelic_d2d_may
>> >
>> >
>> >
>> > _______________________________________________
>> > Languagetool-devel mailing list
>> > [email protected]
>> > https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>> >
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Try New Relic Now & We'll Send You this Cool Shirt
>> New Relic is the only SaaS-based application performance monitoring
>> service
>> that delivers powerful full stack analytics. Optimize and monitor your
>> browser, app, & servers with just a few lines of code. Try New Relic
>> and get this awesome Nerd Life shirt!
>> http://p.sf.net/sfu/newrelic_d2d_may
>> _______________________________________________
>> Languagetool-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>
>
>
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel