W dniu 2013-05-23 13:28, Ruud Baars pisze: > Might a change to the spacebefore-detection be an option? > Like specifying space type instead of just No or Yes ?
As this is just for one language, making massive code changes in the core that would otherwise require a very simple Java rule seems to make no sense. Marcin > Ruud > > On 23-05-13 12:58, Marcin Miłkowski wrote: >> W dniu 2013-05-23 11:32, Nathan Wells pisze: >>> So I just confirmed with some farther testing that spacebefore considers >>> a zero-width space as "spacebefore" so my rule will be false. >>> >>> Is the only way to proceed then a java rule? Or is there some way I can >>> add an exception to "spacebefore" to be all spaces except a zero-width >>> space (U+200B)? >> No. Unfortunately, no. >> >>> The reason is in Khmer there are certain conjunctions that should always >>> have a "Real" space before them, not just a zero-width space, so I am >>> trying to create a rule to detect this. >> I'm afraid that in this particular case, a Java rule would be needed. >> >> Best, >> Marcin >> >>> Thanks, >>> Nathan >>> >>> >>> On Wed, May 22, 2013 at 10:35 PM, Nathan Wells <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Yes, I used the "spacebefore" detection (and I could have totally >>> used it in the wrong way - so that could also be the problem!). But >>> on my test it seemed that a zero-width space was included as a >>> "space" so therefore "spacebefore" was true (Khmer words have a >>> zero-width space between them - I am trying to detect a normal space >>> before a word). Am I correct that "spacebefore" will think a >>> zero-width space is a "space" the same as a normal space? >>> >>> Is there anyway to detect specifically a normal space, and ignore a >>> zero-width space? >>> >>> Thanks, >>> Nathan >>> >>> >>> >>> >>> On Wed, May 22, 2013 at 10:27 PM, Marcin Miłkowski >>> <[email protected] <mailto:[email protected]>> wrote: >>> >>> W dniu 2013-05-22 16:00, Nathan Wells pisze: >>> > Hello Again, >>> > >>> > I am writing a rule trying to detect a space (U+0020) before >>> a certain >>> > token for Khmer. And if it is not present (or if only a >>> zero-width space >>> > exists U+200B) to add a space before the words. >>> > >>> > But it looks rules in the grammar.xml might not be able to >>> discern the >>> > difference between a zero-width space and a space...does that >>> have to be >>> > done in a java rule? >>> > >>> >>> No. See here: >>> >>> http://wiki.languagetool.org/tips-and-tricks#toc13 >>> >>> Best regards, >>> Marcin >>> >>> > I don't really know java so I would rather keep things in the >>> > grammar.xml for Khmer. >>> > >>> > I tried this, but it didn't work: >>> > >>> > <rule id="CONJUNCTION_SPACE" name="Add space before certain >>> conjunctions"> >>> > <pattern> >>> > <marker> >>> > <token spacebefore="no" >>> regexp="yes">(ដើម្បី|ពីព្រោះ|ហើយនិង)</token> >>> > </marker> >>> > </pattern> >>> > <message>Add a full space before this word. >>> > <suggestion><match no="1" >>> > regexp_match="(ដើម្បី|ពីព្រោះ|ហើយនិង)" regexp_replace=" >>> $1"></match></suggestion> >>> > </message> >>> > <short>Add a full space before this word.</short> >>> > <example type="correct"> >>> > គាត់បានទៅ<marker> ដើម្បី</marker>មើល។ >>> > </example> >>> > <example type="incorrect" correction=" ដើម្បី"> >>> > គាត់បានទៅ<marker>ដើម្បី</marker>មើល។ >>> > </example> >>> > </rule> >>> > >>> > Any help would be much appreciated - thanks! >>> > Nathan >>> > >>> > >>> > >>> >>> ------------------------------------------------------------------------------ >>> > Try New Relic Now & We'll Send You this Cool Shirt >>> > New Relic is the only SaaS-based application performance >>> monitoring service >>> > that delivers powerful full stack analytics. Optimize and >>> monitor your >>> > browser, app, & servers with just a few lines of code. Try >>> New Relic >>> > and get this awesome Nerd Life shirt! >>> http://p.sf.net/sfu/newrelic_d2d_may >>> > >>> > >>> > >>> > _______________________________________________ >>> > Languagetool-devel mailing list >>> > [email protected] >>> <mailto:[email protected]> >>> > https://lists.sourceforge.net/lists/listinfo/languagetool-devel >>> > >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Try New Relic Now & We'll Send You this Cool Shirt >>> New Relic is the only SaaS-based application performance >>> monitoring service >>> that delivers powerful full stack analytics. Optimize and >>> monitor your >>> browser, app, & servers with just a few lines of code. Try New >>> Relic >>> and get this awesome Nerd Life shirt! >>> http://p.sf.net/sfu/newrelic_d2d_may >>> _______________________________________________ >>> Languagetool-devel mailing list >>> [email protected] >>> <mailto:[email protected]> >>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Try New Relic Now & We'll Send You this Cool Shirt >>> New Relic is the only SaaS-based application performance monitoring service >>> that delivers powerful full stack analytics. Optimize and monitor your >>> browser, app, & servers with just a few lines of code. Try New Relic >>> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may >>> >>> >>> >>> _______________________________________________ >>> Languagetool-devel mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel >>> >> >> ------------------------------------------------------------------------------ >> Try New Relic Now & We'll Send You this Cool Shirt >> New Relic is the only SaaS-based application performance monitoring service >> that delivers powerful full stack analytics. Optimize and monitor your >> browser, app, & servers with just a few lines of code. Try New Relic >> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may >> _______________________________________________ >> Languagetool-devel mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/languagetool-devel > > > ------------------------------------------------------------------------------ > Try New Relic Now & We'll Send You this Cool Shirt > New Relic is the only SaaS-based application performance monitoring service > that delivers powerful full stack analytics. Optimize and monitor your > browser, app, & servers with just a few lines of code. Try New Relic > and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may > _______________________________________________ > Languagetool-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may _______________________________________________ Languagetool-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/languagetool-devel
