Francis Tyers <fty...@prompsit.com> writes: > El dt 25 de 03 de 2014 a les 12:17 +0000, en/na Jim O'Regan va escriure:
[...] >> Also, I have a tiny feature that allows the user to specify a set of >> characters to be ignored at runtime (motivated primarily by soft >> hyphens, but I've left it general[1]). I sent the patch to Sergio to >> review, but I'd really rather get it in now than wait n years until >> the next release :) >> >> For the curious, I've attached the patch. >> >> Current behaviour is: >> $ echo testing |lttoolbox/lt-proc >> ~/Apertium/apertium-en-es/en-es.automorf.bin >> ^test/test<n><sg>/test<vblex><inf>/test<vblex><pres>$^ing/*ing >> >> Using this as soft-hyphen.icx: >> >> <?xml version="1.0"?> >> <ignored-chars> >> <char value="­ "/> >> </ignored-chars> >> >> echo testing |lttoolbox/lt-proc -i soft-hyphen.icx >> ~/Apertium/apertium-en-es/en-es.automorf.bin >> ^testing/test<vblex><ger>/test<vblex><pprs>/test<vblex><subs>/testing<n><sg>$ > > Could this just be included as default ? I mean, are there any cases in > which we would not want to skip a soft-hyphen ? So having an icx on the command line is nice for developers, and people who use lt-proc for non-Apertium things. But it would require changing modes files for any pairs that want to take advantage of it … I think maybe a hardcoded ignore-list in lttoolbox would be more helpful to more users. Are there other use-cases than soft-hyphens? Or cases where we want to _not_ ignore the soft-hyphen? (Tino Didriksen noted some other possibly skippable stuff: http://www.fileformat.info/info/unicode/category/Cf/list.htm ) -- Kevin Brubeck Unhammer GPG: 0x766AC60C
pgpWxOasYgJx8.pgp
Description: PGP signature
------------------------------------------------------------------------------ HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions Find What Matters Most in Your Big Data with HPCC Systems Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. Leverages Graph Analysis for Fast Processing & Easy Data Exploration http://p.sf.net/sfu/hpccsystems
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff