On 12 March 2013 16:58, Tino Didriksen <tino.didrik...@gmail.com> wrote: > On Tue, Mar 12, 2013 at 12:21 PM, Francis Tyers <fty...@prompsit.com> wrote: >> >> El dt 12 de 03 de 2013 a les 10:55 +0000, en/na Jimmy O'Regan va >> escriure: >> > >> >> > Sorry, I wasn't clear enough. The idea is "segmentation". I said that >> > segmentation by itself would probably make a good project, where "by >> > itself" was intended to mean that the project would just be >> > segmentation. >> > >> > In practice, you will also have to work on a language pair where this >> > can be used. zh_ZH-zh_TW is a perfect candidate, because segmentation >> > is not strictly necessary for this language pair - i.e., you use it to >> > demonstrate that segmentation is working, without _needing_ to. In >> > that regard, you will need to also allot some time to developing that >> > language pair, though it will not be the primary focus of the project. >> >> So this would be for languages where word boundaries are not written ... >> Chinese/Thai/Lao/Khmer/Burmese etc. ? >> >> Yes, that could be interesting. But, if it was the case that the project >> would be for just segmentation, then ideally it would be tested on more >> than one language. > > > Sounds trivially done by making a thin shell over ICU's BreakIterator: > http://userguide.icu-project.org/boundaryanalysis
Not really. That's aimed more at segmentation for display purposes - wrapping lines and the like - where things like ambiguity in the segmentation are not a pressing concern. We can already get something equivalent in lttoolbox, by setting the dictionary to postblank by default. -- <Sefam> Are any of the mentors around? <jimregan> yes, they're the ones trolling you ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar _______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff