Thanks so much!

I once visited the repo of lttoolbox and read the source code of
lt-proc.cc,
lt-comp.cc, lt-expand.cc, etc. But at that time, I was not sure whether it
was
the code I needed, so I only read it roughly. But I still remember their
location
in the repository. Now I'll look more closely and try to find out the
specific code
that implements tokenization and where it fits into the ICU. I think this
will help
improve my proposal.

Sincerely,

Weizhe

On Mon, Mar 16, 2020 at 11:44 PM Tino Didriksen <m...@tinodidriksen.com>
wrote:

> It's somewhere in https://github.com/apertium/lttoolbox - I don't know
> the exact location.
>
> The entrypoint that does tokenization is lt-proc, so start from lt-proc.cc
> and trace execution to somewhere that does tokenization. That's also a good
> way to learn the codebase.
>
> -- Tino Didriksen
>
>
> On Mon, 16 Mar 2020 at 16:00, 杨伟哲 <gavinwzma...@gmail.com> wrote:
>
>> Hi Tino and Fammie,
>>
>> Due to my mistake in sending the email before, I am not sure whether you
>> have
>> received the email I sent, so I'm sending the email to you again now.
>> Hope you can
>> receive it.
>>
>> These days, I read the wikipedia description of tokenization and got a
>> general idea
>> of how it works.I also learn some icu syntax every day. At the mean time,
>> I'm also
>> searching for information on how to handle tokenized Unicode vocabularies.
>>
>> Recently I have been reading "further reading"[1] of my proposed
>> project[2], which
>> is about HFST. The code is a bit hard to understand. But my task is
>> "Update
>> lttoolbox to be fully Unicode compliant with regards to medication to
>> alphabetical
>> symbols". May I know exactly how tokenization is implemented in lttoolbox
>> and the
>> specific code that I'm going to update?
>>
>> Regards,
>>
>> Weizhe
>>
>> [1] https://github.com/hfst/hfst/blob/master/tools/src/hfst-tokenize.cc
>>
>> [2]
>> http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Robust_tokenisation
>>
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to