Thanks a lot Peter.

Attached please find a CSV table encoded in UTF-8. Each row in the file 
contains a single Chinese digital character and its latin / mathematical value. 
I failed to get the value in the second column with the following RUTA script:


WORDTABLE CnDigitTable = 'gZdd.csv';
DECLARE Annotation CnD(STRING DVal);
Document{-> MARKTABLE(CnD, 1, CnDigitTable, "DVal" = 2)};


The type CnD with a feature DVal has been defined in the type descriptor XML 
file.


I have upgraded the engine to the newest 2.7.0 version, but the problem is not 
solved.  Any suggestion? Thanks.


Kind regards,


Baoli


On 7/5/2019 14:11,Peter Klügl<peter.klu...@averbis.com> wrote:
Hi,


most problems with the WordTable are caused by whitespaces in the
dictionary. Can you test if this is your issue by removing all white
spaces in the relevant column?

If this is the source of the problem, there is a configuration parameter
for automatically avoiding it, but I have to check in which version it
was introduced. However, upgrading the Ruta version is recommended in
any case.


If this is not the source of your problem, do you have a minimal example
for reproducing it?


Best,


Peter



Am 05.07.2019 um 03:51 schrieb B. Li:
Hi All,


I am trying to use a WordTable to configure and give several different 
attribute values (with different columns) to some SINGLE (Chinese) characters, 
but I always fail to get the correct values from columns in the WordTable file, 
although the engine can correctly recognize and mark the SINGLE characters. I 
am using RUTA 2.4.0. How can I solve this problem? Any hint would be greatly 
appreciated!


Thanks a lot,


Baoli LI

--
Dr. Peter Klügl
R&D Text Mining/Machine Learning

Averbis GmbH
Salzstr. 15
79098 Freiburg
Germany

Fon: +49 761 708 394 0
Fax: +49 761 708 394 10
Email: peter.klu...@averbis.com
Web: https://averbis.com

Headquarters: Freiburg im Breisgau
Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó

Reply via email to