SenseCluster And Ngram delete special character (ü,ö,ş) in context.
E.g. the word müssen occur as "m ssen" in SenseCluster and n-gram as well.
Is there any solution for this ?
I know that romanian language is used with SenseCluster.
My simple solution is replacing such word "ü" with "xxu"
--
Savas Yildirim
Eberhard Karls Universität Tübingen & Istanbul Bilgi University
Postal Address in Tuebingen:
Seminar für Sprachwissenschaft
Universität Tübingen
Wilhelmstraße 19
Room 1.07
D-72074 Tübingen
Postal Address in Istanbul:
Sisli 34440 Dolapdere Kurtulusdere cad. No:47
Istanbul / Turkey
Phone:
(0090) (212) 311 50 00
------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users