I am wondering whether it would be a good idea for there to be a list of 
numbered preset sentences that are an international standard and then if Google 
chose to front end Google Translate with precise translations of that list of 
sentences made by professional linguists who are native speakers, then there 
could be a system that can produce a translation that is precise for the 
sentences that are on the list and machine translated for everything else.
Phrase-based machine translation goes much further: it already lets you pair up far more sentences than would fit into a standard with a limited code inventory such as Unicode, and it lets you pair up phrases. The fact that translations are not precise is a problem that has to do with context and with natural language per se.

Maybe there could then just be two special Unicode characters, one to indicate 
that the number of a preset sentence is to follow and one to indicate that the 
number has finished.
That would belong into a higher-level protocol, not Unicode.

If that were the case then there might well not be symbols for the sentences, 
yet the precise conveying of messages as envisaged in the simulations would 
still be achievable.
The sentences will be as precise as the scope of the sentence inventory allows. Enumerating sentences or phrasal fragments (I'm hesitant to talk of "phrases", which for me have constituent nature, but maybe that's just me) is unrealistic unless you are trying to cover only a /very/ limited domain. If all you encode is (say) requests for meals with the 100 most frequently wanted combinations of nutritional restrictions, your sentence inventory will encode those requests precisely, but as soon as you're trying to make adjustments to your formulaic requests (you're willing to eat /any/ vegetarian, gluten-free meal each time of the day and day of the year? of /any/ size?), the sentences won't be of use anymore. This is really why an approach that enumerates large text chunks is unworkable. (I won't say "useless", but of limited use; "point-at-me" picture books and imprecise translations are likely to do a tolerable job already.) The number of sentences you'll need will be exponential in the number of ingredient options you are intending to vary over. In any case, we are all left guessing about the intended coverage of any set of sentences you have mind. From your previous writings I'm guessing (as implied earlier) that you mean something like "travel and emergency communication", but that is already a large domain. If you try to delimit the coverage and come up with a finite list of sentences, you will see that you'll end up with far too many. You'd also need to think about how to make these sentences accessible (via number/ID? that would be difficult or require training for the user if the number of sentences isn't very small). What if you only want the inventory of a travel phrasebook? For that, you have the travel phrasebook (hierarchically organized, not by number), and I have heard of limited-domain computers/apps for crisis situations (the details elude me at the moment).

Perhaps that is the way forward for some aspects of communication through the 
language barrier.
You would need to specify which problems precisely you are attempting to solve, what is wrong with the approaches presently available, and why/how your approach does a better job.

Stephan

Reply via email to