My two cents, tesseract-ocr also uses ISO 639-3 and it would be great for those who builds the solutions such as openNLP + tesseract.
-Oleg On Tue, May 17, 2011 at 9:33 PM, Jason Baldridge <[email protected]>wrote: > I think we should change to the three character convention for language > specific materials, e.g. "eng" rather than "en" for English. > > http://en.wikipedia.org/wiki/List_of_ISO_639-2_codes > > Do others agree? > > -- > Jason Baldridge > Assistant Professor, Department of Linguistics > The University of Texas at Austin > http://www.jasonbaldridge.com > http://twitter.com/jasonbaldridge >
