Just committed the UAX#14 changes. The commit looks much bigger than it really is because of the restructure of the codegen directories.
One point to note - There is one file without a license header: src/codegen/unicode/data/LineBreakPairTable.txt This file is special in the sense that it is an input to the code generation and it is manually produced / maintained by going to http://www.unicode.org/reports/tr14/ and then simply from your browser copy and paste the table 2 (section 7.3). Yes, that's a bit of an unusual procedure but at least it works :-). Note also that the code generation is still outside the normal build and would not usually required (see target codegen-unicode in build.xml). Manuel