On Wed, 20 Mar 2019 at 12:27, Iain Buclaw <ibuc...@gdcproject.org> wrote: > > On Wed, 20 Mar 2019 at 10:57, Robin Dapp <rd...@linux.ibm.com> wrote: > > > > Hi, > > > > the unicode tables in std.internal.unicode_tables are apparently auto > > generated and loaded at (libphobos) compile time. They are also in > > little endian format. Is the tool to generate them available somewhere? > > I wanted to start converting them to little endian before loading but > > this will prove difficult at compile time again :) > > > > Hi, > > I will ask if the author still has the utility available. > > My guess would be that the data used for input would have been > retrieved from here. > > http://www.unicode.org/Public/4.1.0/ucd/ > > Will let you know as soon as I find out more. >
It comes from this repo: https://github.com/DmitryOlshansky/gsoc-bench-2012 I've tested build them using the following: mkdir build; cd build sh ../get-uni.sh gdc -m64 -frelease ../gen_uni.d randAA.d -o gen_uni_64 & gdc -m32 -frelease ../gen_uni.d randAA.d -o gen_uni_64 & wait mkdir 64 ./gen_uni_64 mv unicode_*.d 64 mkdir 32 ./gen_uni_32 mv unicode_*.d 32 for name in 64/*.d ; do name32=`echo $name | sed 's/64/32/'` sed -n '/^static if(size_t.sizeof == 4) {$/,$p' $name32 >> $name done mv 64/*.d . rm -rf 64/ 32/ Then a final clean-up using dfmt (https://github.com/dlang-community/dfmt - build with: make gdc) dfmt --inplace --max_line_length=80 unicode_*.d However... it looks like upstream phobos has done some extra tweaks and formatting since the original check-in of the sources, so any new regeneration would have to re-add those ad-hoc changes back in... Are the values inside the tables the problem? Or just some of the helper functions/templates that interact with them to generate the static data? If the latter, then a rebuild of the files may not be necessary. Regards -- Iain