On Mon, Dec 21, 2015 at 4:08 PM, Christian Schudt
<christian.sch...@gmx.de> wrote:
> If you mean having a huge code point table, like in your tables.go file: I 
> think Java already has such tables internally.
> What could be improved here, is that Character.getType(cp) could only be 
> invoked once. I haven’t done any benchmark for this, but I don’t expect a 
> significant performance benefit.

Out of curiosity, I answered my own question here. I'm using Go, which
also has lots of Unicode tables in the standard library, so I
benchmarked running the algorithm (I modified it slightly from the
version in my generator to remove the NFKC step, which is very slow,
this way it more closely resembles your algorithm), and looking up a
value in the large pre-generated trie. I have no idea where the
bottlenecks / optimizations in Java would be, so these results may be
meaningless to you, but, at least in Go, the single Trie lookup was
much faster:

$ go test -bench . -benchmem
PASS
BenchmarkAsciiLookup-4          300000000                3.85 ns/op
        0 B/op          0 allocs/op
BenchmarkFullwidthLookup-4      200000000                9.21 ns/op
        0 B/op          0 allocs/op
BenchmarkAsciiCalculate-4       100000000               17.4 ns/op
        0 B/op          0 allocs/op
BenchmarkFullwidthCalculate-4   20000000                71.4 ns/op
        0 B/op          0 allocs/op
ok      _/home/sam/Projects/golang-x-text/unicode/precis        7.632s

Each test here is looking up or calculating the derived properties for
a single character (the ASCII tests are looking up 'u' and the Unicode
tests are looking up 'u' [full width] which was chosen very
scientifically, I assure you), the second column is the number of
tests that were run until the timings reached equilibrium.

For the worst case, there's a pretty good speed difference, whether
that difference is worth pre-generating the data is another matter, of
course ☺

Best,
Sam


-- 
Sam Whited
pub 4096R/54083AE104EA7AD3
https://blog.samwhited.com

_______________________________________________
precis mailing list
precis@ietf.org
https://www.ietf.org/mailman/listinfo/precis

Reply via email to