Re: Language Pack size

2016-05-13 Thread kellen sunderland
That's a great idea, can we pre-sort the grammar as well? On Fri, May 13, 2016 at 1:47 PM, Matt Post wrote: > Quantization is also supported in the grammar packer. > > Another idea: since we know the model weights when we publish a language > pack, we should pre-compute the dot

Re: Language Pack size

2016-05-13 Thread Matt Post
Quantization is also supported in the grammar packer. Another idea: since we know the model weights when we publish a language pack, we should pre-compute the dot product of the weight vector against the grammar weights and reduce it to a single (quantized) score. (This would reduce the

Re: Language Pack size

2016-05-13 Thread Matt Post
Oh, yes, of course. That's in build_binary. > On May 13, 2016, at 4:39 PM, kellen sunderland > wrote: > > Could we also use quantization with the language model to reduce the size? > KenLM supports this right? > > On Fri, May 13, 2016 at 1:19 PM, Matt Post

Language Pack size

2016-05-13 Thread Tom Barber
Out of curiosity more than anything else I tested XZ compression on a model instead of Gzip, it takes the Spain pack down from 1.9GB to 1.5GB, not the most ever, but obviously does mean 400MB+ less in remote storage and data going over the wire. Worth considering I guess. Tom --