Hi ! You are on the spot, you deal with data 4 times smaller (of course also able to represent 1/4 of the information). But if you are ok with that you may achieve a lighter memory footprint (not 4 times lighter as there are a lot of boilerplate structures as well, but still a decent improvement).
Cheers -------------------------- *Alessandro Benedetti* Director @ Sease Ltd. *Apache Lucene/Solr Committer* *Apache Solr PMC Member* e-mail: a.benede...@sease.io *Sease* - Information Retrieval Applied Consulting | Training | Open Source Website: Sease.io <http://sease.io/> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter <https://twitter.com/seaseltd> | Youtube <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github <https://github.com/seaseltd> On Tue, 11 Jul 2023 at 20:42, MyCoy Z <mycoy.zh...@gmail.com> wrote: > Hi, Lucene Dev Community: > > I'm wondering what benefits we could get by indexing the byte-vectors to > build an HNSW rather than using the floats. > > I can think of storage and performance improvements. > However, due to some internal platform limitations, we cannot actually try > to build such a graph on production data. > > So it would be great if anyone could provide some industrial experience, > for example how much storage can be saved and how much performance can be > improved? > > Thanks >