Pulkitg64 commented on PR #15549: URL: https://github.com/apache/lucene/pull/15549#issuecomment-3881932541
I have removed the panama implementation for now but we can add them later once we have access to Float16Vector operations in JDK. Below are the benchmark numbers with default implementation: **Summary**: Used 100k docs across all runs with force-merge. For no quantization case, we are seeing high regression in latency more than 100% (which is expected) but for quantization case we are seeing comparable latency. Indexing side, we are seeing regression in indexing time across all runs (irrespective if quantization is enabled or not). This is also expected because for quantization runs, we have to do extra conversion from fp16 to fp32 for quantizing vectors. | Encoding | recall | latency(ms) | netCPU | avgCpuCount | quantized | visited | index(s) | index_docs/s | force_merge(s) | index_size(MB) | vec_disk(MB) | vec_RAM(MB) | |----------|---------|--------------|---------|--------------|------------|----------|-----------|---------------|-----------------|-----------------|---------------|-------------| | float16 | 0.990 | 5.739 | 5.738 | 1 | no | 6848 | 42.51 | 2352.66 | 0.00 | 207.68 | 390.625 | 390.625 | | float16 | 0.982 | 2.681 | 2.680 | 1 | 8 bits | 6858 | 44.3 | 2257.34 | 20.51 | 306.88 | 294.495 | 99.182 | | float16 | 0.927 | 1.919 | 1.917 | 0.999 | 4 bits | 6934 | 44.59 | 2242.55 | 0.01 | 258.09 | 245.667 | 50.354 | | float16 | 0.835 | 1.525 | 1.524 | 0.999 | 2 bits | 7277 | 44.68 | 2237.99 | 0.01 | 234.01 | 221.062 | 25.749 | | float16 | 0.717 | 1.146 | 1.145 | 0.999 | 1 bit | 8167 | 43.98 | 2273.92 | 0.01 | 222.96 | 208.855 | 13.542 | | float32 | 0.990 | 2.258 | 2.257 | 0.999 | no | 6863 | 19.84 | 5039.31 | 20.62 | 403.02 | 390.625 | 390.625 | | float32 | 0.982 | 2.756 | 2.754 | 1 | 8 bits | 6867 | 21.29 | 4697.48 | 27.93 | 502.19 | 489.807 | 99.182 | | float32 | 0.927 | 1.91 | 1.909 | 0.999 | 4 bits | 6962 | 20.03 | 4992.01 | 22.23 | 453.4 | 440.979 | 50.354 | | float32 | 0.835 | 1.462 | 1.461 | 0.999 | 2 bits | 7302 | 20.45 | 4890.93 | 20.92 | 429.31 | 416.374 | 25.749 | | float32 | 0.717 | 1.174 | 1.173 | 0.999 | 1 bit | 8205 | 20.16 | 4959.33 | 17.57 | 418.29 | 404.167 | 13.542 | ### Next Steps: If we are okay with above performance numbers so should we go ahead with this PR which is adding float16 vectorEncoding support without panama implementation or should we park this PR and wait for JDK27 release? CC: @rmuir @benwtrent -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
