Pulkitg64 commented on PR #15549:
URL: https://github.com/apache/lucene/pull/15549#issuecomment-3881932541

   I have removed the panama implementation for now but we can add them later 
once we have access to Float16Vector operations in JDK. Below are the benchmark 
numbers with default implementation:
   
   
   **Summary**: Used 100k docs across all runs with force-merge. For no 
quantization case, we are seeing high regression in latency more than 100% 
(which is expected) but for quantization case we are seeing comparable latency. 
Indexing side, we are seeing regression in indexing time across all runs 
(irrespective if quantization is enabled or not). This is also expected because 
for quantization runs, we have to do extra conversion from fp16 to fp32 for 
quantizing vectors.
   
   | Encoding | recall  | latency(ms)  | netCPU  | avgCpuCount  | quantized  | 
visited  | index(s)  | index_docs/s  | force_merge(s)  | index_size(MB)  | 
vec_disk(MB)  | vec_RAM(MB) |
   
|----------|---------|--------------|---------|--------------|------------|----------|-----------|---------------|-----------------|-----------------|---------------|-------------|
   | float16  |  0.990  | 5.739        | 5.738   | 1            | no         | 
6848     | 42.51     | 2352.66       | 0.00            | 207.68          | 
390.625       | 390.625     |
   | float16  |  0.982  | 2.681        | 2.680   | 1            | 8 bits     | 
6858     | 44.3      | 2257.34       | 20.51           | 306.88          | 
294.495       | 99.182      |
   | float16  |  0.927  | 1.919        | 1.917   | 0.999        | 4 bits     | 
6934     | 44.59     | 2242.55       | 0.01            | 258.09          | 
245.667       | 50.354      |
   | float16  |  0.835  | 1.525        | 1.524   | 0.999        | 2 bits     | 
7277     | 44.68     | 2237.99       | 0.01            | 234.01          | 
221.062       | 25.749      |
   | float16  |  0.717  | 1.146        | 1.145   | 0.999        | 1 bit      | 
8167     | 43.98     | 2273.92       | 0.01            | 222.96          | 
208.855       | 13.542      |
   | float32  |  0.990  | 2.258        | 2.257   | 0.999        | no         | 
6863     | 19.84     | 5039.31       | 20.62           | 403.02          | 
390.625       | 390.625     |
   | float32  |  0.982  | 2.756        | 2.754   | 1            | 8 bits     | 
6867     | 21.29     | 4697.48       | 27.93           | 502.19          | 
489.807       | 99.182      |
   | float32  |  0.927  | 1.91         | 1.909   | 0.999        | 4 bits     | 
6962     | 20.03     | 4992.01       | 22.23           | 453.4           | 
440.979       | 50.354      |
   | float32  |  0.835  | 1.462        | 1.461   | 0.999        | 2 bits     | 
7302     | 20.45     | 4890.93       | 20.92           | 429.31          | 
416.374       | 25.749      |
   | float32  |  0.717  | 1.174        | 1.173   | 0.999        | 1 bit      | 
8205     | 20.16     | 4959.33       | 17.57           | 418.29          | 
404.167       | 13.542      |
   
   
   ### Next Steps: 
   
   If we are okay with above performance numbers so should we go ahead with 
this PR which is adding float16 vectorEncoding support without panama 
implementation or should we park this PR and wait for JDK27 release?
   CC: @rmuir @benwtrent 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to