[GitHub] [lucene] mayya-sharipova commented on pull request #536: Don't store graph offsets for HNSW graph

2022-01-11 Thread GitBox
mayya-sharipova commented on pull request #536: URL: https://github.com/apache/lucene/pull/536#issuecomment-1010033192 @jtibshirani Thanks for the guide on the format change, I will study it and follow it. Indeed this PR was merged into the `hnsw` branch, so we will do the format change

[GitHub] [lucene] mayya-sharipova commented on pull request #536: Don't store graph offsets for HNSW graph

2022-01-05 Thread GitBox
mayya-sharipova commented on pull request #536: URL: https://github.com/apache/lucene/pull/536#issuecomment-1005844307 I've also run the comparison on a bigger dataset: deep-image-96-angular of 10M docs. M: 16; efConstruction: 500 Disk size before the change: 4.2G; after change: 4

[GitHub] [lucene] mayya-sharipova commented on pull request #536: Don't store graph offsets for HNSW graph

2021-12-31 Thread GitBox
mayya-sharipova commented on pull request #536: URL: https://github.com/apache/lucene/pull/536#issuecomment-1003460979 > I think it would be prudent to check the size increase/decrease from this change for some dataset/parameter choices I've checked the index sizes and the size actu

[GitHub] [lucene] mayya-sharipova commented on pull request #536: Don't store graph offsets for HNSW graph

2021-12-31 Thread GitBox
mayya-sharipova commented on pull request #536: URL: https://github.com/apache/lucene/pull/536#issuecomment-1003458911 @msokolov > I seem to remember that when I checked (you can use -fanout parameter to KnnGraphTester IIRC) most nodes were not fully populated; ie they had fewer th

[GitHub] [lucene] mayya-sharipova commented on pull request #536: Don't store graph offsets for HNSW graph

2021-12-13 Thread GitBox
mayya-sharipova commented on pull request #536: URL: https://github.com/apache/lucene/pull/536#issuecomment-993080804 @msokolov Thanks for the initial review, it is good to know that we are ok with this idea. I will do the comparison of size and also the `maxConn` numbers. -- This is an