Hi,
These are impressive savings!
Out of curiosity, we don't build the index incrementally using Maven's
IndexReader, do we? That's why we download the whole index, right?
Thanks,
Antonio
[1]
https://maven.apache.org/maven-indexer/indexer-reader/apidocs/org/apache/maven/index/reader/IndexReader.html
On 17/3/23 11:06, Michael Bien wrote:
Hello everyone,
I experimented a bit with the maven index extraction process and got
some pretty good results (I think).
There might be a way to filter the index during extraction without
noteworthy overhead, which allows the following:
- "sliding window" time filters, e.g drop all documents older than 2
years (aka: who uses old libraries?)
- we can drop fields we don't need from the index. Esp interesting for
fields which don't compress well (looking at you, sha1 hash)
some results for the time cutoff filter:
full: 5.6 GB
2y: 2.6 GB
1y: 1.4 GB
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org
For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists