Re: maven indexing tweaks

Antonio Fri, 17 Mar 2023 14:38:46 -0700

Hi,

These are impressive savings!

Out of curiosity, we don't build the index incrementally using Maven'sIndexReader, do we? That's why we download the whole index, right?


Thanks,
Antonio


[1]

https://maven.apache.org/maven-indexer/indexer-reader/apidocs/org/apache/maven/index/reader/IndexReader.html

On 17/3/23 11:06, Michael Bien wrote:

Hello everyone,
I experimented a bit with the maven index extraction process and gotsome pretty good results (I think).
There might be a way to filter the index during extraction withoutnoteworthy overhead, which allows the following:
- "sliding window" time filters, e.g drop all documents older than 2years (aka: who uses old libraries?)
- we can drop fields we don't need from the index. Esp interesting forfields which don't compress well (looking at you, sha1 hash)
some results for the time cutoff filter:

full: 5.6 GB
2y: 2.6 GB
1y: 1.4 GB


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists

Re: maven indexing tweaks

Reply via email to