To follow up on this a bit, since I spent a bit of time attempting to
optimize this - the two big performance wins are:

 1. Cache a byte[] and reuse it for every JAR entry (pass in a lambda to
read() rather than get out a Map<String,byte[]>)
 2. Maven's DefaultScanner will pass the indexer *every single file* in the
repository it's indexing, while NetBeans is likely uninterested in .sha1
files and similar;  filtering the list of files to things NetBeans is
likely to be interested offers large benefits

These two optimizations cut scanning time of my ~/.m2/repository dir (~2700
JAR files) from 42304ms to 30100ms, which is a 29% performance improvement
with

That said, I have some tests to get passing before this is patch-worthy, so
we'll see if those results hold up.

Of course, this only helps local indexing - whatever "Unpacking indexes" is
doing with remote repositories won't be helped by this.

Still, it seems like something that could be optimized quite a bit before
giving up on it.  If anyone's interested in poking at this:
https://timboudreau.com/files/maven-indexer.diff

-Tim

Reply via email to