mbien commented on PR #302:
URL: https://github.com/apache/maven-indexer/pull/302#issuecomment-1537786100

   > Just to be clear: "So the filter can be also used for removing fields in 
addition of whole documents." means that filter can transform passed in 
Document instances? As that would be nasty side effect or plan misuse of API 
IMHO. If we want to "transform" documents (and why not?) let's have a dedicated 
API for that as well IMHO.
   
   I removed that sentence now so that nobody is getting confused. This was 
indeed the goal at first, however it doesn't work anyway. I solved it by 
swapping out the `MinimalArtifactInfoIndexCreator` and adjusting the 
[`updateDocument`](https://github.com/apache/maven-indexer/blob/41e88f874132a6bcae3dd034547b735b6a8a4c12/indexer-core/src/main/java/org/apache/maven/index/creator/MinimalArtifactInfoIndexCreator.java#L272)
 method.
   
   > As for removing SHA1, unsure why would one do it. How to "identify" 
artifacts otherwise, or NB does not have such a use case?
   
   Those hashes alone are >30% of the index size since noise compresses badly. 
The idea is to use SMO for that, since the few usages for hashes right now are 
all followed by subsequent downloads anyway (e.g find src/doc for dependency) - 
so its essentially an online usecase.
   
   A smaller index would also slightly speed up the merge when MT is enabled.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@maven.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to