Claudenw commented on PR #240:
URL: https://github.com/apache/creadur-rat/pull/240#issuecomment-2081968269

   I extracted the Tika processing to its own class.  
   I added the tika `MediaType` to our metadata.
   The process now assumes that all media types = "text/*" are `STANDARD` 
documents, 
   for everything else it is `BINARY` unless it is listed in the 
`documentTypeMap` which is now in the `TikaProcessor` class.
   
   You will notice that `application/json` is listed in the `documentTypeMap`.  
This is a stupid move on my part and will be removed.  This will mean that we 
can then remove the `WildcardFileName` filter for *.json from the 
`filesToIgnore` filter in `ReportConfiguration`.  I will fix this soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@creadur.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to