I have a pull request to switch to the Tika parser [1] one of changes that this imposes is that Tika correctly determines that json files are plain/text type. So they get processed for licenses and the change will end up listing all json files as having unknown licenses. This is a radical breaking change from 0.16.1 where we within the guessers skipped the json types.
We have 2 choices 1. Add extra filtering to convert json back to the "binary" type that 0.16.1 uses. I think that this is an incorrect choice. 2. Create a default file filter that we will always use. This filter will eventually identify text files that, like json, do not have the ability to store comments. Thoughts? [1] https://github.com/apache/creadur-rat/pull/240