ottlinger commented on code in PR #240: URL: https://github.com/apache/creadur-rat/pull/240#discussion_r1578938426
########## apache-rat-core/src/main/java/org/apache/rat/api/Document.java: ########## @@ -33,47 +36,416 @@ public interface Document { */ enum Type { /** A generated document. */ - GENERATED, + GENERATED, /** An unknown document type. */ UNKNOWN, /** An archive type document. */ - ARCHIVE, + ARCHIVE, /** A notice document (e.g. LICENSE file) */ NOTICE, /** A binary file */ BINARY, /** A standard document */ - STANDARD} + STANDARD; + + public static Map<String, Type> documentTypeMap; + + public static Type fromContentType(String documentType, Log log) { + Type result = documentTypeMap.get(documentType); + if (result == null) { + log.warn(String.format("Please open a Jira ticket with the subject: 'Unknown media type %s in Document.Type'", documentType)); + return UNKNOWN; + } + return result; + } + + /* + * https://tika.apache.org/3.0.0-BETA/formats.html + */ + static { + documentTypeMap = new HashMap<>(); Review Comment: Seeing this long list I wondered if we should add a new module apache-rat-regression-tests that contains at least one example for all the file types and can be used measure and integration-test RAT ..... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@creadur.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org