Github user joewitt commented on the pull request:
https://github.com/apache/nifi/pull/252#issuecomment-198149231
@jskora do the attribute names that come from the media bundles have
anything special to them in terms of how tika handles them or are they purely
as found in the metadata of the raw entities? Just want to make sure there
isn't some special mapping/normalization to worry about as versions of tika
evolves.
Also, i've not built this yet but do you know how large those parsers end
up being when pulled in for the nar? I recall for some reason they can be
quite huge which is why we have avoided them so far. Thinking being the are
perfectly fine once we have the registry. Might be fine now too but curious.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---