Tim Allison created TIKA-3872:
---------------------------------

             Summary: Improve namespacing in metadata keys
                 Key: TIKA-3872
                 URL: https://issues.apache.org/jira/browse/TIKA-3872
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


I recently did a group by on metadata keys in roughly 1 million files from our 
regression corpus.  The UTF-8 csvs are available here: 
https://corpora.tika.apache.org/base/share/metadata-keys-1m-20221006.tgz

My gut feeling is that we should namespace everything.  I don't think we should 
make any changes in 2.x, but I'm opening this for longer range planning.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to