Hi Jukka (and others who care), The patch I had sitting in TIKA-431 (https://issues.apache.org/jira/browse/TIKA-431) included both your change to add the charset info to Metadata.CONTENT_TYPE, and it removed setting Metadata.CONTENT_ENCODING.
You made a note in Changes.txt that this was deprecated, so I'm assuming that you think we should hold off on fixing the abuse of CONTENT_ENCODING until after the 1.2 release, right? -- Ken -------------------------- Ken Krugler http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Mahout & Solr
