With 1.13 and this code, I'm not able to see any problems with our handful of test files in our unit tests.
Exactly what code are you using? How are you doing detection? @Test public void testMultiThreadedEncodingDetection() throws Exception { Path testDocs = Paths.get(this.getClass().getResource("/test-documents").toURI()); List<Path> paths = new ArrayList<>(); Map<Path, String> encodings = new ConcurrentHashMap<>(); for (File file : testDocs.toFile().listFiles()) { if (file.getName().endsWith(".txt") || file.getName().endsWith(".html")) { String encoding = getEncoding(file.toPath()); paths.add(file.toPath()); encodings.put(file.toPath(), encoding); } } for (int i = 0; i < 100; i++) { new Thread(new EncodingDetector(paths, encodings)).run(); } assertTrue("success!", true); } private class EncodingDetector implements Runnable { private final List<Path> paths; private final Map<Path, String> encodings; private final Random r = new Random(); private EncodingDetector(List<Path> paths, Map<Path, String> encodings) { this.paths = paths; this.encodings = encodings; } @Override public void run() { for (int i = 0; i < 100; i++) { int pInd = r.nextInt(paths.size()); String detectedEncoding = null; try { detectedEncoding = getEncoding(paths.get(pInd)); } catch (Exception e) { throw new RuntimeException(e); } String trueEncoding = encodings.get(paths.get(pInd)); if (! detectedEncoding.equals(trueEncoding)) { throw new RuntimeException("detected: " + detectedEncoding + " but should have been: "+trueEncoding); } } } } public String getEncoding(Path p) throws Exception { try (InputStream is = TikaInputStream.get(p)) { AutoDetectReader reader = new AutoDetectReader(is); String val = reader.getCharset().toString(); if (val == null) { return "NULL"; } else { return val; } } } -----Original Message----- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Monday, July 25, 2016 9:21 PM To: user@tika.apache.org Subject: RE: Is Tika (especially CharsetDetector) considered thread-safe? Charset detection _should_ be thread safe. If you can help us track down the problem (unit test?), we need to fix this. Thank you for raising this. Best, Tim