Hi Did your probe conclude a result? On Wed, Nov 2, 2011 at 4:40 AM, Ken Krugler <[email protected]>wrote:
> I know some of the original team members - I could ask. > > Are there specific questions, or just "is anybody still minding the fire"? > > -- Ken > > On Nov 1, 2011, at 2:43pm, Nick Burch wrote: > > > On Tue, 1 Nov 2011, Robert Muir wrote: > >> Well as an alternative for them committing the ebcdic detection, > perhaps we could look at the Charset detection apis and propose some API > additions so that users (like Tika) can plug in custom detectors? > > > > In theory it should be pluggable, but I seem to recal we needed to tweak > a few core bits to get the detector working (around negative matches for > control characters) > > > > Looking at the svn version history, the ICU4J team don't appear to have > done any work on their character detectors in several years. From the lack > of responses when I asked on their list about extending them, I fear there > may not be anyone left in their project who's interested in charset > detectors any more. I'd love to be proved wrong though, if anyone has any > personal contacts on the project they could prod about it? > > > > Nick > > -------------------------- > Ken Krugler > http://bixolabs.com > custom big data solutions & training > Hadoop, Cascading, Mahout & Solr > > > >
