There are the situations, I could think about, where you would like to
implement customized classloader:
1. You need different hierarchy to load classes, as OSGi for instance.
Hollywood principle if you like.
2. When you need to run different versions of classes or jars. For example,
you want to
Hi,
On Sun, Sep 12, 2010 at 5:46 PM, Ken Krugler
kkrugler_li...@transpac.com wrote:
But that also seems clunky. Any other suggestions?
A simpler approach would be to simply pass a list of already
instantiated Parser objects to AutoDetectParser, like this:
public AutoDetectParser(Detector
On Fri, Sep 10, 2010 at 10:31 PM, Nick Burch
nick.bu...@alfresco.com wrote:
Quite a lot of OfficeParser does depend on poifs code though, as
well as a
few bits that depend on some of the less common POI text extractors.
It looks like a number of our other new parsers also have direct
On Thu, 9 Sep 2010, Ken Krugler wrote:
I'm wondering how best to handle this type of configuration, in a way
that's relatively resilient to Tika configuration changes and my target
set of formats.
Would it not make more sense to use the xml based TikaConfig constructor
(file, inputstream
Hi Jukka,
On Sep 10, 2010, at 5:35am, Jukka Zitting wrote:
Hi,
On Fri, Sep 10, 2010 at 5:22 AM, Ken Krugler
kkrugler_li...@transpac.com wrote:
With 0.8-SNAPSHOT, the TikaConfig(Classpath) constructor now finds
and
instantiates all Parser-based classes found on the classpath.
Which, as
On Fri, 10 Sep 2010, Ken Krugler wrote:
The issue is that the definitions of the types that are supported come from
POI:
Collections.unmodifiableSet(new HashSetMediaType(Arrays.asList(
POIFSDocumentType.WORKBOOK.type,
POIFSDocumentType.OLE10_NATIVE.type,
Hi all,
In the past, we'd build our Hadoop job jars using a dependency on Tika-
parsers but excluding the supporting jars for types that we know we
don't need to process (e.g. Microsoft docs, PDFs, etc). This
dramatically reduces the size of the resulting Hadoop job jar.
With