On 10/13/07, Keith R. Bennett <[EMAIL PROTECTED]> wrote: > > Hi, all. Whenever we talk about the TikaConfig object, we talk of > configuring it using XML. > > I'd like to suggest that we provide an API as well. In general, it would be > easier to use in many cases.
and essential in some use cases :-) here's a couple of mine (i've been meaning to jump in with this for a while now - glad to see that keith beat me to it ;-) 1. mime guessing antlib (to allow filtering on mime types rather than just extension) 2. improved support for mime types in RAT (BTW IMHO describing some typical use cases would be a good way to kick off the documentation for tika. the code's easy to understand but i've found it tough to see the bigger picture. use cases might be a good way in for new developers.) > Also: > > * The default configuration may be suitable most of the time, but if a user > wants to change only 1 or 2 options, having to create a new XML file is > overkill. It would be much easier to load the default and call a method or > two to deviate from the default. > > * The desired options may not be known until runtime. Having to build a > Document in memory, or write an XML file, seems like more work than should > be necessary. > > * One might want to keep a single instance in memory and modify it over the > life of the program as necessary. > > * In general, I think one shouldn't need to know about an external > representation of an object (TikaConfig's XML representation in this case) > if working with the object directly is simpler. +1 IMHO IoC typically works best in the long run: lift out an interface and switch to factories creating implementations - robert
