On Mon, 22 Aug 2011, Tom Grant wrote:
Here's the use case that I'm attempting to solve. I have a customer
with many legacy systems, some of which are completely custom. These
systems have data files that will never be seen outside of their
environment. For example, some are XML files with their own schemas.
Some are similar to the new office documents and are zip files
containing xml and other goodies. Others are serialized-objects dumped
to disk. Some are similar to EDI with a header and data body with
prescribed offsets. The choices of the past can't be undone and I'm
stuck with about 30 or 40 different file types.
Ah, so you have non standard, custom and specific mimetypes that you're
allocating to these documents. I think we'd tended to think of the
mimetypes as always being like constants
The quantity of file types means that its going to take a few months to
complete and will happen a few at a time. So I'd like to co-locate the
mimetype definition with the parser code for maintainability.
Your best bet is probably to do a custom detector, and have that loaded by
the service loader the same way that the container aware detector now can
be. You can put that in your code along with your custom parsers
I'm not sure what the best way to support this kind of need is. Some
options that spring to mind are:
* Loading multiple mimetype files, and merging them like we do for parser
class loading
* Provide another detector that loads custom-mimetypes.xml files from the
service loader (so you can have multiple ones) which are used for
detection
I guess it depends on if you'd expect to be able to work with the
heirarchy of the custom extra types or not?
I'm not sure we should be proving ways to add a couple of extra types in
at a random point in time, as that'll potentially make things behave very
differently in a multithreaded environment. I'd rather that the extra
types were loaded once up front, in whichever way is supported
Nick