On Mon, 22 Aug 2011, Tom Grant wrote:
Here's the use case that I'm attempting to solve. I have a customer with many legacy systems, some of which are completely custom. These systems have data files that will never be seen outside of their environment. For example, some are XML files with their own schemas. Some are similar to the new office documents and are zip files containing xml and other goodies. Others are serialized-objects dumped to disk. Some are similar to EDI with a header and data body with prescribed offsets. The choices of the past can't be undone and I'm stuck with about 30 or 40 different file types.

Ah, so you have non standard, custom and specific mimetypes that you're allocating to these documents. I think we'd tended to think of the mimetypes as always being like constants

The quantity of file types means that its going to take a few months to complete and will happen a few at a time. So I'd like to co-locate the mimetype definition with the parser code for maintainability.

Your best bet is probably to do a custom detector, and have that loaded by the service loader the same way that the container aware detector now can be. You can put that in your code along with your custom parsers


I'm not sure what the best way to support this kind of need is. Some options that spring to mind are:
* Loading multiple mimetype files, and merging them like we do for parser
  class loading
* Provide another detector that loads custom-mimetypes.xml files from the
  service loader (so you can have multiple ones) which are used for
  detection

I guess it depends on if you'd expect to be able to work with the heirarchy of the custom extra types or not?

I'm not sure we should be proving ways to add a couple of extra types in at a random point in time, as that'll potentially make things behave very differently in a multithreaded environment. I'd rather that the extra types were loaded once up front, in whichever way is supported

Nick

Reply via email to