Jukka - It looks to me that the TikaConfig is now almost completely immutable. It can return a ParserConfig, but that is immutable since you removed the setContents() and made the getContents() return an immutable map. The MimeTypes object, also available from the TikaConfig instance, is almost completely immutable, except that 1) it contains an add() method, and 2) that a MimeType instance managed by MimeTypes has a setLevel() method. But it looks like those mutabilities could be removed by refactoring.
That being the case, are we close to making a TikaConfig object totally reusable? Would you like me to look at refactoring MimeTypes to make it immutable? Thanks, - Keith Jukka Zitting wrote: > > Hi, > > On 9/25/07, kbennett <[EMAIL PROTECTED]> wrote: >> This means that every time a parse methods that uses a default >> configuration >> is used, the default configuration's XML will be reparsed. This may not >> be >> a big deal for apps that only occasionally do this, but for an app whose >> mission is to parse documents, it seems kind of wasteful, especially when >> it >> can be remedied with a small number of simple lines of code. Certainly I >> can get the default configuration once, hold onto it, and then call the >> parse methods that take it, but it seems odd to me that I would have to >> do >> that. I realize it's a minor issue, though. > > I would argue that that's (reusing the configuration instance) the > preferred mode of operation. Currently I wouldn't do that due to the > mutability of Content instances, but as we get to the point of having > stateless Parser instances, I'd even advocate instantiating the full > set of configured parsers when your application starts and reusing > this configuration for any number of documents. > > BR, > > Jukka Zitting > > -- View this message in context: http://www.nabble.com/Providing-a-Default-Tika-Configuration-tf4510478.html#a12912002 Sent from the Apache Tika - Development mailing list archive at Nabble.com.
