Hi, On 10/16/07, Keith R. Bennett <[EMAIL PROTECTED]> wrote: > 1) Now that we have this AutoDetectParser, do you want (me) to remove the > ParseUtils method that do similar/same things?
That would probably be good, as we shouldn't have too many ways to do the same thing. In general I would prefer to avoid static methods in order to keep the codebase modular. We could even do something like this: public class ConvenienceParser extends ParserDecorator { ... public String parse(InputStream stream) throws ... { return parse(stream, new Metadata()); } public String parse(InputStream stream, Metadata metadata) throws ... { StringWriter writer = new StringWriter(); parse(stream, new WriteOutContentHandler(writer), metadata); return writer.toString(); } public String parse(File file) throws ... { Metadata metadata = new Metadata(); metadata.set("filename", file.getName()); InputStream stream = new FileInputStream(file); try { return parse(stream, metadata); } finally { stream.close(); } } ... // etc. } > 2) Also (by the way), we no longer use RereadableInputStream. Did you want > to remove that? We may still have a use for RereadableInputStream later on. See also http://www.nabble.com/RereadableInputStream-tf4609782.html for related discussion on getting similar functionality included in commons-io. > 3) Could we use a constant for "filename"? And should we rename it to > something general enough to cover URL's in addition to file names (such as > "resourceName")? Definitely. Probably. > 4) It looks like it unconditionally reads and inspects the header to > determine the MIME type, even if the CONTENT_TYPE is provided. Did you want > to allow overriding that behavior? (other than by setting a different > config object)? The Metadata.CONTENT_TYPE hint is just a hint. For example the HTTP ContentType header is quite often incorrect, so a MIME magic match (if available) is probably more accurate. > 5) The parse() method sometimes wraps the passed stream in a > BufferedInputStream. The caller (presumably) closes its stream after > parse() returns, but the BufferedInputStream never gets closed, right? As you noticed, that won't be a problem as the only resource in BufferedInputStream that needs explicit closing is the underlying stream. BR, Jukka Zitting