Hi, On Mon, Jan 19, 2009 at 7:25 AM, Sami Siren <[email protected]> wrote: > I like the idea, it allows us to use different strategies for detecting the > type for individual formats or change the whole strategy used. Only thing > that I am wondering is should we introduce some kind of confidence level to > the guesses , perhaps part of metadata?
Good question. I'm personally not that big a fan of confidence levels, as there's no clear definition of how they should be set and interpreted. I also haven't seen any real world cases where confidence levels really would have been needed to accurately determine the type of a document. We can always introduce confidence levels later if we need to, but for now I'd rather skip them. BR, Jukka Zitting
