Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "Tika2_0RoadMap" page has been changed by TimothyAllison: https://wiki.apache.org/tika/Tika2_0RoadMap New page: = Background = This page is intended for a discussion of changes anticipated in Tika 2.0. This is only a first draft from one voice. Please contribute! = Major Planned Changes = * Move from service loading to config file for parser specification and loading. [[https://issues.apache.org/jira/browse/TIKA-1445|TIKA-1445]] raised this as an important area for improvement within Tika. The current strategy in the AutoDetectParser is to load all parsers and then pick the first parser that matches a given mime type. Tika chooses the "first" by first sorting on whether or not the class name begins with org.apache.tika and then (effectively) by reverse alphabetical order of the class name. It would be great if the user could specify the order of parser selection in the config file. We will be working towards this gradually through Tika 1.8 and 1.9, and we will remove service loading entirely in Tika 2.0. * Allow users to build composite parsers with configurable strategies via the config file ([[https://issues.apache.org/jira/browse/TIKA-1509|TIKA-1509]] and CompositeParserDiscussion). We will be working towards this gradually through Tika 1.8 and 1.9. By Tika 2.0, however, this will be the default. * Move to Java 1.7 (???) = Minor Planned Changes = = Wishes = * Allow for easily configurable parser sub-packages. The tika-app, tika-server and tika-bundle jars are now pushing or are > 30MB. It would be great if users easily could specify a subset of parsers they care about, either a la carte or by category (image, common office files (MSOffice, PDF, etc.), environmental data) and only get the dependencies required for that subset of parsers.
