Author: nick
Date: Sat Aug 1 13:19:18 2015
New Revision: 1693705
URL: http://svn.apache.org/r1693705
Log:
TIKA-1702 start on documenting the configuration options we have
Added:
tika/site/src/site/apt/1.9/configuring.apt
Added: tika/site/src/site/apt/1.9/configuring.apt
URL:
http://svn.apache.org/viewvc/tika/site/src/site/apt/1.9/configuring.apt?rev=1693705&view=auto
==============================================================================
--- tika/site/src/site/apt/1.9/configuring.apt (added)
+++ tika/site/src/site/apt/1.9/configuring.apt Sat Aug 1 13:19:18 2015
@@ -0,0 +1,90 @@
+ ----------------
+ Configuring Tika
+ ----------------
+
+~~ Licensed to the Apache Software Foundation (ASF) under one or more
+~~ contributor license agreements. See the NOTICE file distributed with
+~~ this work for additional information regarding copyright ownership.
+~~ The ASF licenses this file to You under the Apache License, Version 2.0
+~~ (the "License"); you may not use this file except in compliance with
+~~ the License. You may obtain a copy of the License at
+~~
+~~ http://www.apache.org/licenses/LICENSE-2.0
+~~
+~~ Unless required by applicable law or agreed to in writing, software
+~~ distributed under the License is distributed on an "AS IS" BASIS,
+~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+~~ See the License for the specific language governing permissions and
+~~ limitations under the License.
+
+Configuring Tika
+
+ Out of the box, Apache Tika will attempt to start with all available
+ Detectors and Parsers, running with sensible defaults. For most users,
+ this default configuration will work well.
+
+ This page gives you information on how to configure the various
+ components of Apache Tika, such as Parsers and Detectors, if you need
+ fine-grained control over ordering, exclusions and the like.
+
+%{toc|section=1|fromDepth=1}
+
+* {Configuring Parsers}
+
+ TODO
+
+ In code, the key classes to use to build up your own custom parser
+ heirarchy are
+
{{{./api/org/apache/tika/parser/DefaultParser.html}org.apache.tika.parser.DefaultParser}},
+
{{{./api/org/apache/tika/parser/CompositeParser.html}org.apache.tika.parser.CompositeParser}}
+ and
+
{{{./api/org/apache/tika/parser/ParserDecorator.html}org.apache.tika.parser.ParserDecorator}}.
+
+* {Configuring Detectors}
+
+ TODO
+
+ In code, the key classes to use to build up your own custom detector
+ heirarchy are
+
{{{./api/org/apache/tika/detect/DefaultDetector.html}org.apache.tika.detect.DefaultDetector}}
+ and
+
{{{./api/org/apache/tika/detect/CompositeDetector.html}org.apache.tika.detect.CompositeDetector}}.
+
+* {Configuring Mime Types}
+
+ TODO
+
+* {Configuring Language Identifiers}
+
+ TODO
+
+* {Configuring Translators}
+
+ TODO
+
+* {Using a Tika Configuration XML file}
+
+ However you call Tika, the System Property of <pre>tika.config</pre> is
+ checked first, and the Environment Variable of <pre>TIKA_CONFIG</pre> is
+ tried next. Setting one of those will cause Tika to use your given
+ Tika Config XML file.
+
+ If you are calling Tika from your own code, then you can pass in the
+ location of your Tika Config XML file when you construct your
+ <pre>TikaConfig</pre> instance. From that, you can fetch your configured
+ parser, detectors etc.
+---
+TikaConfig config = new TikaConfig("/path/to/tika-config.xml");
+Detector detector = config.getDetector();
+Parser autoDetectParser = new AutoDetectParser(config);
+---
+
+ For users of the Tika App, in addition to the sytem property and the
+ environement variable, you can also use the
+ <pre>--config=>tika-config.xml<</pre> option to select a different
+ Tika Config XML file to use
+
+ For users of the Tika Server, in addition to the sytem property and the
+ environement variable, you can also use <pre>-c
>tika-config.xml<</pre>
+ or <pre>--config >tika-config.xml<</pre> options to select a
different
+ Tika Config XML file to use