Author: nick
Date: Fri Nov 18 15:01:07 2011
New Revision: 1203681
URL: http://svn.apache.org/viewvc?rev=1203681&view=rev
Log:
TIKA-784 Sample DITA task, concept and map files. (Based on some Alfresco
documentation, with content replaced with Tika info)
Added:
tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.dita
tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.ditamap
tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA2.dita
Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.dita
URL:
http://svn.apache.org/viewvc/tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.dita?rev=1203681&view=auto
==============================================================================
--- tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.dita
(added)
+++ tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.dita Fri
Nov 18 15:01:07 2011
@@ -0,0 +1,34 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE task PUBLIC "-//OASIS//DTD DITA Task//EN" "task.dtd">
+<task id="apache-tika">
+ <title>Apache Tika</title>
+ <shortdesc>Apache Tika - a content analysis toolkit.</shortdesc>
+ <prolog>
+ <author>Apache Software Foundation</author>
+ <copyright>
+ <copyryear year="2011"/>
+ <copyrholder>Apache Software Foundation</copyrholder>
+ </copyright>
+ <metadata>
+ <audience experiencelevel="expert" job="Customizing" type="Coder"/>
+ <category>Metadata</category>
+ <keywords>
+ <keyword>Tika</keyword>
+ <keyword>Content</keyword>
+ </keywords>
+ <prodinfo>
+ <prodname>Apache Tika</prodname>
+ <vrmlist>
+ <vrm version="1.x" release="Final"
modification="2011/11/11"/>
+ </vrmlist>
+ </prodinfo>
+ </metadata>
+ </prolog>
+ <taskbody>
+ <context>
+ <p>The Apache Tika toolkit detects and extracts metadata and
structured text content from various documents using existing parser libraries.
You can find the latest release on the download page. See the Getting Started
guide for instructions on how to start using Tika.</p>
+
+ <p>Tika is a project of the Apache Software Foundation, and was
formerly a subproject of Apache Lucene.</p>
+ </context>
+ </taskbody>
+</task>
Added:
tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.ditamap
URL:
http://svn.apache.org/viewvc/tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.ditamap?rev=1203681&view=auto
==============================================================================
--- tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.ditamap
(added)
+++ tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA.ditamap
Fri Nov 18 15:01:07 2011
@@ -0,0 +1,23 @@
+<?xml version='1.0' encoding='UTF-8'?>
+<!DOCTYPE map PUBLIC "-//OASIS//DTD DITA Map//EN"
"http://docs.oasis-open.org/dita/v1.1/OS/dtd/map.dtd">
+<map id="apache-tika" title="Apache Tika">
+ <topicmeta>
+ <author>Apache Tika</author>
+ <copyright>
+ <copyryear year="2011"/>
+ <copyrholder>Apache Software Foundation</copyrholder>
+ </copyright>
+ <category>Version 1.x</category>
+ <category>Tika</category>
+ <category>Mime</category>
+ <prodinfo>
+ <prodname>Apache Tika</prodname>
+ <vrmlist>
+ <vrm version="1.x" release="Final" modification="2011/11/11"/>
+ </vrmlist>
+ </prodinfo>
+ </topicmeta>
+ <topicref href="testDITA.dita">
+ <topicref href="testDITA2.dita" />
+ </topicref>
+</map>
Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA2.dita
URL:
http://svn.apache.org/viewvc/tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA2.dita?rev=1203681&view=auto
==============================================================================
--- tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA2.dita
(added)
+++ tika/trunk/tika-parsers/src/test/resources/test-documents/testDITA2.dita
Fri Nov 18 15:01:07 2011
@@ -0,0 +1,33 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN"
"http://docs.oasis-open.org/dita/v1.1/OS/dtd/concept.dtd">
+<concept id="tika-arch">
+ <title>Apache Tika Architecture</title>
+ <shortdesc>This section describes the Apache Tika architecture.</shortdesc>
+ <prolog>
+ <author>Apache Software Foundation</author>
+ <copyright>
+ <copyryear year="2011"/>
+ <copyrholder>Apache Software Foundation</copyrholder>
+ </copyright>
+ <metadata>
+ <audience experiencelevel="expert" job="Customizing" type="Coder"/>
+ <category>Metadata</category>
+ <keywords>
+ <keyword>Tika</keyword>
+ <keyword>Content</keyword>
+ </keywords>
+ <prodinfo>
+ <prodname>Apache Tika</prodname>
+ <vrmlist>
+ <vrm version="1.x" release="Final" modification="2011/11/11"/>
+ </vrmlist>
+ </prodinfo>
+ </metadata>
+ </prolog>
+ <conbody>
+ <p>The Detector Interface</p>
+
+ <p>The org.apache.tika.detect.Detector interface is the basis for most of
the content type detection in Apache Tika. All the different ways of detecting
content all implement the same common method:</p>
+ <image href="http://tika.apache.org/tika.png" id="tika_logo" />
+ </conbody>
+</concept>