Modified: tika/site/publish/1.7/examples.html
URL: 
http://svn.apache.org/viewvc/tika/site/publish/1.7/examples.html?rev=1651638&r1=1651637&r2=1651638&view=diff
==============================================================================
--- tika/site/publish/1.7/examples.html (original)
+++ tika/site/publish/1.7/examples.html Wed Jan 14 12:25:49 2015
@@ -115,41 +115,41 @@
 <p>The <a href="./apidocs/org/apache/tika/Tika.html">Tika facade</a>, provides 
a number of very quick and easy ways to have your content parsed by Tika, and 
return the resulting plain text</p><style type="text/css">
    @import url('attached-includes/css/shCoreDefault.css');
 </style>
-<div id="highlighter_355685" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number37 index0 alt2"><code class="java 
keyword">public</code> <code class="java plain">String parseToStringExample() 
</code><code class="java keyword">throws</code> <code class="java 
plain">IOException, SAXException, TikaException {</code></div><div class="line 
number38 index1 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ParsingExample.</code><code class="java keyword">class</code><code 
class="java plain">.getResourceAsStream(</code><code class="java 
string">"test.doc"</code><code class="java plain">);</code></div><div 
class="line number39 index2 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Tika tika = 
</code><code class="java keyword">new</code> <code class="java 
plain">Tika();</code></div><
 div class="line number40 index3 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">try</code> 
<code class="java plain">{</code></div><div class="line number41 index4 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">tika.parseToString(stream);</code></div><div class="line number42 index5 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number43 index6 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number44 index7 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number45 index8 alt2"><code 
class="java plain">}</code></div></
 div></td></tr></tbody></table></div></div>
+<div id="highlighter_623484" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number37 index0 alt2"><code class="java 
keyword">public</code> <code class="java plain">String parseToStringExample() 
</code><code class="java keyword">throws</code> <code class="java 
plain">IOException, SAXException, TikaException {</code></div><div class="line 
number38 index1 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ParsingExample.</code><code class="java keyword">class</code><code 
class="java plain">.getResourceAsStream(</code><code class="java 
string">"test.doc"</code><code class="java plain">);</code></div><div 
class="line number39 index2 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Tika tika = 
</code><code class="java keyword">new</code> <code class="java 
plain">Tika();</code></div><
 div class="line number40 index3 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">try</code> 
<code class="java plain">{</code></div><div class="line number41 index4 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">tika.parseToString(stream);</code></div><div class="line number42 index5 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number43 index6 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number44 index7 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number45 index8 alt2"><code 
class="java plain">}</code></div></
 div></td></tr></tbody></table></div></div>
 <div class="section">
 <h4><a name="Parsing_using_the_Auto-Detect_Parser">Parsing using the 
Auto-Detect Parser</a></h4>
-<p>For more control, you can call the <a 
href="./apidocs/org/apache/tika/parser/Parser.html">Tika Parsers</a> directly. 
Most likely, you'll want to start out using the <a 
href="./apidocs/org/apache/tika/parser/AutoDetectParser.html">Auto-Detect 
Parser</a>, which automatically figures out what kind of content you have, then 
calls the appropriate parser for you.</p><div id="highlighter_870950" 
class="syntaxhighlighter nogutter  java"><table border="0" cellpadding="0" 
cellspacing="0"><tbody><tr><td class="code"><div class="container"><div 
class="line number66 index0 alt1"><code class="java keyword">public</code> 
<code class="java plain">String parseExample() </code><code class="java 
keyword">throws</code> <code class="java plain">IOException, SAXException, 
TikaException {</code></div><div class="line number67 index1 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">InputStream stream = ParsingExample.</code><code class="java 
keyword">class</code><c
 ode class="java plain">.getResourceAsStream(</code><code class="java 
string">"test.doc"</code><code class="java plain">);</code></div><div 
class="line number68 index2 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number69 index3 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">BodyContentHandler handler = </code><code class="java 
keyword">new</code> <code class="java 
plain">BodyContentHandler();</code></div><div class="line number70 index4 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">Metadata metadata = </code><code class="java 
keyword">new</code> <code class="java plain">Metadata();</code></div><div 
class="line number71 index5 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">tr
 y</code> <code class="java plain">{</code></div><div class="line number72 
index6 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number73 index7 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number74 index8 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number75 index9 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number76 
index10 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="l
 ine number77 index11 alt2"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
+<p>For more control, you can call the <a 
href="./apidocs/org/apache/tika/parser/Parser.html">Tika Parsers</a> directly. 
Most likely, you'll want to start out using the <a 
href="./apidocs/org/apache/tika/parser/AutoDetectParser.html">Auto-Detect 
Parser</a>, which automatically figures out what kind of content you have, then 
calls the appropriate parser for you.</p><div id="highlighter_542339" 
class="syntaxhighlighter nogutter  java"><table border="0" cellpadding="0" 
cellspacing="0"><tbody><tr><td class="code"><div class="container"><div 
class="line number66 index0 alt1"><code class="java keyword">public</code> 
<code class="java plain">String parseExample() </code><code class="java 
keyword">throws</code> <code class="java plain">IOException, SAXException, 
TikaException {</code></div><div class="line number67 index1 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">InputStream stream = ParsingExample.</code><code class="java 
keyword">class</code><c
 ode class="java plain">.getResourceAsStream(</code><code class="java 
string">"test.doc"</code><code class="java plain">);</code></div><div 
class="line number68 index2 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number69 index3 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">BodyContentHandler handler = </code><code class="java 
keyword">new</code> <code class="java 
plain">BodyContentHandler();</code></div><div class="line number70 index4 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">Metadata metadata = </code><code class="java 
keyword">new</code> <code class="java plain">Metadata();</code></div><div 
class="line number71 index5 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">tr
 y</code> <code class="java plain">{</code></div><div class="line number72 
index6 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number73 index7 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number74 index8 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number75 index9 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number76 
index10 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="l
 ine number77 index11 alt2"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
 <div class="section">
 <h3><a name="Picking_different_output_formats">Picking different output 
formats</a></h3>
 <p>With Tika, you can get the textual content of your files returned in a 
number of different formats. These can be plain text, html, xhtml, xhtml of one 
part of the file etc. This is controlled based on the <a class="externalLink" 
href="http://docs.oracle.com/javase/7/docs/api/org/xml/sax/ContentHandler.html";>ContentHandler</a>
 you supply to the Parser.</p>
 <div class="section">
 <h4><a name="Parsing_to_Plain_Text">Parsing to Plain Text</a></h4>
-<p>By using the <a 
href="./apidocs/org/apache/tika/sax/BodyContentHandler.html">BodyContentHandler</a>,
 you can request that Tika return only the content of the document's body as a 
plain-text string.</p><div id="highlighter_21206" class="syntaxhighlighter 
nogutter  java"><table border="0" cellpadding="0" 
cellspacing="0"><tbody><tr><td class="code"><div class="container"><div 
class="line number46 index0 alt1"><code class="java keyword">public</code> 
<code class="java plain">String parseToPlainText() </code><code class="java 
keyword">throws</code> <code class="java plain">IOException, SAXException, 
TikaException {</code></div><div class="line number47 index1 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">BodyContentHandler handler = </code><code class="java 
keyword">new</code> <code class="java 
plain">BodyContentHandler();</code></div><div class="line number48 index2 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><d
 iv class="line number49 index3 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test.doc"</code><code class="java 
plain">);</code></div><div class="line number50 index4 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number51 index5 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Metadata 
metadata = </code><code class="java keyword">new</code> <code class="java 
plain">Metadata();</code></div><div class="line number52 index6 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
keyword">try</code> <code class="java plain">{</code></
 div><div class="line number53 index7 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number54 index8 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number55 index9 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number56 index10 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number57 
index11 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number58 index12 alt1"><code 
class="jav
 a plain">}</code></div></div></td></tr></tbody></table></div></div>
+<p>By using the <a 
href="./apidocs/org/apache/tika/sax/BodyContentHandler.html">BodyContentHandler</a>,
 you can request that Tika return only the content of the document's body as a 
plain-text string.</p><div id="highlighter_923362" class="syntaxhighlighter 
nogutter  java"><table border="0" cellpadding="0" 
cellspacing="0"><tbody><tr><td class="code"><div class="container"><div 
class="line number46 index0 alt1"><code class="java keyword">public</code> 
<code class="java plain">String parseToPlainText() </code><code class="java 
keyword">throws</code> <code class="java plain">IOException, SAXException, 
TikaException {</code></div><div class="line number47 index1 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">BodyContentHandler handler = </code><code class="java 
keyword">new</code> <code class="java 
plain">BodyContentHandler();</code></div><div class="line number48 index2 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><
 div class="line number49 index3 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test.doc"</code><code class="java 
plain">);</code></div><div class="line number50 index4 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number51 index5 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Metadata 
metadata = </code><code class="java keyword">new</code> <code class="java 
plain">Metadata();</code></div><div class="line number52 index6 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
keyword">try</code> <code class="java plain">{</code><
 /div><div class="line number53 index7 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number54 index8 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number55 index9 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number56 index10 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number57 
index11 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number58 index12 alt1"><code 
class="ja
 va plain">}</code></div></div></td></tr></tbody></table></div></div>
 <div class="section">
 <h4><a name="Parsing_to_XHTML">Parsing to XHTML</a></h4>
-<p>By using the <a 
href="./apidocs/org/apache/tika/sax/ToXMLContentHandler.html">ToXMLContentHandler</a>,
 you can get the XHTML content of the whole document as a string.</p><div 
id="highlighter_109231" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number63 index0 alt2"><code class="java 
keyword">public</code> <code class="java plain">String parseToHTML() 
</code><code class="java keyword">throws</code> <code class="java 
plain">IOException, SAXException, TikaException {</code></div><div class="line 
number64 index1 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">ContentHandler 
handler = </code><code class="java keyword">new</code> <code class="java 
plain">ToXMLContentHandler();</code></div><div class="line number65 index2 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div 
class="line number66 index3 alt1">
 <code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">InputStream stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test.doc"</code><code class="java 
plain">);</code></div><div class="line number67 index4 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number68 index5 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Metadata 
metadata = </code><code class="java keyword">new</code> <code class="java 
plain">Metadata();</code></div><div class="line number69 index6 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
keyword">try</code> <code class="java plain">{</code></div><div class="line 
number70 index7 
 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number71 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number72 index9 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number73 index10 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number74 
index11 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number75 index12 alt2"><code 
class="java plain">}</code></div></div></td></t
 r></tbody></table></div>
-<p>If you just want the body of the xhtml document, without the header, you 
can chain together a <a 
href="./apidocs/org/apache/tika/sax/BodyContentHandler.html">BodyContentHandler</a>
 and a <a 
href="./apidocs/org/apache/tika/sax/ToXMLContentHandler.html">ToXMLContentHandler</a>
 as shown:</p><div id="highlighter_401330" class="syntaxhighlighter nogutter  
java"><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td 
class="code"><div class="container"><div class="line number81 index0 
alt2"><code class="java keyword">public</code> <code class="java plain">String 
parseBodyToHTML() </code><code class="java keyword">throws</code> <code 
class="java plain">IOException, SAXException, TikaException {</code></div><div 
class="line number82 index1 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">ContentHandler 
handler = </code><code class="java keyword">new</code> <code class="java 
plain">BodyContentHandler(</code></div><div class="line number83 in
 dex2 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java keyword">new</code> <code class="java 
plain">ToXMLContentHandler());</code></div><div class="line number84 index3 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div 
class="line number85 index4 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test.doc"</code><code class="java 
plain">);</code></div><div class="line number86 index5 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number87 index6 alt2"><code class="java spaces">&nbsp;&nbsp;&n
 bsp;&nbsp;</code><code class="java plain">Metadata metadata = </code><code 
class="java keyword">new</code> <code class="java 
plain">Metadata();</code></div><div class="line number88 index7 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
keyword">try</code> <code class="java plain">{</code></div><div class="line 
number89 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number90 index9 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number91 index10 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number92 index11 alt1">
 <code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number93 
index12 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number94 index13 alt1"><code 
class="java plain">}</code></div></div></td></tr></tbody></table></div></div>
+<p>By using the <a 
href="./apidocs/org/apache/tika/sax/ToXMLContentHandler.html">ToXMLContentHandler</a>,
 you can get the XHTML content of the whole document as a string.</p><div 
id="highlighter_283484" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number63 index0 alt2"><code class="java 
keyword">public</code> <code class="java plain">String parseToHTML() 
</code><code class="java keyword">throws</code> <code class="java 
plain">IOException, SAXException, TikaException {</code></div><div class="line 
number64 index1 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">ContentHandler 
handler = </code><code class="java keyword">new</code> <code class="java 
plain">ToXMLContentHandler();</code></div><div class="line number65 index2 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div 
class="line number66 index3 alt1">
 <code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">InputStream stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test.doc"</code><code class="java 
plain">);</code></div><div class="line number67 index4 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number68 index5 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Metadata 
metadata = </code><code class="java keyword">new</code> <code class="java 
plain">Metadata();</code></div><div class="line number69 index6 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
keyword">try</code> <code class="java plain">{</code></div><div class="line 
number70 index7 
 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number71 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number72 index9 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number73 index10 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number74 
index11 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number75 index12 alt2"><code 
class="java plain">}</code></div></div></td></t
 r></tbody></table></div>
+<p>If you just want the body of the xhtml document, without the header, you 
can chain together a <a 
href="./apidocs/org/apache/tika/sax/BodyContentHandler.html">BodyContentHandler</a>
 and a <a 
href="./apidocs/org/apache/tika/sax/ToXMLContentHandler.html">ToXMLContentHandler</a>
 as shown:</p><div id="highlighter_541873" class="syntaxhighlighter nogutter  
java"><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td 
class="code"><div class="container"><div class="line number81 index0 
alt2"><code class="java keyword">public</code> <code class="java plain">String 
parseBodyToHTML() </code><code class="java keyword">throws</code> <code 
class="java plain">IOException, SAXException, TikaException {</code></div><div 
class="line number82 index1 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">ContentHandler 
handler = </code><code class="java keyword">new</code> <code class="java 
plain">BodyContentHandler(</code></div><div class="line number83 in
 dex2 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java keyword">new</code> <code class="java 
plain">ToXMLContentHandler());</code></div><div class="line number84 index3 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div 
class="line number85 index4 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test.doc"</code><code class="java 
plain">);</code></div><div class="line number86 index5 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number87 index6 alt2"><code class="java spaces">&nbsp;&nbsp;&n
 bsp;&nbsp;</code><code class="java plain">Metadata metadata = </code><code 
class="java keyword">new</code> <code class="java 
plain">Metadata();</code></div><div class="line number88 index7 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
keyword">try</code> <code class="java plain">{</code></div><div class="line 
number89 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number90 index9 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number91 index10 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number92 index11 alt1">
 <code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number93 
index12 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number94 index13 alt1"><code 
class="java plain">}</code></div></div></td></tr></tbody></table></div></div>
 <div class="section">
 <h4><a name="Fetching_just_certain_bits_of_the_XHTML">Fetching just certain 
bits of the XHTML</a></h4>
-<p>It possible to execute XPath queries on the parse results, to fetch only 
certain bits of the XHTML. </p><div id="highlighter_942650" 
class="syntaxhighlighter nogutter  java"><table border="0" cellpadding="0" 
cellspacing="0"><tbody><tr><td class="code"><div class="container"><div 
class="line number100 index0 alt1"><code class="java keyword">public</code> 
<code class="java plain">String parseOnePartToHTML() </code><code class="java 
keyword">throws</code> <code class="java plain">IOException, SAXException, 
TikaException {</code></div><div class="line number101 index1 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
comments">// Only get things under html -> body -> div 
(class=header)</code></div><div class="line number102 index2 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">XPathParser xhtmlParser = </code><code class="java keyword">new</code> 
<code class="java plain">XPathParser(</code><code class="java strin
 g">"xhtml"</code><code class="java plain">, 
XHTMLContentHandler.XHTML);</code></div><div class="line number103 index3 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">Matcher divContentMatcher = 
xhtmlParser.parse(</code></div><div class="line number104 index4 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java 
string">"/xhtml:html/xhtml:body/xhtml:div/descendant::node()"</code><code 
class="java plain">);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
</code></div><div class="line number105 index5 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">ContentHandler 
handler = </code><code class="java keyword">new</code> <code class="java 
plain">MatchingContentHandler(</code></div><div class="line number106 index6 
alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java ke
 yword">new</code> <code class="java plain">ToXMLContentHandler(), 
divContentMatcher);</code></div><div class="line number107 index7 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div class="line 
number108 index8 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test2.doc"</code><code class="java 
plain">);</code></div><div class="line number109 index9 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number110 index10 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Metadata 
metadata = </code><code class="java keyword">new</code> <cod
 e class="java plain">Metadata();</code></div><div class="line number111 
index11 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">try</code> <code class="java plain">{</code></div><div 
class="line number112 index12 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number113 index13 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number114 index14 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number115 index15 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plai
 n">stream.close();</code></div><div class="line number116 index16 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">}</code></div><div class="line number117 index17 alt2"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
+<p>It possible to execute XPath queries on the parse results, to fetch only 
certain bits of the XHTML. </p><div id="highlighter_571894" 
class="syntaxhighlighter nogutter  java"><table border="0" cellpadding="0" 
cellspacing="0"><tbody><tr><td class="code"><div class="container"><div 
class="line number100 index0 alt1"><code class="java keyword">public</code> 
<code class="java plain">String parseOnePartToHTML() </code><code class="java 
keyword">throws</code> <code class="java plain">IOException, SAXException, 
TikaException {</code></div><div class="line number101 index1 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
comments">// Only get things under html -> body -> div 
(class=header)</code></div><div class="line number102 index2 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">XPathParser xhtmlParser = </code><code class="java keyword">new</code> 
<code class="java plain">XPathParser(</code><code class="java strin
 g">"xhtml"</code><code class="java plain">, 
XHTMLContentHandler.XHTML);</code></div><div class="line number103 index3 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">Matcher divContentMatcher = 
xhtmlParser.parse(</code></div><div class="line number104 index4 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java 
string">"/xhtml:html/xhtml:body/xhtml:div/descendant::node()"</code><code 
class="java plain">);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
</code></div><div class="line number105 index5 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">ContentHandler 
handler = </code><code class="java keyword">new</code> <code class="java 
plain">MatchingContentHandler(</code></div><div class="line number106 index6 
alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java ke
 yword">new</code> <code class="java plain">ToXMLContentHandler(), 
divContentMatcher);</code></div><div class="line number107 index7 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div class="line 
number108 index8 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test2.doc"</code><code class="java 
plain">);</code></div><div class="line number109 index9 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">AutoDetectParser parser = </code><code class="java keyword">new</code> 
<code class="java plain">AutoDetectParser();</code></div><div class="line 
number110 index10 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Metadata 
metadata = </code><code class="java keyword">new</code> <cod
 e class="java plain">Metadata();</code></div><div class="line number111 
index11 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">try</code> <code class="java plain">{</code></div><div 
class="line number112 index12 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number113 index13 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">handler.toString();</code></div><div class="line number114 index14 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number115 index15 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plai
 n">stream.close();</code></div><div class="line number116 index16 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">}</code></div><div class="line number117 index17 alt2"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
 <div class="section">
 <h3><a name="Custom_Content_Handlers">Custom Content Handlers</a></h3>
 <p>The textual output of parsing a file with Tika is returned via the SAX <a 
class="externalLink" 
href="http://docs.oracle.com/javase/7/docs/api/org/xml/sax/ContentHandler.html";>ContentHandler</a>
 you pass to the parse method. It is possible to customise your parsing by 
supplying your own ContentHandler which does special things.</p>
 <div class="section">
 <h4><a name="Extract_Phone_Numbers_from_Content_into_the_Metadata">Extract 
Phone Numbers from Content into the Metadata</a></h4>
-<p>By using the <a 
href="./apidocs/org/apache/tika/sax/PhoneExtractingContentHandler.html">PhoneExtractingContentHandler</a>,
 you can have any phone numbers found in the textual content of the document 
extracted and placed into the Metadata object for you.</p><div 
id="highlighter_314119" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number69 index0 alt2"><code class="java 
keyword">public</code> <code class="java keyword">static</code> <code 
class="java keyword">void</code> <code class="java plain">process(File file) 
</code><code class="java keyword">throws</code> <code class="java 
plain">Exception {</code></div><div class="line number70 index1 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">Parser parser = </code><code class="java keyword">new</code> <code 
class="java plain">AutoDetectParser();</code></div><div class="line number71 
 index2 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">Metadata metadata = </code><code class="java 
keyword">new</code> <code class="java plain">Metadata();</code></div><div 
class="line number72 index3 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java comments">// The 
PhoneExtractingContentHandler will examine any characters for phone numbers 
before passing them</code></div><div class="line number73 index4 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
comments">// to the underlying Handler.</code></div><div class="line number74 
index5 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">PhoneExtractingContentHandler handler = </code><code 
class="java keyword">new</code> <code class="java 
plain">PhoneExtractingContentHandler(</code><code class="java 
keyword">new</code> <code class="java plain">BodyContentHandler(), 
metadata);</code></div><div cl
 ass="line number75 index6 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = </code><code class="java keyword">new</code> <code class="java 
plain">FileInputStream(file);</code></div><div class="line number76 index7 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">try</code> <code class="java plain">{</code></div><div 
class="line number77 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata, </code><code 
class="java keyword">new</code> <code class="java 
plain">ParseContext());</code></div><div class="line number78 index9 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number79 index10 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
keyword">finally</code> <code class="java plain">{
 </code></div><div class="line number80 index11 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number81 
index12 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number82 index13 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">String[] numbers = metadata.getValues(</code><code class="java 
string">"phonenumbers"</code><code class="java plain">);</code></div><div 
class="line number83 index14 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">for</code> 
<code class="java plain">(String number : numbers) {</code></div><div 
class="line number84 index15 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">phoneNumbers.add(number);</code></div><div class="line 
number85 index16 al
 t2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">}</code></div><div class="line number86 index17 alt1"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div>
+<p>By using the <a 
href="./apidocs/org/apache/tika/sax/PhoneExtractingContentHandler.html">PhoneExtractingContentHandler</a>,
 you can have any phone numbers found in the textual content of the document 
extracted and placed into the Metadata object for you.</p><div 
id="highlighter_437812" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number69 index0 alt2"><code class="java 
keyword">public</code> <code class="java keyword">static</code> <code 
class="java keyword">void</code> <code class="java plain">process(File file) 
</code><code class="java keyword">throws</code> <code class="java 
plain">Exception {</code></div><div class="line number70 index1 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">Parser parser = </code><code class="java keyword">new</code> <code 
class="java plain">AutoDetectParser();</code></div><div class="line number71 
 index2 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">Metadata metadata = </code><code class="java 
keyword">new</code> <code class="java plain">Metadata();</code></div><div 
class="line number72 index3 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java comments">// The 
PhoneExtractingContentHandler will examine any characters for phone numbers 
before passing them</code></div><div class="line number73 index4 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
comments">// to the underlying Handler.</code></div><div class="line number74 
index5 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">PhoneExtractingContentHandler handler = </code><code 
class="java keyword">new</code> <code class="java 
plain">PhoneExtractingContentHandler(</code><code class="java 
keyword">new</code> <code class="java plain">BodyContentHandler(), 
metadata);</code></div><div cl
 ass="line number75 index6 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = </code><code class="java keyword">new</code> <code class="java 
plain">FileInputStream(file);</code></div><div class="line number76 index7 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">try</code> <code class="java plain">{</code></div><div 
class="line number77 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata, </code><code 
class="java keyword">new</code> <code class="java 
plain">ParseContext());</code></div><div class="line number78 index9 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number79 index10 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
keyword">finally</code> <code class="java plain">{
 </code></div><div class="line number80 index11 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number81 
index12 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number82 index13 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">String[] numbers = metadata.getValues(</code><code class="java 
string">"phonenumbers"</code><code class="java plain">);</code></div><div 
class="line number83 index14 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">for</code> 
<code class="java plain">(String number : numbers) {</code></div><div 
class="line number84 index15 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">phoneNumbers.add(number);</code></div><div class="line 
number85 index16 al
 t2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">}</code></div><div class="line number86 index17 alt1"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div>
 <div class="section">
 <h4><a name="Streaming_the_plain_text_in_chunks">Streaming the plain text in 
chunks</a></h4>
-<p>Sometimes, you want to chunk the resulting text up, perhaps to output as 
you go minimising memory use, perhaps to output to HDFS files, or any other 
reason! With a small custom content handler, you can do that.</p><div 
id="highlighter_819345" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number124 index0 alt1"><code class="java 
keyword">public</code> <code class="java plain">List&lt;String> 
parseToPlainTextChunks() </code><code class="java keyword">throws</code> <code 
class="java plain">IOException, SAXException, TikaException {</code></div><div 
class="line number125 index1 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">final</code> 
<code class="java plain">List&lt;String> chunks = </code><code class="java 
keyword">new</code> <code class="java 
plain">ArrayList&lt;String>();</code></div><div class="line number126 index2 
alt1"><
 code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">chunks.add(</code><code class="java string">""</code><code class="java 
plain">);</code></div><div class="line number127 index3 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">ContentHandlerDecorator handler = </code><code class="java 
keyword">new</code> <code class="java plain">ContentHandlerDecorator() 
{</code></div><div class="line number128 index4 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java color1">@Override</code></div><div class="line number129 index5 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">public</code> <code class="java keyword">void</code> <code 
class="java plain">characters(</code><code class="java 
keyword">char</code><code class="java plain">[] ch, </code><code class="java 
keyword">int</code> <code class="java plain">star
 t, </code><code class="java keyword">int</code> <code class="java 
plain">length) {</code></div><div class="line number130 index6 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">String lastChunk = chunks.get(chunks.size()-</code><code 
class="java value">1</code><code class="java plain">);</code></div><div 
class="line number131 index7 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">String thisStr = </code><code class="java 
keyword">new</code> <code class="java plain">String(ch, start, 
length);</code></div><div class="line number132 index8 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div
 class="line number133 index9 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 clas
 s="java keyword">if</code> <code class="java plain">(lastChunk.length()+length 
> MAXIMUM_TEXT_CHUNK_SIZE) {</code></div><div class="line number134 index10 
alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">chunks.add(thisStr);</code></div><div class="line number135 
index11 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">} </code><code class="java keyword">else</code> <code 
class="java plain">{</code></div><div class="line number136 index12 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">chunks.set(chunks.size()-</code><code class="java 
value">1</code><code class="java plain">, lastChunk+thisStr);</code></div><div 
class="line number137 index13 alt2"><code class="java spaces">&nbsp
 
;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number138 index14 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number139 index15 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">};</code></div><div class="line number140 index16 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div class="line 
number141 index17 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test2.doc"</code><code class="java 
plain">);</code></div><div class="line number142 index18 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Aut
 oDetectParser parser = </code><code class="java keyword">new</code> <code 
class="java plain">AutoDetectParser();</code></div><div class="line number143 
index19 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">Metadata metadata = </code><code class="java 
keyword">new</code> <code class="java plain">Metadata();</code></div><div 
class="line number144 index20 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">try</code> 
<code class="java plain">{</code></div><div class="line number145 index21 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number146 index22 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">chunks;</code></div><div class="line number147 index23 alt2"><code class
 ="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">} 
</code><code class="java keyword">finally</code> <code class="java 
plain">{</code></div><div class="line number148 index24 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number149 
index25 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number150 index26 alt1"><code 
class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
+<p>Sometimes, you want to chunk the resulting text up, perhaps to output as 
you go minimising memory use, perhaps to output to HDFS files, or any other 
reason! With a small custom content handler, you can do that.</p><div 
id="highlighter_115126" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number124 index0 alt1"><code class="java 
keyword">public</code> <code class="java plain">List&lt;String> 
parseToPlainTextChunks() </code><code class="java keyword">throws</code> <code 
class="java plain">IOException, SAXException, TikaException {</code></div><div 
class="line number125 index1 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">final</code> 
<code class="java plain">List&lt;String> chunks = </code><code class="java 
keyword">new</code> <code class="java 
plain">ArrayList&lt;String>();</code></div><div class="line number126 index2 
alt1"><
 code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">chunks.add(</code><code class="java string">""</code><code class="java 
plain">);</code></div><div class="line number127 index3 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">ContentHandlerDecorator handler = </code><code class="java 
keyword">new</code> <code class="java plain">ContentHandlerDecorator() 
{</code></div><div class="line number128 index4 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java color1">@Override</code></div><div class="line number129 index5 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">public</code> <code class="java keyword">void</code> <code 
class="java plain">characters(</code><code class="java 
keyword">char</code><code class="java plain">[] ch, </code><code class="java 
keyword">int</code> <code class="java plain">star
 t, </code><code class="java keyword">int</code> <code class="java 
plain">length) {</code></div><div class="line number130 index6 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">String lastChunk = chunks.get(chunks.size()-</code><code 
class="java value">1</code><code class="java plain">);</code></div><div 
class="line number131 index7 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">String thisStr = </code><code class="java 
keyword">new</code> <code class="java plain">String(ch, start, 
length);</code></div><div class="line number132 index8 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div
 class="line number133 index9 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 clas
 s="java keyword">if</code> <code class="java plain">(lastChunk.length()+length 
> MAXIMUM_TEXT_CHUNK_SIZE) {</code></div><div class="line number134 index10 
alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">chunks.add(thisStr);</code></div><div class="line number135 
index11 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">} </code><code class="java keyword">else</code> <code 
class="java plain">{</code></div><div class="line number136 index12 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code
 class="java plain">chunks.set(chunks.size()-</code><code class="java 
value">1</code><code class="java plain">, lastChunk+thisStr);</code></div><div 
class="line number137 index13 alt2"><code class="java spaces">&nbsp
 
;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number138 index14 alt1"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number139 index15 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">};</code></div><div class="line number140 index16 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code>&nbsp;</div><div class="line 
number141 index17 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ContentHandlerExample.</code><code class="java 
keyword">class</code><code class="java plain">.getResourceAsStream(</code><code 
class="java string">"test2.doc"</code><code class="java 
plain">);</code></div><div class="line number142 index18 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Aut
 oDetectParser parser = </code><code class="java keyword">new</code> <code 
class="java plain">AutoDetectParser();</code></div><div class="line number143 
index19 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">Metadata metadata = </code><code class="java 
keyword">new</code> <code class="java plain">Metadata();</code></div><div 
class="line number144 index20 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">try</code> 
<code class="java plain">{</code></div><div class="line number145 index21 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata);</code></div><div 
class="line number146 index22 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">chunks;</code></div><div class="line number147 index23 alt2"><code class
 ="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">} 
</code><code class="java keyword">finally</code> <code class="java 
plain">{</code></div><div class="line number148 index24 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number149 
index25 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number150 index26 alt1"><code 
class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
 <div class="section">
 <h3><a name="Translation">Translation</a></h3>
 <p>Tika provides a pluggable Translation system, which allow you to send the 
results of parsing off to an external system or program to have the text 
translated into another language.</p>
 <div class="section">
 <h4><a name="Translation_using_the_Microsoft_Translation_API">Translation 
using the Microsoft Translation API</a></h4>
-<p>In order to use the Microsoft Translation API, you need to sign up for a 
Microsoft account, get an API key, then pass the key to Tika before 
translating.</p><div id="highlighter_318159" class="syntaxhighlighter nogutter  
java"><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td 
class="code"><div class="container"><div class="line number23 index0 
alt2"><code class="java keyword">public</code> <code class="java plain">String 
microsoftTranslateToFrench(String text) {</code></div><div class="line number24 
index1 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">MicrosoftTranslator translator = </code><code class="java 
keyword">new</code> <code class="java 
plain">MicrosoftTranslator();</code></div><div class="line number25 index2 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java comments">// Change the id and secret! See <a 
href="http://msdn.microsoft.com/en-us/library/hh454950.aspx.";>http://msdn.microso
 ft.com/en-us/library/hh454950.aspx.</a></code></div><div class="line number26 
index3 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">translator.setId(</code><code class="java 
string">"dummy-id"</code><code class="java plain">);</code></div><div 
class="line number27 index4 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">translator.setSecret(</code><code class="java 
string">"dummy-secret"</code><code class="java plain">);</code></div><div 
class="line number28 index5 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">try</code> 
<code class="java plain">{</code></div><div class="line number29 index6 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">translator.translate(text, </code><code class="java 
string">"fr"</code><code class="java plain">);</code></div><div class=
 "line number30 index7 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">} </code><code 
class="java keyword">catch</code> <code class="java plain">(Exception e) 
{</code></div><div class="line number31 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java string">"Error while 
translating."</code><code class="java plain">;</code></div><div class="line 
number32 index9 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">}</code></div><div class="line number33 index10 alt2"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
+<p>In order to use the Microsoft Translation API, you need to sign up for a 
Microsoft account, get an API key, then pass the key to Tika before 
translating.</p><div id="highlighter_401204" class="syntaxhighlighter nogutter  
java"><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td 
class="code"><div class="container"><div class="line number23 index0 
alt2"><code class="java keyword">public</code> <code class="java plain">String 
microsoftTranslateToFrench(String text) {</code></div><div class="line number24 
index1 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">MicrosoftTranslator translator = </code><code class="java 
keyword">new</code> <code class="java 
plain">MicrosoftTranslator();</code></div><div class="line number25 index2 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java comments">// Change the id and secret! See <a 
href="http://msdn.microsoft.com/en-us/library/hh454950.aspx.";>http://msdn.microso
 ft.com/en-us/library/hh454950.aspx.</a></code></div><div class="line number26 
index3 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">translator.setId(</code><code class="java 
string">"dummy-id"</code><code class="java plain">);</code></div><div 
class="line number27 index4 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">translator.setSecret(</code><code class="java 
string">"dummy-secret"</code><code class="java plain">);</code></div><div 
class="line number28 index5 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">try</code> 
<code class="java plain">{</code></div><div class="line number29 index6 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">translator.translate(text, </code><code class="java 
string">"fr"</code><code class="java plain">);</code></div><div class=
 "line number30 index7 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">} </code><code 
class="java keyword">catch</code> <code class="java plain">(Exception e) 
{</code></div><div class="line number31 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java string">"Error while 
translating."</code><code class="java plain">;</code></div><div class="line 
number32 index9 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">}</code></div><div class="line number33 index10 alt2"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
 <div class="section">
 <h3><a name="Language_Identification">Language Identification</a></h3>
-<p>Tika provides support for identifying the language of text, through the <a 
href="./apidocs/org/apache/tika/language/LanguageIdentifier.html">LanguageIdentifier</a>
 class.</p><div id="highlighter_471964" class="syntaxhighlighter nogutter  
java"><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td 
class="code"><div class="container"><div class="line number23 index0 
alt2"><code class="java keyword">public</code> <code class="java plain">String 
identifyLanguage(String text) {</code></div><div class="line number24 index1 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">LanguageIdentifier identifier = </code><code class="java 
keyword">new</code> <code class="java 
plain">LanguageIdentifier(text);</code></div><div class="line number25 index2 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">identifier.getLanguage();</code></div><div class="line number26 index3
  alt1"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
+<p>Tika provides support for identifying the language of text, through the <a 
href="./apidocs/org/apache/tika/language/LanguageIdentifier.html">LanguageIdentifier</a>
 class.</p><div id="highlighter_707425" class="syntaxhighlighter nogutter  
java"><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td 
class="code"><div class="container"><div class="line number23 index0 
alt2"><code class="java keyword">public</code> <code class="java plain">String 
identifyLanguage(String text) {</code></div><div class="line number24 index1 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">LanguageIdentifier identifier = </code><code class="java 
keyword">new</code> <code class="java 
plain">LanguageIdentifier(text);</code></div><div class="line number25 index2 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">identifier.getLanguage();</code></div><div class="line number26 index3
  alt1"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
       </div>
       <div id="sidebar">
         <div id="navigation">
@@ -332,7 +332,7 @@
       </div>
       <div id="footer">
         <p>
-          Copyright &#169; 2014
+          Copyright &#169; 2015
           <a href="http://www.apache.org/";>The Apache Software Foundation</a>.
           Site powered by <a href="http://maven.apache.org/";>Apache Maven</a>. 
           Search powered by

Modified: tika/site/publish/1.7/formats.html
URL: 
http://svn.apache.org/viewvc/tika/site/publish/1.7/formats.html?rev=1651638&r1=1651637&r2=1651638&view=diff
==============================================================================
--- tika/site/publish/1.7/formats.html (original)
+++ tika/site/publish/1.7/formats.html Wed Jan 14 12:25:49 2015
@@ -194,7 +194,390 @@
 <p>The <a 
href="./api/org/apache/tika/parser/crypto/Pkcs7Parser.html">Pkcs7Parser</a> is 
able to parse the contents of PKCS7 signed messages, but doesn't include any 
information from the outer PKCS7 wrapper.</p></div></div>
 <div class="section">
 <h2>Full list of supported formats:<a 
name="Full_list_of_supported_formats:"></a></h2>
-<p>TODO Populate this at release time</p></div>
+<ul>
+<li>org.apache.tika.parser.asm.<a 
href="./api/org/apache/tika/parser/asm/ClassParser">ClassParser</a>
+<ul>
+<li>application/java-vm</li></ul></li>
+<li>org.apache.tika.parser.audio.<a 
href="./api/org/apache/tika/parser/audio/AudioParser">AudioParser</a>
+<ul>
+<li>audio/x-wav</li>
+<li>audio/x-aiff</li>
+<li>audio/basic</li></ul></li>
+<li>org.apache.tika.parser.audio.<a 
href="./api/org/apache/tika/parser/audio/MidiParser">MidiParser</a>
+<ul>
+<li>application/x-midi</li>
+<li>audio/midi</li></ul></li>
+<li>org.apache.tika.parser.chm.<a 
href="./api/org/apache/tika/parser/chm/ChmParser">ChmParser</a>
+<ul>
+<li>application/vnd.ms-htmlhelp</li>
+<li>application/chm</li>
+<li>application/x-chm</li></ul></li>
+<li>org.apache.tika.parser.code.<a 
href="./api/org/apache/tika/parser/code/SourceCodeParser">SourceCodeParser</a>
+<ul>
+<li>text/x-java-source</li>
+<li>text/x-c++src</li>
+<li>text/x-groovy</li></ul></li>
+<li>org.apache.tika.parser.crypto.<a 
href="./api/org/apache/tika/parser/crypto/Pkcs7Parser">Pkcs7Parser</a>
+<ul>
+<li>application/pkcs7-signature</li>
+<li>application/pkcs7-mime</li></ul></li>
+<li>org.apache.tika.parser.dwg.<a 
href="./api/org/apache/tika/parser/dwg/DWGParser">DWGParser</a>
+<ul>
+<li>image/vnd.dwg</li></ul></li>
+<li>org.apache.tika.parser.epub.<a 
href="./api/org/apache/tika/parser/epub/EpubParser">EpubParser</a>
+<ul>
+<li>application/x-ibooks+zip</li>
+<li>application/epub+zip</li></ul></li>
+<li>org.apache.tika.parser.executable.<a 
href="./api/org/apache/tika/parser/executable/ExecutableParser">ExecutableParser</a>
+<ul>
+<li>application/x-elf</li>
+<li>application/x-sharedlib</li>
+<li>application/x-executable</li>
+<li>application/x-msdownload</li>
+<li>application/x-coredump</li>
+<li>application/x-object</li></ul></li>
+<li>org.apache.tika.parser.feed.<a 
href="./api/org/apache/tika/parser/feed/FeedParser">FeedParser</a>
+<ul>
+<li>application/atom+xml</li>
+<li>application/rss+xml</li></ul></li>
+<li>org.apache.tika.parser.font.<a 
href="./api/org/apache/tika/parser/font/AdobeFontMetricParser">AdobeFontMetricParser</a>
+<ul>
+<li>application/x-font-adobe-metric</li></ul></li>
+<li>org.apache.tika.parser.font.<a 
href="./api/org/apache/tika/parser/font/TrueTypeParser">TrueTypeParser</a>
+<ul>
+<li>application/x-font-ttf</li></ul></li>
+<li>org.apache.tika.parser.gdal.<a 
href="./api/org/apache/tika/parser/gdal/GDALParser">GDALParser</a>
+<ul>
+<li>image/x-ozi</li>
+<li>application/x-snodas</li>
+<li>application/x-ecrg-toc</li>
+<li>image/envisat</li>
+<li>application/x-doq2</li>
+<li>application/x-rs2</li>
+<li>application/x-gsag</li>
+<li>application/x-ers</li>
+<li>application/fits</li>
+<li>application/x-pnm</li>
+<li>image/adrg</li>
+<li>image/gif</li>
+<li>application/x-generic-bin</li>
+<li>application/x-bt</li>
+<li>application/x-zmap</li>
+<li>application/x-hdf</li>
+<li>image/eir</li>
+<li>application/x-ace2</li>
+<li>application/grass-ascii-grid</li>
+<li>application/x-l1b</li>
+<li>application/x-gsc</li>
+<li>image/jp2</li>
+<li>image/hfa</li>
+<li>image/fits</li>
+<li>image/raster</li>
+<li>application/x-epsilon</li>
+<li>image/x-srp</li>
+<li>application/x-envi-hdr</li>
+<li>application/x-ctable2</li>
+<li>application/x-srtmhgt</li>
+<li>application/jaxa-pal-sar</li>
+<li>application/x-ndf</li>
+<li>application/sdts-raster</li>
+<li>application/x-gtx</li>
+<li>application/x-rst</li>
+<li>application/x-xyz</li>
+<li>application/terragen</li>
+<li>application/x-gs7bg</li>
+<li>image/arg</li>
+<li>application/elas</li>
+<li>image/big-gif</li>
+<li>application/x-geo-pdf</li>
+<li>application/x-ctg</li>
+<li>application/aaigrid</li>
+<li>application/x-lcp</li>
+<li>application/x-nwt-grc</li>
+<li>application/x-fast</li>
+<li>application/x-usgs-dem</li>
+<li>application/x-nwt-grd</li>
+<li>application/x-ingr</li>
+<li>application/x-envi</li>
+<li>application/x-rik</li>
+<li>application/x-blx</li>
+<li>application/x-wcs</li>
+<li>image/ceos</li>
+<li>application/x-ngs-geoid</li>
+<li>application/x-r</li>
+<li>image/bmp</li>
+<li>application/x-til</li>
+<li>application/x-pds</li>
+<li>application/x-http</li>
+<li>application/x-rasterlite</li>
+<li>application/x-gmt</li>
+<li>application/x-msgn</li>
+<li>image/ilwis</li>
+<li>application/aig</li>
+<li>application/x-rmf</li>
+<li>image/x-hdf5-image</li>
+<li>image/sar-ceos</li>
+<li>application/x-kro</li>
+<li>application/vrt</li>
+<li>application/x-netcdf</li>
+<li>image/png</li>
+<li>image/geotiff</li>
+<li>image/x-mff2</li>
+<li>application/x-webp</li>
+<li>image/ida</li>
+<li>application/x-gsbg</li>
+<li>application/x-ntv2</li>
+<li>application/x-coasp</li>
+<li>application/x-los-las</li>
+<li>application/x-tsx</li>
+<li>application/x-bag</li>
+<li>image/fit</li>
+<li>application/x-lan</li>
+<li>application/x-map</li>
+<li>image/jpeg</li>
+<li>application/x-dods</li>
+<li>application/jdem</li>
+<li>application/gff</li>
+<li>application/x-isis2</li>
+<li>application/x-isis3</li>
+<li>application/xpm</li>
+<li>application/x-pcidsk</li>
+<li>application/x-gxf</li>
+<li>image/ntif</li>
+<li>application/x-wms</li>
+<li>application/x-cosar</li>
+<li>image/bsb</li>
+<li>application/x-grib</li>
+<li>application/x-mbtiles</li>
+<li>application/x-cappi</li>
+<li>application/x-rpf-toc</li>
+<li>image/x-mff</li>
+<li>image/x-dimap</li>
+<li>image/x-pcraster</li>
+<li>application/x-ppi</li>
+<li>application/x-sdat</li>
+<li>application/pcisdk</li>
+<li>application/x-cpg</li>
+<li>application/leveller</li>
+<li>image/sgi</li>
+<li>image/x-fujibas</li>
+<li>image/x-airsar</li>
+<li>application/x-e00-grid</li>
+<li>application/x-kml</li>
+<li>application/x-p-aux</li>
+<li>application/x-doq1</li>
+<li>application/dted</li>
+<li>application/x-dipex</li></ul></li>
+<li>org.apache.tika.parser.hdf.<a 
href="./api/org/apache/tika/parser/hdf/HDFParser">HDFParser</a>
+<ul>
+<li>application/x-hdf</li></ul></li>
+<li>org.apache.tika.parser.html.<a 
href="./api/org/apache/tika/parser/html/HtmlParser">HtmlParser</a>
+<ul>
+<li>application/x-asp</li>
+<li>application/xhtml+xml</li>
+<li>application/vnd.wap.xhtml+xml</li>
+<li>text/html</li></ul></li>
+<li>org.apache.tika.parser.image.<a 
href="./api/org/apache/tika/parser/image/BPGParser">BPGParser</a>
+<ul>
+<li>image/bpg</li>
+<li>image/x-bpg</li></ul></li>
+<li>org.apache.tika.parser.image.<a 
href="./api/org/apache/tika/parser/image/ImageParser">ImageParser</a>
+<ul>
+<li>image/x-ms-bmp</li>
+<li>image/png</li>
+<li>image/x-icon</li>
+<li>image/vnd.wap.wbmp</li>
+<li>image/gif</li>
+<li>image/bmp</li>
+<li>image/x-xcf</li></ul></li>
+<li>org.apache.tika.parser.image.<a 
href="./api/org/apache/tika/parser/image/PSDParser">PSDParser</a>
+<ul>
+<li>image/vnd.adobe.photoshop</li></ul></li>
+<li>org.apache.tika.parser.iptc.<a 
href="./api/org/apache/tika/parser/iptc/IptcAnpaParser">IptcAnpaParser</a>
+<ul>
+<li>text/vnd.iptc.anpa</li></ul></li>
+<li>org.apache.tika.parser.iwork.<a 
href="./api/org/apache/tika/parser/iwork/IWorkPackageParser">IWorkPackageParser</a>
+<ul>
+<li>application/vnd.apple.iwork</li>
+<li>application/vnd.apple.numbers</li>
+<li>application/vnd.apple.keynote</li>
+<li>application/vnd.apple.pages</li></ul></li>
+<li>org.apache.tika.parser.mail.<a 
href="./api/org/apache/tika/parser/mail/RFC822Parser">RFC822Parser</a>
+<ul>
+<li>message/rfc822</li></ul></li>
+<li>org.apache.tika.parser.mat.<a 
href="./api/org/apache/tika/parser/mat/MatParser">MatParser</a>
+<ul>
+<li>application/x-matlab-data</li></ul></li>
+<li>org.apache.tika.parser.mbox.<a 
href="./api/org/apache/tika/parser/mbox/MboxParser">MboxParser</a>
+<ul>
+<li>application/mbox</li></ul></li>
+<li>org.apache.tika.parser.mbox.<a 
href="./api/org/apache/tika/parser/mbox/OutlookPSTParser">OutlookPSTParser</a>
+<ul>
+<li>application/vnd.ms-outlook-pst</li></ul></li>
+<li>org.apache.tika.parser.microsoft.<a 
href="./api/org/apache/tika/parser/microsoft/OfficeParser">OfficeParser</a>
+<ul>
+<li>application/x-mspublisher</li>
+<li>application/x-tika-msoffice</li>
+<li>application/vnd.ms-excel</li>
+<li>application/sldworks</li>
+<li>application/x-tika-msworks-spreadsheet</li>
+<li>application/vnd.ms-powerpoint</li>
+<li>application/x-tika-msoffice-embedded; format=ole10_native</li>
+<li>application/vnd.ms-project</li>
+<li>application/x-tika-ooxml-protected</li>
+<li>application/msword</li>
+<li>application/vnd.ms-outlook</li>
+<li>application/vnd.visio</li></ul></li>
+<li>org.apache.tika.parser.microsoft.<a 
href="./api/org/apache/tika/parser/microsoft/OldExcelParser">OldExcelParser</a>
+<ul>
+<li>application/vnd.ms-excel.sheet.3</li>
+<li>application/vnd.ms-excel.sheet.2</li>
+<li>application/vnd.ms-excel.workspace.3</li>
+<li>application/vnd.ms-excel.workspace.4</li>
+<li>application/vnd.ms-excel.sheet.4</li></ul></li>
+<li>org.apache.tika.parser.microsoft.<a 
href="./api/org/apache/tika/parser/microsoft/TNEFParser">TNEFParser</a>
+<ul>
+<li>application/x-tnef</li>
+<li>application/ms-tnef</li>
+<li>application/vnd.ms-tnef</li></ul></li>
+<li>org.apache.tika.parser.microsoft.ooxml.<a 
href="./api/org/apache/tika/parser/microsoft/ooxml/OOXMLParser">OOXMLParser</a>
+<ul>
+<li>application/vnd.ms-excel.sheet.macroenabled.12</li>
+<li>application/vnd.ms-powerpoint.presentation.macroenabled.12</li>
+<li>application/vnd.openxmlformats-officedocument.spreadsheetml.template</li>
+<li>application/vnd.openxmlformats-officedocument.wordprocessingml.document</li>
+<li>application/vnd.openxmlformats-officedocument.presentationml.template</li>
+<li>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</li>
+<li>application/vnd.openxmlformats-officedocument.presentationml.presentation</li>
+<li>application/vnd.ms-excel.addin.macroenabled.12</li>
+<li>application/vnd.ms-word.document.macroenabled.12</li>
+<li>application/vnd.ms-excel.template.macroenabled.12</li>
+<li>application/vnd.openxmlformats-officedocument.wordprocessingml.template</li>
+<li>application/vnd.ms-powerpoint.slideshow.macroenabled.12</li>
+<li>application/vnd.ms-powerpoint.addin.macroenabled.12</li>
+<li>application/vnd.ms-word.template.macroenabled.12</li>
+<li>application/x-tika-ooxml</li>
+<li>application/vnd.openxmlformats-officedocument.presentationml.slideshow</li></ul></li>
+<li>org.apache.tika.parser.mp3.<a 
href="./api/org/apache/tika/parser/mp3/Mp3Parser">Mp3Parser</a>
+<ul>
+<li>audio/mpeg</li></ul></li>
+<li>org.apache.tika.parser.mp4.<a 
href="./api/org/apache/tika/parser/mp4/MP4Parser">MP4Parser</a>
+<ul>
+<li>video/3gpp2</li>
+<li>video/mp4</li>
+<li>video/quicktime</li>
+<li>audio/mp4</li>
+<li>application/mp4</li>
+<li>video/x-m4v</li>
+<li>video/3gpp</li></ul></li>
+<li>org.apache.tika.parser.netcdf.<a 
href="./api/org/apache/tika/parser/netcdf/NetCDFParser">NetCDFParser</a>
+<ul>
+<li>application/x-netcdf</li></ul></li>
+<li>org.apache.tika.parser.ocr.<a 
href="./api/org/apache/tika/parser/ocr/TesseractOCRParser">TesseractOCRParser</a>
+<ul>
+<li>image/x-ms-bmp</li>
+<li>image/jpeg</li>
+<li>image/png</li>
+<li>image/tiff</li>
+<li>image/gif</li></ul></li>
+<li>org.apache.tika.parser.odf.<a 
href="./api/org/apache/tika/parser/odf/OpenDocumentParser">OpenDocumentParser</a>
+<ul>
+<li>application/x-vnd.oasis.opendocument.graphics-template</li>
+<li>application/vnd.sun.xml.writer</li>
+<li>application/x-vnd.oasis.opendocument.text</li>
+<li>application/x-vnd.oasis.opendocument.text-web</li>
+<li>application/x-vnd.oasis.opendocument.spreadsheet-template</li>
+<li>application/vnd.oasis.opendocument.formula-template</li>
+<li>application/vnd.oasis.opendocument.presentation</li>
+<li>application/vnd.oasis.opendocument.image-template</li>
+<li>application/x-vnd.oasis.opendocument.graphics</li>
+<li>application/vnd.oasis.opendocument.chart-template</li>
+<li>application/vnd.oasis.opendocument.presentation-template</li>
+<li>application/x-vnd.oasis.opendocument.image-template</li>
+<li>application/vnd.oasis.opendocument.formula</li>
+<li>application/x-vnd.oasis.opendocument.image</li>
+<li>application/vnd.oasis.opendocument.spreadsheet-template</li>
+<li>application/x-vnd.oasis.opendocument.chart-template</li>
+<li>application/x-vnd.oasis.opendocument.formula</li>
+<li>application/vnd.oasis.opendocument.spreadsheet</li>
+<li>application/vnd.oasis.opendocument.text-web</li>
+<li>application/vnd.oasis.opendocument.text-template</li>
+<li>application/vnd.oasis.opendocument.text</li>
+<li>application/x-vnd.oasis.opendocument.formula-template</li>
+<li>application/x-vnd.oasis.opendocument.spreadsheet</li>
+<li>application/x-vnd.oasis.opendocument.chart</li>
+<li>application/vnd.oasis.opendocument.text-master</li>
+<li>application/x-vnd.oasis.opendocument.text-master</li>
+<li>application/x-vnd.oasis.opendocument.text-template</li>
+<li>application/vnd.oasis.opendocument.graphics</li>
+<li>application/vnd.oasis.opendocument.graphics-template</li>
+<li>application/x-vnd.oasis.opendocument.presentation</li>
+<li>application/vnd.oasis.opendocument.image</li>
+<li>application/x-vnd.oasis.opendocument.presentation-template</li>
+<li>application/vnd.oasis.opendocument.chart</li></ul></li>
+<li>org.apache.tika.parser.pdf.<a 
href="./api/org/apache/tika/parser/pdf/PDFParser">PDFParser</a>
+<ul>
+<li>application/pdf</li></ul></li>
+<li>org.apache.tika.parser.pkg.<a 
href="./api/org/apache/tika/parser/pkg/CompressorParser">CompressorParser</a>
+<ul>
+<li>application/x-bzip</li>
+<li>application/x-bzip2</li>
+<li>application/gzip</li>
+<li>application/x-gzip</li>
+<li>application/x-xz</li></ul></li>
+<li>org.apache.tika.parser.pkg.<a 
href="./api/org/apache/tika/parser/pkg/PackageParser">PackageParser</a>
+<ul>
+<li>application/x-tar</li>
+<li>application/x-tika-unix-dump</li>
+<li>application/java-archive</li>
+<li>application/x-7z-compressed</li>
+<li>application/x-archive</li>
+<li>application/x-cpio</li>
+<li>application/zip</li></ul></li>
+<li>org.apache.tika.parser.rtf.<a 
href="./api/org/apache/tika/parser/rtf/RTFParser">RTFParser</a>
+<ul>
+<li>application/rtf</li></ul></li>
+<li>org.apache.tika.parser.txt.<a 
href="./api/org/apache/tika/parser/txt/TXTParser">TXTParser</a>
+<ul>
+<li>text/plain</li></ul></li>
+<li>org.apache.tika.parser.video.<a 
href="./api/org/apache/tika/parser/video/FLVParser">FLVParser</a>
+<ul>
+<li>video/x-flv</li></ul></li>
+<li>org.apache.tika.parser.xml.<a 
href="./api/org/apache/tika/parser/xml/DcXMLParser">DcXMLParser</a>
+<ul>
+<li>application/xml</li>
+<li>image/svg+xml</li></ul></li>
+<li>org.apache.tika.parser.xml.<a 
href="./api/org/apache/tika/parser/xml/FictionBookParser">FictionBookParser</a>
+<ul>
+<li>application/x-fictionbook+xml</li></ul></li>
+<li>org.gagravarr.tika.<a 
href="./api/org/gagravarr/tika/FlacParser">FlacParser</a>
+<ul>
+<li>audio/x-oggflac</li>
+<li>audio/x-flac</li></ul></li>
+<li>org.gagravarr.tika.<a 
href="./api/org/gagravarr/tika/OggParser">OggParser</a>
+<ul>
+<li>application/kate</li>
+<li>application/ogg</li>
+<li>audio/x-oggpcm</li>
+<li>video/x-oggyuv</li>
+<li>video/x-dirac</li>
+<li>video/x-ogm</li>
+<li>audio/ogg</li>
+<li>video/x-ogguvs</li>
+<li>video/theora</li>
+<li>video/x-oggrgb</li>
+<li>video/ogg</li></ul></li>
+<li>org.gagravarr.tika.<a 
href="./api/org/gagravarr/tika/OpusParser">OpusParser</a>
+<ul>
+<li>audio/opus</li>
+<li>audio/ogg; codecs=opus</li></ul></li>
+<li>org.gagravarr.tika.<a 
href="./api/org/gagravarr/tika/SpeexParser">SpeexParser</a>
+<ul>
+<li>audio/speex</li>
+<li>audio/ogg; codecs=speex</li></ul></li>
+<li>org.gagravarr.tika.<a 
href="./api/org/gagravarr/tika/VorbisParser">VorbisParser</a>
+<ul>
+<li>audio/vorbis</li></ul></li></ul></div>
       </div>
       <div id="sidebar">
         <div id="navigation">
@@ -377,7 +760,7 @@
       </div>
       <div id="footer">
         <p>
-          Copyright &#169; 2014
+          Copyright &#169; 2015
           <a href="http://www.apache.org/";>The Apache Software Foundation</a>.
           Site powered by <a href="http://maven.apache.org/";>Apache Maven</a>. 
           Search powered by


Reply via email to