Author: nick
Date: Sat Dec 20 06:52:15 2014
New Revision: 1646920

URL: http://svn.apache.org/r1646920
Log:
Include more examples in the website TIKA-1390

Modified:
    tika/site/publish/1.7/examples.html
    tika/site/src/site/apt/1.7/examples.apt

Modified: tika/site/publish/1.7/examples.html
URL: 
http://svn.apache.org/viewvc/tika/site/publish/1.7/examples.html?rev=1646920&r1=1646919&r2=1646920&view=diff
==============================================================================
--- tika/site/publish/1.7/examples.html (original)
+++ tika/site/publish/1.7/examples.html Sat Dec 20 06:52:15 2014
@@ -93,7 +93,13 @@
 <ul>
 <li><a href="#Parsing">Parsing</a>
 <ul>
-<li><a href="#Parsing_using_the_Tika_Facade">Parsing using the Tika 
Facade</a></li></ul></li></ul></li></ul>
+<li><a href="#Parsing_using_the_Tika_Facade">Parsing using the Tika 
Facade</a></li></ul></li>
+<li><a href="#Custom_Content_Handlers">Custom Content Handlers</a>
+<ul>
+<li><a href="#Extract_Phone_Numbers_from_Content_into_the_Metadata">Extract 
Phone Numbers from Content into the Metadata</a></li></ul></li>
+<li><a href="#Translation">Translation</a>
+<ul>
+<li><a href="#Translation_using_the_Microsoft_Translation_API">Translation 
using the Microsoft Translation API</a></li></ul></li></ul></li></ul>
 <div class="section">
 <h3><a name="Parsing">Parsing</a></h3>
 <p>TODO Explain the options</p>
@@ -102,7 +108,19 @@
 <p>TODO Explain about using this</p><style type="text/css">
    @import url('attached-includes/css/shCoreDefault.css');
 </style>
-<div id="highlighter_260411" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number37 index0 alt2"><code class="java 
keyword">public</code> <code class="java plain">String parseToStringExample() 
</code><code class="java keyword">throws</code> <code class="java 
plain">IOException, SAXException, TikaException {</code></div><div class="line 
number38 index1 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ParsingExample.</code><code class="java keyword">class</code><code 
class="java plain">.getResourceAsStream(</code><code class="java 
string">"test.doc"</code><code class="java plain">);</code></div><div 
class="line number39 index2 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Tika tika = 
</code><code class="java keyword">new</code> <code class="java 
plain">Tika();</code></div><
 div class="line number40 index3 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">try</code> 
<code class="java plain">{</code></div><div class="line number41 index4 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">tika.parseToString(stream);</code></div><div class="line number42 index5 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number43 index6 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number44 index7 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number45 index8 alt2"><code 
class="java plain">}</code></div></
 div></td></tr></tbody></table></div></div></div></div>
+<div id="highlighter_882359" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number37 index0 alt2"><code class="java 
keyword">public</code> <code class="java plain">String parseToStringExample() 
</code><code class="java keyword">throws</code> <code class="java 
plain">IOException, SAXException, TikaException {</code></div><div class="line 
number38 index1 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = ParsingExample.</code><code class="java keyword">class</code><code 
class="java plain">.getResourceAsStream(</code><code class="java 
string">"test.doc"</code><code class="java plain">);</code></div><div 
class="line number39 index2 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">Tika tika = 
</code><code class="java keyword">new</code> <code class="java 
plain">Tika();</code></div><
 div class="line number40 index3 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">try</code> 
<code class="java plain">{</code></div><div class="line number41 index4 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">tika.parseToString(stream);</code></div><div class="line number42 index5 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">} </code><code class="java keyword">finally</code> <code 
class="java plain">{</code></div><div class="line number43 index6 alt2"><code 
class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number44 index7 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number45 index8 alt2"><code 
class="java plain">}</code></div></
 div></td></tr></tbody></table></div></div></div>
+<div class="section">
+<h3><a name="Custom_Content_Handlers">Custom Content Handlers</a></h3>
+<p>The textual output of parsing a file with Tika is returned via the SAX <a 
class="externalLink" 
href="http://docs.oracle.com/javase/7/docs/api/org/xml/sax/ContentHandler.html";>ContentHandler</a>
 you pass to the parse method. It is possible to customise your parsing by 
supplying your own ContentHandler which does special things.</p>
+<div class="section">
+<h4><a name="Extract_Phone_Numbers_from_Content_into_the_Metadata">Extract 
Phone Numbers from Content into the Metadata</a></h4>
+<p>By using the <a 
href="./apidocs/org/apache/tika/sax/PhoneExtractingContentHandler.html">PhoneExtractingContentHandler</a>,
 you can have any phone numbers found in the textual content of the document 
extracted and placed into the Metadata object for you.</p><div 
id="highlighter_956409" class="syntaxhighlighter nogutter  java"><table 
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div 
class="container"><div class="line number69 index0 alt2"><code class="java 
keyword">public</code> <code class="java keyword">static</code> <code 
class="java keyword">void</code> <code class="java plain">process(File file) 
</code><code class="java keyword">throws</code> <code class="java 
plain">Exception {</code></div><div class="line number70 index1 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">Parser parser = </code><code class="java keyword">new</code> <code 
class="java plain">AutoDetectParser();</code></div><div class="line number71 
 index2 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">Metadata metadata = </code><code class="java 
keyword">new</code> <code class="java plain">Metadata();</code></div><div 
class="line number72 index3 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java comments">// The 
PhoneExtractingContentHandler will examine any characters for phone numbers 
before passing them</code></div><div class="line number73 index4 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
comments">// to the underlying Handler.</code></div><div class="line number74 
index5 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">PhoneExtractingContentHandler handler = </code><code 
class="java keyword">new</code> <code class="java 
plain">PhoneExtractingContentHandler(</code><code class="java 
keyword">new</code> <code class="java plain">BodyContentHandler(), 
metadata);</code></div><div cl
 ass="line number75 index6 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">InputStream 
stream = </code><code class="java keyword">new</code> <code class="java 
plain">FileInputStream(file);</code></div><div class="line number76 index7 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">try</code> <code class="java plain">{</code></div><div 
class="line number77 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">parser.parse(stream, handler, metadata, </code><code 
class="java keyword">new</code> <code class="java 
plain">ParseContext());</code></div><div class="line number78 index9 
alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number79 index10 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
keyword">finally</code> <code class="java plain">{
 </code></div><div class="line number80 index11 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">stream.close();</code></div><div class="line number81 
index12 alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">}</code></div><div class="line number82 index13 alt1"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">String[] numbers = metadata.getValues(</code><code class="java 
string">"phonenumbers"</code><code class="java plain">);</code></div><div 
class="line number83 index14 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">for</code> 
<code class="java plain">(String number : numbers) {</code></div><div 
class="line number84 index15 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">phoneNumbers.add(number);</code></div><div class="line 
number85 index16 al
 t2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">}</code></div><div class="line number86 index17 alt1"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
+<div class="section">
+<h3><a name="Translation">Translation</a></h3>
+<p>Tika provides a pluggable Translation system, which allow you to send the 
results of parsing off to an external system or program to have the text 
translated into another language</p>
+<div class="section">
+<h4><a name="Translation_using_the_Microsoft_Translation_API">Translation 
using the Microsoft Translation API</a></h4>
+<p>In order to use the Microsoft Translation API, you need to sign up for an 
account and get a key, then pass that to Tika when you have the translation 
done.</p><div id="highlighter_451559" class="syntaxhighlighter nogutter  
java"><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td 
class="code"><div class="container"><div class="line number23 index0 
alt2"><code class="java keyword">public</code> <code class="java plain">String 
microsoftTranslateToFrench(String text) {</code></div><div class="line number24 
index1 alt1"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java plain">MicrosoftTranslator translator = </code><code class="java 
keyword">new</code> <code class="java 
plain">MicrosoftTranslator();</code></div><div class="line number25 index2 
alt2"><code class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java comments">// Change the id and secret! See <a 
href="http://msdn.microsoft.com/en-us/library/hh454950.aspx.";>http://msdn.micro
 soft.com/en-us/library/hh454950.aspx.</a></code></div><div class="line 
number26 index3 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">translator.setId(</code><code class="java string">"dummy-id"</code><code 
class="java plain">);</code></div><div class="line number27 index4 alt2"><code 
class="java spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">translator.setSecret(</code><code class="java 
string">"dummy-secret"</code><code class="java plain">);</code></div><div 
class="line number28 index5 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java keyword">try</code> 
<code class="java plain">{</code></div><div class="line number29 index6 
alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java 
plain">translator.translate(text, </code><code class="java 
string">"fr"</code><code class="java plain">);</code></div><div clas
 s="line number30 index7 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java plain">} </code><code 
class="java keyword">catch</code> <code class="java plain">(Exception e) 
{</code></div><div class="line number31 index8 alt2"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code 
class="java keyword">return</code> <code class="java string">"Error while 
translating."</code><code class="java plain">;</code></div><div class="line 
number32 index9 alt1"><code class="java 
spaces">&nbsp;&nbsp;&nbsp;&nbsp;</code><code class="java 
plain">}</code></div><div class="line number33 index10 alt2"><code class="java 
plain">}</code></div></div></td></tr></tbody></table></div></div></div></div>
       </div>
       <div id="sidebar">
         <div id="navigation">

Modified: tika/site/src/site/apt/1.7/examples.apt
URL: 
http://svn.apache.org/viewvc/tika/site/src/site/apt/1.7/examples.apt?rev=1646920&r1=1646919&r2=1646920&view=diff
==============================================================================
--- tika/site/src/site/apt/1.7/examples.apt (original)
+++ tika/site/src/site/apt/1.7/examples.apt Sat Dec 20 06:52:15 2014
@@ -37,3 +37,32 @@ Apache Tika API Usage Examples
    TODO Explain about using this
 
 
%{include|source=src/examples-src/main/java/org/apache/tika/example/ParsingExample.java|snippet=aj:..parseToStringExample()|show-gutter=false}
+
+* {Custom Content Handlers}
+
+   The textual output of parsing a file with Tika is returned via the SAX 
+   
{{{http://docs.oracle.com/javase/7/docs/api/org/xml/sax/ContentHandler.html}ContentHandler}}
+   you pass to the parse method. It is possible to customise your parsing by 
supplying your
+   own ContentHandler which does special things.
+
+** {Extract Phone Numbers from Content into the Metadata}
+
+   By using the 
+   
{{{./apidocs/org/apache/tika/sax/PhoneExtractingContentHandler.html}PhoneExtractingContentHandler}},
+   you can have any phone numbers found in the textual content of the document 
extracted and placed
+   into the Metadata object for you.
+
+%{include|source=src/examples-src/main/java/org/apache/tika/example/GrabPhoneNumbersExample.java|snippet=aj:..process(..File)|show-gutter=false}
+
+* {Translation}
+
+   Tika provides a pluggable Translation system, which allow you to send the 
results of
+   parsing off to an external system or program to have the text translated 
into another
+   language
+
+** {Translation using the Microsoft Translation API}
+
+   In order to use the Microsoft Translation API, you need to sign up for an 
account
+   and get a key, then pass that to Tika when you have the translation done.
+
+%{include|source=src/examples-src/main/java/org/apache/tika/example/TranslatorExample.java|snippet=aj:..microsoftTranslateToFrench(..String)|show-gutter=false}


Reply via email to