Author: jukka
Date: Wed Oct 5 13:17:51 2011
New Revision: 1179211
URL: http://svn.apache.org/viewvc?rev=1179211&view=rev
Log:
site: Update 0.10 instructions
Modified:
tika/site/pom.xml
tika/site/publish/0.10/gettingstarted.html
tika/site/publish/download.html
tika/site/publish/index.html
tika/site/src/site/apt/0.10/gettingstarted.apt
tika/site/src/site/apt/download.apt
tika/site/src/site/apt/index.apt
Modified: tika/site/pom.xml
URL:
http://svn.apache.org/viewvc/tika/site/pom.xml?rev=1179211&r1=1179210&r2=1179211&view=diff
==============================================================================
--- tika/site/pom.xml (original)
+++ tika/site/pom.xml Wed Oct 5 13:17:51 2011
@@ -28,7 +28,7 @@
<parent>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parent</artifactId>
- <version>0.9</version>
+ <version>0.10</version>
</parent>
<artifactId>tika-site</artifactId>
Modified: tika/site/publish/0.10/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/0.10/gettingstarted.html?rev=1179211&r1=1179210&r2=1179211&view=diff
==============================================================================
--- tika/site/publish/0.10/gettingstarted.html (original)
+++ tika/site/publish/0.10/gettingstarted.html Wed Oct 5 13:17:51 2011
@@ -122,33 +122,61 @@
... <!-- your other classpath entries -->
<pathelement location="path/to/tika-core-0.10.jar"/>
<pathelement location="path/to/tika-parsers-0.10.jar"/>
+ <pathelement location="path/to/netcdf-4.2-min.jar"/>
+ <pathelement location="path/to/slf4j-api-1.5.6.jar"/>
+ <pathelement location="path/to/apache-mime4j-core-0.7.jar"/>
+ <pathelement location="path/to/apache-mime4j-dom-0.7.jar"/>
+ <pathelement location="path/to/commons-compress-1.1.jar"/>
+ <pathelement location="path/to/commons-codec-1.4.jar"/>
+ <pathelement location="path/to/pdfbox-1.6.0.jar"/>
+ <pathelement location="path/to/fontbox-1.6.0.jar"/>
+ <pathelement location="path/to/jempbox-1.6.0.jar"/>
<pathelement location="path/to/commons-logging-1.1.1.jar"/>
- <pathelement location="path/to/commons-compress-1.0.jar"/>
- <pathelement
location="path/to/pdfbox-0.10.0-incubating.jar"/>
- <pathelement
location="path/to/fontbox-0.10.0-incubator.jar"/>
- <pathelement
location="path/to/jempbox-0.10.0-incubator.jar"/>
- <pathelement location="path/to/poi-3.6.jar"/>
- <pathelement location="path/to/poi-scratchpad-3.6.jar"/>
- <pathelement location="path/to/poi-ooxml-3.6.jar"/>
- <pathelement location="path/to/poi-ooxml-schemas-3.6.jar"/>
+ <pathelement location="path/to/poi-3.8-beta4.jar"/>
+ <pathelement
location="path/to/poi-scratchpad-3.8-beta4.jar"/>
+ <pathelement location="path/to/poi-ooxml-3.8-beta4.jar"/>
+ <pathelement
location="path/to/poi-ooxml-schemas-3.8-beta4.jar"/>
<pathelement location="path/to/xmlbeans-2.3.0.jar"/>
<pathelement location="path/to/dom4j-1.6.1.jar"/>
- <pathelement location="path/to/xml-apis-1.0.b2.jar"/>
<pathelement
location="path/to/geronimo-stax-api_1.0_spec-1.0.jar"/>
- <pathelement location="path/to/tagsoup-1.2.jar"/>
+ <pathelement location="path/to/tagsoup-1.2.1.jar"/>
<pathelement location="path/to/asm-3.1.jar"/>
- <pathelement location="path/to/log4j-1.2.14.jar"/>
<pathelement
location="path/to/metadata-extractor-2.4.0-beta-1.jar"/>
+ <pathelement location="path/to/boilerpipe-1.1.0.jar"/>
+ <pathelement location="path/to/rome-0.9.jar"/>
+ <pathelement location="path/to/jdom-1.0.jar"/>
</classpath></pre></div><p>An easy way to gather all these libraries is
to run "mvn dependency:copy-dependencies" in the tika-parsers source
directory. This will copy all Tika dependencies to the
<tt>target/dependencies</tt> directory.</p><p>Alternatively you can simply drop
the entire tika-app jar to your classpath to get all of the above dependencies
in a single archive.</p></div><div class="section"><h2>Using Tika as a command
line utility<a name="Using_Tika_as_a_command_line_utility"></a></h2><p>The Tika
application jar (tika-app-0.10.jar) can be used as a command line utility for
extracting text content and metadata from all sorts of files. This runnable jar
contains all the dependencies it needs, so you don't need to worry about
classpath settings to run it.</p><p>The usage instructions are shown
below.</p><div><pre>usage: java -jar tika-app-0.10.jar [option] [file]
Options:
- -? or --help Print this usage message
- -v or --verbose Print debug level messages
- -g or --gui Start the Apache Tika GUI
- -x or --xml Output XHTML content (default)
- -h or --html Output HTML content
- -t or --text Output plain text content
- -m or --metadata Output only metadata
+ -? or --help Print this usage message
+ -v or --verbose Print debug level messages
+
+ -g or --gui Start the Apache Tika GUI
+ -s or --server Start the Apache Tika server
+
+ -x or --xml Output XHTML content (default)
+ -h or --html Output HTML content
+ -j or --json Output JSON content
+ -t or --text Output plain text content
+ -T or --text-main Output plain text content (main content only)
+ -m or --metadata Output only metadata
+ -l or --language Output only language
+ -d or --detect Detect document type
+ -eX or --encoding=X Use output encoding X
+ -z or --extract Extract all attachements into current directory
+ -r or --pretty-print For XML and XHTML outputs, adds newlines and
+ whitespace, for better readability
+
+ --create-profile=X
+ Create NGram profile, where X is a profile name
+ --list-parsers
+ List the available document parsers
+ --list-parser-details
+ List the available document parsers, and their supported mime types
+ --list-met-models
+ List the available metadata models, and their supported keys
+ --list-supported-types
+ List all known media types and related information
Description:
Apache Tika will parse the file(s) specified on the
@@ -160,12 +188,21 @@ Description:
If no file name or URL is specified (or the special
name "-" is used), then the standard input stream
- is parsed.
+ is parsed. If no arguments were given and no input
+ data is available, the GUI is started instead.
+
+- GUI mode
+
+ Use the "--gui" (or "-g") option to start the
+ Apache Tika GUI. You can drag and drop files from
+ a normal file explorer to the GUI window to extract
+ text content and metadata from the files.
+
+- Server mode
- Use the "--gui" (or "-g") option to start
- the Apache Tika GUI. You can drag and drop files
- from a normal file explorer to the GUI window to
- extract text content and metadata from the files.</pre></div><p>You can
also use the jar as a component in a Unix pipeline or as an external tool in
many scripting languages.</p><div><pre># Check if an Internet resource contains
a specific keyword
+ Use the "-server" (or "-s") option to start the
+ Apache Tika server. The server will listen to the
+ ports you specify as one or more arguments.</pre></div><p>You can also use
the jar as a component in a Unix pipeline or as an external tool in many
scripting languages.</p><div><pre># Check if an Internet resource contains a
specific keyword
curl http://.../document.doc \
| java -jar tika-app-0.10.jar --text \
| grep -q keyword</pre></div></div>
Modified: tika/site/publish/download.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/download.html?rev=1179211&r1=1179210&r2=1179211&view=diff
==============================================================================
--- tika/site/publish/download.html (original)
+++ tika/site/publish/download.html Wed Oct 5 13:17:51 2011
@@ -84,7 +84,7 @@
width="387" height="100"/></a>
</div>
<div id="content">
- <!-- Licensed to the Apache Software Foundation (ASF) under one or
more --><!-- contributor license agreements. See the NOTICE file distributed
with --><!-- this work for additional information regarding copyright
ownership. --><!-- The ASF licenses this file to You under the Apache License,
Version 2.0 --><!-- (the "License"); you may not use this file except in
compliance with --><!-- the License. You may obtain a copy of the License at
--><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!--
Unless required by applicable law or agreed to in writing, software --><!--
distributed under the License is distributed on an "AS IS" BASIS, --><!--
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
--><!-- See the License for the specific language governing permissions and
--><!-- limitations under the License. --><div class="section"><h2>Download
Apache Tika<a name="Download_Apache_Tika"></a></h2><p>Apache Tika 0.10 is now
availa
ble. See the <a class="externalLink"
href="http://www.apache.org/dist/tika/CHANGES-0.10.txt">CHANGES.txt</a> file
for more information on the list of updates in this initial
release.</p><ul><li><a class="externalLink"
href="http://www.apache.org/dyn/closer.cgi/tika/apache-tika-0.10-src.zip">apache-tika-0.10-src.zip</a>
(source archive, <a class="externalLink"
href="http://www.apache.org/dist/tika/apache-tika-0.10-src.zip.asc">PGP
signature</a>)<br />SHA1: <tt>355d0b2fa0de232672e4760941ea0dcf641a82ad</tt><br
/>MD5: <tt>96fb7db1b0c93d1e958a2ee52c4bd02f</tt></li><li><a
class="externalLink"
href="http://www.apache.org/dyn/closer.cgi/tika/0.10/tika-app-0.10.jar">tika-app-0.10.jar</a>
(runnable jar, <a class="externalLink"
href="http://www.apache.org/dist/tika/0.10/tika-app-0.10.jar.asc">PGP
signature</a>)<br />SHA1: <tt>e1ad4e6cc4601c1c1367c646b2fbc57788664bed</tt><br
/>MD5: <tt>d4b1136ddedc3ae2f9af778cea42c219</tt></li></ul><p>Apache Tika
releases are available under the <a clas
s="externalLink" href="http://www.apache.org/licenses/LICENSE-2.0">Apache
License, Version 2.0</a>. See the NOTICE.txt file contained in each release
artifact for applicable copyright attribution notices.</p><p>If you are looking
for previous releases of Apache Tika, have a look in the <a
class="externalLink"
href="http://archive.apache.org/dist/tika/">archives</a>.</p><p>If you are
looking for releases of Apache Tika from the Apache Lucene project (pre-0.8
releases), have a look in the <a class="externalLink"
href="http://archive.apache.org/dist/lucene/tika/">lucene archives</a>. If you
are looking for releases of ApacheTika from the Apache Incubator (pre-0.2
releases), have a look in the <a class="externalLink"
href="http://archive.apache.org/dist/incubator/tika/">incubator
archives</a>.</p></div><div class="section"><h2>Export control<a
name="Export_control"></a></h2><p>Apache Tika includes cryptographic software.
The country in which you currently reside may have restric
tions on the import, possession, use, and/or re-export to another country, of
encryption software. BEFORE using any encryption software, please check your
country's laws, regulations and policies concerning the import, possession, or
use, and re-export of encryption software, to see if this is permitted. See
<<a class="externalLink"
href="http://www.wassenaar.org/">http://www.wassenaar.org/</a>> for more
information.</p><p>The U.S. Government Department of Commerce, Bureau of
Industry and Security (BIS), has classified this software as Export Commodity
Control Number (ECCN) 5D002.C.1, which includes information security software
using or performing cryptographic functions with asymmetric algorithms. The
form and manner of this Apache Software Foundation distribution makes it
eligible for export under the License Exception ENC Technology Software
Unrestricted (TSU) exception (see the BIS Export Administration Regulations,
Section 740.13) for both object code and source
code.</p><p>The following provides more details on the included cryptographic
software:</p><ul><li>Apache Tika uses the Bouncy Castle generic encryption
libraries for extracting text content and metadata from encrypted PDF files.
See <a class="externalLink"
href="http://www.bouncycastle.org/">http://www.bouncycastle.org/</a> for more
details on Bouncy Castle.</li></ul></div><div class="section"><h2>Verify<a
name="Verify"></a></h2><p>It is essential that you verify the integrity of the
downloaded files using the PGP signatures. Please read <a class="externalLink"
href="http://httpd.apache.org/dev/verification.html">Verifying Apache HTTP
Server Releases</a> for more information on why you should verify our
releases.</p><p>The PGP signatures can be verified using PGP or GPG. First
download the KEYS file as well as the .asc signature files for the relevant
release packages. Make sure you get these files from the main distribution
directory, rather than from a mirror. Then verify
the signatures using</p><div class="source"><pre>% pgpk -a KEYS
+ <!-- Licensed to the Apache Software Foundation (ASF) under one or
more --><!-- contributor license agreements. See the NOTICE file distributed
with --><!-- this work for additional information regarding copyright
ownership. --><!-- The ASF licenses this file to You under the Apache License,
Version 2.0 --><!-- (the "License"); you may not use this file except in
compliance with --><!-- the License. You may obtain a copy of the License at
--><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!--
Unless required by applicable law or agreed to in writing, software --><!--
distributed under the License is distributed on an "AS IS" BASIS, --><!--
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
--><!-- See the License for the specific language governing permissions and
--><!-- limitations under the License. --><div class="section"><h2>Download
Apache Tika<a name="Download_Apache_Tika"></a></h2><p>Apache Tika 0.10 is now
availa
ble. See the <a class="externalLink"
href="http://www.apache.org/dist/tika/CHANGES-0.10.txt">CHANGES.txt</a> file
for more information on the list of updates in this initial
release.</p><ul><li><a class="externalLink"
href="http://www.apache.org/dyn/closer.cgi/tika/apache-tika-0.10-src.zip">apache-tika-0.10-src.zip</a>
(source archive, <a class="externalLink"
href="http://www.apache.org/dist/tika/apache-tika-0.10-src.zip.asc">PGP
signature</a>)<br />SHA1: <tt>355d0b2fa0de232672e4760941ea0dcf641a82ad</tt><br
/>MD5: <tt>96fb7db1b0c93d1e958a2ee52c4bd02f</tt></li><li><a
class="externalLink"
href="http://www.apache.org/dyn/closer.cgi/tika/tika-app-0.10.jar">tika-app-0.10.jar</a>
(runnable jar, <a class="externalLink"
href="http://www.apache.org/dist/tika/tika-app-0.10.jar.asc">PGP
signature</a>)<br />SHA1: <tt>e1ad4e6cc4601c1c1367c646b2fbc57788664bed</tt><br
/>MD5: <tt>d4b1136ddedc3ae2f9af778cea42c219</tt></li></ul><p>Apache Tika
releases are available under the <a class="externa
lLink" href="http://www.apache.org/licenses/LICENSE-2.0">Apache License,
Version 2.0</a>. See the NOTICE.txt file contained in each release artifact for
applicable copyright attribution notices.</p><p>If you are looking for previous
releases of Apache Tika, have a look in the <a class="externalLink"
href="http://archive.apache.org/dist/tika/">archives</a>.</p><p>If you are
looking for releases of Apache Tika from the Apache Lucene project (pre-0.8
releases), have a look in the <a class="externalLink"
href="http://archive.apache.org/dist/lucene/tika/">lucene archives</a>. If you
are looking for releases of ApacheTika from the Apache Incubator (pre-0.2
releases), have a look in the <a class="externalLink"
href="http://archive.apache.org/dist/incubator/tika/">incubator
archives</a>.</p></div><div class="section"><h2>Export control<a
name="Export_control"></a></h2><p>Apache Tika includes cryptographic software.
The country in which you currently reside may have restrictions on t
he import, possession, use, and/or re-export to another country, of encryption
software. BEFORE using any encryption software, please check your country's
laws, regulations and policies concerning the import, possession, or use, and
re-export of encryption software, to see if this is permitted. See <<a
class="externalLink"
href="http://www.wassenaar.org/">http://www.wassenaar.org/</a>> for more
information.</p><p>The U.S. Government Department of Commerce, Bureau of
Industry and Security (BIS), has classified this software as Export Commodity
Control Number (ECCN) 5D002.C.1, which includes information security software
using or performing cryptographic functions with asymmetric algorithms. The
form and manner of this Apache Software Foundation distribution makes it
eligible for export under the License Exception ENC Technology Software
Unrestricted (TSU) exception (see the BIS Export Administration Regulations,
Section 740.13) for both object code and source code.</p><
p>The following provides more details on the included cryptographic
software:</p><ul><li>Apache Tika uses the Bouncy Castle generic encryption
libraries for extracting text content and metadata from encrypted PDF files.
See <a class="externalLink"
href="http://www.bouncycastle.org/">http://www.bouncycastle.org/</a> for more
details on Bouncy Castle.</li></ul></div><div class="section"><h2>Verify<a
name="Verify"></a></h2><p>It is essential that you verify the integrity of the
downloaded files using the PGP signatures. Please read <a class="externalLink"
href="http://httpd.apache.org/dev/verification.html">Verifying Apache HTTP
Server Releases</a> for more information on why you should verify our
releases.</p><p>The PGP signatures can be verified using PGP or GPG. First
download the KEYS file as well as the .asc signature files for the relevant
release packages. Make sure you get these files from the main distribution
directory, rather than from a mirror. Then verify the signa
tures using</p><div class="source"><pre>% pgpk -a KEYS
% pgpv apache-tika-X.Y.Z.tar.gz.asc</pre></div><p>or</p><div
class="source"><pre>% pgp -ka KEYS
% pgp apache-tika-X.Y.Z.tar.gz.asc</pre></div><p>or</p><div
class="source"><pre>% gpg --import KEYS
% gpg --verify apache-tika-X.Y.Z.tar.gz.asc</pre></div></div>
Modified: tika/site/publish/index.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/index.html?rev=1179211&r1=1179210&r2=1179211&view=diff
==============================================================================
--- tika/site/publish/index.html (original)
+++ tika/site/publish/index.html Wed Oct 5 13:17:51 2011
@@ -84,7 +84,7 @@
width="387" height="100"/></a>
</div>
<div id="content">
- <!-- Licensed to the Apache Software Foundation (ASF) under one or
more --><!-- contributor license agreements. See the NOTICE file distributed
with --><!-- this work for additional information regarding copyright
ownership. --><!-- The ASF licenses this file to You under the Apache License,
Version 2.0 --><!-- (the "License"); you may not use this file except in
compliance with --><!-- the License. You may obtain a copy of the License at
--><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!--
Unless required by applicable law or agreed to in writing, software --><!--
distributed under the License is distributed on an "AS IS" BASIS, --><!--
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
--><!-- See the License for the specific language governing permissions and
--><!-- limitations under the License. --><div class="section"><h2>Apache Tika
- a content analysis toolkit<a
name="Apache_Tika_-_a_content_analysis_toolkit"></
a></h2><p>The Apache Tika™ toolkit detects and extracts metadata and
structured text content from various documents using existing parser libraries.
You can find the latest release on the <a href="./download.html">download
page</a>. See the <a href="./0.9/gettingstarted.html">Getting Started</a> guide
for instructions on how to start using Tika.</p><p>Tika is a project of the <a
class="externalLink" href="http://www.apache.org/">Apache Software
Foundation</a>, and was formerly a subproject of <a class="externalLink"
href="http://lucene.apache.org/">Apache Lucene</a>.</p></div><div
class="section"><h2>Latest News<a name="Latest_News"></a></h2><dl><dt>7-11
November 2011 - Tika at US ApacheCon</dt><dd> ApacheCon NA is coming to
Vancouver, British Columbia, at the Westin Bayshore, and Chris Mattmann will be
giving a <a class="externalLink"
href="http://na11.apachecon.com/talks/19391">talk</a> on the forthcoming 1.0
release of Tika as part of the <a class="externalLink" hr
ef="http://na11.apachecon.com/talk/by_track/1400">Content Technologies
track</a> on Thursday November 10th, 2011. The talk will cover the history of
Tika, its genesis, its inception as a top-level project, and where it's headed
1.0 and beyond. Come out and support Tika by attending the talk! </dd><dt>30
September 2011: Apache Tika Release</dt><dd> Apache Tika 0.10 has been
released. This release includes new parser support for CHM files, bugfixes to
RTF parsing, an improved GUI and more. Please see the download page for more
details. </dd><dt>16 February 2011: Apache Tika Release</dt><dd> Apache Tika
0.9 has been released. This release includes several important bugfixes and new
features. Please see the download page for more details. </dd><dt>12 November
2010: Apache Tika Release</dt><dd> Apache Tika 0.8 has been released. Please
see the download page for more details. This is our first release as a TLP.
We're excited!</dd><dt>1-5 November 2010 - Tika at US ApacheCon</dt><d
d> ApacheCon NA is coming to Atlanta, Georgia, at the Westin Peachtree, and
Tika is being repped as part of the <a class="externalLink"
href="http://us.apachecon.com/c/acna2010/schedule/2010/11/05">Lucene and
friends track</a> on Friday, November 5th, 2010. Chris Mattmann will give a
talk on how Tika is being used at NASA and in the context of other projects in
the Apache ecosystem.<p>Friday, Nov. 5th, 2010:</p><ul><li><a
class="externalLink"
href="http://us.apachecon.com/c/acna2010/sessions/538">Scientific data curation
and processing with Apache Tika</a> - Chris Mattmann @
9:00am</li></ul></dd><dt>April 2010: Tika Graduates to TLP</dt><dd> Apache Tika
was voted into TLP status by a resolution submitted to the Apache Board. We are
in the process of updating the site and moving things around. If you notice
anything out of place, let us know.</dd><dt>April 2010: Apache Tika
Release</dt><dd> Apache Tika 0.7 has been released. Please see the download
page for more details.</dd>
<dt>January 2010: Apache Tika Release</dt><dd> Apache Tika 0.6 has been
released. Please see the download page for more details.</dd><dt>November 2009:
Apache Tika Release</dt><dd> Apache Tika 0.5 has been released. Please see the
download page for more details.</dd><dt>14 August 2009 - Lucene at US
ApacheCon</dt><dd> ApacheCon US is once again in the Bay Area and Lucene is
coming along for the ride! The Lucene community has planned two full days of
talks, plus a meetup and the usual bevy of training. With a well-balanced mix
of first time and veteran ApacheCon speakers, the <a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/schedule#lucene">Lucene track</a>
at ApacheCon US promises to have something for everyone. Be sure not to
miss:<p>Training:</p><ul><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/437">Lucene Boot Camp</a>
- A two day training session, Nov. 2nd & 3rd</li><li><a
class="externalLink" href="http://www.u
s.apachecon.com/c/acus2009/sessions/375">Solr Day</a> - A one day training
session, Nov. 2nd</li></ul><p>Thursday, Nov. 5th:</p><ul><li><a
class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/428">Introduction to the
Lucene Ecosystem</a> - Grant Ingersoll @ 9:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/461">Lucene Basics and
New Features</a> - Michael Busch @ 10:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/331">Apache Solr: Out of
the Box</a> - Chris Hostetter @ 14:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/427">Introduction to
Nutch</a> - Andrzej Bialecki @ 15:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/430">Lucene and Solr
Performance Tuning</a> - Mark Miller @ 16:30</li></ul><p>Friday, Nov.
6th:</p><ul><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009
/sessions/332">Implementing an Information Retrieval Framework for an
Organizational Repository</a> - Sithu D Sudarsan @ 9:00</li><li><a
class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/333">Apache Mahout -
Going from raw data to Information</a> - Isabel Drost @ 10:00</li><li><a
class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/334">MIME Magic with
Apache Tika</a> - Jukka Zitting @ 11:30</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/335">Building Intelligent
Search Applications with the Lucene Ecosystem</a> - Ted Dunning @
14:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/462">Realtime Search</a>
- Jason Rutherglen @ 15:00</li></ul></dd><dt>July 2009: Apache Tika
Release</dt><dd> Apache Tika 0.4 has been released. Please see the download
page for more details.</dd><dt>March 2009: Apache Tika Release</dt><dd> Apache
Tika 0.3 has been released. P
lease see the download page for more details.</dd><dt>February 2009: Lucene at
ApacheCon Europe 2009 in Amsterdam</dt><dd> Lucene will be extremely well
represented at <a class="externalLink"
href="http://www.eu.apachecon.com/c/aceu2009/">ApacheCon EU 2009</a> in
Amsterdam, Netherlands this March 23-27, 2009:<ul><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/197">Lucene Boot Camp</a> - A
two day training session, March 23 & 24th</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/201">Solr Boot Camp</a> - A
one day training session, March 24th</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/136">Introducing Apache
Mahout</a> - Grant Ingersoll. March 25th @ 10:30</li><li><a
class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/137">Lucene/Solr Case
Studies</a> - Erik Hatcher. March 25th @ 11:30</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu
2009/sessions/138">Advanced Indexing Techniques with Apache Lucene</a> -
Michael Busch. March 25th @ 14:00</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/251">Apache Solr - A Case
Study</a> - Uri Boness. March 26th @ 17:30</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/250">Best of breed - httpd,
forrest, solr and droids</a> - Thorsten Scherler. March 27th @ 17:30</li><li><a
class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/165">Apache Droids - an
intelligent standalone robot framework</a> - Thorsten Scherler. March 26th @
15:00</li></ul></dd><dt>December 2008: Apache Tika Release</dt><dd> Apache Tika
0.2 has been released. Please see the download page for more
details.</dd><dt>November 2008: User mailing list created</dt><dd> A new
mailing list, [email protected], has been created for discussion
about the use of the Tika toolkit. You can subscribe this mailing list by sendin
g a message to [email protected].</dd><dt>October 2008:
Tika graduates to a Lucene subproject</dt><dd> Tika has graduated form the
Incubator to become a subproject of Apache Lucene. The project infrastructure
will be migrated from incubator.apache.org to
lucene.apache.org.</dd><dt>October 2008: Apache Tika status report</dt><dd>
Dave Meikle was just voted in as a new committer.<p>Paolo Mottadelli will
present Tika at ApacheCon US.</p><p>Tika 0.2 should be released
soon.</p><p>Usage documentation has been added to the website.</p></dd><dt>July
2008: Apache Tika status report</dt><dd> Tika community remains relatively
small, with just a handful of active members<p>Work towards Tika 0.2 continues,
Chris Mattman has volunteered to be the release manager</p></dd><dt>April 2008:
Apache Tika status report</dt><dd> Niall Pemberton joined the project as a
committer and PPMC member<p>The number of issues reported by external
contributors is growing gradually.</p><p
>There was a Fast Feather Talk on Tika in ApacheCon EU 2008</p><p>We have good
>contacts especially with Apache POI and PDFBox</p><p>We are working towards
>Tika 0.2</p><p>Metadata handling improvements are being
>discussed</p></dd><dt>January 2008: Apache Tika status report</dt><dd> No new
>committers since the last report, activity has been moderate but steady,
>leading to the 0.1 release.<p>Tika 0.1 (incubating) has just been
>released.</p><p>Chris Mattmann intends to use that release in Nutch, That's
>good progress towards Tika's goal of providing data extraction functionality
>to other projects.</p><p>A new Tika logo was created by Google Highly Open
>Participation student, hasn't been integrated yet.</p></dd><dt>December 27th,
>2007: Tika 0.1-incubating Released!</dt><dd> Tika has made its first official
>release, titled 0.1-incubating. See the <a class="externalLink"
>href="http://www.apache.org/dist/incubator/tika/CHANGES-0.1-incubating.txt">CHANGES.txt</a>
> file for more informa
tion on the list of updates in this initial release. Thanks to all who
contributed! You can download the official source tarball <a
class="externalLink"
href="http://www.apache.org/dyn/closer.cgi/incubator/tika">here</a>.</dd><dt>October
8th, 2007: Welcome Keith Bennett!</dt><dd> The Tika PPMC has <a
class="externalLink"
href="http://www.nabble.com/Please-welcome-Keith-Bennett-as-a-Tika-committer%21-tf4586151.html#a13107428">elected</a>
Keith Bennett as our new committer. Welcome!</dd><dt>March 22nd, 2007: Apache
Tika project started</dt><dd> The Apache Tika project was formally started when
the <a class="externalLink"
href="http://wiki.apache.org/incubator/TikaProposal">Tika proposal</a> was <a
class="externalLink"
href="http://mail-archives.apache.org/mod_mbox/incubator-general/200703.mbox/%[email protected]%3e">accepted</a>
by the <a class="externalLink" href="http://incubator.apache.org/">Apache
Incubator PMC</a>. </dd></dl></d
iv>
+ <!-- Licensed to the Apache Software Foundation (ASF) under one or
more --><!-- contributor license agreements. See the NOTICE file distributed
with --><!-- this work for additional information regarding copyright
ownership. --><!-- The ASF licenses this file to You under the Apache License,
Version 2.0 --><!-- (the "License"); you may not use this file except in
compliance with --><!-- the License. You may obtain a copy of the License at
--><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!--
Unless required by applicable law or agreed to in writing, software --><!--
distributed under the License is distributed on an "AS IS" BASIS, --><!--
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
--><!-- See the License for the specific language governing permissions and
--><!-- limitations under the License. --><div class="section"><h2>Apache Tika
- a content analysis toolkit<a
name="Apache_Tika_-_a_content_analysis_toolkit"></
a></h2><p>The Apache Tika™ toolkit detects and extracts metadata and
structured text content from various documents using existing parser libraries.
You can find the latest release on the <a href="./download.html">download
page</a>. See the <a href="./0.10/gettingstarted.html">Getting Started</a>
guide for instructions on how to start using Tika.</p><p>Tika is a project of
the <a class="externalLink" href="http://www.apache.org/">Apache Software
Foundation</a>, and was formerly a subproject of <a class="externalLink"
href="http://lucene.apache.org/">Apache Lucene</a>.</p></div><div
class="section"><h2>Latest News<a name="Latest_News"></a></h2><dl><dt>7-11
November 2011 - Tika at US ApacheCon</dt><dd> ApacheCon NA is coming to
Vancouver, British Columbia, at the Westin Bayshore, and Chris Mattmann will be
giving a <a class="externalLink"
href="http://na11.apachecon.com/talks/19391">talk</a> on the forthcoming 1.0
release of Tika as part of the <a class="externalLink" h
ref="http://na11.apachecon.com/talk/by_track/1400">Content Technologies
track</a> on Thursday November 10th, 2011. The talk will cover the history of
Tika, its genesis, its inception as a top-level project, and where it's headed
1.0 and beyond. Come out and support Tika by attending the talk! </dd><dt>30
September 2011: Apache Tika Release</dt><dd> Apache Tika 0.10 has been
released. This release includes new parser support for CHM files, bugfixes to
RTF parsing, an improved GUI and more. Please see the download page for more
details. </dd><dt>16 February 2011: Apache Tika Release</dt><dd> Apache Tika
0.9 has been released. This release includes several important bugfixes and new
features. Please see the download page for more details. </dd><dt>12 November
2010: Apache Tika Release</dt><dd> Apache Tika 0.8 has been released. Please
see the download page for more details. This is our first release as a TLP.
We're excited!</dd><dt>1-5 November 2010 - Tika at US ApacheCon</dt><
dd> ApacheCon NA is coming to Atlanta, Georgia, at the Westin Peachtree, and
Tika is being repped as part of the <a class="externalLink"
href="http://us.apachecon.com/c/acna2010/schedule/2010/11/05">Lucene and
friends track</a> on Friday, November 5th, 2010. Chris Mattmann will give a
talk on how Tika is being used at NASA and in the context of other projects in
the Apache ecosystem.<p>Friday, Nov. 5th, 2010:</p><ul><li><a
class="externalLink"
href="http://us.apachecon.com/c/acna2010/sessions/538">Scientific data curation
and processing with Apache Tika</a> - Chris Mattmann @
9:00am</li></ul></dd><dt>April 2010: Tika Graduates to TLP</dt><dd> Apache Tika
was voted into TLP status by a resolution submitted to the Apache Board. We are
in the process of updating the site and moving things around. If you notice
anything out of place, let us know.</dd><dt>April 2010: Apache Tika
Release</dt><dd> Apache Tika 0.7 has been released. Please see the download
page for more details.</dd
><dt>January 2010: Apache Tika Release</dt><dd> Apache Tika 0.6 has been
>released. Please see the download page for more details.</dd><dt>November
>2009: Apache Tika Release</dt><dd> Apache Tika 0.5 has been released. Please
>see the download page for more details.</dd><dt>14 August 2009 - Lucene at US
>ApacheCon</dt><dd> ApacheCon US is once again in the Bay Area and Lucene is
>coming along for the ride! The Lucene community has planned two full days of
>talks, plus a meetup and the usual bevy of training. With a well-balanced mix
>of first time and veteran ApacheCon speakers, the <a class="externalLink"
>href="http://www.us.apachecon.com/c/acus2009/schedule#lucene">Lucene
>track</a> at ApacheCon US promises to have something for everyone. Be sure
>not to miss:<p>Training:</p><ul><li><a class="externalLink"
>href="http://www.us.apachecon.com/c/acus2009/sessions/437">Lucene Boot
>Camp</a> - A two day training session, Nov. 2nd & 3rd</li><li><a
>class="externalLink" href="http://www.
us.apachecon.com/c/acus2009/sessions/375">Solr Day</a> - A one day training
session, Nov. 2nd</li></ul><p>Thursday, Nov. 5th:</p><ul><li><a
class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/428">Introduction to the
Lucene Ecosystem</a> - Grant Ingersoll @ 9:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/461">Lucene Basics and
New Features</a> - Michael Busch @ 10:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/331">Apache Solr: Out of
the Box</a> - Chris Hostetter @ 14:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/427">Introduction to
Nutch</a> - Andrzej Bialecki @ 15:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/430">Lucene and Solr
Performance Tuning</a> - Mark Miller @ 16:30</li></ul><p>Friday, Nov.
6th:</p><ul><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus200
9/sessions/332">Implementing an Information Retrieval Framework for an
Organizational Repository</a> - Sithu D Sudarsan @ 9:00</li><li><a
class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/333">Apache Mahout -
Going from raw data to Information</a> - Isabel Drost @ 10:00</li><li><a
class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/334">MIME Magic with
Apache Tika</a> - Jukka Zitting @ 11:30</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/335">Building Intelligent
Search Applications with the Lucene Ecosystem</a> - Ted Dunning @
14:00</li><li><a class="externalLink"
href="http://www.us.apachecon.com/c/acus2009/sessions/462">Realtime Search</a>
- Jason Rutherglen @ 15:00</li></ul></dd><dt>July 2009: Apache Tika
Release</dt><dd> Apache Tika 0.4 has been released. Please see the download
page for more details.</dd><dt>March 2009: Apache Tika Release</dt><dd> Apache
Tika 0.3 has been released.
Please see the download page for more details.</dd><dt>February 2009: Lucene
at ApacheCon Europe 2009 in Amsterdam</dt><dd> Lucene will be extremely well
represented at <a class="externalLink"
href="http://www.eu.apachecon.com/c/aceu2009/">ApacheCon EU 2009</a> in
Amsterdam, Netherlands this March 23-27, 2009:<ul><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/197">Lucene Boot Camp</a> - A
two day training session, March 23 & 24th</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/201">Solr Boot Camp</a> - A
one day training session, March 24th</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/136">Introducing Apache
Mahout</a> - Grant Ingersoll. March 25th @ 10:30</li><li><a
class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/137">Lucene/Solr Case
Studies</a> - Erik Hatcher. March 25th @ 11:30</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/ace
u2009/sessions/138">Advanced Indexing Techniques with Apache Lucene</a> -
Michael Busch. March 25th @ 14:00</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/251">Apache Solr - A Case
Study</a> - Uri Boness. March 26th @ 17:30</li><li><a class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/250">Best of breed - httpd,
forrest, solr and droids</a> - Thorsten Scherler. March 27th @ 17:30</li><li><a
class="externalLink"
href="http://eu.apachecon.com/c/aceu2009/sessions/165">Apache Droids - an
intelligent standalone robot framework</a> - Thorsten Scherler. March 26th @
15:00</li></ul></dd><dt>December 2008: Apache Tika Release</dt><dd> Apache Tika
0.2 has been released. Please see the download page for more
details.</dd><dt>November 2008: User mailing list created</dt><dd> A new
mailing list, [email protected], has been created for discussion
about the use of the Tika toolkit. You can subscribe this mailing list by sendi
ng a message to [email protected].</dd><dt>October 2008:
Tika graduates to a Lucene subproject</dt><dd> Tika has graduated form the
Incubator to become a subproject of Apache Lucene. The project infrastructure
will be migrated from incubator.apache.org to
lucene.apache.org.</dd><dt>October 2008: Apache Tika status report</dt><dd>
Dave Meikle was just voted in as a new committer.<p>Paolo Mottadelli will
present Tika at ApacheCon US.</p><p>Tika 0.2 should be released
soon.</p><p>Usage documentation has been added to the website.</p></dd><dt>July
2008: Apache Tika status report</dt><dd> Tika community remains relatively
small, with just a handful of active members<p>Work towards Tika 0.2 continues,
Chris Mattman has volunteered to be the release manager</p></dd><dt>April 2008:
Apache Tika status report</dt><dd> Niall Pemberton joined the project as a
committer and PPMC member<p>The number of issues reported by external
contributors is growing gradually.</p><
p>There was a Fast Feather Talk on Tika in ApacheCon EU 2008</p><p>We have
good contacts especially with Apache POI and PDFBox</p><p>We are working
towards Tika 0.2</p><p>Metadata handling improvements are being
discussed</p></dd><dt>January 2008: Apache Tika status report</dt><dd> No new
committers since the last report, activity has been moderate but steady,
leading to the 0.1 release.<p>Tika 0.1 (incubating) has just been
released.</p><p>Chris Mattmann intends to use that release in Nutch, That's
good progress towards Tika's goal of providing data extraction functionality to
other projects.</p><p>A new Tika logo was created by Google Highly Open
Participation student, hasn't been integrated yet.</p></dd><dt>December 27th,
2007: Tika 0.1-incubating Released!</dt><dd> Tika has made its first official
release, titled 0.1-incubating. See the <a class="externalLink"
href="http://www.apache.org/dist/incubator/tika/CHANGES-0.1-incubating.txt">CHANGES.txt</a>
file for more inform
ation on the list of updates in this initial release. Thanks to all who
contributed! You can download the official source tarball <a
class="externalLink"
href="http://www.apache.org/dyn/closer.cgi/incubator/tika">here</a>.</dd><dt>October
8th, 2007: Welcome Keith Bennett!</dt><dd> The Tika PPMC has <a
class="externalLink"
href="http://www.nabble.com/Please-welcome-Keith-Bennett-as-a-Tika-committer%21-tf4586151.html#a13107428">elected</a>
Keith Bennett as our new committer. Welcome!</dd><dt>March 22nd, 2007: Apache
Tika project started</dt><dd> The Apache Tika project was formally started when
the <a class="externalLink"
href="http://wiki.apache.org/incubator/TikaProposal">Tika proposal</a> was <a
class="externalLink"
href="http://mail-archives.apache.org/mod_mbox/incubator-general/200703.mbox/%[email protected]%3e">accepted</a>
by the <a class="externalLink" href="http://incubator.apache.org/">Apache
Incubator PMC</a>. </dd></dl></
div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/src/site/apt/0.10/gettingstarted.apt
URL:
http://svn.apache.org/viewvc/tika/site/src/site/apt/0.10/gettingstarted.apt?rev=1179211&r1=1179210&r2=1179211&view=diff
==============================================================================
--- tika/site/src/site/apt/0.10/gettingstarted.apt (original)
+++ tika/site/src/site/apt/0.10/gettingstarted.apt Wed Oct 5 13:17:51 2011
@@ -137,23 +137,29 @@ Using Tika in an Ant project
... <!-- your other classpath entries -->
<pathelement location="path/to/tika-core-0.10.jar"/>
<pathelement location="path/to/tika-parsers-0.10.jar"/>
+ <pathelement location="path/to/netcdf-4.2-min.jar"/>
+ <pathelement location="path/to/slf4j-api-1.5.6.jar"/>
+ <pathelement location="path/to/apache-mime4j-core-0.7.jar"/>
+ <pathelement location="path/to/apache-mime4j-dom-0.7.jar"/>
+ <pathelement location="path/to/commons-compress-1.1.jar"/>
+ <pathelement location="path/to/commons-codec-1.4.jar"/>
+ <pathelement location="path/to/pdfbox-1.6.0.jar"/>
+ <pathelement location="path/to/fontbox-1.6.0.jar"/>
+ <pathelement location="path/to/jempbox-1.6.0.jar"/>
<pathelement location="path/to/commons-logging-1.1.1.jar"/>
- <pathelement location="path/to/commons-compress-1.0.jar"/>
- <pathelement location="path/to/pdfbox-0.10.0-incubating.jar"/>
- <pathelement location="path/to/fontbox-0.10.0-incubator.jar"/>
- <pathelement location="path/to/jempbox-0.10.0-incubator.jar"/>
- <pathelement location="path/to/poi-3.6.jar"/>
- <pathelement location="path/to/poi-scratchpad-3.6.jar"/>
- <pathelement location="path/to/poi-ooxml-3.6.jar"/>
- <pathelement location="path/to/poi-ooxml-schemas-3.6.jar"/>
+ <pathelement location="path/to/poi-3.8-beta4.jar"/>
+ <pathelement location="path/to/poi-scratchpad-3.8-beta4.jar"/>
+ <pathelement location="path/to/poi-ooxml-3.8-beta4.jar"/>
+ <pathelement location="path/to/poi-ooxml-schemas-3.8-beta4.jar"/>
<pathelement location="path/to/xmlbeans-2.3.0.jar"/>
<pathelement location="path/to/dom4j-1.6.1.jar"/>
- <pathelement location="path/to/xml-apis-1.0.b2.jar"/>
<pathelement location="path/to/geronimo-stax-api_1.0_spec-1.0.jar"/>
- <pathelement location="path/to/tagsoup-1.2.jar"/>
+ <pathelement location="path/to/tagsoup-1.2.1.jar"/>
<pathelement location="path/to/asm-3.1.jar"/>
- <pathelement location="path/to/log4j-1.2.14.jar"/>
<pathelement location="path/to/metadata-extractor-2.4.0-beta-1.jar"/>
+ <pathelement location="path/to/boilerpipe-1.1.0.jar"/>
+ <pathelement location="path/to/rome-0.9.jar"/>
+ <pathelement location="path/to/jdom-1.0.jar"/>
</classpath>
---
@@ -178,13 +184,35 @@ Using Tika as a command line utility
usage: java -jar tika-app-0.10.jar [option] [file]
Options:
- -? or --help Print this usage message
- -v or --verbose Print debug level messages
- -g or --gui Start the Apache Tika GUI
- -x or --xml Output XHTML content (default)
- -h or --html Output HTML content
- -t or --text Output plain text content
- -m or --metadata Output only metadata
+ -? or --help Print this usage message
+ -v or --verbose Print debug level messages
+
+ -g or --gui Start the Apache Tika GUI
+ -s or --server Start the Apache Tika server
+
+ -x or --xml Output XHTML content (default)
+ -h or --html Output HTML content
+ -j or --json Output JSON content
+ -t or --text Output plain text content
+ -T or --text-main Output plain text content (main content only)
+ -m or --metadata Output only metadata
+ -l or --language Output only language
+ -d or --detect Detect document type
+ -eX or --encoding=X Use output encoding X
+ -z or --extract Extract all attachements into current directory
+ -r or --pretty-print For XML and XHTML outputs, adds newlines and
+ whitespace, for better readability
+
+ --create-profile=X
+ Create NGram profile, where X is a profile name
+ --list-parsers
+ List the available document parsers
+ --list-parser-details
+ List the available document parsers, and their supported mime types
+ --list-met-models
+ List the available metadata models, and their supported keys
+ --list-supported-types
+ List all known media types and related information
Description:
Apache Tika will parse the file(s) specified on the
@@ -196,12 +224,21 @@ Description:
If no file name or URL is specified (or the special
name "-" is used), then the standard input stream
- is parsed.
+ is parsed. If no arguments were given and no input
+ data is available, the GUI is started instead.
+
+- GUI mode
+
+ Use the "--gui" (or "-g") option to start the
+ Apache Tika GUI. You can drag and drop files from
+ a normal file explorer to the GUI window to extract
+ text content and metadata from the files.
+
+- Server mode
- Use the "--gui" (or "-g") option to start
- the Apache Tika GUI. You can drag and drop files
- from a normal file explorer to the GUI window to
- extract text content and metadata from the files.
+ Use the "-server" (or "-s") option to start the
+ Apache Tika server. The server will listen to the
+ ports you specify as one or more arguments.
---
You can also use the jar as a component in a Unix pipeline or
Modified: tika/site/src/site/apt/download.apt
URL:
http://svn.apache.org/viewvc/tika/site/src/site/apt/download.apt?rev=1179211&r1=1179210&r2=1179211&view=diff
==============================================================================
--- tika/site/src/site/apt/download.apt (original)
+++ tika/site/src/site/apt/download.apt Wed Oct 5 13:17:51 2011
@@ -28,8 +28,8 @@ Download Apache Tika
SHA1: <<<355d0b2fa0de232672e4760941ea0dcf641a82ad>>>\
MD5: <<<96fb7db1b0c93d1e958a2ee52c4bd02f>>>
- *
{{{http://www.apache.org/dyn/closer.cgi/tika/0.10/tika-app-0.10.jar}tika-app-0.10.jar}}
- (runnable jar,
{{{http://www.apache.org/dist/tika/0.10/tika-app-0.10.jar.asc}PGP signature}})\
+ *
{{{http://www.apache.org/dyn/closer.cgi/tika/tika-app-0.10.jar}tika-app-0.10.jar}}
+ (runnable jar,
{{{http://www.apache.org/dist/tika/tika-app-0.10.jar.asc}PGP signature}})\
SHA1: <<<e1ad4e6cc4601c1c1367c646b2fbc57788664bed>>>\
MD5: <<<d4b1136ddedc3ae2f9af778cea42c219>>>
Modified: tika/site/src/site/apt/index.apt
URL:
http://svn.apache.org/viewvc/tika/site/src/site/apt/index.apt?rev=1179211&r1=1179210&r2=1179211&view=diff
==============================================================================
--- tika/site/src/site/apt/index.apt (original)
+++ tika/site/src/site/apt/index.apt Wed Oct 5 13:17:51 2011
@@ -23,7 +23,7 @@ Apache Tika - a content analysis toolkit
structured text content from various documents using existing parser
libraries. You can find the latest release on the
{{{./download.html}download page}}. See the
- {{{./0.9/gettingstarted.html}Getting Started}} guide for instructions on
+ {{{./0.10/gettingstarted.html}Getting Started}} guide for instructions on
how to start using Tika.
Tika is a project of the