= new ParseContext();
context.set(Parser.class, parser); parser.parse(is,
handler, metadata, context); return new
TikaBox(metadata,handler);{
At 2014-10-14 17:56:20, "Nick Burch" wrote:
>On Tue, 14 Oct 2014, imyuka wrote:
Hi all,
I catch a 'more than 10 characters' exception while processing a
document, to avoid this, I can either use the abridged text or increase the
maximum limit. In these cases, how can I increase the limit or retrieve only
the first 10 characters of the document without throwing
, while I have no idea about the Java coding implementation. Are there any
instructions or tutorials I can refer to?
Thanks!
At 2014-10-09 20:46:01, "Nick Burch" wrote:
>On Thu, 9 Oct 2014, imyuka wrote:
>> Here is my problem: I have extracted plain texts from a s
Hi all,
Here is my problem: I have extracted plain texts from a serious of doc(x)
documents and their titles via the "dc:title" label of metadata, but I'm not
sure this is the right way to attain a title of a document. In many cases, a
title inside a document could be of the largest font-s