Is it possible for you to create a new JIRA issue at
https://issues.apache.org/jira/browse/TIKA and upload the file
(checking the box for "Grant license to ASF for inclusion in ASF
works") ?

Checking this box is really important: if there is a bug in
TIKA/PDFBox with your persian document, it would allow those projects
to add the PDF file to regression tests.

On Mon, Sep 12, 2011 at 3:47 AM, ahmad ajiloo <ahmad.aji...@gmail.com> wrote:
> yes, of course!
> please find the attachment.
>
> On Mon, Sep 12, 2011 at 9:42 AM, Robert Muir <rcm...@gmail.com> wrote:
>>
>> 2011/9/12 ahmad ajiloo <ahmad.aji...@gmail.com>:
>> > Hello
>> > I used Tika (of course in Nutch) to parse some persian pdf files. some
>> > of
>> > the files clearly transformed to a plain text. but about some of them,
>> > output was corrupted. I used ICU4J v4 library and the text changed to
>> > right-to-left mode. but the mentioned problem didn't resolve. insofar as
>> > Tika can not understand any charachter of input persian pdf file!
>>
>> Maybe you can upload one of your PDF files to a Tika or PDFBox JIRA
>> issue so they can investigate the problem?
>>
>> --
>> lucidimagination.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>



-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to