Extraction of the Content from CJK pdf's using PDFBox and indexing the same
with LUCENE search in Solaris fails.
----------------------------------------------------------------------------------------------------------------
Key: PDFBOX-929
URL: https://issues.apache.org/jira/browse/PDFBOX-929
Project: PDFBox
Issue Type: Bug
Components: Lucene, PDFReader, Text extraction
Affects Versions: 1.3.1
Environment: Solaris
Reporter: gomathy s
Fix For: 1.3.1
In the solaris environment , when we are using the PDFBox ,extracting the
content and setting few lines from the PDF as a description and
indexing the content.In the search we don't get any results when we are
searching with the CJK characters but english words it is
able to retreive results.Am using the correct analyzer both during indexing and
searching.This happens only in Solaris , in windows it is working
fine.Please suggest me guys , this is an major issue for me.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.