https://bz.apache.org/bugzilla/show_bug.cgi?id=60556
Bug ID: 60556
Summary: IllegalArgumentException: The end () must not be
before the start ()
Product: POI
Version: 3.15-FINAL
Hardware: PC
Status: NEW
Severity: major
Priority: P2
Component: HWPF
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
Created attachment 34596
--> https://bz.apache.org/bugzilla/attachment.cgi?id=34596&action=edit
File which the code fails
I'm extracting the text from a WordExtractor class (apache POI), but I have an
error for some .doc files. Here the code:
"
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.poifs.filesystem.OfficeXmlFileException;
public class [class name]
{
public static void main(String... args) throws FileNotFoundException,
IOException, NullPointerException, OfficeXmlFileException {
File[] files = new File("[input path]").listFiles();
showFiles(files);
}
public static void showFiles(File[] files) throws
FileNotFoundException, IOException, NullPointerException,
OfficeXmlFileException {
File log = new File("[output name]/out.tsv");
for (File file : files) {
if (file.isDirectory()) {
//System.out.println("Directory/" + file.getName());
showFiles(file.listFiles()); // Calls same method
again.
} else {
String N = file.getName();
// caso .docx
if (N.toLowerCase().endsWith(".docx") &&
!N.toLowerCase().startsWith("~"))
{
System.out.println(file.getAbsolutePath());
XWPFDocument docx = new
XWPFDocument(new FileInputStream(file));
XWPFWordExtractor we = new
XWPFWordExtractor(docx);
String T =
we.getText().replaceAll("\\n", " ").replaceAll("\\r", " ");
// PARA ESCRIBIR EL ARCHIVO
try{
// if(!log.exists()){
// System.out.println("We had to
make a new file.");
// log.createNewFile();
// }
FileWriter fileWriter = new
FileWriter(log, true);
BufferedWriter bufferedWriter = new
BufferedWriter(fileWriter);
bufferedWriter.write(file.getAbsolutePath()+"\t"+T+"\n");
bufferedWriter.close();
} catch (IOException e) {
System.err.println("Problem writing .DOCX to the
file out.txt " + e.getMessage());
}
}
else {
if (N.toLowerCase().endsWith(".doc") &&
!N.toLowerCase().startsWith("~"))
{
System.out.println(file.getAbsolutePath());
HWPFDocument doc = new
HWPFDocument(new FileInputStream(file));
WordExtractor we = new
WordExtractor(doc);
//WordExtractor we = new
WordExtractor(new FileInputStream(file));
String T =
we.getText().replaceAll("\\n", " ").replaceAll("\\r", " ");
// PARA ESCRIBIR EL ARCHIVO
try{
// if(!log.exists()){
//
log.createNewFile();
// }
FileWriter fileWriter =
new FileWriter(log, true);
BufferedWriter
bufferedWriter = new BufferedWriter(fileWriter);
bufferedWriter.write(file.getAbsolutePath()+"\t"+T+"\n");
bufferedWriter.close();
} catch (IOException e) {
System.err.println("Problem writing .DOC to the file out.txt " +
e.getMessage());
}
}
}
}
}
}
}
"
For most .docx and .doc files it's work fine.
The error message is:
Exception in thread "main" java.lang.RuntimeException:
java.lang.IllegalArgumentException: The end (4958) must not be before the start
(4990)
How can I fix it?
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]