When I use the POI-HLSF to extract the text from the slides.
My code is very easy:
  PowerPointExtractor ppe = new PowerPointExtractor("c:\\4.ppt");
  String s = ppe.getText(true,true);
  System.out.println(s);
  ppe.close();

But there is a error:
  java.lang.ClassCastException: org.apache.poi.hslf.record.Slide cannot be
cast to org.apache.poi.hslf.record.MainMaster
I trace the bug, I find that:
  For the ppt which has more than 2(include 2) MainMasters, the SlideShow
Constructor could not correctly to get MasterRecords:


SlideListWithText masterSLWT =
_documentRecord.getMasterSlideListWithText();
SlideListWithText slidesSLWT = _documentRecord.getSlideSlideListWithText();
SlideListWithText notesSLWT  = _documentRecord.getNotesSlideListWithText();

   //find master slides
   SlideAtomsSet[] masterSets = new SlideAtomsSet[0];
   org.apache.poi.hslf.record.MainMaster[] masterRecords = null;
   if (masterSLWT != null){

       masterSets = masterSLWT.getSlideAtomsSets();
       masterRecords = new org.apache.poi.hslf.record.MainMaster[
masterSets.length];
       for(int i=0; i<masterRecords.length; i++) {
           masterRecords[i] = (org.apache.poi.hslf.record.MainMaster
)getCoreRecordForSAS(masterSets[i]);
       }
   }

Eg. The masterSets.length =2, but there is only 1 master records in the
MostRecentCoreRecords, so the there will be a casting error.

And I also find that the test data given in the Src dir, all ppt file are
simple, all of them have only 1 master slide,

Did anyone test the ones which have more than 2 master slides?
--
With respects,
  Mingfan.Lu

Reply via email to