[ 
https://issues.apache.org/jira/browse/TIKA-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112801#comment-13112801
 ] 

Michael McCandless commented on TIKA-712:
-----------------------------------------

Good idea!  Nice how approachable OOXML is...

In theory the answer is here:
http://www.ecma-international.org/publications/standards/Ecma-376.htm
but I have not tried to dig.
 
So, here's a boilerplate-only chunk from the master slide (PowerPoint does not 
display this on the slide):

{noformat}
      <p:sp>
        <p:nvSpPr>
          <p:cNvPr id="2" name="Title Placeholder 1"/>
          <p:cNvSpPr>
            <a:spLocks noGrp="1"/>
          </p:cNvSpPr>
          <p:nvPr>
            <p:ph type="title"/>
          </p:nvPr>
        </p:nvSpPr>
        <p:spPr>
          <a:xfrm>
            <a:off x="457200" y="274638"/>
            <a:ext cx="8229600" cy="1143000"/>
          </a:xfrm>
          <a:prstGeom prst="rect">
            <a:avLst/>
          </a:prstGeom>
        </p:spPr>
        <p:txBody>
          <a:bodyPr vert="horz" lIns="91440" tIns="45720" rIns="91440" 
bIns="45720" rtlCol="0" anchor="ctr">
            <a:normAutofit/>
          </a:bodyPr>
          <a:lstStyle/>
          <a:p>
            <a:r>
              <a:rPr lang="en-US" smtClean="0"/>
              <a:t>Click to edit Master title style
              </a:t>
            </a:r>
            <a:endParaRPr lang="en-US"/>
          </a:p>
        </p:txBody>
      </p:sp>
{noformat}

And here's the footer I edited (PowerPoint does display this on the slide):

{noformat}
      <p:sp>
        <p:nvSpPr>
          <p:cNvPr id="5" name="Footer Placeholder 4"/>
          <p:cNvSpPr>
            <a:spLocks noGrp="1"/>
          </p:cNvSpPr>
          <p:nvPr>
            <p:ph type="ftr" sz="quarter" idx="3"/>
          </p:nvPr>
        </p:nvSpPr>
        <p:spPr>
          <a:xfrm>
            <a:off x="3124200" y="6356350"/>
            <a:ext cx="2895600" cy="365125"/>
          </a:xfrm>
          <a:prstGeom prst="rect">
            <a:avLst/>
          </a:prstGeom>
        </p:spPr>
        <p:txBody>
          <a:bodyPr vert="horz" lIns="91440" tIns="45720" rIns="91440" 
bIns="45720" rtlCol="0" anchor="ctr"/>
          <a:lstStyle>
            <a:lvl1pPr algn="ctr">
              <a:defRPr sz="1200">
                <a:solidFill>
                  <a:schemeClr val="tx1">
                    <a:tint val="75000"/>
                  </a:schemeClr>
                </a:solidFill>
              </a:defRPr>
            </a:lvl1pPr>
          </a:lstStyle>
          <a:p>
            <a:r>
              <a:rPr lang="en-US" smtClean="0"/>
              <a:t>Slide footer is right here
              </a:t>
            </a:r>
            <a:endParaRPr lang="en-US"/>
          </a:p>
        </p:txBody>
      </p:sp>
{noformat}

I can't spot any obvious ideas on quick glance... I'll attach the full
master slide XML (there's lots of other stuff); could be the
difference is elsewhere in there.



> Master slide text isn't extracted
> ---------------------------------
>
>                 Key: TIKA-712
>                 URL: https://issues.apache.org/jira/browse/TIKA-712
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Michael McCandless
>         Attachments: TIKA-712.patch, testPPT_masterFooter.ppt, 
> testPPT_masterFooter.pptx, testPPT_masterFooter2.ppt, 
> testPPT_masterFooter2.pptx
>
>
> It looks like we are not getting text from the master slide for PPT
> and PPTX.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to