On Fri, May 30, 2014 at 2:41 AM, Aaron Misquith <aaronmisqu...@gmail.com> wrote:
> Like pypdf is used to convert pdf to text; is there any library that is > used in converting .ppt files to .txt? Even some sample programs will be > helpful. > I suspect you'd need to use PowerPoint itself to do that cleanly; you can definitely drive PowerPoint from Python if you so desire, though: http://www.s-anand.net/blog/automating-powerpoint-with-python/ If anybody's written a package to brute-force the text out of a .ppt file without using PowerPoint, though, I'm unaware of it. That way lies madness, I suspect. (The new MS Office formats - .docx, .xlsx, .pptx - are XML files inside of a renamed ZIP container; it should be fairly easy to get the text out of a .pptx file using any one of Python's XML libraries. But the older format is proprietary and extremely scary.)
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor