Hi there, (+sorry for long post) A little while ago I've added and -just a few minutes ago- updated ODS read support for Octave in the IO package (in SVN) (ODS = spreadsheet format used by OpenOffice.org Calc, it's the spreadsheet subtype of Open Document Format). While this was motivated mainly by an informal question by Alois Schloegl, I could use octave ods support for my own needs as well.
As it stands, reading from ODS is fairly reliable (but admittedly a bit slow) - though more testing is surely welcome. For me it works on Windows (XP SP3) and on Linux (Mandriva 2009.0). Writing ODS is still a bridge too far (see Developers's Notes below). Relevant m-files: odsread / odsopen / ods2oct / odsclose / odsfinfo (/ calccelladdress) For this ODS support the java package is needed + one or both of (odfdom.jar + xercesImpl.jar) or (jopendocument.jar), and in all cases, these jars + rt.jar from your java jre / java jdk should be in your javaclasspath. The former (odfdom + xercesImpl) is presently the preferred option. Get the jars from: http://odftoolkit.org/projects/odfdom/downloads/directory/current-version http://xerces.apache.org/mirrors.cgi or www.jopendocument.org Usage =============== Usage is easy and modeled after the Excel xls stuff I committed earlier: Use odsfinfo() to explore ods-files with unknown content; then just do: [<outputdata>] = odsread (<spreadsheet/worksheet/...>) ----or (= more flexible)---- ods = odsread (filename.ods) # Get filepointer; [<data>] = ods2oct (ods, worksheet/range) # Get actual raw data; [.. = parsecell (...) ] # Separate numbers and text; : # Optionally get more data : # from same file; ods = odsclose (ods) # Close file pointer. (see the help from each m-file) Like in the xls java stuff I contributed a while ago, my scripts also return the outer column and row numbers where the data came from (AFAIK Excel nor Matlab can do this), an IMO handy feature for exploring spreadsheets with unknown content. Contributors note for package maintainer ======================================== Although ODS support doesn't work without java, I left the requirements for the java package as "suggested" as lacking java support is (should be) gracefully catched & the user informed appropriately. Developer's notes (as far as I'm "developer" - more "Hobby programmer"): ======================================================================= ODS (a zipped xml format) seems easy to parse (after unzipping you can read it) so I once had the impression that simply unzipping the ods file and then processing what's inside would do. But it turns out that things quickly and steeply become complex far beyond basic needs (= just simple ODS data exchange). Unzipping is the first obstacle - the current tools in the miscellaneous package do not allow direct unzipping into memory (as e.g. a char array). Fiddling around with temporary disk files is not very elegant IMO and might induce frustrations if your USB stick is almost full (or when your IT dept has particularly restrictive write policies). A way around might be possible (e.g. IIRC V7 mat files are compressed and that functionality is in core octave) but that's beyond me currently. What's needed is unzipping into a stream and directly parsing that stream into an XML tree (a la pipe). After having explored various unzipped ODS files, having read the discussions in nabble on XML, and having stumbled over the absolute lack of usable (for me) documentation for the XML stuff in the miscellaneous package, I decided to definitely go for a java-based solution: - First of all we need plain data exchange. Post-processing, formatting and worksheet overhaul like inserting columns & merging cells can wait. - Octave <-> spreadsheet data exchange primarily revolves around rectangular arrays that don't map easily onto tree-like XML data structures and vice versa. - So, to make it simple, we need something that hides the gory XML details and presents a "rectangular" (let's say table) model to octave. - Such interfaces must have been written many many times. Why reinvent the wheel? - I'm not quite a java fanboy (too much bloat I think) but it *does* help platform independence and -once you get the hang of it- it *does* speed up development. - Matlab is heavily dependent on java. Typing 'javaclasspath' (or the equivalent) in Matlab returns a LONG list of java classes installed by default. If I'm not mistaken Matlab may even add a java JDK during installation - our IT dept installs all our SW so I can't tell for sure. - I did find a number of (at first sight!) promising java solutions. - Java adds a dependency; but IMO that is outweighed by the advantages: (1) octave now has ODS (read) support and (2) even a mere hobby programmer like me could make it with a reasonable time investment. The currently available java solutions I found & selected have their limitations: --- jOpenDocument is the most promising, but currently lacks key features (some vital methods are protected and therefore unusable, some obviously needed methods are lacking, and of course there are bugs) while -as stated in their support forums- development is primarily driven by paying customers. Maybe a next version next year works better... (I really hope so - jOpenDocument opens the way for easy ODS write support). But: jOpenDocument is *much* faster than ODFtoolkit. I've left jOpenDocument support in because (1) it *does* work, and (2) it should be developed further once the time is right. --- ODFtoolkit is big and IMO leans too much to the ODF (xml) format itself rather than abstracting away from it to make access easy. E.g., spreadsheet support mainly goes through Tables; in the xml domain that is an obvious choice for developers but less so for what octave ODS support needs (although it goes some way). Once I was handed a table-cell I simply resorted to octave's native string processing for the final bits - admittedly clumsy but it works and it's faster than preparing and executing another java_invoke(). This also avoids potential problems as java (.util.Date) apparently has epoch at 1-1-1970 (so earlier java dates might not be represented properly) while OO.o Calc happily fits them in. Write support ============= This whole state of things is the main reason that I've only glanced at ODS write support but decided to let that rest for now. If someone desperately needs it, well, the beauty of open source is .... Best wishes, Philip ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Octave-dev mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/octave-dev
