Hi Annie, Can you please create a JIRA issue for this, and also please create a diff against the Tika trunk by doing the following:
0. create JIRA issue for Matlab parser 1. svn co http://svn.apache.org/repos/asf/tika/trunk tika 2. cd tika 3. drop your Matlab parser files in e.g., tika-parsers/src/main/java/org/apache/tika/parser/matlab 4. update file packages, etc. 5. svn status (files look ok?) 6. svn diff > TIKA-xxx.aburgess.yyMMdd.patch.txt (where xxx is the JIRA issue id from 0.) Then if you attach the diff to ReviewBoard I can annotate the lines etc with comments. THanks! Also once you create the JIRA issue I will help get it into the sources. Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Ann Burgess <anniebry...@gmail.com> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>, "Bryant, Ann C (398J-Affiliate)" <anniebry...@gmail.com> Date: Thursday, June 5, 2014 11:37 AM To: Chris Mattmann <mattm...@apache.org> Cc: Matthias Krueger <c...@mkr.io>, tika <dev@tika.apache.org>, "Bryant, Ann C (398J-Affiliate)" <anniebry...@gmail.com>, Nick Burch <n...@apache.org> Subject: Re: Review Request 22246: New parser for Matlab .mat files > > >> On June 4, 2014, 11:25 p.m., Matthias Krueger wrote: >> > The Matlab MIME types used seem to be application/x-matlab-data or >>application/matlab-mat. >> > >> > Would it make sense to add them to the mime XML for detection? >> > >> > <mime-type type="application/x-matlab-data"> >> > <comment>MATLAB data file</comment> >> > <alias type="application/matlab-mat"/> >> > <magic priority="50"> >> > <match value="MATLAB" type="string" offset="0"/> >> > </magic> >> > <glob pattern="*.mat"/> >> > </mime-type> >> > >> > >> >> Chris Mattmann wrote: >> +1 this makes a ton of sense to add IMO. >> >> Nick Burch wrote: >> There's some odd whitespace going on - we normally use 4 spaces and >>no tabs. >> >> When outputting the variables, it would probably make sense to put >>each one into either a paragraph or a list, so that we get helpful >>output in html mode as well as text mode >> >> With that in place, it would then be possible to have a unit test >>that checked the html output, as well as the current text one >> >> Also on testing, I think at least some of the tests have an >>implementation of assertContains, which generally gives a more helpful >>failure message than assertTrue(s.contains(...)) does, might be worth >>looking into that? > >Great input - thank you! I will integrate both and upload the diff. > > >- Ann > > >----------------------------------------------------------- >This is an automatically generated e-mail. To reply, visit: >https://reviews.apache.org/r/22246/#review44773 >----------------------------------------------------------- > > >On June 4, 2014, 10:23 p.m., Ann Burgess wrote: >> >> ----------------------------------------------------------- >> This is an automatically generated e-mail. To reply, visit: >> https://reviews.apache.org/r/22246/ >> ----------------------------------------------------------- >> >> (Updated June 4, 2014, 10:23 p.m.) >> >> >> Review request for tika and Chris Mattmann. >> >> >> Repository: tika >> >> >> Description >> ------- >> >> This is a new parser for Matlab .mat files. The parser utilizes the >>JmatIO, Matlab's MAT-file I/O API in JAVA. JmatIO is available through >>Maven Central. The text output from this parser provides variable names >>and dimensions that are both inside and outside of data structures, but >>does NOT provide the actual data values within each .mat file. >> >> >> Diffs >> ----- >> >> >> Diff: https://reviews.apache.org/r/22246/diff/ >> >> >> Testing >> ------- >> >> Successfully run a basic unit test that checks both --text and >>--metadata parser output. >> >> >> File Attachments >> ---------------- >> >> Parser File >> >>https://reviews.apache.org/media/uploaded/files/2014/06/04/cb39636d-ec53- >>4fbc-b348-6a4db8907f6b__MatParser.java >> Unit Test >> >>https://reviews.apache.org/media/uploaded/files/2014/06/04/bbff8c6b-caa1- >>4830-b441-532c28c3c78e__MatParserTest.java >> >> >> Thanks, >> >> Ann Burgess >> >> >