If you push the poi version to 3.6 in your maven configuration, do you still get the error?
Mark On Fri, Oct 8, 2010 at 9:47 AM, Keith Gilbertson <keith.gilbert...@library.gatech.edu> wrote: > Mark - Thank you. It's in our maven repository. Graham had mentioned there > would be some work to get this going, but I didn't know what it involved. > Everything built and installed with some minor code changes, which was very > nifty. I still got an error in the word filter. > Hardy Pottinger had sent me a link to this notice: > http://code.google.com/p/text-mining/issues/detail?id=5 > I didn't know what "rejar" meant, but I found this to work: > 1. Get source for this version of text-mining utils with 'svn checkout > http://text-mining.googlecode.com/svn/trunk/ text-mining-read-only' command > 2. From this tree, delete lib/poi-3.0.1-FINAL-20070705.jar and replace with > poi-3.6.jar > 3. Rebuild with 'ant' command > 4. Copy build/bin/tm-extractors-1.0.jar to > lib/dspace-tm-extractors-1.0.0.jar directory of my dspace deployment > directory > Then filter-media works fine with the new PowerPoint filter and the > WordFilter. > So, could we rebuild the dspace-tm-extractors-1.00.jar against poi-3.6 and > put that in our maven repository? I suppose now would also be a good > opportunity for me to learn about the unit testing framework and use it to > make sure filtering still works as well as it did before the change! > Ryan Ackley, the developer for these tm-extractors also worked on the POI > project for a while. Presumably he's very busy, but I'll contact him and > ask if POI now has the full capability of the tm-extractors and hope for an > answer - because maybe we don't even need the tm-extractors library if the > POI extractors were rewritten by Ryan. > It looks like the current WordFilter doesn't handle the new Microsoft Word > XML formats - so that may be another small project for someone to take on > soon. > > --keith > > On Oct 7, 2010, at 3:35 AM, Mark Diggory wrote: > > As its not in the maven central repository. We would need to release > it ourselves under org.dspace.dependencies or see if someone else can > push out a new version of tm-extractors for maven central. > > To release into our repository, we just need to author a pom.xml file > for the tm-extractors and package the jar... I set this up, but had > some issues with sonatype failing to let me see the staged release on > their side. I did release to the central repository. Still waiting to > see it show up here: > > http://repo2.maven.org/maven2/org/dspace/dependencies/dspace-tm-extractors > > once available, give it a try and see if it fixes your issues. > > Mark > > On Wed, Oct 6, 2010 at 11:11 AM, Keith Gilbertson > <keith.gilbert...@library.gatech.edu> wrote: > > Thanks Graham and Tim. I hadn't seen that. > > On Oct 6, 2010, at 11:52 AM, Graham Triggs wrote: > > That version of tm-extractors is quite old. > > There is a newer version on the Google site > > - http://code.google.com/p/text-mining/ - but it will take a bit of work > > wrapping things up for general use. > > It has dependencies on newer versions of POI than 0.4, and some distinct > > improvements to it's robustness. > > G > > On 6 October 2010 16:39, Tim Donohue <tdono...@duraspace.org> wrote: > > Ugh -- sounds like you've entered dependency hell. > > Though, I think the one shred of good news here is that it seems to only > > have a dependency conflict in one place in our codebase. > > It looks like (at a glance) if our WordFilter can be re-written to no > > longer need the org.textmining project, you *might* be OK (i.e. > > hopefully it wouldn't snowball on you). But, that would require finding > > a Word document text extractor that is as good as (or better than) that > > 'org.textmining' one, and then hoping it doesn't cause another > > dependency conflict. Not sure of any alternative Word text extractors, > > off the top of my head, but maybe others know of one? > > - Tim > > > ------------------------------------------------------------------------------ > > Beautiful is writing same markup. Internet Explorer 9 supports > > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > > Spend less time writing and rewriting code and more time creating great > > experiences on the web. Be a part of the beta today. > > http://p.sf.net/sfu/beautyoftheweb > > _______________________________________________ > > DSpace-tech mailing list > > DSpace-tech@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/dspace-tech > > > > > > -- > Mark R. Diggory > Head of U.S. Operations - @mire > > http://www.atmire.com - Institutional Repository Solutions > http://www.togather.eu - Before getting together, get t...@ther > > -- Mark R. Diggory Head of U.S. Operations - @mire http://www.atmire.com - Institutional Repository Solutions http://www.togather.eu - Before getting together, get t...@ther ------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today. http://p.sf.net/sfu/beautyoftheweb _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech