I believe in OpenNLP 1.5 and above, the model metadata is part of the mode/zip file now, so I was thinking of making it even simpler: It should be as simple as: SentenceModel model = new SentenceModel(InputStream);
--Pei > -----Original Message----- > From: William Karl Thompson [mailto:[email protected]] > Sent: Tuesday, June 11, 2013 2:08 PM > To: [email protected] > Subject: RE: InputSteam instead of java.io.File > > Issue (1) is something I've encountered too, in the > SuffixMaxentModelResourceImpl class. There is a call to > DataResource.getUrl() which doesn't work if the resource is located in a jar > file. Replacing this with the following code (starting on line 55) fixed the > problem: > > //File modelFile = new File(dr.getUri()); > InputStream is = dr.getInputStream(); > DataReader dataReader = new PlainTextFileDataReader(is); > GISModelReader modelReader = new > GISModelReader(dataReader); > iv_maxentModel = modelReader.getModel(); > is.close(); > > > -----Original Message----- > From: Chen, Pei [mailto:[email protected]] > Sent: Tuesday, June 11, 2013 12:50 PM > To: [email protected] > Subject: InputSteam instead of java.io.File > > While working on the test cases in cTAKES, I've encountered couple of issues > and suggestions: > > 1) File or Url.getRawPath() became problematic if they are read in from > the jars from the classpath and which couldn't resolve to a physical File. > > a. Suggestion: Wherever possible, replace loading of resouces via > java.io.File with InputStream instead. . We can add a new method in the > FileLocator util and deprecate the old File method. > > 2) Sentence Dectector is still using the OpenNLP 1.4 mechanism of loading > it's model file. > > a. Suggestion: Let's update it to use the new 1.5 way similar to > POSTagger. > (Remove non longer required classes: SuffixMaxentModelResourceImpl, > MaxentModelResource, SuffixSensitiveGISModelReader, classes etc.) > > Background: > Certain unit tests fail because they can't be resolved via jars from the > classpath because the code is explicitly looking for File on disk instead of > input stream. But in order to solve it appropriately, it had a cascading > effect > and required a lot more changes, but it's probably a good time to update > those projects anyhow. > > --Pei
