I noticed that the patch changes a public API (populateCASfromURL). This will break backwards compatibility, if anyone has code that is depending on that API.
If there is a convenient way to implement the fix without changing the APIs, I think our users may prefer that :-) . -Marshall On 9/20/2010 1:52 AM, Tommaso Teofili (JIRA) wrote: > [ > https://issues.apache.org/jira/browse/UIMA-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Tommaso Teofili resolved UIMA-1878. > ----------------------------------- > > Resolution: Fixed > >> TikaAnnotator doesn't handle spaces in path string >> -------------------------------------------------- >> >> Key: UIMA-1878 >> URL: https://issues.apache.org/jira/browse/UIMA-1878 >> Project: UIMA >> Issue Type: Bug >> Components: Sandbox-TikaAnnotator >> Affects Versions: 2.3 >> Environment: Windows >> Reporter: Greg Holmberg >> Attachments: TikaAnnotator-patch.txt >> >> >> If you give a value for InputDirectory that contains a space, then >> TikiAnnotator silently does nothing. >> This is because File objects are converted directly to a URL, and >> openStream() fails because the space character wasn't converted to %20. >> When this happens, the exception is ignored and the CAS text is set to "". >> It would be better to convert the File object to a URI and the URI to a URL. >> This will convert the space character correctly. >> Secondly, it would be better the throw an exception rather than silently >> ignore it. >> A suggested patch is attached.
