[jira] Created: (UIMA-1603) Missing doc describing service spec syntax
Missing doc describing service spec syntax -- Key: UIMA-1603 URL: https://issues.apache.org/jira/browse/UIMA-1603 Project: UIMA Issue Type: Improvement Components: Sandbox-SimpleServer Affects Versions: 2.2.2 Reporter: Olivier Terrier Priority: Minor The SimpleServer is a great contribution, unfortunately, the documentation is quite incomplete. In particular there is now (even basic) guidelines on how to write a result spec xml file to customize the output. Even a simple readme file would be of great help -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: [jira] Updated: (UIMA-1506) update Bean Scripting Framework Annotator with info about licenses and documentation
One question: The POM for this project specifies that the Jar have included resources, which then become available on the class path. The resources included are BeanshellTestAnnotator.xml BSFAggregatedAE.xml BSFAnnotator.xml NICKNAMES.bsh RhinoTestAnnotator.xml TEST.bsh TEST.js I see these are needed for running the tests. I'm moving them to the src/test/resources files. That way, maven will use them for running the tests, but they won't be packaged in the Jar file that maven builds. Let me know if I'm missing something here... Yes, they are here for testing purposes so I guess moving them to the test/resources dir is OK O.
[jira] Updated: (UIMA-1506) update Bean Scripting Framework Annotator with info about licenses and documentation
[ https://issues.apache.org/jira/browse/UIMA-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olivier Terrier updated UIMA-1506: -- Attachment: BSFAnnotator.zip Marshall: I did a quick review of the code to: 1. Fix the NullPointerException when running the testAnnotatorAggregated 2. migrate the BSFAnnotator from the old-way JTextAnnotator_ImplBase to JCasAnnotator_ImplBase 3. add a few generics to avoid warnings Attached you will find a zip that contains the files that have been modified Cheers Olivier update Bean Scripting Framework Annotator with info about licenses and documentation Key: UIMA-1506 URL: https://issues.apache.org/jira/browse/UIMA-1506 Project: UIMA Issue Type: Improvement Components: Sandbox-BSFAnnotator Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 2.3AS Attachments: BSFAnnotator.zip update with info received in email: http://markmail.org/thread/e3q5gk6h42jggg5c -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: possibly including the BSFAnnotator in the sandbox release 2.3.0
Hi Marshall When I donated the BSFAnnotator about two years ago I was not very familiar with the maven-based Apache UIMA build process... And I'm afraid I'm still not be. While working on including the BSFAnnotator in the sandbox release, I noticed it has 6 jar files in its lib directory. The NOTICE file says that the project includes some components under MPL and SPL; however, the information needed to associate each of this included 3rd-party jar files with the proper license is missing. Can someone who knows identify the source of the following jars: bsf.jar Apache 2 license Comes from http://apache.crihan.fr/dist/jakarta/bsf/binaries/bsf-bin-2.4.0.zip bsh-bsf.jar No longer needed: can be removed bsh.jar Dual licensed under both the SPL and LGPL = SPL is in the list of acceptable licenses available here: http://www.apache.org/legal/3party.html Can be replaced by http://www.beanshell.org/bsh-1.3.0.jar (which now contains the bsh-bsf.jar classes) commons-logging-api-1.1.jar Apache 2 license Comes from http://archive.apache.org/dist/commons/logging/binaries/commons-logging- 1.1.zip js.jar Dual licensed under both the MPL 1.1 and GPL = MPL is in the list of acceptable licenses available here: http://www.apache.org/legal/3party.html Comes from ftp://ftp.mozilla.org/pub/mozilla.org/js/rhino1_6R7.zip log4j-1.2.15.jar Apache 2 license Comes from http://www.apache.org/dyn/closer.cgi/logging/log4j/1.2.15/apache-log4j-1 .2.15.zip Is there a better way to package / distribute this in binary form? Probably using maven dependencies but I don't have enough maven-skills to do it myself Also, I noticed that the POM section for building the documentation is commented out - because there is no docbook style documentation. The documentation seems limited to a README and some examples. Is there additional documentation available for this? No: just the README sorry I guess that a real revamping should use the JSR-223 (javax.script) which is included in Java 1.6 and add support to new popular scripting languages like groovy, beanshell 2.0 etc... -Marshall
RE: possibly including the BSFAnnotator in the sandbox release 2.3.0
OK I have a look Cheers Olivier -Message d'origine- De : Marshall Schor [mailto:m...@schor.com] Envoyé : jeudi 20 août 2009 16:52 À : uima-dev@incubator.apache.org Objet : Re: possibly including the BSFAnnotator in the sandbox release 2.3.0 Hi Olivier - I updated the project: removed the bsf-bsh.jar updated the bsh.jar to bsh-1.3.0.jar I changed the POM so it didn't depend on the bsf-bsh.jar and changed the dependency of the bsh.jar to bsh-1.3.0.jar. When I tried to build it, it failed when running the testAnnotatorAggregated. (The tests ran before I did these updates) The error was: caused by org.apache.bsf.BSFException: the application script threw an exception: java.lang.NullPointerException: Null Pointer in Method Invocation at bsh.util.BeanShellBSFEngine.call(Unknown Source) at org.apache.uima.annotator.bsf.BSFAnnotator.initialize(BSFAnnotator.java :161) Can you take a look? I've check in the updated project - you should be able to just check it out. Thanks. -Marshall Olivier Terrier wrote: Hi Marshall When I donated the BSFAnnotator about two years ago I was not very familiar with the maven-based Apache UIMA build process... And I'm afraid I'm still not be. While working on including the BSFAnnotator in the sandbox release, I noticed it has 6 jar files in its lib directory. The NOTICE file says that the project includes some components under MPL and SPL; however, the information needed to associate each of this included 3rd-party jar files with the proper license is missing. Can someone who knows identify the source of the following jars: bsf.jar Apache 2 license Comes from http://apache.crihan.fr/dist/jakarta/bsf/binaries/bsf-bin-2.4.0.zip bsh-bsf.jar No longer needed: can be removed bsh.jar Dual licensed under both the SPL and LGPL = SPL is in the list of acceptable licenses available here: http://www.apache.org/legal/3party.html Can be replaced by http://www.beanshell.org/bsh-1.3.0.jar (which now contains the bsh-bsf.jar classes) commons-logging-api-1.1.jar Apache 2 license Comes from http://archive.apache.org/dist/commons/logging/binaries/commons- logging- 1.1.zip js.jar Dual licensed under both the MPL 1.1 and GPL = MPL is in the list of acceptable licenses available here: http://www.apache.org/legal/3party.html Comes from ftp://ftp.mozilla.org/pub/mozilla.org/js/rhino1_6R7.zip log4j-1.2.15.jar Apache 2 license Comes from http://www.apache.org/dyn/closer.cgi/logging/log4j/1.2.15/apache- log4j-1 .2.15.zip Is there a better way to package / distribute this in binary form? Probably using maven dependencies but I don't have enough maven- skills to do it myself Also, I noticed that the POM section for building the documentation is commented out - because there is no docbook style documentation. The documentation seems limited to a README and some examples. Is there additional documentation available for this? No: just the README sorry I guess that a real revamping should use the JSR-223 (javax.script) which is included in Java 1.6 and add support to new popular scripting languages like groovy, beanshell 2.0 etc... -Marshall
[jira] Updated: (UIMA-1304) Error handling parameters in CPE with a Vinci processor
[ https://issues.apache.org/jira/browse/UIMA-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olivier Terrier updated UIMA-1304: -- Attachment: UIMA-1304.clr.patch Error handling parameters in CPE with a Vinci processor --- Key: UIMA-1304 URL: https://issues.apache.org/jira/browse/UIMA-1304 Project: UIMA Issue Type: Bug Components: Collection Processing Affects Versions: 2.2.2 Reporter: Olivier Terrier Priority: Minor Attachments: UIMA-1304.clr.patch The handling of the error handling parameters of a CPE that has a Vinci remote Cas processor with its service-access deployment parameter set to random is buggy If you set the error parameters to the following values: errorHandling errorRateThreshold action=continue value=10/1000 / maxConsecutiveRestarts action=continue value=10 waitTimeBetweenRetries=1 / timeout max=60 default=-1 / /errorHandling It looks like, when the Vinci processor fails for some reason, the CPE intents gracefully to reconnect up to N times (N=10 which is the value of the maxConsecutiveRestarts parameter) which is the expected behaviour. But the waitTimeBetweenRetries delay is not used at all. Apparently in the implementation of method: private int attachToServices(boolean redeploy, String aServiceUri, int howMany, ProcessingContainer aProcessingContainer) throws Exception; of the class org.apache.uima.collection.impl.cpm.container.deployer.vinci.VinciCasProcessorDeployer the sleepBetweenRetries only occurs if the Vinci Cas processor is in exclusive mode. On the contrary (random mode) the method calls directly the method private synchronized boolean activateProcessor(CasProcessorConfiguration aCasProcessorConfig, String aService, ProcessingContainer aProcessingContainer, boolean redeploy); Which uses a hard coded timeout of 1 sec (SLEEP_TIME) between each retries instead of the waitTimeBetweenRetries. The bug has been confirmed by Jerry Cwiklik and he proposed the attached patch which solves the problem -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (UIMA-1304) Error handling parameters in CPE with a Vinci processor
[ https://issues.apache.org/jira/browse/UIMA-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olivier Terrier updated UIMA-1304: -- Description: The handling of the error handling parameters of a CPE that has a Vinci remote Cas processor with its service-access deployment parameter set to random is buggy If you set the error parameters to the following values: errorHandling errorRateThreshold action=continue value=10/1000 / maxConsecutiveRestarts action=continue value=10 waitTimeBetweenRetries=1 / timeout max=60 default=-1 / /errorHandling It looks like, when the Vinci processor fails for some reason, the CPE intents gracefully to reconnect up to N times (N=10 which is the value of the maxConsecutiveRestarts parameter) which is the expected behaviour. But the waitTimeBetweenRetries delay is not used at all. Apparently in the implementation of method: private int attachToServices(boolean redeploy, String aServiceUri, int howMany, ProcessingContainer aProcessingContainer) throws Exception; of the class org.apache.uima.collection.impl.cpm.container.deployer.vinci.VinciCasProcessorDeployer the sleepBetweenRetries only occurs if the Vinci Cas processor is in exclusive mode. On the contrary (random mode) the method calls directly the method private synchronized boolean activateProcessor(CasProcessorConfiguration aCasProcessorConfig, String aService, ProcessingContainer aProcessingContainer, boolean redeploy); Which uses a hard coded timeout of 1 sec (SLEEP_TIME) between each retries instead of the waitTimeBetweenRetries. The bug has been confirmed by Jerry Cwiklik and he proposed the attached patch which solves the problem was: The handling of the error handling parameters of a CPE that has a Vinci remote Cas processor with its service-access deployment parameter set to random is buggy If you set the error parameters to the following values: errorHandling errorRateThreshold action=continue value=10/1000 / maxConsecutiveRestarts action=continue value=10 waitTimeBetweenRetries=1 / timeout max=60 default=-1 / /errorHandling It looks like, when the Vinci processor fails for some reason, the CPE intents gracefully to reconnect up to N times (N=10 which is the value of the maxConsecutiveRestarts parameter) which is the expected behaviour. But the waitTimeBetweenRetries delay is not used at all. Apparently in the implementation of method: private int attachToServices(boolean redeploy, String aServiceUri, int howMany, ProcessingContainer aProcessingContainer) throws Exception; of the class org.apache.uima.collection.impl.cpm.container.deployer.vinci.VinciCasProcessorDeployer the sleepBetweenRetries only occurs if the Vinci Cas processor is in exclusive mode. On the contrary (random mode) the method calls directly the method private synchronized boolean activateProcessor(CasProcessorConfiguration aCasProcessorConfig, String aService, ProcessingContainer aProcessingContainer, boolean redeploy); Which uses a hard coded timeout of 1 sec (SLEEP_TIME) between each retries instead of the waitTimeBetweenRetries. The bug has been confirmed by Jerry Cwillick and he proposed the attached patch which solves the problem Error handling parameters in CPE with a Vinci processor --- Key: UIMA-1304 URL: https://issues.apache.org/jira/browse/UIMA-1304 Project: UIMA Issue Type: Bug Components: Collection Processing Affects Versions: 2.2.2 Reporter: Olivier Terrier Priority: Minor Attachments: UIMA-1304.clr.patch The handling of the error handling parameters of a CPE that has a Vinci remote Cas processor with its service-access deployment parameter set to random is buggy If you set the error parameters to the following values: errorHandling errorRateThreshold action=continue value=10/1000 / maxConsecutiveRestarts action=continue value=10 waitTimeBetweenRetries=1 / timeout max=60 default=-1 / /errorHandling It looks like, when the Vinci processor fails for some reason, the CPE intents gracefully to reconnect up to N times (N=10 which is the value of the maxConsecutiveRestarts parameter) which is the expected behaviour. But the waitTimeBetweenRetries delay is not used at all. Apparently in the implementation of method: private int attachToServices(boolean redeploy, String aServiceUri, int howMany, ProcessingContainer aProcessingContainer) throws Exception; of the class
[jira] Created: (UIMA-1304) Error handling parameters in CPE with a Vinci processor
Error handling parameters in CPE with a Vinci processor --- Key: UIMA-1304 URL: https://issues.apache.org/jira/browse/UIMA-1304 Project: UIMA Issue Type: Bug Components: Collection Processing Affects Versions: 2.2.2 Reporter: Olivier Terrier Priority: Minor The handling of the error handling parameters of a CPE that has a Vinci remote Cas processor with its service-access deployment parameter set to random is buggy If you set the error parameters to the following values: errorHandling errorRateThreshold action=continue value=10/1000 / maxConsecutiveRestarts action=continue value=10 waitTimeBetweenRetries=1 / timeout max=60 default=-1 / /errorHandling It looks like, when the Vinci processor fails for some reason, the CPE intents gracefully to reconnect up to N times (N=10 which is the value of the maxConsecutiveRestarts parameter) which is the expected behaviour. But the waitTimeBetweenRetries delay is not used at all. Apparently in the implementation of method: private int attachToServices(boolean redeploy, String aServiceUri, int howMany, ProcessingContainer aProcessingContainer) throws Exception; of the class org.apache.uima.collection.impl.cpm.container.deployer.vinci.VinciCasProcessorDeployer the sleepBetweenRetries only occurs if the Vinci Cas processor is in exclusive mode. On the contrary (random mode) the method calls directly the method private synchronized boolean activateProcessor(CasProcessorConfiguration aCasProcessorConfig, String aService, ProcessingContainer aProcessingContainer, boolean redeploy); Which uses a hard coded timeout of 1 sec (SLEEP_TIME) between each retries instead of the waitTimeBetweenRetries. The bug has been confirmed by Jerry Cwillick and he proposed the attached patch which solves the problem -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: UIMA chunking
Hi Burn and thanks for your suggestion that makes a lot of sense. Do you think that such a CAS multiplier can be used in a CPE? I guess no because as far as I know due to the management of the CAS pool of the CPE it is not possible to have components that produce CASes in the flow of a CPE. Olivier -Message d'origine- De : Burn Lewis [mailto:[EMAIL PROTECTED] Envoyé : lundi 21 juillet 2008 20:23 À : uima-dev@incubator.apache.org Objet : Re: UIMA chunking One approach is to use another CAS Multiplier for the merge ... it would take in the N chunks and produce an output CAS only when the N-th has been processed. Any later processing would be independent of the chunking that preceded it. This merging CM could also handle any out-of-order segments that can occur if you scale out your annotators. The CasCopier class makes it relatively easy to copy all FeatureStructures and update their offsets as necessary. Burn.
UIMA chunking
Hi all, Sometimes we are facing the problem of processing collection of big documents. This may leads to an instability of the processing chain: out-of-memory errors, timeouts etc... Moreover this it not very efficient in terms of load balancing (we use CPEs with analysis engines deployed as Vinci remote services on several machines). We would like to solve this problem implementing a kind of UIMA document chunking where big documents would be splitted into reasonable chunks (according to a given block size for example) at the beginning of the processing chain and merged back into one CAS at the end. According to us, the splitting phase is quite straightforward : a CAS multiplier splits the input document into N text blocks and produce N CASes. Chunking informations like: - document identifier - current part number - total part number - text offset Are stored in the CAS. The merging phase is much more complicated : a CAS consumer is responsible for intercepting each part and store it somewhere (in memory or serialized on the filesystem), when the last part of the document comes in, all the annotation of the CAS parts are merged back taking into account the offset. As we use a CPE, the merger CAS consumer can't produce a new CAS. What we have in mind is to create a new Sofa fullDocumentView in the last CAS part to store the text of the full document along with its associated annotations. Another idea is to use sofa mappings to leave unchanged our existing CAS consumers (that are sofa-unaware) that come after the merger in the CPE flow. CPE flow: CAS SPLITTER _InitialView: text part_i fullDocumentView: empty | AE1 _InitialView: text part_i + annotations AE1 fullDocumentView: empty | ... | AEn _InitialView: text partN + annotations AE1+...+AEn fullDocumentView: empty | CAS MERGER _InitialView: text part_i + annotations AE1+...+AEn fullDocumentView: if not last part = empty if last part = text + annotations merged part1+...+partN | CONSUMER (sofa-unaware) MAPPING cpe sofa : fullDocumentView = component sofa : _InitialView _InitialView: text + annotations merged part1+...+partN The tricky operations are: - caching/storing the CAS 'parts' in the merger: how (XCAS, XMI, etc..) ? where (memory, disk, ...)? - the merging of CAS 'parts' annotation into the full document CAS. - error management: what append in case of errors on some parts? We would like to share the thoughts/opinions of the UIMA community regarding this problem and the possible solutions. Do you think our approach is the good one? Does anybody has already faced a similar problem? As far as possible we don't want to reinvent the wheele and give priority to a generic and ideally a UIMA-builtin implementation. We are of course ready to contribute to this development if the community find a generic solution. Regards Olivier Terrier - TEMIS
[jira] Commented: (UIMA-1105) CPE is sutck trying to retrieve a free CAS from the pool
[ https://issues.apache.org/jira/browse/UIMA-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12612391#action_12612391 ] Olivier Terrier commented on UIMA-1105: --- It works like a charm now! thanks CPE is sutck trying to retrieve a free CAS from the pool Key: UIMA-1105 URL: https://issues.apache.org/jira/browse/UIMA-1105 Project: UIMA Issue Type: Bug Components: Collection Processing Affects Versions: 2.2.1 Environment: Windows XP 32 bits Reporter: Olivier Terrier Assignee: Marshall Schor Fix For: 2.3 Attachments: cpe.xml, ProcessingUnit-patch.txt, uima-cpe.jar, uima.zip Buggy scenario is a CPE with a first remote processor deployed as a Vinci service and an integrated CAS consumer that throws a ResourceProcessException in its process method. It is quite easy to reproduce with a dummy consumer with this implementation public void processCas(CAS aCAS) throws ResourceProcessException { throw new ResourceProcessException(new FileNotFoundException(file not found)); } It looks like the CPE is stuck trying to retrieve a CAS from the CAS pool that is apparently empty at some point. My feeling is that when you have an ResourceProcessException thrown in the last component of the CPE, the code that is supposed to release the CAS from the CAS pool is not properly called... If I suspend the process in Eclipse I can see that the CasConsumer and the Collection Reader pipelines Threads are waiting on the CPECasPool.getCas(long) method I attach the uima.log set to the FINEST level -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (UIMA-1105) CPE is sutck trying to retrieve a free CAS from the pool
CPE is sutck trying to retrieve a free CAS from the pool Key: UIMA-1105 URL: https://issues.apache.org/jira/browse/UIMA-1105 Project: UIMA Issue Type: Bug Components: Collection Processing Affects Versions: 2.2.1 Environment: Windows XP 32 bits Reporter: Olivier Terrier Buggy scenario is a CPE with a first remote processor deployed as a Vinci service and an integrated CAS consumer that throws a ResourceProcessException in its process method. It is quite easy to reproduce with a dummy consumer with this implementation public void processCas(CAS aCAS) throws ResourceProcessException { throw new ResourceProcessException(new FileNotFoundException(file not found)); } It looks like the CPE is stuck trying to retrieve a CAS from the CAS pool that is apparently empty at some point. My feeling is that when you have an ResourceProcessException thrown in the last component of the CPE, the code that is supposed to release the CAS from the CAS pool is not properly called... If I suspend the process in Eclipse I can see that the CasConsumer and the Collection Reader pipelines Threads are waiting on the CPECasPool.getCas(long) method I attach the uima.log set to the FINEST level -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (UIMA-1105) CPE is sutck trying to retrieve a free CAS from the pool
[ https://issues.apache.org/jira/browse/UIMA-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olivier Terrier updated UIMA-1105: -- Attachment: uima.zip CPE is sutck trying to retrieve a free CAS from the pool Key: UIMA-1105 URL: https://issues.apache.org/jira/browse/UIMA-1105 Project: UIMA Issue Type: Bug Components: Collection Processing Affects Versions: 2.2.1 Environment: Windows XP 32 bits Reporter: Olivier Terrier Attachments: uima.zip Buggy scenario is a CPE with a first remote processor deployed as a Vinci service and an integrated CAS consumer that throws a ResourceProcessException in its process method. It is quite easy to reproduce with a dummy consumer with this implementation public void processCas(CAS aCAS) throws ResourceProcessException { throw new ResourceProcessException(new FileNotFoundException(file not found)); } It looks like the CPE is stuck trying to retrieve a CAS from the CAS pool that is apparently empty at some point. My feeling is that when you have an ResourceProcessException thrown in the last component of the CPE, the code that is supposed to release the CAS from the CAS pool is not properly called... If I suspend the process in Eclipse I can see that the CasConsumer and the Collection Reader pipelines Threads are waiting on the CPECasPool.getCas(long) method I attach the uima.log set to the FINEST level -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (UIMA-1105) CPE is sutck trying to retrieve a free CAS from the pool
[ https://issues.apache.org/jira/browse/UIMA-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olivier Terrier updated UIMA-1105: -- Attachment: cpe.xml CPE is sutck trying to retrieve a free CAS from the pool Key: UIMA-1105 URL: https://issues.apache.org/jira/browse/UIMA-1105 Project: UIMA Issue Type: Bug Components: Collection Processing Affects Versions: 2.2.1 Environment: Windows XP 32 bits Reporter: Olivier Terrier Attachments: cpe.xml, uima.zip Buggy scenario is a CPE with a first remote processor deployed as a Vinci service and an integrated CAS consumer that throws a ResourceProcessException in its process method. It is quite easy to reproduce with a dummy consumer with this implementation public void processCas(CAS aCAS) throws ResourceProcessException { throw new ResourceProcessException(new FileNotFoundException(file not found)); } It looks like the CPE is stuck trying to retrieve a CAS from the CAS pool that is apparently empty at some point. My feeling is that when you have an ResourceProcessException thrown in the last component of the CPE, the code that is supposed to release the CAS from the CAS pool is not properly called... If I suspend the process in Eclipse I can see that the CasConsumer and the Collection Reader pipelines Threads are waiting on the CPECasPool.getCas(long) method I attach the uima.log set to the FINEST level -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (UIMA-1105) CPE is sutck trying to retrieve a free CAS from the pool
[ https://issues.apache.org/jira/browse/UIMA-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12611608#action_12611608 ] Olivier Terrier commented on UIMA-1105: --- Not it is not (see attached CPE descriptor) CPE is sutck trying to retrieve a free CAS from the pool Key: UIMA-1105 URL: https://issues.apache.org/jira/browse/UIMA-1105 Project: UIMA Issue Type: Bug Components: Collection Processing Affects Versions: 2.2.1 Environment: Windows XP 32 bits Reporter: Olivier Terrier Attachments: cpe.xml, uima.zip Buggy scenario is a CPE with a first remote processor deployed as a Vinci service and an integrated CAS consumer that throws a ResourceProcessException in its process method. It is quite easy to reproduce with a dummy consumer with this implementation public void processCas(CAS aCAS) throws ResourceProcessException { throw new ResourceProcessException(new FileNotFoundException(file not found)); } It looks like the CPE is stuck trying to retrieve a CAS from the CAS pool that is apparently empty at some point. My feeling is that when you have an ResourceProcessException thrown in the last component of the CPE, the code that is supposed to release the CAS from the CAS pool is not properly called... If I suspend the process in Eclipse I can see that the CasConsumer and the Collection Reader pipelines Threads are waiting on the CPECasPool.getCas(long) method I attach the uima.log set to the FINEST level -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (UIMA-1105) CPE is sutck trying to retrieve a free CAS from the pool
[ https://issues.apache.org/jira/browse/UIMA-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12611656#action_12611656 ] Olivier Terrier commented on UIMA-1105: --- No it does not change nothing, the CPE is still stuck even with dropCasOnException=true CPE is sutck trying to retrieve a free CAS from the pool Key: UIMA-1105 URL: https://issues.apache.org/jira/browse/UIMA-1105 Project: UIMA Issue Type: Bug Components: Collection Processing Affects Versions: 2.2.1 Environment: Windows XP 32 bits Reporter: Olivier Terrier Attachments: cpe.xml, uima.zip Buggy scenario is a CPE with a first remote processor deployed as a Vinci service and an integrated CAS consumer that throws a ResourceProcessException in its process method. It is quite easy to reproduce with a dummy consumer with this implementation public void processCas(CAS aCAS) throws ResourceProcessException { throw new ResourceProcessException(new FileNotFoundException(file not found)); } It looks like the CPE is stuck trying to retrieve a CAS from the CAS pool that is apparently empty at some point. My feeling is that when you have an ResourceProcessException thrown in the last component of the CPE, the code that is supposed to release the CAS from the CAS pool is not properly called... If I suspend the process in Eclipse I can see that the CasConsumer and the Collection Reader pipelines Threads are waiting on the CPECasPool.getCas(long) method I attach the uima.log set to the FINEST level -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: [jira] Closed: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop
Fine with me! Olivier -Message d'origine- De : Thilo Goetz [mailto:[EMAIL PROTECTED] Envoyé : vendredi 23 novembre 2007 17:09 À : uima-dev@incubator.apache.org Objet : Re: [jira] Closed: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop As we can't expect any input from the native speakers this week (because of Thanksgiving), here are couple of minor changes I would suggest. --Thilo BSF Annotator The Bean Scripting Framework (BSF) Annotator is an Apache UIMA analysis engine that provides a link between the UIMA framework and the scripting languages that are supported by Apache BSF (http://jakarta.apache.org/bsf). The current implementation comes with examples in Beanshell (http://www.beanshell.org) and Rhino Javascript (http://www.mozilla.org/rhino). Simple tests have also been conducted successfully with Jython (http://jython.sourceforge.net/Project/index.html) and JRuby (http://jruby.codehaus.org). The annotator takes as parameter the source file containing the script. The script is supposed to implement the initialize and process functions of the analysis engine. Using a scripting language can be very handy to do quick prototyping, pre/post processing, CAS cleaning tasks or typeystem conversion/adaptation. The Java source of the annotator can be accessed from the SVN repository at http://svn.apache.org/repos/asf/incubator/uima/sandbox/trunk/B SFAnnotator. Olivier Terrier wrote: Hi Michael I'm not familiar at all with the website update process so I appreciate your help. Here is a small abstact (please review the english as it is not my native language!) - BSF Annotator The Bean Scripting Framework (BSF) Annotator is an Apache UIMA analysis engine that makes a linkage between the UIMA framework and the scripting languages that are supported by Apache BSF (http://jakarta.apache.org/bsf). The current implementation comes with examples in Beanshell (http://www.beanshell.org) and Rhino Javascript (http://www.mozilla.org/rhino). Simple tests have also been conducted with success with Jython (http://jython.sourceforge.net/Project/index.html) and JRuby (http://jruby.codehaus.org). The annotator takes as parameter the source file containing the script. The script is supposed to implement the initialize and process functions of the analysis engine. Using a scripting language can be very handy to do quick prototyping, pre/post processing, CAS cleaning tasks or typeystem conversion/adaptation. The Java source of the annotator can be accessed in the SVN repository at http://svn.apache.org/repos/asf/incubator/uima/sandbox/trunk/B SFAnnotator. - Best Olivier -Message d'origine- De : Michael Baessler [mailto:[EMAIL PROTECTED] Envoyé : vendredi 23 novembre 2007 11:50 À : uima-dev@incubator.apache.org Objet : Re: [jira] Closed: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop Hi Olivier, since the BSF annotator code is now in the Sandbox, can you please make a short update of the Sandbox website content? http://incubator.apache.org/uima/sandbox.html If you are currently not familiar with the website update process and you also do not have the time to get familiar with, you can also send me an abstract about the BSF annotator and I will do the update. A short website update HowTo is available here: http://svn.apache.org/repos/asf/incubator/uima/site/trunk/uima -website/HOWTO Thanks Michael Thilo Goetz (JIRA) wrote: [ https://issues.apache.org/jira/browse/UIMA-624?page=com.atlassian.jir a .plugin.system.issuetabpanels:all-tabpanel ] Thilo Goetz closed UIMA-624. Resolution: Fixed Assignee: Thilo Goetz Checked in as submitted. Only change: adapted NOTICE and LICENSE file to Apache conventions. Thanks! UIMA Sandbox BSF Annotator initial code drop Key: UIMA-624 URL: https://issues.apache.org/jira/browse/UIMA-624 Project: UIMA Issue Type: New Feature Components: Sandbox-BSFAnnotator Reporter: Olivier Terrier Assignee: Thilo Goetz Priority: Minor Attachments: BSFAnnotator.zip, BSFAnnotator.zip, jruby-COPYING.CPL Here is the BSF Scripting Annotator as discussed in the uima-dev list. Comments are welcome Olivier
RE: [jira] Closed: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop
Hi Michael I'm not familiar at all with the website update process so I appreciate your help. Here is a small abstact (please review the english as it is not my native language!) - BSF Annotator The Bean Scripting Framework (BSF) Annotator is an Apache UIMA analysis engine that makes a linkage between the UIMA framework and the scripting languages that are supported by Apache BSF (http://jakarta.apache.org/bsf). The current implementation comes with examples in Beanshell (http://www.beanshell.org) and Rhino Javascript (http://www.mozilla.org/rhino). Simple tests have also been conducted with success with Jython (http://jython.sourceforge.net/Project/index.html) and JRuby (http://jruby.codehaus.org). The annotator takes as parameter the source file containing the script. The script is supposed to implement the initialize and process functions of the analysis engine. Using a scripting language can be very handy to do quick prototyping, pre/post processing, CAS cleaning tasks or typeystem conversion/adaptation. The Java source of the annotator can be accessed in the SVN repository at http://svn.apache.org/repos/asf/incubator/uima/sandbox/trunk/BSFAnnotator. - Best Olivier -Message d'origine- De : Michael Baessler [mailto:[EMAIL PROTECTED] Envoyé : vendredi 23 novembre 2007 11:50 À : uima-dev@incubator.apache.org Objet : Re: [jira] Closed: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop Hi Olivier, since the BSF annotator code is now in the Sandbox, can you please make a short update of the Sandbox website content? http://incubator.apache.org/uima/sandbox.html If you are currently not familiar with the website update process and you also do not have the time to get familiar with, you can also send me an abstract about the BSF annotator and I will do the update. A short website update HowTo is available here: http://svn.apache.org/repos/asf/incubator/uima/site/trunk/uima -website/HOWTO Thanks Michael Thilo Goetz (JIRA) wrote: [ https://issues.apache.org/jira/browse/UIMA-624?page=com.atlassian.jira .plugin.system.issuetabpanels:all-tabpanel ] Thilo Goetz closed UIMA-624. Resolution: Fixed Assignee: Thilo Goetz Checked in as submitted. Only change: adapted NOTICE and LICENSE file to Apache conventions. Thanks! UIMA Sandbox BSF Annotator initial code drop Key: UIMA-624 URL: https://issues.apache.org/jira/browse/UIMA-624 Project: UIMA Issue Type: New Feature Components: Sandbox-BSFAnnotator Reporter: Olivier Terrier Assignee: Thilo Goetz Priority: Minor Attachments: BSFAnnotator.zip, BSFAnnotator.zip, jruby-COPYING.CPL Here is the BSF Scripting Annotator as discussed in the uima-dev list. Comments are welcome Olivier
[jira] Updated: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop
[ https://issues.apache.org/jira/browse/UIMA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olivier Terrier updated UIMA-624: - Attachment: BSFAnnotator.zip New version of the package without any reference to ruby. Use Mozilla Rhino (javascript) instead UIMA Sandbox BSF Annotator initial code drop Key: UIMA-624 URL: https://issues.apache.org/jira/browse/UIMA-624 Project: UIMA Issue Type: New Feature Components: Sandbox-BSFAnnotator Reporter: Olivier Terrier Priority: Minor Attachments: BSFAnnotator.zip, BSFAnnotator.zip, jruby-COPYING.CPL Here is the BSF Scripting Annotator as discussed in the uima-dev list. Comments are welcome Olivier -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: [jira] Commented: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop
No worry I can remove all the jruby stuff and just put a note to tell people how to install it if them want to give it a try, as you suggested. The original purpose was to illustrate that several scripting languages can be used by the annotator (not just beanshell which is the prefered one for me). What I can do is to add the Rhino Javascript support (quite easy I already have it here) which is perfectly eligible in terms of licensing (Mozilla) and adapt the tests in consequence. Whar do you think? Olivier -Message d'origine- De : Thilo Goetz (JIRA) [mailto:[EMAIL PROTECTED] Envoyé : mercredi 21 novembre 2007 13:17 À : Olivier Terrier Objet : [jira] Commented: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop [ https://issues.apache.org/jira/browse/UIMA-624?page=com.atlass ian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_ 12544433 ] Thilo Goetz commented on UIMA-624: -- Folks, I spent some more time on this, trying to straighten out the legal stuff. Unfortunately, it looks like JRuby is an assembly of variously licensed open source components without proper attribution or licensing information in the JRuby distribution. Just for example, the attribution and license required by the ASM distribution is missing. We can't put this in our SVN, nor do we want to distribute it. So what are our options? Leave it out and tell people where to get it and how to install it would be my suggestion. Olivier, what do you think? I've checked the other 3rd party components, they seem to be in order. UIMA Sandbox BSF Annotator initial code drop Key: UIMA-624 URL: https://issues.apache.org/jira/browse/UIMA-624 Project: UIMA Issue Type: New Feature Components: Sandbox Reporter: Olivier Terrier Priority: Minor Attachments: BSFAnnotator.zip, jruby-COPYING.CPL Here is the BSF Scripting Annotator as discussed in the uima-dev list. Comments are welcome Olivier -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: Java 1.5 Generics
Hi Jörn Yes you're right, the command: javac -source 1.5 -target jsr14 Does exactly what you suggest Best Olivier -Message d'origine- De : Jörn Kottmann [mailto:[EMAIL PROTECTED] Envoyé : vendredi 9 novembre 2007 10:10 À : uima-dev@incubator.apache.org Objet : Re: Java 1.5 Generics None yet that I know of. Feel free to draft some kind of proposal, and post here or perhaps on our wiki. If I got it right its possible to just insert generics everywhere in our code without breaking any code which does not use generics. The java compiler does a source-to-source translation and replaces the generics with explicit cast like its done now in our code. After the source-to- source translation the code should be the same as now. Eclipse has a re-factoring tool which can guess generics but does not always work correctly. Someone must dig trough the code and change all the generics in a semi automatic manner with the tool. Maybe it will help if all interfaces are converted first. I would offer to do this job for the uimaj-core project I already planned to take a closer look at the source code, so I can do it together. Jörn
[jira] Updated: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop
[ https://issues.apache.org/jira/browse/UIMA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olivier Terrier updated UIMA-624: - Attachment: BSFAnnotator.zip UIMA Sandbox BSF Annotator initial code drop Key: UIMA-624 URL: https://issues.apache.org/jira/browse/UIMA-624 Project: UIMA Issue Type: New Feature Components: Sandbox Reporter: Olivier Terrier Priority: Minor Attachments: BSFAnnotator.zip Here is the BSF Scripting Annotator as discussed in the uima-dev list. Comments are welcome Olivier -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: [jira] Commented: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop
Sorry I didn't know that. Actually Beanshell is dual licensed under both the SPL and LGPL and SPL is according according to the list. Jruby is distributed under a tri-license (CPL/GPL/LGPL) and I think CPL is OK too. So maybe I only need to change the jruby.COPYING.LGPL by the jruby.COPYING.CPL What do you think ? Olivier -Message d'origine- De : Thilo Goetz (JIRA) [mailto:[EMAIL PROTECTED] Envoyé : jeudi 8 novembre 2007 11:47 À : uima-dev@incubator.apache.org Objet : [jira] Commented: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop [ https://issues.apache.org/jira/browse/UIMA-624?page=com.atlass ian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541017 ] Thilo Goetz commented on UIMA-624: -- Sorry, but we can't accept LGPL software. I know it's a pain that these open source licenses are mutually incompatible, but there's nothing we can do about it. Please see http://people.apache.org/~cliffs/3party.html for an overview of the (un)acceptable licenses. If there are licenses that are not explicitly listed there, we need to decide on a case-by-case basis. What we can do is document for users where they can acquire the GPLed software, and what they need to do to integrate it with your annotator. Does that make sense in this case? --Thilo UIMA Sandbox BSF Annotator initial code drop Key: UIMA-624 URL: https://issues.apache.org/jira/browse/UIMA-624 Project: UIMA Issue Type: New Feature Components: Sandbox Reporter: Olivier Terrier Priority: Minor Attachments: BSFAnnotator.zip Here is the BSF Scripting Annotator as discussed in the uima-dev list. Comments are welcome Olivier -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop
[ https://issues.apache.org/jira/browse/UIMA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olivier Terrier updated UIMA-624: - Attachment: jruby-COPYING.CPL UIMA Sandbox BSF Annotator initial code drop Key: UIMA-624 URL: https://issues.apache.org/jira/browse/UIMA-624 Project: UIMA Issue Type: New Feature Components: Sandbox Reporter: Olivier Terrier Priority: Minor Attachments: BSFAnnotator.zip, jruby-COPYING.CPL Here is the BSF Scripting Annotator as discussed in the uima-dev list. Comments are welcome Olivier -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (UIMA-624) UIMA Sandbox BSF Annotator initial code drop
UIMA Sandbox BSF Annotator initial code drop Key: UIMA-624 URL: https://issues.apache.org/jira/browse/UIMA-624 Project: UIMA Issue Type: New Feature Components: Sandbox Reporter: Olivier Terrier Priority: Minor Here is the BSF Scripting Annotator as discussed in the uima-dev list. Comments are welcome Olivier -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: Scripting annotator
Hi Thilo I have sent my signed ICLA by fax a week ago and Temis (Pascal Coupet) has also sent a Corporate CLA. Should I received any confirmation for that? Regarding the contribution I wouldn't call it a substantial piece of work anyway I check with Pascal about a code grant by Temis Thanks Olivier -Message d'origine- De : Thilo Goetz [mailto:[EMAIL PROTECTED] Envoyé : mercredi 31 octobre 2007 09:03 À : uima-dev@incubator.apache.org Objet : Re: Scripting annotator Marshall Schor wrote: This sounds like a great addition to the framework - the ability to have annotators that are based on the Bean Scripting Framework, especially for quick tasks. We have some capability for this in the framework already for Perl and Python, but this sounds like it could be a smoother integration. +1 from me - what do others think? -Marshall P.S. - sorry for the slow response, have been inundated Olivier Terrier wrote: Hi We (at Temis) have been invited by Marshall to contribute more actively to UIMA. As a first contribution I would like to share with the community a little component that we have developped and found very useful: a scripting annotator based on Apache Bean Scripting Framework (BSF). We use it very frequently to do quick modifications in CASes within an annotation chain (preprocessing, typesystem adaptation, etc...). Currently the BSF Annotator has been tested with Beanshell and JRuby as scripting languages but should be extended to most of BSF-enabled languages including Javascript, Jython etc... Prior to conbtributing the source as a patch in Jira I would like to have your opinion on the usefulness of such components Olivier Terrier Core Products Software Development Manager TEMIS S.A. Text Intelligence in your daily business 5, Rue du Tour de l'Eau 38400 Saint Martin d'Hères - France Direct: +33 (0)4 56 38 24 06 Mobile: +33 (0)6 60 36 93 11 Visit our website! www.temis-group.com http://www.temis-group.com/ +1 Before you start contributing, you should file an ICLA (individual contributor's license agreement) with the ASF. See http://www.apache.org/licenses/#clas. If the annotator is a substantial piece of work, and you haven't written it all yourself, we'll need a code grant from Temis for it. See http://www.apache.org/licenses/#grants. If it's just a small handful of files, this won't be necessary. --Thilo
Scripting annotator
Hi We (at Temis) have been invited by Marshall to contribute more actively to UIMA. As a first contribution I would like to share with the community a little component that we have developped and found very useful: a scripting annotator based on Apache Bean Scripting Framework (BSF). We use it very frequently to do quick modifications in CASes within an annotation chain (preprocessing, typesystem adaptation, etc...). Currently the BSF Annotator has been tested with Beanshell and JRuby as scripting languages but should be extended to most of BSF-enabled languages including Javascript, Jython etc... Prior to conbtributing the source as a patch in Jira I would like to have your opinion on the usefulness of such components Olivier Terrier Core Products Software Development Manager TEMIS S.A. Text Intelligence in your daily business 5, Rue du Tour de l'Eau 38400 Saint Martin d'Hères - France Direct: +33 (0)4 56 38 24 06 Mobile: +33 (0)6 60 36 93 11 Visit our website! www.temis-group.com http://www.temis-group.com/