sounds good. please go for it :-) -Marshall

On 6/24/2010 9:45 AM, Eddie Epstein (JIRA) wrote:
>     [ 
> https://issues.apache.org/jira/browse/UIMA-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882153#action_12882153
>  ] 
>
> Eddie Epstein commented on UIMA-1818:
> -------------------------------------
>
> bq. Is the idea that every CAS that comes thru a particular specified 
> annotator would be saved to the file system?
> Yes.
>
> bq. if so - maybe some parameter to control how many, or how frequently to 
> sample, etc.?
> Implementing JMX control to dynamically turn on/off CAS logging would 
> accomplish this.
>
> {quote}The "COMPONENT_ARRAY" delegate keys need the x/y/z syntax for non 
> UIMA-AS cases - where an aggregate contains another aggregate, etc. This is 
> already a convention in UIMA. So it would be good to just continue using it 
> both for UIMA-AS cases and non-UIMA-AS cases.
> {quote}
> Right. The same syntax should work for UIMA CASes. To clarify, the code to 
> implement this is in the aggregate controller, of which there is one for UIMA 
> AS and another for core UIMA. The UIMA AS controller only sees asynchronous 
> delegates and visa versa for the core UIMA controller. This issue is only 
> covering implementation for asynchronous delegates.
>
> {quote}Would it be valuable to have a spec to say if the logging was to be 
> before or after the AnalysisEnging, for each delegate? For instance, the spec 
> could be e.g., someAggName/somePrimName:before:after (showing both). "before" 
> could be the default.{quote}
> To me, much less valuable to capture output CASes, and more complicated to 
> implement. The main use of capturing CASes going into a delegate is to be 
> able to later run the delegate stand-alone in a debug environment. In my 
> case, a scaled out delegate is hanging on one or more CASes and timing out. 
> This utility will allow one to easily capture all the CASes sent to the 
> queue, find the problem CAS and ultimately the cause.
>
> bq.Would it be valuable to dump only the changed data (a/la "delta cas")? 
> (possible syntax: add modifier :delta)
> This sounds more appropriately handled by CAS journaling, where all CAS 
> modifications can be attributed to specific annotators.
>
> bq.It would be good if the output was consumable by the CAS Viewer, too
> Interesting. The XmiCASes will be, but only if the CAS typesystem is 
> available. The typesystem description should be written into the directory 
> along with the CAS files.
>
>
>   
>> Provide simple mechanism to capture all CASes input to specified delegate
>> -------------------------------------------------------------------------
>>
>>                 Key: UIMA-1818
>>                 URL: https://issues.apache.org/jira/browse/UIMA-1818
>>             Project: UIMA
>>          Issue Type: New Feature
>>          Components: Async Scaleout
>>            Reporter: Eddie Epstein
>>            Assignee: Eddie Epstein
>>
>> The existing approach to capturing CASes sent to a component is to insert a 
>> new CAS-serializer-annotator just before it in the flow, or modify the 
>> component itself to serialize CASes. Both of these approaches require 
>> modifications to existing code and/or component descriptors, are somewhat 
>> time consuming and error prone.
>> A much simpler approach is to just "turn on" CAS logging for a particular 
>> component using Java properties before starting the process, or to turn CAS 
>> logging on/off for an already running process using JMX operations.
>> This issue covers using Java properties to turn on CAS logging for any 
>> delegate of an asynchronous aggregate.
>> CAS logging would be controlled by the following properties:
>> UIMA_CASLOG_BASE_DIRECTORY - optional; this is the directory under which 
>> other directories with XmiCas files will be created. If not specified, the 
>> processes current directory will be the base.
>> UIMA_CASLOG_COMPONENT_ARRAY - This is a space separated list of delegates 
>> keys. If a delegate is nested inside a co-located async aggregate, the name 
>> would include the key name of the aggregate, e.g. "someAggName/someDelName". 
>> The XmiCas files will then be written into 
>> $UIMA_CASLOG_BASE_DIRECTORY/someAggName/someDelName/
>> UIMA_CASLOG_TYPE_NAME - optional; this is the name of a FeatureStructure in 
>> the CAS containing a unique string to use the name each XmiCas file. If not 
>> specified, XmiCas file name will be NNN.xmi, where NNN is  the time in 
>> microseconds since the component was initialized.
>> UIMA_CASLOG_FEATURE_NAME - optional unless if the TYPE_NAME is specified; 
>> this parameter gives the string feature to use. An example of type and 
>> feature names to use would be 
>> "org.apache.uima.examples.SourceDocumentInformation" and "uri".
>>     
>   

Reply via email to