[ https://issues.apache.org/jira/browse/UIMA-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881925#action_12881925 ]
Marshall Schor commented on UIMA-1818: -------------------------------------- Sounds like a valuable debugging aide. Is the idea that *every* CAS that comes thru a particular specified annotator would be saved to the file system? * if so - maybe some parameter to control how many, or how frequently to sample, etc.? The "COMPONENT_ARRAY" delegate keys need the x/y/z syntax for non UIMA-AS cases - where an aggregate contains another aggregate, etc. This is already a convention in UIMA. So it would be good to just continue using it both for UIMA-AS cases and non-UIMA-AS cases. Would it be valuable to have a spec to say if the logging was to be before or after the AnalysisEnging, for each delegate? For instance, the spec could be e.g., someAggName/somePrimName:before:after (showing both). "before" could be the default. Would it be valuable to dump only the changed data (a/la "delta cas")? (possible syntax: add modifier :delta) It would be good if the output was consumable by the CAS Viewer, too :-). > Provide simple mechanism to capture all CASes input to specified delegate > ------------------------------------------------------------------------- > > Key: UIMA-1818 > URL: https://issues.apache.org/jira/browse/UIMA-1818 > Project: UIMA > Issue Type: New Feature > Components: Async Scaleout > Reporter: Eddie Epstein > Assignee: Eddie Epstein > > The existing approach to capturing CASes sent to a component is to insert a > new CAS-serializer-annotator just before it in the flow, or modify the > component itself to serialize CASes. Both of these approaches require > modifications to existing code and/or component descriptors, are somewhat > time consuming and error prone. > A much simpler approach is to just "turn on" CAS logging for a particular > component using Java properties before starting the process, or to turn CAS > logging on/off for an already running process using JMX operations. > This issue covers using Java properties to turn on CAS logging for any > delegate of an asynchronous aggregate. > CAS logging would be controlled by the following properties: > UIMA_CASLOG_BASE_DIRECTORY - optional; this is the directory under which > other directories with XmiCas files will be created. If not specified, the > processes current directory will be the base. > UIMA_CASLOG_COMPONENT_ARRAY - This is a space separated list of delegates > keys. If a delegate is nested inside a co-located async aggregate, the name > would include the key name of the aggregate, e.g. "someAggName/someDelName". > The XmiCas files will then be written into > $UIMA_CASLOG_BASE_DIRECTORY/someAggName/someDelName/ > UIMA_CASLOG_TYPE_NAME - optional; this is the name of a FeatureStructure in > the CAS containing a unique string to use the name each XmiCas file. If not > specified, XmiCas file name will be NNN.xmi, where NNN is the time in > microseconds since the component was initialized. > UIMA_CASLOG_FEATURE_NAME - optional unless if the TYPE_NAME is specified; > this parameter gives the string feature to use. An example of type and > feature names to use would be > "org.apache.uima.examples.SourceDocumentInformation" and "uri". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.