Chris A. Mattmann created OODT-667:
--------------------------------------

             Summary: CAS-PGE no longer respects writers and file tags from 
earlier pgeConfig.xml files
                 Key: OODT-667
                 URL: https://issues.apache.org/jira/browse/OODT-667
             Project: OODT
          Issue Type: Bug
          Components: pge wrapper framework
    Affects Versions: 0.6, 0.5, 0.4
            Reporter: Chris A. Mattmann
            Assignee: Chris A. Mattmann
             Fix For: 0.7


It's been a long standing bug post Apache OODT 0.3 (0.4 and beyond) that the 
updates to CAS-PGE to simplify its crawling system for met extraction based on 
files and regExp tags and to unify it with the AutoDetectProductCrawler has 
caused cas-pge to no longer honor the following blocks from pgeConfig.xml files:

{code:xml}
<output>
  <dir>
    <files regExp="someRegExp" metWriter="some.class" args="some args"/>
  <!--...-->
   </dir>
</output>
{code}

This was a conscious decision and discuss by Brian Foster and myself and others 
on several occasions:

https://issues.apache.org/jira/browse/OODT-426
http://markmail.org/message/oe5tmutu374wqldb

I support Brian's implementation but I think we took a step back in not 
offering backwards compatibility that simply:

1. still reads the pgeConfig.xml files tags above and then;
2. constructs the appropriate AutoDetectCrawler and RenamingConventions and 
other plumbing behind the scenes.

Note one of the key features that becomes important in these situations is to 
have CAS-PGE job directories contain the metadata files serialized for offline 
inspection in case there are errors. Currently we lost support for that (as 
evidenced by the removal of the met key MET_FILE_EXT). I am also going to add 
that back in, and simply subclass AutoDetectProductCrawler in cas-pge, and then 
override its crawling step to also serialize the met files it generates. 

That will get us back to full forwards and backwards compat support starting in 
0.7 for *all* versions of CAS-PGE pgeConfig.xml files. wish me luck!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to