manifest processing has a very limited usecase. Why can't it be
processed using a PlainTextEntityProcessor and write a Tranformer to
read lines using regex?



--Noble

On Mon, Mar 9, 2009 at 8:30 PM, Fergus McMenemie <fer...@twig.me.uk> wrote:
> Hello,
>
> I have almost finished a new DIH EntityProcessor which
> I am calling the manifestEnityProcessor. It is designed
> around the idea that whatever demon is used to maintain
> your set of a few 100,000 xml documents it is likely to
> drop a report or log file explaining what has been changed
> within your content store. This assumes a file based
> content repository.
>
> The manifestEnityProcessor is used as follows
>
>       <entity name="jc"
>               processor="ManifestEntityProcessor"
>               baseDir="/Volumes/Techmore/ts/aaa/schema/data"
>               rootEntity="false"
>               dataSource="null"
>
>               allowRegex="^.*\.xml$"
>               manifestFileName="/Volumes/ts/man-find.txt"
>               manifestAddRegex="(.*)$"
>               >
>
> The idea is you have a log file or other report, perhaps
> from tar or zip, and you wish to use this to control the
> indexing of the new content. The new entity fields are as
> follows.
>
> manifestFileName is the name of the manifest file. If
>                 this value is relative, it assumed to
>                 be relative to baseDir. Required.
>
> manifestAddRegex is a required regex to identify lines
>                 which when matched should cause docs to
>                 be added to the index.
>
> manifestDelRegex is an optional value of a regex to
>                 identify documents which when matched should
>                 be deleted from the index **PLANNED**
>
> allowRegex       a required regex to identify the portion
>                 of the ADD/DELete line identified above
>                 which contains the file or pathname to
>                 ADDed or DELeted. If the resulting value
>                 relative, it assumed to be relative to
>                 baseDir.
>
> What do I do next?
>   Raise a JIRA issue and add the code?
>   Is DIH the right place to add this?
>   Suggestions for a different name?
>   Suggestions on how to do the delete bitty from within an entity?
>
> Regards Fergus.
>
>
>
> --
>
> ===============================================================
> Fergus McMenemie               Email:fer...@twig.me.uk
> Techmore Ltd                   Phone:(UK) 07721 376021
>
> Unix/Mac/Intranets             Analyst Programmer
> ===============================================================
>



-- 
--Noble Paul

Reply via email to