Hello Andreas,
I updated blends_metadata_gathener.py
>From first intuition I would think it might make sense to add single
> paragraphs to
> the configfile, like
>
> blends-all
> blend-med
> blend-edu
> blend-gis
> blend-...
>
> I added the above paragraphs inside config-ullman.yaml.
The gathener with blends-all runs for each available Blend else it runs for
the selected blend.
I created the single blend paragraphs using <<: *blends-conf in case we
need to override any of the blends-all attributes.
Each Blend now has each own log file by the name :
blends_metadata_gatherer-BLEND.log
In case the gathener fails before he updates any blend it logs into a
blends_metadata_gatherer-default.log file.
For checking if a task file has changed I added a "hashkey" column in the
blends_tasks. When a task is imported I save a md5 hash in the
blends_tasks. Before I delete and add from scratch a taskfile I checked
whether its hashkey has changed. So if you run once the new gathener in
order to save some first hashkeys then it will only delete/adds the changed
tasks.
In the above case I could not delete and readd the Blend entry from
blends_metadata table (because of the references in blends_tasks etc) so I
check whether a Blends exists. If it exists I update the entry to save any
changes else I use the blends_metadata_insert to create a new entry.
You can test the gathener. Any feedback/comments is more than welcome :-).
I will now check on the following (quoting from a previous mail of yours):
c) try to make the insertion procedure itself more efficient by for
instance:
- check, whether we could speed up the check for a package that
just exists in UDD
- inject all packages in one rush
Kind regards
Emmanouil