Re: Data updates in debian packages

2016-10-31 Thread Russ Allbery
Ole Streicher  writes:

> We need it to put correct time on astronomical registrations, so it is
> most important to have them once they are effective. Having them in
> advance would be an additional plus, however, since f.e. a computer may
> be disconnected during/after the observation, if that happens on a place
> without internet connection.

Christian's data is excellent.  I can also add that, having followed the
project for a while, I think it's pretty safe to assume that tzdata will
have the updated leap seconds in a released version and therefore in a
Debian stable package before the leap second takes effect with a high
level of reliability, and probably at least several months before.

I was mostly worried if you needed the data ASAP after an IERS
announcement, since the leap second data is sufficiently ancillary to the
tz project that it probably wouldn't, by itself, trigger a new release.
So the update would wait for some other time zone change to be rolled into
a release.

-- 
Russ Allbery (r...@debian.org)   



Re: Data updates in debian packages

2016-10-31 Thread Christian Seiler
On 10/31/2016 10:30 AM, Ole Streicher wrote:
>  debian/triggers --
> interest /usr/share/zoneinfo/leap-seconds.list
> ---8<--
> 
> However, I now get the following error when I try to update tzdata:
> 
> dpkg: cycle found while processing triggers:
>  chain of packages whose triggers are or may be responsible:
>   casacore-data-tai-utc -> casacore-data-tai-utc
>  packages' pending triggers which are or may be unresolvable:
>   casacore-data-tai-utc: /usr/share/zoneinfo/leap-seconds.list
> dpkg: error processing package casacore-data-tai-utc (--configure):
>  triggers looping, abandoned
> Errors were encountered while processing:
>  casacore-data-tai-utc
> 
> What is my mistake here?

Well, if your package Depends: on tzdata, then you created a cycle:
tzdata wants to trigger your package, but your package depends on
tzdata.

What you'll want to do is

interest-noawait ...

instead of

interest ...

A detailed explanation is man 5 deb-triggers together with
/usr/share/doc/dpkg-dev/triggers.txt.gz, but it's not easy to grok.
However, the recommendation in man 5 deb-triggers is something you
should follow, i.e. use -noawait triggers unless you really need
-await triggers for some reason.

Regards,
Christian



Re: Data updates in debian packages

2016-10-31 Thread Ole Streicher
Paul Wise  writes:
> On Sat, Oct 29, 2016 at 8:45 PM, Ole Streicher wrote:
>> The package in question (casacore) wants them in a specific format "CASA
>> table" (which is uniformly used within that package), and dependent
>> packages access this in that specific format. The only way would be to
>> create this table from another leap second table (instead of our current
>> source usno.navy.mil), and to update this every time the original table
>> is updated (which I would have to learn how to do this).
>
> You can use dpkg triggers to update files in response to packages
> updating other files.

I tried this, namely (the source package and has only one binary
package):

 debian/triggers --
interest /usr/share/zoneinfo/leap-seconds.list
---8<--

 debian/postinst --
#!/bin/sh

set -e

case "$1" in
triggered|configure)
casacore-update-tai_utc
;;
abort-upgrade|abort-remove|abort-deconfigure)
;;
*)
echo "postinst called with unknown argument \`$1'" >&2
exit 1
;;
esac

#DEBHELPER#
---8<--

However, I now get the following error when I try to update tzdata:

dpkg: cycle found while processing triggers:
 chain of packages whose triggers are or may be responsible:
  casacore-data-tai-utc -> casacore-data-tai-utc
 packages' pending triggers which are or may be unresolvable:
  casacore-data-tai-utc: /usr/share/zoneinfo/leap-seconds.list
dpkg: error processing package casacore-data-tai-utc (--configure):
 triggers looping, abandoned
Errors were encountered while processing:
 casacore-data-tai-utc

What is my mistake here?

Best regards

Ole



Re: Data updates in debian packages

2016-10-31 Thread Ole Streicher
Christian Seiler  writes:
> On 10/31/2016 09:07 AM, Ole Streicher wrote:
[leap seconds]
>> We need it to put correct time on astronomical registrations, so it is
>> most important to have them once they are effective. Having them in
>> advance would be an additional plus, however, since f.e. a computer may
>> be disconnected during/after the observation, if that happens on a place
>> without internet connection.
>
> Data might help here, so I've looked at the past 3 leap seconds that
> were introduced [...]
>
> What this does say is that stable/updates and oldstable (LTS) had
> updated leap seconds information slightly less than 3 months before
> the leap second, in some cases even a bit earlier. [...]
>
> Hope this information helps in you evaluating this.

Thank you very much for this detailed information! This helps a lot for
the decision (we will depend on tzdata), and it gives also a good
argument for discussion upstream.

Best regards

Ole



Re: Data updates in debian packages

2016-10-31 Thread Christian Seiler
On 10/31/2016 09:07 AM, Ole Streicher wrote:
> Russ Allbery  writes:
>> The required timeliness depends a lot on what you're using leap seconds
>> for, and in particular if you need to know about them far in advance, or
>> if it's only necessary to have an updated table before the leap second
>> itself arrives.
> 
> We need it to put correct time on astronomical registrations, so it is
> most important to have them once they are effective. Having them in
> advance would be an additional plus, however, since f.e. a computer may
> be disconnected during/after the observation, if that happens on a place
> without internet connection.

Data might help here, so I've looked at the past 3 leap seconds that
were introduced (I don't think it makes sense to go further back,
because the one before that was 2009, and that's probably too long
ago to draw conclusions):

Leap second | Jun 2012 | Jun 2015 | Dec 2016
+--+--+-
IERS ann.   |   2012-01-05 |   2015-01-05 |   2016-07-06
tzdata rel. | 2012a 2012-03-01 | 2015a 2015-01-29 | 2016g 2016-09-13
sid | 2012b 2012-03-06 | 2015a 2015-01-31 | 2016g 2016-09-28
stable  | 2016c 2012-05-05 | 2015a 2015-02-01 | 2016g 2016-10-03
stable PR   |   2012-05-12 |   2015-09-05 |   not yet
|  |  (now oldstable) | 
oldstable   | (Lenny EOL)  | 2015c 2015-04-17 | 2016h 2016-10-26

"stable" means stable/updates (former volatile), "stable PR" means
the stable point release that gathered up the all stable/updates,
stable-security and stable/proposed-updates and "oldstable" means
squeeze-lts and wheezy-security. (In both cases they were already LTS,
no leap second in the last 6 years has fallen into a window where we
had oldstable not being LTS.)

Note that the "stable PR" metric just shows you that you don't want
to run a system that needs up to date leap seconds data without
having stable/updates enabled, just because point releases are too
infrequent. (But that would apply to a new package tracking just
the leap seconds data from IERS as well.)

What this does say is that stable/updates and oldstable (LTS) had
updated leap seconds information slightly less than 3 months before
the leap second, in some cases even a bit earlier. If we are going
to assume that in a perfect storm this might be a bit worse, then I
think one can say that roughly 2 months in advance of a leap second
any officially supported Debian version will have updated an tzdata
package. (If you enable the proper repositories.)

(Btw. leap-seconds.list was only introduced upstream in 2013, and
packaged in the binary package in 2015; before that only the binary
rules files for each time zone contained the leap second info. See
. However, since this is used by
DSA, this is going to be kept around.)

Hope this information helps in you evaluating this.

Regards,
Christian



Re: Data updates in debian packages

2016-10-31 Thread Ole Streicher
Russ Allbery  writes:
> The required timeliness depends a lot on what you're using leap seconds
> for, and in particular if you need to know about them far in advance, or
> if it's only necessary to have an updated table before the leap second
> itself arrives.

We need it to put correct time on astronomical registrations, so it is
most important to have them once they are effective. Having them in
advance would be an additional plus, however, since f.e. a computer may
be disconnected during/after the observation, if that happens on a place
without internet connection.

Best regards

Ole



Re: Data updates in debian packages

2016-10-30 Thread Russ Allbery
Christian Seiler  writes:
> On 10/30/2016 10:20 AM, Ole Streicher wrote:

>> IETF is responsible for internet standards, not for leap seconds. They
>> will take the leap seconds from IERS. I would assume that this
>> connection is well-established to rely on it. I was not so much
>> questioning upstream here, but I worry a bit about the Debian package
>> for tzdata: how sure can I be that the tzdata is actual (wrt upstream)?

> Regular stable updates (via stable/updates, not only point releases)
> happen for that package, in addition to regular uploads to unstable.
> See the timeline in:
> https://tracker.debian.org/pkg/tzdata

> From what I can tell, this is probably the package that's updated in
> stable most consistently in the entirety of Debian. I would really
> recommend that you rely on tzdata directly, this will also save the
> release team a lot of work. (It's much easier for them to approve just a
> single package than 100 packages that need the time zone and/or leap
> second information.)

Speaking as a long-time lurker of the tz mailing list, I would recommend
confirming with upstream that they intend to be a timely (enough) source
for leap second information, as I believe that has been a bit
controversial.  Note that leap second information is used in the tzdata
package for a fairly ancillary purpose (the maintenance of the "right"
time zones, which almost no one uses), and is not a primary goal of the
project.

tzdata definitely just takes a copy of leap second information from IERS
(actually, I think they may pull the data from NIST, which gets it from
IERS).  IERS is the correct upstream source.

If many Debian packages want a high-quality, timely source of leap
seconds, it might be better to have a separate package devoted to that, so
that any update timeliness is not entangled with issues with tzdata.  That
said, tzdata is, as mentioned, a very reliably updated package in Debian
stable releases, so if upstream is willing, maybe it's fine to rely on
that.

The required timeliness depends a lot on what you're using leap seconds
for, and in particular if you need to know about them far in advance, or
if it's only necessary to have an updated table before the leap second
itself arrives.

-- 
Russ Allbery (r...@debian.org)   



Re: Data updates in debian packages

2016-10-30 Thread Ole Streicher
On 30.10.2016 04:38, Paul Wise wrote:
> On Sat, Oct 29, 2016 at 8:45 PM, Ole Streicher wrote:
>> The update script itself could even be distributed with the casacore
>> package itself. And for simplicity I would make
>> casacore-data-autoupdater a binary package within the casacore source
>> package (since this is the main dependency anyway).
>>
>> Comments on that? What would be the best dependency specification then?
>> casacore-data-autoupdater "suggests" casacore-data-XXX and/ore vice-versa?
> 
> casacore-data-autoupdater Enhances: casacore-data-XXX

Isn't this redundant? I always thought that "Enhances" is just the
reverse of "Suggests" ("A enhances B  <=> B suggests A").
The disadvantage of "Enhances" would be that it would need to know which
packages there are -- so every time a new data package is added, we
would need to update the updater package.

> casacore-data-XXX Recommends: casacore-data-autoupdater

This would raise privacy concerns, since recommended packages are
installed by default, and this one would connect to some .mil domain
servers. Why not "suggests"?

>>> Make sure that any security/privacy consequences of the non-apt update
>>> method are dealt with.
>>
>> If you have comments on my proposal, please comment.
> 
> I don't know enough about the formats and the download processes to comment.

Formats, download processes and further processing are data dependent
(and therefore part of the casacore-data-XXX package). The autoupdater
would just execute the update scripts that need to be provided by the
individual packages.

Best regards

Ole



Re: Data updates in debian packages

2016-10-30 Thread Christian Seiler
On 10/30/2016 10:20 AM, Ole Streicher wrote:
> IETF is responsible for internet standards, not for leap seconds. They
> will take the leap seconds from IERS. I would assume that this
> connection is well-established to rely on it. I was not so much
> questioning upstream here, but I worry a bit about the Debian package
> for tzdata: how sure can I be that the tzdata is actual (wrt upstream)?

Regular stable updates (via stable/updates, not only point releases)
happen for that package, in addition to regular uploads to unstable.
See the timeline in:
https://tracker.debian.org/pkg/tzdata

>From what I can tell, this is probably the package that's updated in
stable most consistently in the entirety of Debian. I would really
recommend that you rely on tzdata directly, this will also save the
release team a lot of work. (It's much easier for them to approve
just a single package than 100 packages that need the time zone
and/or leap second information.)

Regards,
Christian



Re: Data updates in debian packages

2016-10-30 Thread Ole Streicher
On 30.10.2016 04:42, Paul Wise wrote:
> On Sun, Oct 30, 2016 at 4:36 AM, Ole Streicher wrote:
> 
>> The canonical source for leap seconds is the IERS. Our current plan was
>> to take the leap second list from there and build our package from this
>> (as it is done in the casacore-data upstream). This guaranteed that we
>> always have the actual definition (... as long as we do our updated
>> package ASAP).
>>
>> When we switch that to tzdata, then we get the leap second from a place
>> that is not strictly the original source, but may have some delay: first
>> the tzdata upstream package needs to be updated, and then it needs to be
>> packaged (... and possibly backported).
>>
>> So my question is: how safe is it to assum that this whole process is
>> quick (let's say: a few weeks)? If someone works later on Stretch and
>> has an outdated leap second, this could cause problems. Especially if he
>> has no direct information about the actuality of the leap second
>> definition (which he would have in the case of an leap second package
>> taking the value directly from IERS -- we could use the date of the
>> announcement as version number there).
> 
> Where does the IERS data come from?

IERS is the instance which actually decides about the leap second,
namely by this file:

ftp://hpiers.obspm.fr/iers/bul/bulc/bulletinc.dat

I couldn't find the original source now, but see f.e. wikipedia: "Among
its other functions, the IERS is responsible for announcing leap seconds."

> I think the tzdata version of the data comes from the IETF:

IETF is responsible for internet standards, not for leap seconds. They
will take the leap seconds from IERS. I would assume that this
connection is well-established to rely on it. I was not so much
questioning upstream here, but I worry a bit about the Debian package
for tzdata: how sure can I be that the tzdata is actual (wrt upstream)?

Best regards

Ole



Re: Data updates in debian packages

2016-10-29 Thread Paul Wise
On Sun, Oct 30, 2016 at 4:36 AM, Ole Streicher wrote:

> The canonical source for leap seconds is the IERS. Our current plan was
> to take the leap second list from there and build our package from this
> (as it is done in the casacore-data upstream). This guaranteed that we
> always have the actual definition (... as long as we do our updated
> package ASAP).
>
> When we switch that to tzdata, then we get the leap second from a place
> that is not strictly the original source, but may have some delay: first
> the tzdata upstream package needs to be updated, and then it needs to be
> packaged (... and possibly backported).
>
> So my question is: how safe is it to assum that this whole process is
> quick (let's say: a few weeks)? If someone works later on Stretch and
> has an outdated leap second, this could cause problems. Especially if he
> has no direct information about the actuality of the leap second
> definition (which he would have in the case of an leap second package
> taking the value directly from IERS -- we could use the date of the
> announcement as version number there).

Where does the IERS data come from?

I think the tzdata version of the data comes from the IETF:

https://www.ietf.org/timezones/data/leap-seconds.list

I would suggest discussing it with the tzdata maintainer and tzdata
upstream. It may be that you end up packaging the leap seconds data in
a new package, or it may be that you end up leaving them in the tzdata
package.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise



Re: Data updates in debian packages

2016-10-29 Thread Paul Wise
On Sat, Oct 29, 2016 at 8:45 PM, Ole Streicher wrote:

> The package in question (casacore) wants them in a specific format "CASA
> table" (which is uniformly used within that package), and dependent
> packages access this in that specific format. The only way would be to
> create this table from another leap second table (instead of our current
> source usno.navy.mil), and to update this every time the original table
> is updated (which I would have to learn how to do this).

You can use dpkg triggers to update files in response to packages
updating other files.

> however it worries me a bit that leap seconds are not directly mentioned
> there. How sure can one be that they will be installed in-time?

They are in the tzdata package, you can use Depends to ensure it is installed.

> What would be the default place?
> /var/lib/casacore/data?

Sounds good.

> What I would do here is a separate package "casacore-data-autoupdater"
> that provides that service for all installed casacore-data-XXX packages.
> That package would install itself into /etc/cron.daily and, when called,

Please ensure that all the systems that install this package do not do
the download at the same time, to spread the load out a bit. Check out
these files in the apt package for a good way to do that, which avoids
cron jobs sleeping for ages on systemd-based systems and uses random
sleeps with cron on other systems:

/etc/cron.daily/apt-compat
/lib/systemd/system/apt-daily.service
/lib/systemd/system/apt-daily.timer
/usr/lib/apt/apt.systemd.daily

> The update script itself could even be distributed with the casacore
> package itself. And for simplicity I would make
> casacore-data-autoupdater a binary package within the casacore source
> package (since this is the main dependency anyway).
>
> Comments on that? What would be the best dependency specification then?
> casacore-data-autoupdater "suggests" casacore-data-XXX and/ore vice-versa?

casacore-data-autoupdater Enhances: casacore-data-XXX
casacore-data-XXX Recommends: casacore-data-autoupdater

>> Make sure that any security/privacy consequences of the non-apt update
>> method are dealt with.
>
> If you have comments on my proposal, please comment.

I don't know enough about the formats and the download processes to comment.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise



Re: Data updates in debian packages

2016-10-29 Thread Ole Streicher
Ben Finney  writes:
> Ole Streicher  writes:
>> How sure can one be that they will be installed in-time?
>
> This confuses me too. If the file is installed, you have the
> leap-seconds data for the installed version of ‘tzdata’.
>
> So I think I don't understand. What specific concern do you have about
> the leap seconds data from the ‘tzdata’ package?

The canonical source for leap seconds is the IERS. Our current plan was
to take the leap second list from there and build our package from this
(as it is done in the casacore-data upstream). This guaranteed that we
always have the actual definition (... as long as we do our updated
package ASAP).

When we switch that to tzdata, then we get the leap second from a place
that is not strictly the original source, but may have some delay: first
the tzdata upstream package needs to be updated, and then it needs to be
packaged (... and possibly backported).

So my question is: how safe is it to assum that this whole process is
quick (let's say: a few weeks)? If someone works later on Stretch and
has an outdated leap second, this could cause problems. Especially if he
has no direct information about the actuality of the leap second
definition (which he would have in the case of an leap second package
taking the value directly from IERS -- we could use the date of the
announcement as version number there).

Best regards

Ole



Re: Data updates in debian packages

2016-10-29 Thread Ben Finney
Ole Streicher  writes:

> Probably the canonical source would be:
>
> > tzdata: /usr/share/zoneinfo/leap-seconds.list

I agree, it would be good for a package to build its custom leap-seconds
data starting from that canonical source.

> however it worries me a bit that leap seconds are not directly mentioned
> there.

I don't understand that statement. Reading that file (from ‘tzdata’
version “2016h-1”), I see extensive discussion of leap seconds in the
large header of the file.

What mention are you expecting but not seeing, and how does the header
block in that file not constitute “direct mention of leap seconds”?

> How sure can one be that they will be installed in-time?

This confuses me too. If the file is installed, you have the
leap-seconds data for the installed version of ‘tzdata’.

So I think I don't understand. What specific concern do you have about
the leap seconds data from the ‘tzdata’ package?

-- 
 \ “Nature hath given men one tongue but two ears, that we may |
  `\  hear from others twice as much as we speak.” —Epictetus, |
_o__)  _Fragments_ |
Ben Finney



Re: Data updates in debian packages

2016-10-29 Thread Ole Streicher
Hi Paul,

On 29.10.2016 03:37, Paul Wise wrote:
> On Fri, Oct 28, 2016 at 6:38 PM, Ole Streicher wrote:
>> We have the problem (I am not sure whether I posted about this already),
>> that the "casacore" package needs additional "casacore-data-XXX"
>> packages, providing the basic data to work with casacore. Some of the
>> data are almost immutable, others (for example leap seconds) are
>> changing every year or so, and others change quite rapidly (high
>> precision ephemides forecasts). They all can be downloaded from some FTP
>> servers.
> 
> FYI leap seconds are already packaged multiple times in Debian, so
> please do not add another copy of them.

The package in question (casacore) wants them in a specific format "CASA
table" (which is uniformly used within that package), and dependent
packages access this in that specific format. The only way would be to
create this table from another leap second table (instead of our current
source usno.navy.mil), and to update this every time the original table
is updated (which I would have to learn how to do this).

Probably the canonical source would be:

> tzdata: /usr/share/zoneinfo/leap-seconds.list

however it worries me a bit that leap seconds are not directly mentioned
there. How sure can one be that they will be installed in-time?

>> How should the update service work? Can it just overwrite the existing
>> files? How one should handle if an update (with possibly older data) in
>> installed to not downgrade the data?
> 
> Check out pciutils/usbutils and similar.
> 
> Essentially:
> 
> Make the applications look in /var by default.
> 
> Put the packaged data in /usr/share.
> 
> Have the postinst symlink from /var to /usr/share when the /var
> location is missing or older than the /usr/share location.

Looks like a plan ;-) I'll start there. What would be the default place?
/var/lib/casacore/data?

> Have an update script that can be run by the sysadmin or from cron
> that downloads the latest version and atomically replaces the data in
> the /var location.

What I would do here is a separate package "casacore-data-autoupdater"
that provides that service for all installed casacore-data-XXX packages.
That package would install itself into /etc/cron.daily and, when called,
check the age of each installed data table and update if necessary.
Having this service centralized would avoid a debconf script for each
package to ask the user several times for if he wants to auto-update
that table.

The name and the description of the package would make it clear that it
will access the data via net.

The update script itself could even be distributed with the casacore
package itself. And for simplicity I would make
casacore-data-autoupdater a binary package within the casacore source
package (since this is the main dependency anyway).

Comments on that? What would be the best dependency specification then?
casacore-data-autoupdater "suggests" casacore-data-XXX and/ore vice-versa?

> Make sure that any security/privacy consequences of the non-apt update
> method are dealt with.

If you have comments on my proposal, please comment.

Best regards

Ole



Re: Data updates in debian packages

2016-10-28 Thread Paul Wise
On Fri, Oct 28, 2016 at 8:09 PM, Alec Leamas wrote:

> I wonder i the very idea to package this data is the correct one. What if
> you instead package an update service which is able to download all required
> data and make it available somewhere under /var/lib i. e., not in any
> package at all? And then adds some packages from which you can seed this
> service in situations where it's needed: off-line, tests, persistent data
> etc...

This leads to a package that is completely broken when installed on an
offline system from a mirror on DVD or external hard drive.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise



Re: Data updates in debian packages

2016-10-28 Thread Paul Wise
On Fri, Oct 28, 2016 at 6:38 PM, Ole Streicher wrote:

> We have the problem (I am not sure whether I posted about this already),
> that the "casacore" package needs additional "casacore-data-XXX"
> packages, providing the basic data to work with casacore. Some of the
> data are almost immutable, others (for example leap seconds) are
> changing every year or so, and others change quite rapidly (high
> precision ephemides forecasts). They all can be downloaded from some FTP
> servers.

FYI leap seconds are already packaged multiple times in Debian, so
please do not add another copy of them.

tzdata: /usr/share/zoneinfo/leap-seconds.list
swe-basic-data: /usr/share/libswe/ephe/seleapsec.txt
ptpd: /usr/share/ptpd/leap-seconds.list.28dec2015
pike7.8-core: /usr/lib/pike7.8/modules/Calendar.pmod/tzdata/leapseconds
pike8.0-core: /usr/lib/pike8.0/modules/Calendar.pmod/tzdata/leap-seconds.list
pike8.0-core: /usr/lib/pike8.0/modules/Calendar.pmod/tzdata/leapseconds
libjs-jquery-flot-docs:
/usr/share/doc/libjs-jquery-flot-docs/examples/axes-time-zones/tz/leapseconds
mariadb-test-data:
/usr/share/mysql/mysql-test/std_data/mysql5613mysql/time_zone_leap_second.MYD
mariadb-test-data:
/usr/share/mysql/mysql-test/std_data/mysql5613mysql/time_zone_leap_second.MYI
mariadb-test-data:
/usr/share/mysql/mysql-test/std_data/mysql5613mysql/time_zone_leap_second.frm

> How should the update service work? Can it just overwrite the existing
> files? How one should handle if an update (with possibly older data) in
> installed to not downgrade the data?

Check out pciutils/usbutils and similar.

Essentially:

Make the applications look in /var by default.

Put the packaged data in /usr/share.

Have the postinst symlink from /var to /usr/share when the /var
location is missing or older than the /usr/share location.

Have an update script that can be run by the sysadmin or from cron
that downloads the latest version and atomically replaces the data in
the /var location.

Make sure that any security/privacy consequences of the non-apt update
method are dealt with.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise



Re: Data updates in debian packages

2016-10-28 Thread Alec Leamas



On 28/10/16 12:38, Ole Streicher wrote:

Hi,


My question is now how to provide a good and consistent packaging:
Usually, one would just put the data into a package. This works nicely
for the immutable data, and reasonably for the slowly changing data. The
fast changing data shall be available for all people, but not everyone
needs a daily update. So, for consistency, and to have them available in
CI and build time tests, I would like to also package them directly, but
then to provide an (optional) update service.

How should the update service work? Can it just overwrite the existing
files? How one should handle if an update (with possibly older data) in
installed to not downgrade the data?



I wonder i the very idea to package this data is the correct one. What 
if you instead package an update service which is able to download all 
required data and make it available somewhere under /var/lib i. e., not 
in any package at all? And then adds some packages from which you can 
seed this service in situations where it's needed: off-line, tests, 
persistent data etc...



Cheers!

--alec