Re: [Rpm-ecosystem] Some points about zchunk

2018-07-09 Thread Jonathan Dieter
On Mon, 2018-07-09 at 08:59 +, Michael Schroeder wrote:
> I tought about this a bit more over the weekend, and maybe we
> should do this in a bit more general way. Basically zchunk is
> just another compression format, like "xz" or "zstd". If we
> want to support yet another compression format, we proably wouldn't
> want to add new attributes to the existing elements, but instead
> add new elements. E.g.
> 
> 
>   
>   ...
> 
> 
>   
>   ...
> 
> 
> We might also want to add a "format" attribute in case we want
> to get switch from "xml" to something that can be parsed faster,
> like "json".
> 
> The zchunk compression format would be the same, but with added
> "header-size" and "header-checksum" elements (so back to what
> you had earier):
> 
> 
>   
>   ...
>   ...
>   ...
>   ...
>   ...
>   ...
>   ...
> 
> 
> The problem with all this is that we don't know how all the
> repomd.xml parsers behave when there are multiple  elements
> with the same type, so we might need to annotate the "type" with
> the compression/format, e.g. "primary@zchunk".

I had originally planned to do something along these lines (I think I
used primary-zck rather than primary@zchunk), but realized that this
pushed the "choose best format" code into the top-level tools, rather
than leaving the decision in librepo.

I suppose if librepo grew the ability to understand that primary@zchunk
 matches primary, it could work, but that would take some work, I
think.

What would be worth the effort would be switching back to header-size
and header-checksum, and making sure that createrepo can create zchunk-
only metadata as well as the current plan of zchunk+gz metadata.  In
other words, we only use zck-loc and zck-timestamp if it's zchunk+gz.

Jonathan
___
Rpm-ecosystem mailing list
Rpm-ecosystem@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-ecosystem


Re: [Rpm-ecosystem] Is there anything I can do to help zchunk reviews along?

2018-07-09 Thread Florian Festi
On 06/29/2018 01:09 PM, Jonathan Dieter wrote:
> Ok, I've put together an initial proposal at https://fedoraproject.org/
> wiki/Changes/Zchunk_Metadata.
In case you need another argument why this is important:

Fedora is still growing at a linear or may be slightly above linear
rate. So the amount of meta data is going to continue to increase.

I updated my "Growth of Fedora" document I created in 2011 after the
release of Fedora 14. IIRC my estimation back then was that we need to
be able to handle the meta data of around 50,000 packages to be save for
the next 5 years. This lines up pretty nicely with the plot which you
can find on sheet 2 of the document.

Florian

-- 

Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Paul Argiry, Charles Cachera, Michael Cunningham,
Michael O'Neill


Fedora-Statistics.ods
Description: application/vnd.oasis.opendocument.spreadsheet
___
Rpm-ecosystem mailing list
Rpm-ecosystem@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-ecosystem


Re: [Rpm-ecosystem] Some points about zchunk

2018-07-09 Thread Michael Schroeder
On Sun, Jul 08, 2018 at 07:45:36PM +0100, Jonathan Dieter wrote:
> On Fri, 2018-07-06 at 11:48 +, Michael Schroeder wrote:
> > On Thu, Jul 05, 2018 at 08:07:58PM +0300, Jonathan Dieter wrote:
> > > My proposal is here:
> > > https://www.jdieter.net/downloads/zchunk/repomd.dtd
> > > 
> > > In summary, I'm just adding extra zchunk attributes to the main file
> > > element:
> > > zck-location
> > > header-checksum
> > > header-size
> > > zck-timestamp
> > > 
> > > librepo first downloads header-size of the file and then verifies that
> > > the header checksum matches and is valid.
> > 
> > Please use zck-header-checksum and zck-header-size instead.
> 
> Ok, will do.

I tought about this a bit more over the weekend, and maybe we
should do this in a bit more general way. Basically zchunk is
just another compression format, like "xz" or "zstd". If we
want to support yet another compression format, we proably wouldn't
want to add new attributes to the existing elements, but instead
add new elements. E.g.


  
  ...


  
  ...


We might also want to add a "format" attribute in case we want
to get switch from "xml" to something that can be parsed faster,
like "json".

The zchunk compression format would be the same, but with added
"header-size" and "header-checksum" elements (so back to what
you had earier):


  
  ...
  ...
  ...
  ...
  ...
  ...
  ...


The problem with all this is that we don't know how all the
repomd.xml parsers behave when there are multiple  elements
with the same type, so we might need to annotate the "type" with
the compression/format, e.g. "primary@zchunk".

Cheers,
  Michael.

-- 
Michael Schroeder   m...@suse.de
SUSE LINUX GmbH,   GF Jeff Hawn, HRB 16746 AG Nuernberg
main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);}
___
Rpm-ecosystem mailing list
Rpm-ecosystem@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-ecosystem