Re: [pacman-dev] [PATCH] repo-add: don't break if delta package sources contain epoch

2019-01-04 Thread Allan McRae
On 5/1/19 9:23 am, Dave Reisner wrote:
> On Fri, Jan 04, 2019 at 11:20:04AM +1000, Allan McRae wrote:
>> On 28/12/18 9:15 am, Eli Schwartz wrote:
>>> Our sed parser for xdelta3 headers will greedily match on ":" which
>>> coincidentally is also the character we use to define a version with an
>>> epoch.
>>>
>>> While we are at it, simply use sed for the whole pipeline, rather than
>>> using both grep and sed.
>>>
>>> Fixes FS#61195
>>>
>>> Signed-off-by: Eli Schwartz 
>>
>> OK.
>>
>> As an aside, the fact we are rewriting package version parsing code in
>> bash highlights again that repo-add really should be converted to a
>> libalpm program.
> 
> Not sure I understand this -- isn't the problem here the crappy output
> format of xdelta3? We're not really parsing versions, just contending
> with the human readable, colon-delimited table that we're forced to use.
> Rewriting against libalpm won't fix anything here, though if we were to
> rewrite in C, we could link to the "hidden" xdelta3 library or implement
> our own crappy vcdiff parser (eww, please no).

Yes - my initial through was this was fixing parsing of versions from
nicer output, but after I sent that emails I saw that was confounded
with xdelta output "goodness"...

> I don't want to open a can of worms (maybe I do), but I think delta
> handling is the least of the problems with repo-add. I'm aware of many
> users (myself previously included) that would either replace repo-add
> entirely, or wrap it substantially in order to make it truly useful. One
> such example is respose [0]. If we want to consider a rewrite, maybe we
> should rethink our approach to repo management and make a better tool,
> not just an equivalent one.

I agree.  I was going to come back to the design of the new replacement
after the writer had been added to libalpm.  I like repose, but I found
it does not really work if you wanted more than one repo in a package
directory.  It is very focused to a single workflow.  Saying that, I
think we can do better than the current repo-add which is far from great...

Allan


Re: [pacman-dev] [PATCH] Add [ignored] to -Qu output for packages in repos that are not Usage = Upgrade

2019-01-04 Thread Andrew Gregory
On 01/04/19 at 02:21pm, Allan McRae wrote:
> The behaviour of "pacman -Qu" was very strange...  It would only consider
> packages from repos with Usage = Search (or All), and ignore those with
> Usage = Sync, Install or Upgrade.
> 
> This is because alpm_sync_newversion() used ALPM_DB_USAGE_SEARCH for its
> filtering. Given this function is documented (at least in the source) to
> "Check for new version of pkg in sync repos", I would expect that to look at
> all repos. However, just changing this parameter, would result in a fairly
> silent change in behaviour of this function.  To counter that, add a parameter
> to the function that tells it which databases usage levels to consider.
> 
> Finally, list all available updates in -Qu output, but include [ignored] 
> beside
> those that will not be updated in a -Su operation due to thier repo Usage
> value (in addition to those that are Ignored).
> 
> Fixes FS#59854.
> 
> With thanks to the following who provided initial patches to print [ignored] 
> on
> -Qu operations, which highlighted the larger problem:
> morganamilo 
> Michael Straube 
> 
> Signed-off-by: Allan McRae 
> ---
> 
> In comments on earlier patches, there was debate about what the
> alpm_sync_newversion() is doing.   Although I expect users of this function
> to likely always pass ALPM_DB_USAGE_ALL, I think changing the function
> signature is the safest way to handle this change.
> 
> Comments?

I was initially on board with this approach, but the more I think
about it, the more convinced I am that having a function that takes an
explicit list of db's filter them based on usage is a fundamentally
bad design.  As I said in the initial discussion, these are low-level
functions that may be used in different contexts, in which case
different usage filters would be appropriate.  For example, this patch
uses newversion with USAGE_ALL because it's being used for purely
informational purposes.  Manjaro's syncfirst patch, however, uses it
to actually prepare an upgrade, making USAGE_UPGRADE the correct
filter.

But, requiring the caller to provide a usage everywhere would defeat
the purpose of making repo usages a back-end feature.  We might as
well just have the caller do the filtering.  Given that there are only
a handful of usage levels, the filtered lists could even be cached,
completely saving us the hassle of iterating over db's that won't get
used.

I think the best way to fulfill the original goal of adding repo
usages would be to move the usage filtering to higher level functions
that have more narrowly defined purposes and take a handle rather than
a db list.  This would allow callers that just want to "do the right
thing" to not have to worry about usages while allowing callers with
more specific needs to use the low-level functions without having to
worry about what extra filtering alpm might be doing behind the
scenes.  It would also result in a cleaner API that doesn't have
functions ignoring databases they were explicitly told to use.