[pacman-dev] [PATCH v2] Merge expac into src/pacman

2018-07-10 Thread Dave Reisner
This introduces code which has long been maintained at:

https://github.com/falconindy/expac

>From the README:

expac is a data extraction tool for alpm databases. It features
printf-like flexibility and aims to be used as a simple tool for other
pacman based utilities which don't link against the library. It uses
pacman.conf as a config file for locating and loading your local and
sync databases.
---
v2:

* use alpm_pkgfrom_t instead of homegrown enum
* silently skip output when strftime fails
* split input from stdin on newlines only

 doc/Makefile.am|   7 +-
 doc/expac.1.asciidoc   | 179 +
 src/pacman/.gitignore  |   2 +
 src/pacman/Makefile.am |   8 +-
 src/pacman/expac.c | 858 +
 5 files changed, 1051 insertions(+), 3 deletions(-)
 create mode 100644 doc/expac.1.asciidoc
 create mode 100644 src/pacman/expac.c

diff --git a/doc/Makefile.am b/doc/Makefile.am
index 2ac38cba..38f7077b 100644
--- a/doc/Makefile.am
+++ b/doc/Makefile.am
@@ -16,7 +16,8 @@ MANPAGES = \
makepkg.conf.5 \
pacman.conf.5 \
libalpm.3 \
-   BUILDINFO.5
+   BUILDINFO.5 \
+   expac.1
 
 DOXYGEN_MANS = $(wildcard man3/*.3)
 
@@ -32,7 +33,8 @@ HTML_MANPAGES = \
PKGBUILD.5.html \
makepkg.conf.5.html \
pacman.conf.5.html \
-   libalpm.3.html
+   libalpm.3.html \
+   expac.1.html
 
 HTML_OTHER = \
index.html \
@@ -61,6 +63,7 @@ EXTRA_DIST = \
pacman.conf.5.asciidoc \
BUILDINFO.5.asciidoc \
libalpm.3.asciidoc \
+   expac.1.asciidoc \
footer.asciidoc \
index.asciidoc \
submitting-patches.asciidoc \
diff --git a/doc/expac.1.asciidoc b/doc/expac.1.asciidoc
new file mode 100644
index ..f5ec3adf
--- /dev/null
+++ b/doc/expac.1.asciidoc
@@ -0,0 +1,179 @@
+expac(1)
+=
+
+Name
+
+expac - alpm data extraction utility
+
+Synopsis
+
+'expac'   [options]
+
+Description
+---
+expac is a data extraction tool for alpm databases. It features printf-like
+flexibility and aims to be used as a simple tool for other pacman based
+utilities which don't link against the library. It uses pacman.conf as a config
+file for locating and loading your local and sync databases.
+
+Invoking expac consists of supplying a format string, which is generally
+described by one to many of the formatting tokens (see the 'FORMATTING'
+section), any relevant options and zero to many targets. The format string
+'must' be the first non-option argument. Targets can be a simple package name,
+a query string (in the case of a search), or in repo/package syntax when the
+-sync option is supplied.
+
+
+Options
+---
+*-Q, \--query*::
+   Search the local database for provided targets. This is the default 
behavior.
+
+*-S, \--sync*::
+   Search the sync databases for provided targets.
+
+*-s, --search*::
+   Search for packages matching the strings specified by targets. This is a
+   boolean AND query and regex is allowed.
+
+*-g, --group*::
+   Return packages matching the specified targets as package groups.
+
+*\--config* ::
+   Read from  for alpm initialization instead of .
+
+*-H, \--humansize* ::
+   Format package sizes in SI units according to . Valid options are:
++
+B, K, M, G, T, P, E, Z, Y
+
+*-1, \--readone*::
+   Stop searching after the first result. This only has an effect on -S 
operations
+   without -s.
+
+*-d, \--delim *::
+   Separate each package with the specified . The default value is 
a
+   newline character.
+
+*-l, \--listdelim* ::
+   Separate each list item with the specified . Lists are any 
interpreted
+   sequence specified with a capital letter. The default value is two 
spaces.
+
+*-p, \--file*::
+   Interpret targets as paths to local files.
+
+*-t, \--timefmt* ::
+   Output time described by the specified . This string is passed 
directly
+   to linkman:strftime[3]. The default format is %c.
+
+*-v, \--verbose*::
+   Output more. `Package not found' errors will be shown, and empty field 
values
+   will display as 'None'.
+
+*-h, \--help*::
+
+Display the help message.
+
+*-V, \--version*::
+
+Display version information.
+
+Formatting
+--
+
+The format argument allows the following interpreted sequences:
+
+  %Bbackup files
+
+  %Cconflicts with (no version strings)
+
+  %Ddepends on
+
+  %Edepends on (no version strings)
+
+  %Ffiles (only with -Q)
+
+  %Ggroups
+
+  %Hconflicts with
+
+  %Llicenses
+
+  %Nrequired by
+
+  %Ooptional deps
+
+  %ooptional deps (no descriptions)
+
+  %Pprovides
+
+  %Rreplaces (no version strings)
+
+  %Treplaces
+
+  %Sprovides (no version strings)
+
+  %aarchitecture
+
+  %bbuild date
+
+  %ddescription
+
+  %epackage base
+
+  %ffilename (only with -S)
+
+  %gbase64 encoded PGP signature (only with -S)
+
+  %hsha256sum

Re: [pacman-dev] [PATCH 4/4] Merge expac into src/pacman

2018-07-10 Thread Dave Reisner
On Mon, Jul 09, 2018 at 06:37:35PM -0400, Andrew Gregory wrote:
> On 07/05/18 at 10:42am, Dave Reisner wrote:
> > This introduces code which has long been maintained at:
> > 
> > https://github.com/falconindy/expac
> > 
> > From the README:
> > 
> > expac is a data extraction tool for alpm databases. It features
> > printf-like flexibility and aims to be used as a simple tool for other
> > pacman based utilities which don't link against the library. It uses
> > pacman.conf as a config file for locating and loading your local and
> > sync databases.
> > ---
> > I expect that I'll get some pushback on the use of #pragma here. I don't
> > really know how portable it is to other compilers. Open to suggestions
> > on how to better avoid this (though I'd prefer not to disable the
> > warnings entirely for the file).
> 
> I'm not thrilled about adding #pragma to our codebase, but I don't see
> a better solution.  I would think it would be at least as portable as
> the function attributes we use elsewhere.
> 
> Just looking at expac.c, I don't see anything major, although I didn't
> look all that hard because I assume expac is pretty well tested at
> this point.  Since this is a previously external project, I'm fine if
> you want to incorporate changes as further patches to keep this one as
> close to original as possible.
> 

Happy to make most of these changes up front since your suggested
changes are fairly minor and very reasonable. Thanks for the review.

> >  doc/Makefile.am|   7 +-
> >  doc/expac.1.asciidoc   | 179 +
> >  src/pacman/.gitignore  |   2 +
> >  src/pacman/Makefile.am |   8 +-
> >  src/pacman/expac.c | 864 +
> >  5 files changed, 1057 insertions(+), 3 deletions(-)
> >  create mode 100644 doc/expac.1.asciidoc
> >  create mode 100644 src/pacman/expac.c
> 
> ...
> 
> > diff --git a/src/pacman/expac.c b/src/pacman/expac.c
> > new file mode 100644
> > index ..332414a0
> > --- /dev/null
> > +++ b/src/pacman/expac.c
> > @@ -0,0 +1,864 @@
> 
> ...
> 
> > +
> > +typedef enum package_corpus_t {
> > +   CORPUS_LOCAL,
> > +   CORPUS_SYNC,
> > +   CORPUS_FILE,
> > +} package_corpus_t;
> 
> Is there a reason you don't use the PKG_FROM constants from alpm?
> 

Nope. It's fine to reuse alpm_pkgfrom_t here.

> ...
> 
> > +static int print_time(time_t timestamp) {
> > +   char buffer[64];
> > +   int out = 0;
> > +
> > +   if(!timestamp) {
> > +   if(opt_verbose) {
> > +   out += printf("None");
> > +   }
> > +   return out;
> > +   }
> > +
> > +#pragma GCC diagnostic push
> > +#pragma GCC diagnostic ignored "-Wformat-nonliteral"
> > +   /* no overflow here, strftime prints a max of 64 including null */
> > +   strftime([0], 64, opt_timefmt, localtime());
> > +#pragma GCC diagnostic pop
> > +
> > +   out += printf("%s", buffer);
> 
> I'm still trying to figure out how how the hell to correctly use
> strftime with its multiple undefined behaviors and lack of an error
> indicator, but we should at least skip the printf if it returns 0.  In
> that case, either there is nothing to print or buffer contains
> garbage.
> 
> ...
> 

Sounds good, though silently failing isn't my favorite. At least the
glibc implementation of strftime doesn't leave garbage in the buffer on
overflow. It'd also be lousy to print something like "error: timefmt
overflow" on every single error.

I've wanted for a while to do more sanity checking of the format strings
up front so that we can fail hard (and once) before gathering packages
and printing the results. I guess this is another situation that would
fix.

For now, let's just go with your suggestion and avoid printing an
undefined buffer on failure. In nearly 8 years (I can't believe it's
been that long) no one has complained that they've experienced
surprising behavior as a result of overflowing the 64 byte time format.

> > +static int read_targets_from_file(FILE *in, alpm_list_t **targets)
> > +{
> > +   char line[BUFSIZ];
> > +   int i = 0, end = 0, targets_added = 0;
> > +
> > +   while(!end) {
> > +   line[i] = fgetc(in);
> > +
> > +   if(feof(in))
> > +   end = 1;
> > +
> > +   if(isspace(line[i]) || end) {
> 
> This should be line-delimited to match pacman rather than
> space-delimited.
> 

Done.

> > +   line[i] = '\0';
> > +   /* avoid adding zero length arg, if multiple spaces 
> > separate args */
> > +   if(i > 0) {
> > +   if(!alpm_list_find_str(*targets, line)) {
> > +   *targets = alpm_list_add(*targets, 
> > strdup(line));
> > +   }
> > +   i = 0;
> > +   ++targets_added;
> > +   }
> > +   } else {
> > +   ++i;
> > +   if(i >= BUFSIZ) {
> > +   fprintf(stderr, "error: buffer