bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-19 Thread Maxim Cournoyer
Hi Ricardo!

Ricardo Wurmus  writes:

> Hi Maxim,
>
> I would very much like to see your improvements to the pypi importer to
> be merged.  Have you been able to separate the independent changes as
> suggested by Ludo?

I'm thrilled that someone has an interest in this :-)

I took my time, but finally got around to restructure the changes a
bit.  I hope it'll be easier to review this time around!

Thank you!

Maxim

From 54e44b7397f17910d95dbdb233d23e5c97c095aa Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer 
Date: Thu, 28 Mar 2019 00:26:00 -0400
Subject: [PATCH 1/9] import: pypi: Do not consider requirements.txt files.

* guix/import/pypi.scm (guess-requirements): Update comment.
[guess-requirements-from-source]: Do not attempt to parse the file
requirements.txt.  Streamline logic.
---
 guix/import/pypi.scm | 35 +--
 tests/pypi.scm   | 23 +++
 2 files changed, 24 insertions(+), 34 deletions(-)

diff --git a/guix/import/pypi.scm b/guix/import/pypi.scm
index 3a20fc4b9b..8269aa61d7 100644
--- a/guix/import/pypi.scm
+++ b/guix/import/pypi.scm
@@ -206,35 +206,26 @@ cannot determine package dependencies"))
   (call-with-temporary-directory
(lambda (dir)
  (let* ((pypi-name (string-take dirname (string-rindex dirname #\-)))
-(req-files (list (string-append dirname "/requirements.txt")
- (string-append dirname "/" pypi-name ".egg-info"
-"/requires.txt")))
-(exit-codes (map (lambda (file-name)
-   (parameterize ((current-error-port (%make-void-port "rw+"))
-  (current-output-port (%make-void-port "rw+")))
- (system* "tar" "xf" tarball "-C" dir file-name)))
- req-files)))
-   ;; Only one of these files needs to exist.
-   (if (any zero? exit-codes)
-   (match (find-files dir)
- ((file . _)
-  (read-requirements file))
- (()
-  (warning (G_ "No requirements file found.\n"
+(requires.txt (string-append dirname "/" pypi-name
+ ".egg-info" "/requires.txt"))
+(exit-code (parameterize ((current-error-port (%make-void-port "rw+"))
+  (current-output-port (%make-void-port "rw+")))
+ (system* "tar" "xf" tarball "-C" dir requires.txt
+   (if (zero? exit-code)
+   (read-requirements (string-append dir "/" requires.txt))
(begin
- (warning (G_ "Failed to extract requirements files\n"))
+ (warning
+  (G_ "Failed to extract file: ~a from source.~%")
+  requires.txt)
  '())
   '(
 
-  ;; First, try to compute the requirements using the wheel, since that is the
-  ;; most reliable option. If a wheel is not provided for this package, try
-  ;; getting them by reading either the "requirements.txt" file or the
-  ;; "requires.txt" from the egg-info directory from the source tarball. Note
-  ;; that "requirements.txt" is not mandatory, so this is likely to fail.
+  ;; First, try to compute the requirements using the wheel, else, fallback to
+  ;; reading the "requires.txt" from the egg-info directory from the source
+  ;; tarball.
   (or (guess-requirements-from-wheel)
   (guess-requirements-from-source)))
 
-
 (define (compute-inputs source-url wheel-url tarball)
   "Given the SOURCE-URL of an already downloaded TARBALL, return a list of
 name/variable pairs describing the required inputs of this package.  Also
diff --git a/tests/pypi.scm b/tests/pypi.scm
index 6daa44a6e7..335be42644 100644
--- a/tests/pypi.scm
+++ b/tests/pypi.scm
@@ -23,7 +23,7 @@
   #:use-module (gcrypt hash)
   #:use-module (guix tests)
   #:use-module (guix build-system python)
-  #:use-module ((guix build utils) #:select (delete-file-recursively which))
+  #:use-module ((guix build utils) #:select (delete-file-recursively which mkdir-p))
   #:use-module (srfi srfi-64)
   #:use-module (ice-9 match))
 
@@ -55,11 +55,10 @@
 (define test-source-hash
   "")
 
-(define test-requirements
-"# A comment
- # A comment after a space
+(define test-requires.txt "\
 bar
-baz > 13.37")
+baz > 13.37
+")
 
 (define test-metadata
   "{
@@ -107,10 +106,10 @@ baz > 13.37")
  (match url
("https://example.com/foo-1.0.0.tar.gz";
 (begin
-  (mkdir "foo-1.0.0")
-  (with-output-to-file "foo-1.0.0/requirements.txt"
+  (mkdir-p "foo-1.0.0/foo.egg-info/")
+  (wi

bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-20 Thread Ludovic Courtès
Hello!

Maxim Cournoyer  skribis:

> Ricardo Wurmus  writes:
>
>> Hi Maxim,
>>
>> I would very much like to see your improvements to the pypi importer to
>> be merged.  Have you been able to separate the independent changes as
>> suggested by Ludo?
>
> I'm thrilled that someone has an interest in this :-)
>
> I took my time, but finally got around to restructure the changes a
> bit.  I hope it'll be easier to review this time around!

I’ll let Ricardo comment.  For my part, I see that it has tests, and
that’s enough to make me happy.  :-)

Thanks for improving the importer!

Ludo’.





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-21 Thread Maxim Cournoyer
Greetings!

Ludovic Courtès  writes:

> Hello!
>
> Maxim Cournoyer  skribis:
>
>> Ricardo Wurmus  writes:
>>
>>> Hi Maxim,
>>>
>>> I would very much like to see your improvements to the pypi importer to
>>> be merged.  Have you been able to separate the independent changes as
>>> suggested by Ludo?
>>
>> I'm thrilled that someone has an interest in this :-)
>>
>> I took my time, but finally got around to restructure the changes a
>> bit.  I hope it'll be easier to review this time around!
>
> I’ll let Ricardo comment.  For my part, I see that it has tests, and
> that’s enough to make me happy.  :-)

OK!

> Thanks for improving the importer!

My pleasure! One less itch to scratch ;-)

Maxim





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-27 Thread Ricardo Wurmus


Hi Maxim,

> Subject: [PATCH 1/9] import: pypi: Do not consider requirements.txt files.
>
> * guix/import/pypi.scm (guess-requirements): Update comment.
> [guess-requirements-from-source]: Do not attempt to parse the file
> requirements.txt.  Streamline logic.

Why remove the handling of the requirements.txt?  Is it no longer
popular enough to expect its availability in the source archives?

Please also mention in the commit message that and how you adjusted the
tests.  You removed the comments from the example requires.txt — are
comments no longer permitted in these files?  If they are, please don’t
include those changes.

--
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-27 Thread Ricardo Wurmus
Hi Maxim,

on to patch number 2!

> From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer 
> Date: Thu, 28 Mar 2019 00:26:00 -0400
> Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements from
>  source.
>
> * guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
> (guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
> and rename it to PARSE-REQUIRES.TXT.  Move the CLEAN-REQUIREMENT and COMMENT?
> functions inside the READ-REQUIREMENTS procedure.
> (parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
> parsing optional requirements.
>
> * tests/pypi.scm (test-requires-with-sections): New variable.
> ("parse-requires.txt, with sections"): New test.
> ("pypi->guix-package"): Mute tar output to stdout.

The commit log does not match the changes.  CLEAN-REQUIREMENT is now a
top-level procedure, not a local procedure inside of READ-REQUIREMENTS
as reported in the commit message.  Which is correct?

> +  (call-with-input-file requires.txt
> +(lambda (port)
> +  (let loop ((result '()))
> +(let ((line (read-line port)))
> +  ;; Stop when a section is encountered, as sections contains 
> optional

Should be “contain”.

> +  ;; (extra) requirements.  Non-optional requirements must appear
> +  ;; before any section is defined.
> +  (if (or (eof-object? line) (section-header? line))
> +  (reverse result)
> +  (cond
> +   ((or (string-null? line) (comment? line))
> +(loop result))
> +   (else
> +(loop (cons (clean-requirement line)
> +result))
> +

I think it would be better to use “match” here instead of nested “let”,
“if” and “cond”.  At least you can drop the “if” and just use cond.

The loop let and the inner let can be merged.


> +(define (parse-requires.txt requires.txt)
> +  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
> +requirement names."
> +  ;; This is a very incomplete parser, which job is to select the 
> non-optional

“which” –> “whose”

> +  ;; dependencies and strip them out of any version information.
> +  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
> +  ;; library and the requirements grammar defined by PEP-0508
> +  ;; (https://www.python.org/dev/peps/pep-0508/).

Let’s remove the sentence starting with “Alternatively…”.  We could do
that but we didn’t :)

> +  (define (section-header? line)
> +;; Return #t if the given LINE is a section header, #f otherwise.
> +(let ((trimmed-line (string-trim line)))
> +  (and (not (string-null? trimmed-line))
> +   (eq? (string-ref trimmed-line 0) #\[
> +

How about using string-prefix? instead?  This looks more complicated
than it deserves.  You can get rid of string-null? and eq? and string-ref
and all that.

Same here:

> +  (define (comment? line)
> +;; Return #t if the given LINE is a comment, #f otherwise.
> +(eq? (string-ref (string-trim line) 0) #\#))

I’d just use string-prefix? here.

> +(define (clean-requirement s)
> +  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
> +  ;; file, remove everything other than the actual name of the required
> +  ;; package, and return it.
> +  (string-take s (or (string-index s (lambda (chr)
> +   (member chr '(#\space #\> #\= #\<
> + (string-length s

“string-take” with “string-length” is not very elegant.  The char
predicate in string-index could better be a char set:

--8<---cut here---start->8---
(define (clean-requirement s)
 (cond
  ((string-index s (char-set #\space #\> #\= #\<)) => (cut string-take s <>))
  (else s)))
--8<---cut here---end--->8---


> ("pypi->guix-package"): Mute tar output to stdout.

Finally, I think it would be better to keep this separate because it’s
really orthogonal to the other changes in this patch.

What do you think?

-- 
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-27 Thread Ricardo Wurmus
Patch number 3!

> From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer 
> Date: Thu, 28 Mar 2019 00:26:01 -0400
> Subject: [PATCH 3/9] import: pypi: Improve parsing of requirement
>  specifications.
>
> The previous solution was fragile and could leave unwanted characters in a
> requirement name, such as '[' or ']'.

Wouldn’t it be sufficient to add [ and ] to the list of forbidden
characters?  The tests pass with this implementation of
clean-requirements:

(define (clean-requirements s)
 (cond
  ((string-index s (char-set #\space #\> #\= #\< #\[ #\])) => (cut string-take 
s <>))
  (else s)))

> +(define %requirement-name-regexp
> +  ;; Regexp to match the requirement name in a requirement specification.
> +
> +  ;; Some grammar, taken from PEP-0508 (see:
> +  ;; https://www.python.org/dev/peps/pep-0508/).
> +
> +  ;; The unified rule can be expressed as:
> +  ;; specification = wsp* ( url_req | name_req ) wsp*
> +
> +  ;; where url_req is:
> +  ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
> +
> +  ;; and where name_req is:
> +  ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
> +
> +  ;; Thus, we need only matching NAME, which is expressed as:
> +  ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
> +  ;; identifier= letterOrDigit identifier_end*
> +  ;; name  = identifier
> +  (let* ((letter-or-digit "[A-Za-z0-9]")
> + (identifier-end (string-append "(" letter-or-digit "|"
> +"[-_.]*" letter-or-digit ")"))
> + (identifier (string-append "^" letter-or-digit identifier-end "*"))
> + (name identifier))
> +(make-regexp name)))

This seems a little too complicated.  Translating a grammar into a
regexp is probably not a good idea in general.  Since we don’t care
about anything other than the name it seems easier to just chop off
the string tail as soon as we find an invalid character.

-- 
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-27 Thread Ricardo Wurmus
> From 76e4a3150f8126e0b952c6129b6e1371afba80c0 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer 
> Date: Thu, 28 Mar 2019 00:26:01 -0400
> Subject: [PATCH 4/9] import: pypi: Deduplicate requirements.
>
> * guix/import/pypi.scm (parse-requires.txt): Remove potential duplicates.

This looks fine to me, but it is subject to changes that I requested to
the procedure in my comments to an earlier patch.

-- 
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-28 Thread Ricardo Wurmus
On to the next:

> From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer 
> Date: Thu, 28 Mar 2019 00:26:02 -0400
> Subject: [PATCH 5/9] import: pypi: Support more types of archives.
>
> This change enables the PyPI importer to look for requirements in a source
> archive of a different type than "tar.gz" or "tar.bz2".

Okay.

> * guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
> [archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
> archive is supported or not.

Nitpick: please use “...this.” and leave two spaces between sentences.

Typo: it should be COMPRESSED-FILE?

> [guess-requirements-from-source]: Adapt to use the new method, and use unzip
> to extract ZIP archives.

s/method/procedure/

Please also mention that “compute-inputs” has been adjusted.

> -  (define (tarball-directory url)
> -;; Given the URL of the package's tarball, return the name of the 
> directory
> +  (define (archive-root-directory url)
> +;; Given the URL of the package's archive, return the name of the 
> directory
>  ;; that will be created upon decompressing it. If the filetype is not
>  ;; supported, return #f.
> -;; TODO: Support more archive formats.
> -(let ((basename (substring url (+ 1 (string-rindex url #\/)
> -  (cond
> -   ((string-suffix? ".tar.gz" basename)
> -(string-drop-right basename 7))
> -   ((string-suffix? ".tar.bz2" basename)
> -(string-drop-right basename 8))
> -   (else
> +(if (compressed-file? url)
> +(let ((root-directory (file-sans-extension (basename url
> +  (if (string=? "tar" (file-extension root-directory))
> +  (file-sans-extension root-directory)
> +  root-directory))
>  (begin
> -  (warning (G_ "Unsupported archive format: \
> -cannot determine package dependencies"))
> -  #f)
> +  (warning (G_ "Unsupported archive format (~a): \
> +cannot determine package dependencies") (file-extension url))
> +  #f)))

I think the double application of file-sans-extension and the
intermediate variable name “root-directory” for something that is a file
is a little confusing, but I don’t have a better proposal (other than to
replace file-extension and file-sans-extension with a match expression).

>(define (read-wheel-metadata wheel-archive)
>  ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
> @@ -246,16 +243,20 @@ cannot determine package dependencies"))

>(define (guess-requirements-from-source)
>  ;; Return the package's requirements by guessing them from the source.
> -(let ((dirname (tarball-directory source-url)))
> +(let ((dirname (archive-root-directory source-url))
> +  (extension (file-extension source-url)))
>(if (string? dirname)
>(call-with-temporary-directory
> (lambda (dir)
>   (let* ((pypi-name (string-take dirname (string-rindex dirname 
> #\-)))
>  (requires.txt (string-append dirname "/" pypi-name
>   ".egg-info" 
> "/requires.txt"))
> -(exit-code (parameterize ((current-error-port 
> (%make-void-port "rw+"))
> -  (current-output-port 
> (%make-void-port "rw+")))
> - (system* "tar" "xf" tarball "-C" dir 
> requires.txt
> +(exit-code
> + (parameterize ((current-error-port (%make-void-port 
> "rw+"))
> +(current-output-port (%make-void-port 
> "rw+")))
> +   (if (string=? "zip" extension)
> +   (system* "unzip" archive "-d" dir requires.txt)
> +   (system* "tar" "xf" archive "-C" dir 
> requires.txt)

I guess this is why I’m not too happy with this: we’re checking in
multiple places if the format is supported but then forget about this
again until the next time we need to do something to the file.

I wonder if we could do better and answer the question just once.

--
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-28 Thread Ricardo Wurmus
Patch number 6:

> From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer 
> Date: Thu, 28 Mar 2019 00:26:03 -0400
> Subject: [PATCH 6/9] import: pypi: Parse wheel METADATA instead of
>  metadata.json.

> With newer Wheel releases, there is no more metadata.json file; the METADATA
> file should be used instead (see: https://github.com/pypa/wheel/issues/195).

> This change updates our PyPI importer so that it uses the later.

Typo: should be “latter” instead of “later”.

> * guix/import/pypi.scm (define-module): Remove unnecessary modules and export
>   the PARSE-WHEEL-METADATA method.

Please remove the indentation here.  Also, please don’t use “method”
(because it’s not); use “procedure” instead.

> (parse-wheel-metadata): Add method.

Same here.

> +  (define (requires-dist-header? line)
> +;; Return #t if the given LINE is a Requires-Dist header.
> +(regexp-match? (string-match "^Requires-Dist: " line)))
> +
> +  (define (requires-dist-value line)
> +(string-drop line (string-length "Requires-Dist: ")))
> +
> +  (define (extra? line)
> +;; Return #t if the given LINE is an "extra" requirement.
> +(regexp-match? (string-match "extra == " line)))

The use of “regexp-match?” here isn’t strictly necessary as the return
value is true-ish anyway.

> +  (call-with-input-file metadata
> +(lambda (port)
> +  (let loop ((requirements '()))
> +(let ((line (read-line port)))
> +  ;; Stop at the first 'Provides-Extra' section: the non-optional
> +  ;; requirements appear before the optional ones.
> +  (if (eof-object? line)
> +  (reverse (delete-duplicates requirements))
> +  (cond
> +   ((and (requires-dist-header? line) (not (extra? line)))
> +(loop (cons (specification->requirement-name
> + (requires-dist-value line))
> +requirements)))
> +   (else
> +(loop requirements)
> +

As before you can simplify the nested let and merge “if” and "cond“.

>(define (read-wheel-metadata wheel-archive)
>  ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
> -;; requirements.
> +;; requirements, or #f if the metadata file contained therein couldn't be
> +;; extracted.
>  (let* ((dirname (wheel-url->extracted-directory wheel-url))
> -   (json-file (string-append dirname "/metadata.json")))
> -  (and (zero? (system* "unzip" "-q" wheel-archive json-file))
> -   (dynamic-wind
> - (const #t)
> - (lambda ()
> -   (call-with-input-file json-file
> - (lambda (port)
> -   (let* ((metadata (json->scm port))
> -  (run_requires (hash-ref metadata "run_requires"))
> -  (requirements (if run_requires
> -(hash-ref (list-ref run_requires 
> 0)
> -   "requires")
> -'(
> - (map specification->requirement-name requirements)
> - (lambda ()
> -   (delete-file json-file)
> -   (rmdir dirname))
> +   (metadata (string-append dirname "/METADATA")))
> +  (call-with-temporary-directory
> +   (lambda (dir)
> + (if (zero? (system* "unzip" "-q" wheel-archive "-d" dir metadata))
> + (parse-wheel-metadata (string-append dir "/" metadata))
> + (begin
> +   (warning
> +(G_ "Failed to extract file: ~a from wheel.~%") metadata)
> +   #f))

The old approach took care of removing the unpacked archive no matter
what happened.  The new code doesn’t do that.

> --- a/tests/pypi.scm
> +++ b/tests/pypi.scm

Thanks for the tests!

--
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-28 Thread Ricardo Wurmus
Next up: Seven of Nine, tertiary adjunct of unimatrix zero one:

> From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer 
> Date: Thu, 28 Mar 2019 23:12:26 -0400
> Subject: [PATCH 7/9] import: pypi: Include optional test inputs as
>  native-inputs.
>
> * guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
> (test-section?): New predicate.
> (parse-requires.txt): Collect the optional test inputs, and return them as the
> second element of the returned list.

AFAICT parse-requires.txt now returns a list of pairs, but used to
return a plain list before.  Is this correct?

>  (define (parse-requires.txt requires.txt)
> -  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
> -requirement names."
> -  ;; This is a very incomplete parser, which job is to select the 
> non-optional
> -  ;; dependencies and strip them out of any version information.
> +  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of 
> requirements.
> +
> +The first element of the pair contains the required dependencies while the
> +second the optional test dependencies.  Note that currently, optional,
> +non-test dependencies are omitted since these can be difficult or expensive 
> to
> +satisfy."
> +
> +  ;; This is a very incomplete parser, which job is to read in the 
> requirement
> +  ;; specification lines, and strip them out of any version information.
>;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
>;; library and the requirements grammar defined by PEP-0508
>;; (https://www.python.org/dev/peps/pep-0508/).

Does it really return a pair?  Or a list of pairs?  Or is it a
two-element list of lists?

>(call-with-input-file requires.txt
>  (lambda (port)
> -  (let loop ((result '()))
> +  (let loop ((required-deps '())
> + (test-deps '())
> + (inside-test-section? #f)
> + (optional? #f))
>  (let ((line (read-line port)))
> -  ;; Stop when a section is encountered, as sections contains 
> optional
> -  ;; (extra) requirements.  Non-optional requirements must appear
> -  ;; before any section is defined.
> -  (if (or (eof-object? line) (section-header? line))
> +  (if (eof-object? line)
>;; Duplicates can occur, since the same requirement can be
>;; listed multiple times with different conditional markers, 
> e.g.
>;; pytest >= 3 ; python_version >= "3.3"
>;; pytest < 3 ; python_version < "3.3"
> -  (reverse (delete-duplicates result))
> +  (map (compose reverse delete-duplicates)
> +   (list required-deps test-deps))

Looks like a list of lists to me.  “delete-duplicates” now won’t delete
a name that is in both “required-deps” as well as in “test-deps”.  Is
this acceptable?

Personally, I’m not a fan of using data structures for returning
multiple values, because we can simply return multiple values.

Or we could have more than just strings.  The meaning of these strings
is provided by the bin into which they are thrown — either
“required-deps” or “test-deps”.  It could be an option to collect tagged
values instead and have the caller deal with filtering.

>  (define (parse-wheel-metadata metadata)
> -  "Given METADATA, a Wheel metadata file, return a list of requirement 
> names."
> +  "Given METADATA, a Wheel metadata file, return a pair of requirements.
> +
> +The first element of the pair contains the required dependencies while the 
> second the optional
> +test dependencies.  Note that currently, optional, non-test dependencies are
> +omitted since these can be difficult or expensive to satisfy."
>;; METADATA is a RFC-2822-like, header based file.

This sounds like this is going to duplicate the previous procedures.

>(define (requires-dist-header? line)
>  ;; Return #t if the given LINE is a Requires-Dist header.
> -(regexp-match? (string-match "^Requires-Dist: " line)))
> +(string-match "^Requires-Dist: " line))
>
>(define (requires-dist-value line)
>  (string-drop line (string-length "Requires-Dist: ")))
>
>(define (extra? line)
>  ;; Return #t if the given LINE is an "extra" requirement.
> -(regexp-match? (string-match "extra == " line)))
> +(string-match "extra == '(.*)'" line))

These hunks should be part of the previous patch where they were
introduced.  (See my comments there about “regexp-match?”.)

> +  (define (test-requirement? line)
> +(let ((extra-label (match:substring (extra? line) 1)))
> +  (and extra-label (test-section? extra-label

You can use “and=>” instead of binding a name:

(and=> (match:substring (extra? line) 1) test-section?)

>(call-with-input-file metadata
>  (lambda (port)
> -  (let loop ((requirements '()))
> +  (let loop ((required-deps '())
> + (test-deps '()))
>  

bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-28 Thread Ricardo Wurmus


> From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer 
> Date: Sat, 30 Mar 2019 23:13:26 -0400
> Subject: [PATCH 8/9] import: pypi: Scan source archive to find requires.txt
>  file.

> * guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
> (guess-requirements)[archive-root-directory]: Remove procedure.

Oh, I guess I reviewed this procedure in vain :(

Please modify the commits so that added procedures are not removed in
later commits.  This is easier on the reviewer and makes for a clearer
commit history.

>(define (guess-requirements-from-source)
>  ;; Return the package's requirements by guessing them from the source.
> -(let ((dirname (archive-root-directory source-url))
> -  (extension (file-extension source-url)))
> -  (if (string? dirname)
> -  (call-with-temporary-directory
> -   (lambda (dir)
> - (let* ((pypi-name (string-take dirname (string-rindex dirname 
> #\-)))
> -(requires.txt (string-append dirname "/" pypi-name
> - ".egg-info" 
> "/requires.txt"))
> -(exit-code
> - (parameterize ((current-error-port (%make-void-port 
> "rw+"))
> -(current-output-port (%make-void-port 
> "rw+")))
> -   (if (string=? "zip" extension)
> -   (system* "unzip" archive "-d" dir requires.txt)
> -   (system* "tar" "xf" archive "-C" dir 
> requires.txt)
> -   (if (zero? exit-code)
> -   (parse-requires.txt (string-append dir "/" requires.txt))
> -   (begin
> - (warning
> -  (G_ "Failed to extract file: ~a from source.~%")
> -  requires.txt)
> - (list '() '()))
> +(if (compressed-file? source-url)
> +(call-with-temporary-directory
> + (lambda (dir)
> +   (parameterize ((current-error-port (%make-void-port "rw+"))
> +  (current-output-port (%make-void-port "rw+")))
> + (if (string=? "zip" (file-extension source-url))
> + (invoke "unzip" archive "-d" dir)
> + (invoke "tar" "xf" archive "-C" dir)))
> +   (let ((requires.txt-files
> +  (find-files dir (lambda (abs-file-name _)
> + (string-match "\\.egg-info/requires.txt$"
> +  abs-file-name)
> + (if (> (length requires.txt-files) 0)

Let’s work on the empty list directly.  Here “match” would be better.

> + (begin
> +   (parse-requires.txt (first requires.txt-files)))

No need for “begin” here.

> + (begin (warning (G_ "Cannot guess requirements from source 
> archive:\
> + no requires.txt file found.~%"))
> +(list '() '()))

I know that this is from an earlier commit, but I don’t like the look of
“(list '() '())” at all :)

> +(begin
> +  (warning (G_ "Unsupported archive format; \
> +cannot determine package dependencies from source archive: ~a~%")
> +   (basename source-url))
>(list '() '()

Same here.  Certainly there’s a better return value.

--
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-28 Thread Ricardo Wurmus
And finally: Number 9!

> From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer 
> Date: Sat, 30 Mar 2019 20:27:35 -0400
> Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
>  pypi-uri.
>
> Fixes issue: #33046.

Please change this to:

Fixes .

> * guix/build-system/python.scm (pypi-uri): Update the host URI to
> "files.pythonhosted.org".
> * guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
> the source URL calls for it.

Is the first change to use files.pythonhosted.org required to fix this?
Or is this unrelated?

If it is required this looks fine to me.

Thank you!

-- 
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-05-29 Thread Maxim Cournoyer
Hello Ricardo!

Ricardo Wurmus  writes:

> And finally: Number 9!
>
>> From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer 
>> Date: Sat, 30 Mar 2019 20:27:35 -0400
>> Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
>>  pypi-uri.
>>
>> Fixes issue: #33046.
>
> Please change this to:
>
> Fixes .
>
>> * guix/build-system/python.scm (pypi-uri): Update the host URI to
>> "files.pythonhosted.org".
>> * guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
>> the source URL calls for it.
>
> Is the first change to use files.pythonhosted.org required to fix this?
> Or is this unrelated?
>
> If it is required this looks fine to me.
>
> Thank you!

Thank you for this thorough review!  I'll need some time
to go through it, and an upcoming travel will delay it some more, so
don't fret if you don't hear back from me before a couple days. Sorry!

I'll keep you posted.

Thanks a bunch!

Maxim





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-09 Thread Maxim Cournoyer
Hello Ricardo!

Ricardo Wurmus  writes:

> Hi Maxim,
>
>> Subject: [PATCH 1/9] import: pypi: Do not consider requirements.txt files.
>>
>> * guix/import/pypi.scm (guess-requirements): Update comment.
>> [guess-requirements-from-source]: Do not attempt to parse the file
>> requirements.txt.  Streamline logic.
>
> Why remove the handling of the requirements.txt?  Is it no longer
> popular enough to expect its availability in the source archives?
>
> Please also mention in the commit message that and how you adjusted the
> tests.

The commit message now explains the above:

import: pypi: Do not consider requirements.txt files.

PyPI packages are mandated to have a setup.py file, which contains a listing
of the required dependencies.  The setuptools/distutils machinery embed
metadata in the archives they produce, which contains this information. 
There
is no need nor gain to collect the requirements from a "requirements.txt"
file, as it is not the true record of dependencies for PyPI packages and may
contain extraneous requirements or not exist at all.

* guix/import/pypi.scm (guess-requirements): Update comment.
[guess-requirements-from-source]: Do not attempt to parse the file
requirements.txt.  Streamline logic.
* tests/pypi.scm (test-requires.txt): Rename from test-requirements, to hint
at the file being tested.
("pypi->guix-package"): Adapt so that the fake package contains a 
requires.txt
file rather than a requirements.txt file.
("pypi->guix-package, wheels"): Likewise.

> You removed the comments from the example requires.txt — are
> comments no longer permitted in these files?  If they are, please don’t
> include those changes.

The comments are now preserved. Thank you!

Maxim





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-09 Thread Maxim Cournoyer
Hello again!

Ricardo Wurmus  writes:

> Hi Maxim,
>
> on to patch number 2!

Yay!

>> From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer 
>> Date: Thu, 28 Mar 2019 00:26:00 -0400
>> Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements from
>>  source.
>>
>> * guix/import/pypi.scm: Export PARSE-REQUIRES.TXT.
>> (guess-requirements): Move the READ-REQUIREMENTS procedure to the top level,
>> and rename it to PARSE-REQUIRES.TXT.  Move the CLEAN-REQUIREMENT and COMMENT?
>> functions inside the READ-REQUIREMENTS procedure.
>> (parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prevent
>> parsing optional requirements.
>>
>> * tests/pypi.scm (test-requires-with-sections): New variable.
>> ("parse-requires.txt, with sections"): New test.
>> ("pypi->guix-package"): Mute tar output to stdout.
>
> The commit log does not match the changes.  CLEAN-REQUIREMENT is now a
> top-level procedure, not a local procedure inside of READ-REQUIREMENTS
> as reported in the commit message.  Which is correct?

Fixed.

>> +  (call-with-input-file requires.txt
>> +(lambda (port)
>> +  (let loop ((result '()))
>> +(let ((line (read-line port)))
>> +  ;; Stop when a section is encountered, as sections contains 
>> optional
>
> Should be “contain”.

Fixed.

>> +  ;; (extra) requirements.  Non-optional requirements must appear
>> +  ;; before any section is defined.
>> +  (if (or (eof-object? line) (section-header? line))
>> +  (reverse result)
>> +  (cond
>> +   ((or (string-null? line) (comment? line))
>> +(loop result))
>> +   (else
>> +(loop (cons (clean-requirement line)
>> +result))
>> +
>
> I think it would be better to use “match” here instead of nested “let”,
> “if” and “cond”.  At least you can drop the “if” and just use cond.
>
> The loop let and the inner let can be merged.

I'm not sure I understand; wouldn't merging the named let with the plain
let mean adding an extra LINE argument to my LOOP procedure?  I don't
want that.

Also, how could the above code be expressed using "match"? I'm using
predicates which tests for (special) characters in a string; I don't see
how the more primitive pattern language of "match" will enable me to do
the same.

>> +(define (parse-requires.txt requires.txt)
>> +  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
>> +requirement names."
>> +  ;; This is a very incomplete parser, which job is to select the 
>> non-optional
>
> “which” –> “whose”

Fixed, with due diligence reading on English grammar ;-)

>> +  ;; dependencies and strip them out of any version information.
>> +  ;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
>> +  ;; library and the requirements grammar defined by PEP-0508
>> +  ;; (https://www.python.org/dev/peps/pep-0508/).
>
> Let’s remove the sentence starting with “Alternatively…”.  We could do
> that but we didn’t :)

Alright; done!

>> +  (define (section-header? line)
>> +;; Return #t if the given LINE is a section header, #f otherwise.
>> +(let ((trimmed-line (string-trim line)))
>> +  (and (not (string-null? trimmed-line))
>> +   (eq? (string-ref trimmed-line 0) #\[
>> +
>
> How about using string-prefix? instead?  This looks more complicated
> than it deserves.  You can get rid of string-null? and eq? and string-ref
> and all that.
>
> Same here:
>
>> +  (define (comment? line)
>> +;; Return #t if the given LINE is a comment, #f otherwise.
>> +(eq? (string-ref (string-trim line) 0) #\#))
>
> I’d just use string-prefix? here.

Neat! Adjusted, for both counts.

>> +(define (clean-requirement s)
>> +  ;; Given a requirement LINE, as can be found in a setuptools requires.txt
>> +  ;; file, remove everything other than the actual name of the required
>> +  ;; package, and return it.
>> +  (string-take s (or (string-index s (lambda (chr)
>> +   (member chr '(#\space #\> #\= #\<
>> + (string-length s
>
> “string-take” with “string-length” is not very elegant.  The char
> predicate in string-index could better be a char set:
>
> (define (clean-requirement s)
>  (cond
>   ((string-index s (char-set #\space #\> #\= #\<)) => (cut string-take s <>))
>   (else s)))

That's nicer, thanks!

>> ("pypi->guix-package"): Mute tar output to stdout.
>
> Finally, I think it would be better to keep this separate because it’s
> really orthogonal to the other changes in this patch.

OK, done!

> What do you think?

All good points; I just don't understand the one about using a match
and/or merging the regular "let" with the named "let".

Thanks,

Maxim





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-10 Thread Maxim Cournoyer
Hello!

Ricardo Wurmus  writes:

> Patch number 3!

Yay!

>> From 0c62b541a3e8925b5ca31fe55dbe7536cf95151f Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer 
>> Date: Thu, 28 Mar 2019 00:26:01 -0400
>> Subject: [PATCH 3/9] import: pypi: Improve parsing of requirement
>>  specifications.
>>
>> The previous solution was fragile and could leave unwanted characters in a
>> requirement name, such as '[' or ']'.
>
> Wouldn’t it be sufficient to add [ and ] to the list of forbidden
> characters?  The tests pass with this implementation of
> clean-requirements:
>
> (define (clean-requirements s)
>  (cond
>   ((string-index s (char-set #\space #\> #\= #\< #\[ #\])) => (cut 
> string-take s <>))
>   (else s)))

Indeed this would be sufficient to make the tests pass, but the tests
don't cover all the cases; as an example, consider:

--8<---cut here---start->8---
argparse;python_version<"2.7"
--8<---cut here---end--->8---

While we could make it work with the current logic by adding more
invalid characters (such as ';' here) to the character set, it seems
less error prone to use the upstream provided regex to match a package
name.  [0]

>> +(define %requirement-name-regexp
>> +  ;; Regexp to match the requirement name in a requirement specification.
>> +
>> +  ;; Some grammar, taken from PEP-0508 (see:
>> +  ;; https://www.python.org/dev/peps/pep-0508/).
>> +
>> +  ;; The unified rule can be expressed as:
>> +  ;; specification = wsp* ( url_req | name_req ) wsp*
>> +
>> +  ;; where url_req is:
>> +  ;; url_req = name wsp* extras? wsp* urlspec wsp+ quoted_marker?
>> +
>> +  ;; and where name_req is:
>> +  ;; name_req = name wsp* extras? wsp* versionspec? wsp* quoted_marker?
>> +
>> +  ;; Thus, we need only matching NAME, which is expressed as:
>> +  ;; identifer_end = letterOrDigit | (('-' | '_' | '.' )* letterOrDigit)
>> +  ;; identifier= letterOrDigit identifier_end*
>> +  ;; name  = identifier
>> +  (let* ((letter-or-digit "[A-Za-z0-9]")
>> + (identifier-end (string-append "(" letter-or-digit "|"
>> +"[-_.]*" letter-or-digit ")"))
>> + (identifier (string-append "^" letter-or-digit identifier-end "*"))
>> + (name identifier))
>> +(make-regexp name)))
>
> This seems a little too complicated.  Translating a grammar into a
> regexp is probably not a good idea in general.  Since we don’t care
> about anything other than the name it seems easier to just chop off
> the string tail as soon as we find an invalid character.

While I agree that a regexp is a bigger hammer than basic string
manipulation, I see some merit to it here:

1) We can be assured of conformance with upstream, again, per PEP-0508.
2) It is easier to extend; we might want to add parsing for the version
spec in order to disregard dependencies specified for Python < 3, for
example.

The use of the PEP-0508 grammar to define the regexp is useful to detail
in a more human-friendly language the components of the regexp.  We
could have otherwise used the more cryptic regexp for Python
distribution names:

--8<---cut here---start->8---
^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$
--8<---cut here---end--->8---

So I guess that what I'm saying is that I prefer this approach to using
string-index with invalid characters, for the reasons above.

[0]  https://www.python.org/dev/peps/pep-0508/

Thanks!

Maxim





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-10 Thread Ricardo Wurmus


Maxim Cournoyer  writes:

> While I agree that a regexp is a bigger hammer than basic string
> manipulation, I see some merit to it here:
>
> 1) We can be assured of conformance with upstream, again, per PEP-0508.
> 2) It is easier to extend; we might want to add parsing for the version
> spec in order to disregard dependencies specified for Python < 3, for
> example.
>
> The use of the PEP-0508 grammar to define the regexp is useful to detail
> in a more human-friendly language the components of the regexp.  We
> could have otherwise used the more cryptic regexp for Python
> distribution names:
>
> --8<---cut here---start->8---
> ^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$
> --8<---cut here---end--->8---
>
> So I guess that what I'm saying is that I prefer this approach to using
> string-index with invalid characters, for the reasons above.
>
> [0]  https://www.python.org/dev/peps/pep-0508/

Okay, sounds good.  Please make sure to note this in a comment, so that
I won’t be asking myself this same question in a year :)

--
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-10 Thread Ricardo Wurmus


Maxim Cournoyer  writes:

>>> +  ;; (extra) requirements.  Non-optional requirements must appear
>>> +  ;; before any section is defined.
>>> +  (if (or (eof-object? line) (section-header? line))
>>> +  (reverse result)
>>> +  (cond
>>> +   ((or (string-null? line) (comment? line))
>>> +(loop result))
>>> +   (else
>>> +(loop (cons (clean-requirement line)
>>> +result))
>>> +
>>
>> I think it would be better to use “match” here instead of nested “let”,
>> “if” and “cond”.  At least you can drop the “if” and just use cond.
>>
>> The loop let and the inner let can be merged.
>
> I'm not sure I understand; wouldn't merging the named let with the plain
> let mean adding an extra LINE argument to my LOOP procedure?  I don't
> want that.

Let’s forget about merging the nested “let”, because you would indeed
need to change a few more things.  It’s fine to keep that as it is.  But
(if … (cond …)) is not pretty.  At least it could be done in one “cond”:

(cond
 ((or (eof-object? line) (section-header? line))
  (reverse result))
 ((or (string-null? line) (comment? line))
  (loop result))
 (else
  (loop (cons (clean-requirement line)
  result

> Also, how could the above code be expressed using "match"? I'm using
> predicates which tests for (special) characters in a string; I don't see
> how the more primitive pattern language of "match" will enable me to do
> the same.

“match” has support for predicates, so you could do something like this:

(match line
 ((or (eof-object) (? section-header?))
  (reverse result))
 ((or '() (? comment?))
  (loop result))
 (_ (loop (cons (clean-requirement line) result

This allows you to match “eof-object” and '() directly.  Whenever I see
“string-null?” I think it might be better to “match” on the empty list
directly.

But really, that’s up to you.  I only feel strongly about avoiding “(if
… (cond …))”.

--
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-10 Thread Maxim Cournoyer
Hello again!

Ricardo Wurmus  writes:

> On to the next:
>
>> From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer 
>> Date: Thu, 28 Mar 2019 00:26:02 -0400
>> Subject: [PATCH 5/9] import: pypi: Support more types of archives.
>>
>> This change enables the PyPI importer to look for requirements in a source
>> archive of a different type than "tar.gz" or "tar.bz2".
>
> Okay.
>
>> * guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
>> [archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
>> archive is supported or not.
>
> Nitpick: please use “...this.” and leave two spaces between sentences.

Done.

> Typo: it should be COMPRESSED-FILE?

Fixed.

>> [guess-requirements-from-source]: Adapt to use the new method, and use unzip
>> to extract ZIP archives.
>
> s/method/procedure/

Done.

> Please also mention that “compute-inputs” has been adjusted.

Done.

>> -  (define (tarball-directory url)
>> -;; Given the URL of the package's tarball, return the name of the 
>> directory
>> +  (define (archive-root-directory url)
>> +;; Given the URL of the package's archive, return the name of the 
>> directory
>>  ;; that will be created upon decompressing it. If the filetype is not
>>  ;; supported, return #f.
>> -;; TODO: Support more archive formats.
>> -(let ((basename (substring url (+ 1 (string-rindex url #\/)
>> -  (cond
>> -   ((string-suffix? ".tar.gz" basename)
>> -(string-drop-right basename 7))
>> -   ((string-suffix? ".tar.bz2" basename)
>> -(string-drop-right basename 8))
>> -   (else
>> +(if (compressed-file? url)
>> +(let ((root-directory (file-sans-extension (basename url
>> +  (if (string=? "tar" (file-extension root-directory))
>> +  (file-sans-extension root-directory)
>> +  root-directory))
>>  (begin
>> -  (warning (G_ "Unsupported archive format: \
>> -cannot determine package dependencies"))
>> -  #f)
>> +  (warning (G_ "Unsupported archive format (~a): \
>> +cannot determine package dependencies") (file-extension url))
>> +  #f)))
>
> I think the double application of file-sans-extension and the
> intermediate variable name “root-directory” for something that is a file
> is a little confusing, but I don’t have a better proposal (other than to
> replace file-extension and file-sans-extension with a match expression).

Done, w.r.t. using "match":

--8<---cut here---start->8---
@@ -198,10 +198,12 @@ be extracted in a temporary directory."
 ;; that will be created upon decompressing it. If the filetype is not
 ;; supported, return #f.
 (if (compressed-file? url)
-(let ((root-directory (file-sans-extension (basename url
-  (if (string=? "tar" (file-extension root-directory))
-  (file-sans-extension root-directory)
-  root-directory))
+(match (file-sans-extension (basename url))
+  (root-directory
+   (match (file-extension root-directory)
+ ("tar"
+  (file-sans-extension root-directory))
+ (_ root-directory
 (begin
   (warning (G_ "Unsupported archive format (~a): \
 cannot determine package dependencies") (file-extension url))
--8<---cut here---end--->8---


>>(define (read-wheel-metadata wheel-archive)
>>  ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
>> @@ -246,16 +243,20 @@ cannot determine package dependencies"))
>
>>(define (guess-requirements-from-source)
>>  ;; Return the package's requirements by guessing them from the source.
>> -(let ((dirname (tarball-directory source-url)))
>> +(let ((dirname (archive-root-directory source-url))
>> +  (extension (file-extension source-url)))
>>(if (string? dirname)
>>(call-with-temporary-directory
>> (lambda (dir)
>>   (let* ((pypi-name (string-take dirname (string-rindex dirname 
>> #\-)))
>>  (requires.txt (string-append dirname "/" pypi-name
>>   ".egg-info" 
>> "/requires.txt"))
>> -(exit-code (parameterize ((current-error-port 
>> (%make-void-port "rw+"))
>> -  (current-output-port 
>> (%make-void-port "rw+")))
>> - (system* "tar" "xf" tarball "-C" dir 
>> requires.txt
>> +(exit-code
>> + (parameterize ((current-error-port (%make-void-port 
>> "rw+"))
>> +(current-output-port (%make-void-port 
>> "rw+")))
>> +   (if (string=? "zip" extension)
>> +   (system* "unzip" archive "-d" dir requires.txt)
>> +   

bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-10 Thread Ricardo Wurmus


Hi Maxim,

thanks for your patience in addressing my comments.  I appreciate it!

>> I think the double application of file-sans-extension and the
>> intermediate variable name “root-directory” for something that is a file
>> is a little confusing, but I don’t have a better proposal (other than to
>> replace file-extension and file-sans-extension with a match expression).
>
> Done, w.r.t. using "match":
>
> --8<---cut here---start->8---
> @@ -198,10 +198,12 @@ be extracted in a temporary directory."
>  ;; that will be created upon decompressing it. If the filetype is not
>  ;; supported, return #f.
>  (if (compressed-file? url)
> -(let ((root-directory (file-sans-extension (basename url
> -  (if (string=? "tar" (file-extension root-directory))
> -  (file-sans-extension root-directory)
> -  root-directory))
> +(match (file-sans-extension (basename url))
> +  (root-directory
> +   (match (file-extension root-directory)
> + ("tar"
> +  (file-sans-extension root-directory))
> + (_ root-directory
>  (begin
>(warning (G_ "Unsupported archive format (~a): \
>  cannot determine package dependencies") (file-extension url))
> --8<---cut here---end--->8---

The first application of “match” matches anything.  What I had in mind
was really a slightly different approach, namely to split up the “url”
string at dots and then match the resulting list of strings.

Something like this:

  (match (string-split "hello.tar.gz" #\.)
   ((base "tar" (or "bz2" "gz")) base)
   ((base ext) base))

> I don't see much of a problem with the current design since there are
> two questions being answered:
>
> 1) What should be the directory name of the extracted package (retrieved
>from the base name of the archive).
> 2) What extractor should be used (zip vs tar).
>
> These two questions are orthogonal, and that the same primitive get used
> to answer both is an implementation, or rather, an optimization detail.
>
>> I wonder if we could do better and answer the question just once.
>
> The questions are different :-). We could optimize, but that would be at
> the price of expressiveness (squash the two questions into one solving
> space).

Okay.  I guess I’m too picky :)
I’d be happy if we could move the checks to tiny named procedures, but
it’s probably fine the way it is.

Thanks!

--
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-10 Thread Maxim Cournoyer
Hello!

Ricardo Wurmus  writes:

> Patch number 6:
>
>> From fb0547ef225103c0f8355a7eccc41e0d028f6563 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer 
>> Date: Thu, 28 Mar 2019 00:26:03 -0400
>> Subject: [PATCH 6/9] import: pypi: Parse wheel METADATA instead of
>>  metadata.json.
>
>> With newer Wheel releases, there is no more metadata.json file; the METADATA
>> file should be used instead (see: https://github.com/pypa/wheel/issues/195).
>
>> This change updates our PyPI importer so that it uses the later.
>
> Typo: should be “latter” instead of “later”.

Fixed.

>> * guix/import/pypi.scm (define-module): Remove unnecessary modules and export
>>   the PARSE-WHEEL-METADATA method.
>
> Please remove the indentation here.  Also, please don’t use “method”
> (because it’s not); use “procedure” instead.

Done. Thanks for fixing my terminology :-).

>> (parse-wheel-metadata): Add method.
>
> Same here.

Done.

>> +  (define (requires-dist-header? line)
>> +;; Return #t if the given LINE is a Requires-Dist header.
>> +(regexp-match? (string-match "^Requires-Dist: " line)))
>> +
>> +  (define (requires-dist-value line)
>> +(string-drop line (string-length "Requires-Dist: ")))
>> +
>> +  (define (extra? line)
>> +;; Return #t if the given LINE is an "extra" requirement.
>> +(regexp-match? (string-match "extra == " line)))
>
> The use of “regexp-match?” here isn’t strictly necessary as the return
> value is true-ish anyway.

Done.

>> +  (call-with-input-file metadata
>> +(lambda (port)
>> +  (let loop ((requirements '()))
>> +(let ((line (read-line port)))
>> +  ;; Stop at the first 'Provides-Extra' section: the non-optional
>> +  ;; requirements appear before the optional ones.
>> +  (if (eof-object? line)
>> +  (reverse (delete-duplicates requirements))
>> +  (cond
>> +   ((and (requires-dist-header? line) (not (extra? line)))
>> +(loop (cons (specification->requirement-name
>> + (requires-dist-value line))
>> +requirements)))
>> +   (else
>> +(loop requirements)
>> +
>
> As before you can simplify the nested let and merge “if” and "cond“.

Oh, I get it now, I think:

--8<---cut here---start->8---
 
   (call-with-input-file metadata
 (lambda (port)
   (let loop ((requirements '()))
-(let ((line (read-line port)))
-  ;; Stop at the first 'Provides-Extra' section: the non-optional
-  ;; requirements appear before the optional ones.
-  (if (eof-object? line)
-  (reverse (delete-duplicates requirements))
-  (cond
-   ((and (requires-dist-header? line) (not (extra? line)))
-(loop (cons (specification->requirement-name
- (requires-dist-value line))
-requirements)))
-   (else
-(loop requirements)
+(match (read-line port)
+  (line
+   ;; Stop at the first 'Provides-Extra' section: the non-optional
+   ;; requirements appear before the optional ones.
+   (cond
+((eof-object? line)
+ (reverse (delete-duplicates requirements)))
+((and (requires-dist-header? line) (not (extra? line)))
+ (loop (cons (specification->requirement-name
+  (requires-dist-value line))
+ requirements)))
+(else
+ (loop requirements)
 
 (define (guess-requirements source-url wheel-url archive)
   "Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
--8<---cut here---end--->8---

>>(define (read-wheel-metadata wheel-archive)
>>  ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
>> -;; requirements.
>> +;; requirements, or #f if the metadata file contained therein couldn't 
>> be
>> +;; extracted.
>>  (let* ((dirname (wheel-url->extracted-directory wheel-url))
>> -   (json-file (string-append dirname "/metadata.json")))
>> -  (and (zero? (system* "unzip" "-q" wheel-archive json-file))
>> -   (dynamic-wind
>> - (const #t)
>> - (lambda ()
>> -   (call-with-input-file json-file
>> - (lambda (port)
>> -   (let* ((metadata (json->scm port))
>> -  (run_requires (hash-ref metadata "run_requires"))
>> -  (requirements (if run_requires
>> -(hash-ref (list-ref 
>> run_requires 0)
>> -   "requires")
>> -'(
>> - (map specification->requirement-name requirements)
>> - (lam

bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-11 Thread Ricardo Wurmus


Hi Maxim,

>>> +  (call-with-input-file metadata
>>> +(lambda (port)
>>> +  (let loop ((requirements '()))
>>> +(let ((line (read-line port)))
>>> +  ;; Stop at the first 'Provides-Extra' section: the non-optional
>>> +  ;; requirements appear before the optional ones.
>>> +  (if (eof-object? line)
>>> +  (reverse (delete-duplicates requirements))
>>> +  (cond
>>> +   ((and (requires-dist-header? line) (not (extra? line)))
>>> +(loop (cons (specification->requirement-name
>>> + (requires-dist-value line))
>>> +requirements)))
>>> +   (else
>>> +(loop requirements)
>>> +
>>
>> As before you can simplify the nested let and merge “if” and "cond“.
>
> Oh, I get it now, I think:
>
> --8<---cut here---start->8---
>
>(call-with-input-file metadata
>  (lambda (port)
>(let loop ((requirements '()))
> -(let ((line (read-line port)))
> -  ;; Stop at the first 'Provides-Extra' section: the non-optional
> -  ;; requirements appear before the optional ones.
> -  (if (eof-object? line)
> -  (reverse (delete-duplicates requirements))
> -  (cond
> -   ((and (requires-dist-header? line) (not (extra? line)))
> -(loop (cons (specification->requirement-name
> - (requires-dist-value line))
> -requirements)))
> -   (else
> -(loop requirements)
> +(match (read-line port)
> +  (line
> +   ;; Stop at the first 'Provides-Extra' section: the non-optional
> +   ;; requirements appear before the optional ones.
> +   (cond
> +((eof-object? line)
> + (reverse (delete-duplicates requirements)))
> +((and (requires-dist-header? line) (not (extra? line)))
> + (loop (cons (specification->requirement-name
> +  (requires-dist-value line))
> + requirements)))
> +(else
> + (loop requirements)
>
>  (define (guess-requirements source-url wheel-url archive)
>"Given SOURCE-URL, WHEEL-URL and a ARCHIVE of the package, return a list
> --8<---cut here---end--->8---

Not quite.  Your ‘match’ expression here doesn’t do anything that a
‘let’ wouldn’t have done.  It really just binds the return value of
(read-line port) to ‘line’; that’s the same as (let ((line (read-line
port))) …).

I gave a match example using predicate matchers in a previous reply.  In
any case, using ‘cond’ inside of a let would be just fine.  If you
wanted to go with ‘match’, though, you’d replace the ‘cond’.

--
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-11 Thread Maxim Cournoyer
Ricardo Wurmus  writes:

> Next up: Seven of Nine, tertiary adjunct of unimatrix zero one:

Ehe! I had to look up the reference; I'm not much of a Star Trek fan
obviously :-P.

>> From 37e499d5d5d5f690aa0a065c730e13f6a31dd30d Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer 
>> Date: Thu, 28 Mar 2019 23:12:26 -0400
>> Subject: [PATCH 7/9] import: pypi: Include optional test inputs as
>>  native-inputs.
>>
>> * guix/import/pypi.scm (maybe-inputs): Add INPUT-TYPE argument, and use it.
>> (test-section?): New predicate.
>> (parse-requires.txt): Collect the optional test inputs, and return them as 
>> the
>> second element of the returned list.
>
> AFAICT parse-requires.txt now returns a list of pairs, but used to
> return a plain list before.  Is this correct?

Right, a list of two lists to be technically correct.

>>  (define (parse-requires.txt requires.txt)
>> -  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of
>> -requirement names."
>> -  ;; This is a very incomplete parser, which job is to select the 
>> non-optional
>> -  ;; dependencies and strip them out of any version information.
>> +  "Given REQUIRES.TXT, a Setuptools requires.txt file, return a pair of 
>> requirements.
>> +
>> +The first element of the pair contains the required dependencies while the
>> +second the optional test dependencies.  Note that currently, optional,
>> +non-test dependencies are omitted since these can be difficult or expensive 
>> to
>> +satisfy."
>> +
>> +  ;; This is a very incomplete parser, which job is to read in the 
>> requirement
>> +  ;; specification lines, and strip them out of any version information.
>>;; Alternatively, we could implement a PEG parser with the (ice-9 peg)
>>;; library and the requirements grammar defined by PEP-0508
>>;; (https://www.python.org/dev/peps/pep-0508/).
>
> Does it really return a pair?  Or a list of pairs?  Or is it a
> two-element list of lists?

The latter! I've fixed the docstring accordingly.

>>(call-with-input-file requires.txt
>>  (lambda (port)
>> -  (let loop ((result '()))
>> +  (let loop ((required-deps '())
>> + (test-deps '())
>> + (inside-test-section? #f)
>> + (optional? #f))
>>  (let ((line (read-line port)))
>> -  ;; Stop when a section is encountered, as sections contains 
>> optional
>> -  ;; (extra) requirements.  Non-optional requirements must appear
>> -  ;; before any section is defined.
>> -  (if (or (eof-object? line) (section-header? line))
>> +  (if (eof-object? line)
>>;; Duplicates can occur, since the same requirement can be
>>;; listed multiple times with different conditional markers, 
>> e.g.
>>;; pytest >= 3 ; python_version >= "3.3"
>>;; pytest < 3 ; python_version < "3.3"
>> -  (reverse (delete-duplicates result))
>> +  (map (compose reverse delete-duplicates)
>> +   (list required-deps test-deps))
>
> Looks like a list of lists to me.  “delete-duplicates” now won’t delete
> a name that is in both “required-deps” as well as in “test-deps”.  Is
> this acceptable?

It is acceptable, as this corner case cannot exist given the current
code (a requirement can exist in either required-deps or test-deps, but
never in both). It also doesn't make sense that a run time requirement
would also be listed as a test requirement, so that corner case is not
likely to exist in the future either.

> Personally, I’m not a fan of using data structures for returning
> multiple values, because we can simply return multiple values.

I thought the Guile supported multiple values return value would be
great here as well, but I've found that for this specific case here, a
list of lists worked better, since the two lists contain requirements to
be processed the same, which "map" can readily do (i.e. less ceremony is
required).

> Or we could have more than just strings.  The meaning of these strings
> is provided by the bin into which they are thrown — either
> “required-deps” or “test-deps”.  It could be an option to collect tagged
> values instead and have the caller deal with filtering.

Sounds neat, but I'd rather punt on this one for now.

>>  (define (parse-wheel-metadata metadata)
>> -  "Given METADATA, a Wheel metadata file, return a list of requirement 
>> names."
>> +  "Given METADATA, a Wheel metadata file, return a pair of requirements.
>> +
>> +The first element of the pair contains the required dependencies while the 
>> second the optional
>> +test dependencies.  Note that currently, optional, non-test dependencies are
>> +omitted since these can be difficult or expensive to satisfy."
>>;; METADATA is a RFC-2822-like, header based file.
>
> This sounds like this is going to duplicate the previous procedures.

Instead of duplicating the docstring, I'm now referring to that of
PARSE-REQUIRES.TXT for PARSE-W

bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-11 Thread Ricardo Wurmus


Hi Maxim,

>>>(call-with-input-file requires.txt
>>>  (lambda (port)
>>> -  (let loop ((result '()))
>>> +  (let loop ((required-deps '())
>>> + (test-deps '())
>>> + (inside-test-section? #f)
>>> + (optional? #f))
>>>  (let ((line (read-line port)))
>>> -  ;; Stop when a section is encountered, as sections contains 
>>> optional
>>> -  ;; (extra) requirements.  Non-optional requirements must appear
>>> -  ;; before any section is defined.
>>> -  (if (or (eof-object? line) (section-header? line))
>>> +  (if (eof-object? line)
>>>;; Duplicates can occur, since the same requirement can be
>>>;; listed multiple times with different conditional markers, 
>>> e.g.
>>>;; pytest >= 3 ; python_version >= "3.3"
>>>;; pytest < 3 ; python_version < "3.3"
>>> -  (reverse (delete-duplicates result))
>>> +  (map (compose reverse delete-duplicates)
>>> +   (list required-deps test-deps))
>>
>> Looks like a list of lists to me.  “delete-duplicates” now won’t delete
>> a name that is in both “required-deps” as well as in “test-deps”.  Is
>> this acceptable?
>
> It is acceptable, as this corner case cannot exist given the current
> code (a requirement can exist in either required-deps or test-deps, but
> never in both). It also doesn't make sense that a run time requirement
> would also be listed as a test requirement, so that corner case is not
> likely to exist in the future either.

I mentioned it because I believe I’ve seen this in the past where the
importer would return some of the same inputs as both regular inputs and
test dependencies.

>> Personally, I’m not a fan of using data structures for returning
>> multiple values, because we can simply return multiple values.
>
> I thought the Guile supported multiple values return value would be
> great here as well, but I've found that for this specific case here, a
> list of lists worked better, since the two lists contain requirements to
> be processed the same, which "map" can readily do (i.e. less ceremony is
> required).

“map” can also operate on more than one list at a time:

(call-with-values
  (lambda ()
(values (list 1 2 3)
(list 9 8 7)))
  (lambda (a b) (map + a b)))

=> (10 10 10)

Of course, it would be simpler to just use a single list of tagged
items.

--
Ricardo





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-15 Thread Maxim Cournoyer
Hello Ricardo!

Ricardo Wurmus  writes:

>> From cfde6e09f8f8c692fe252d76ed27e8c50a9e5377 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer 
>> Date: Sat, 30 Mar 2019 23:13:26 -0400
>> Subject: [PATCH 8/9] import: pypi: Scan source archive to find requires.txt
>>  file.
>
>> * guix/import/pypi.scm (use-modules): Use invoke from (guix build utils).
>> (guess-requirements)[archive-root-directory]: Remove procedure.
>
> Oh, I guess I reviewed this procedure in vain :(
>
> Please modify the commits so that added procedures are not removed in
> later commits.  This is easier on the reviewer and makes for a clearer
> commit history.

Indeed; I'll be more careful about this is the future; sorry!

I've squashed this commit along with the one enabling more archive types
support, as this is where the modified (and later removed) procedure
originated.

>>(define (guess-requirements-from-source)
>>  ;; Return the package's requirements by guessing them from the source.
>> -(let ((dirname (archive-root-directory source-url))
>> -  (extension (file-extension source-url)))
>> -  (if (string? dirname)
>> -  (call-with-temporary-directory
>> -   (lambda (dir)
>> - (let* ((pypi-name (string-take dirname (string-rindex dirname 
>> #\-)))
>> -(requires.txt (string-append dirname "/" pypi-name
>> - ".egg-info" 
>> "/requires.txt"))
>> -(exit-code
>> - (parameterize ((current-error-port (%make-void-port 
>> "rw+"))
>> -(current-output-port (%make-void-port 
>> "rw+")))
>> -   (if (string=? "zip" extension)
>> -   (system* "unzip" archive "-d" dir requires.txt)
>> -   (system* "tar" "xf" archive "-C" dir 
>> requires.txt)
>> -   (if (zero? exit-code)
>> -   (parse-requires.txt (string-append dir "/" requires.txt))
>> -   (begin
>> - (warning
>> -  (G_ "Failed to extract file: ~a from source.~%")
>> -  requires.txt)
>> - (list '() '()))
>> +(if (compressed-file? source-url)
>> +(call-with-temporary-directory
>> + (lambda (dir)
>> +   (parameterize ((current-error-port (%make-void-port "rw+"))
>> +  (current-output-port (%make-void-port "rw+")))
>> + (if (string=? "zip" (file-extension source-url))
>> + (invoke "unzip" archive "-d" dir)
>> + (invoke "tar" "xf" archive "-C" dir)))
>> +   (let ((requires.txt-files
>> +  (find-files dir (lambda (abs-file-name _)
>> +(string-match "\\.egg-info/requires.txt$"
>> +  abs-file-name)
>> + (if (> (length requires.txt-files) 0)
>
> Let’s work on the empty list directly.  Here “match” would be better.

Done, like this:

--8<---cut here---start->8---
- (if (> (length requires.txt-files) 0)
- (parse-requires.txt (first requires.txt-files))
- (begin (warning (G_ "Cannot guess requirements from source 
archive:\
+ (match requires.txt-files
+   (()
+(warning (G_ "Cannot guess requirements from source archive:\
  no requires.txt file found.~%"))
-'())
+'())
+   (else (parse-requires.txt (first requires.txt-files)))
--8<---cut here---end--->8---

>> + (begin
>> +   (parse-requires.txt (first requires.txt-files)))
>
> No need for “begin” here.

Fixed.

>> + (begin (warning (G_ "Cannot guess requirements from source 
>> archive:\
>> + no requires.txt file found.~%"))
>> +(list '() '()))
>
> I know that this is from an earlier commit, but I don’t like the look of
> “(list '() '())” at all :)
>
>> +(begin
>> +  (warning (G_ "Unsupported archive format; \
>> +cannot determine package dependencies from source archive: ~a~%")
>> +   (basename source-url))
>>(list '() '()
>
> Same here.  Certainly there’s a better return value.

This might look ugly, but I can't think of a better return value, since
using anything else would mean having to introduce extra logic in the
callers, while it is now a correct value that needs no special case.

Maxim





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-15 Thread Maxim Cournoyer
Ricardo Wurmus  writes:

> And finally: Number 9!

Yay!

>> From 1290f9d1f0d594fdd4723d76b94116be25da9dd5 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer 
>> Date: Sat, 30 Mar 2019 20:27:35 -0400
>> Subject: [PATCH 9/9] import: pypi: Preserve package name case when forming
>>  pypi-uri.
>>
>> Fixes issue: #33046.
>
> Please change this to:
>
> Fixes .

Done!

>> * guix/build-system/python.scm (pypi-uri): Update the host URI to
>> "files.pythonhosted.org".
>> * guix/import/pypi.scm (make-pypi-sexp): Preserve the package name case when
>> the source URL calls for it.
>
> Is the first change to use files.pythonhosted.org required to fix this?
> Or is this unrelated?
>
> If it is required this looks fine to me.
>
> Thank you!

The permanent redirection was found while fixing the issue; but it's
better to have the fix separate.  I've separated it into its own commit.

Thank you!





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-15 Thread Maxim Cournoyer
Hi again,

Ricardo Wurmus  writes:

> Maxim Cournoyer  writes:
>
>> While I agree that a regexp is a bigger hammer than basic string
>> manipulation, I see some merit to it here:
>>
>> 1) We can be assured of conformance with upstream, again, per PEP-0508.
>> 2) It is easier to extend; we might want to add parsing for the version
>> spec in order to disregard dependencies specified for Python < 3, for
>> example.
>>
>> The use of the PEP-0508 grammar to define the regexp is useful to detail
>> in a more human-friendly language the components of the regexp.  We
>> could have otherwise used the more cryptic regexp for Python
>> distribution names:
>>
>> --8<---cut here---start->8---
>> ^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$
>> --8<---cut here---end--->8---
>>
>> So I guess that what I'm saying is that I prefer this approach to using
>> string-index with invalid characters, for the reasons above.
>>
>> [0]  https://www.python.org/dev/peps/pep-0508/
>
> Okay, sounds good.  Please make sure to note this in a comment, so that
> I won’t be asking myself this same question in a year :)

Done!

Maxim





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-16 Thread Maxim Cournoyer
Hello!

Continued feedback about your much appreciated comments! :-)

Ricardo Wurmus  writes:

> Maxim Cournoyer  writes:
>
 +  ;; (extra) requirements.  Non-optional requirements must appear
 +  ;; before any section is defined.
 +  (if (or (eof-object? line) (section-header? line))
 +  (reverse result)
 +  (cond
 +   ((or (string-null? line) (comment? line))
 +(loop result))
 +   (else
 +(loop (cons (clean-requirement line)
 +result))
 +
>>>
>>> I think it would be better to use “match” here instead of nested “let”,
>>> “if” and “cond”.  At least you can drop the “if” and just use cond.
>>>
>>> The loop let and the inner let can be merged.
>>
>> I'm not sure I understand; wouldn't merging the named let with the plain
>> let mean adding an extra LINE argument to my LOOP procedure?  I don't
>> want that.
>
> Let’s forget about merging the nested “let”, because you would indeed
> need to change a few more things.  It’s fine to keep that as it is.  But
> (if … (cond …)) is not pretty.  At least it could be done in one “cond”:
>
> (cond
>  ((or (eof-object? line) (section-header? line))
>   (reverse result))
>  ((or (string-null? line) (comment? line))
>   (loop result))
>  (else
>   (loop (cons (clean-requirement line)
>   result

Agreed and fixed, thanks.

>> Also, how could the above code be expressed using "match"? I'm using
>> predicates which tests for (special) characters in a string; I don't see
>> how the more primitive pattern language of "match" will enable me to do
>> the same.
>
> “match” has support for predicates, so you could do something like this:
>
> (match line
>  ((or (eof-object) (? section-header?))
>   (reverse result))
>  ((or '() (? comment?))
>   (loop result))
>  (_ (loop (cons (clean-requirement line) result

Oh, that's neat! I had no idea that predicates could be used with
"match".  '() would need to be replaced by "" to match the empty
string.  Another gotcha with "match", is that the "or" seems to evaluate
every component, no matter if a early true condition was found; this
resulted in the following error:

--8<---cut here---start->8---
+ (wrong-type-arg
+   "string-trim"
+   "Wrong type argument in position ~A (expecting ~A): ~S"
+   (1 "string" #)
+   (#))
result: FAIL
--8<---cut here---end--->8---

Due to the "(or (eof-object) (? section-header?)" match clause
evaluating the section-header? predicate despite the line being an EOF
character.

> This allows you to match “eof-object” and '() directly.  Whenever I see
> “string-null?” I think it might be better to “match” on the empty list
> directly.

string-null? and an empty list are not the same, unless I'm missing something.

> But really, that’s up to you.  I only feel strongly about avoiding “(if
> … (cond …))”.

Due to the problem mentioned above, I stayed with "cond".

Thanks!

Maxim





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-16 Thread Maxim Cournoyer
Hey Ricardo :-)

Ricardo Wurmus  writes:

> Hi Maxim,
>
(call-with-input-file requires.txt
  (lambda (port)
 -  (let loop ((result '()))
 +  (let loop ((required-deps '())
 + (test-deps '())
 + (inside-test-section? #f)
 + (optional? #f))
  (let ((line (read-line port)))
 -  ;; Stop when a section is encountered, as sections contains 
 optional
 -  ;; (extra) requirements.  Non-optional requirements must appear
 -  ;; before any section is defined.
 -  (if (or (eof-object? line) (section-header? line))
 +  (if (eof-object? line)
;; Duplicates can occur, since the same requirement can be
;; listed multiple times with different conditional 
 markers, e.g.
;; pytest >= 3 ; python_version >= "3.3"
;; pytest < 3 ; python_version < "3.3"
 -  (reverse (delete-duplicates result))
 +  (map (compose reverse delete-duplicates)
 +   (list required-deps test-deps))
>>>
>>> Looks like a list of lists to me.  “delete-duplicates” now won’t delete
>>> a name that is in both “required-deps” as well as in “test-deps”.  Is
>>> this acceptable?
>>
>> It is acceptable, as this corner case cannot exist given the current
>> code (a requirement can exist in either required-deps or test-deps, but
>> never in both). It also doesn't make sense that a run time requirement
>> would also be listed as a test requirement, so that corner case is not
>> likely to exist in the future either.
>
> I mentioned it because I believe I’ve seen this in the past where the
> importer would return some of the same inputs as both regular inputs and
> test dependencies.

OK!

>>> Personally, I’m not a fan of using data structures for returning
>>> multiple values, because we can simply return multiple values.
>>
>> I thought the Guile supported multiple values return value would be
>> great here as well, but I've found that for this specific case here, a
>> list of lists worked better, since the two lists contain requirements to
>> be processed the same, which "map" can readily do (i.e. less ceremony is
>> required).
>
> “map” can also operate on more than one list at a time:
>
> (call-with-values
>   (lambda ()
> (values (list 1 2 3)
> (list 9 8 7)))
>   (lambda (a b) (map + a b)))
>
> => (10 10 10)

That's what I meant by "requires more ceremony".  I can simply apply
"map" to the return value of the function and get what I need, rather
than having to use "values" in the callee, then "call-with-values" in
the caller and establish a binding for each list.
  
> Of course, it would be simpler to just use a single list of tagged
> items.

Do you feel strongly about it? I don't; I'm open to try to use a tagged
list if you feel this is worth it.

Maxim





bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.

2019-06-16 Thread Ricardo Wurmus


Maxim Cournoyer  writes:

>> This allows you to match “eof-object” and '() directly.  Whenever I see
>> “string-null?” I think it might be better to “match” on the empty list
>> directly.
>
> string-null? and an empty list are not the same, unless I'm missing something.

Yes, sorry, I meant “null?”.  Using “string-null?” is equivalent to
matching the empty string, of course.

>> But really, that’s up to you.  I only feel strongly about avoiding “(if
>> … (cond …))”.
>
> Due to the problem mentioned above, I stayed with "cond".

Okay!  Thanks.

-- 
Ricardo