This is almost exactly the intuition behind the standard error reporting
heuristic for grammars involving alternations. It is a heuristic, but it
has to be since on a failure it's impossible to entirely accurately
determine the user's intention. But intuitively, generally the rule that
has managed to parse the furthest into the input is considered the rule
that is most likely to be what the user intended since that's the one that
has matched the most. This is usually calculated using the source offset,
this is subtly different to :path since :path is the distance in the
grammar itself. This doesn't necessarily correspond to the distance into
the input - consider the regex aabcd|a* when matching aaaa - the first
alternation would produce a path of length 2 if the two a's in the grammar
are labeled, but the second would produce a path of length 1 (a single
rule) even though it would match more input.

Here's how the basic algorithm works to produce an error like I showed
above.

   1. Parse the form - if it matches some path through your grammar, you're
   done.
   2. If not, for each failure record the path to that failure, the text
   offset of the error point and what the rule was expecting at that point.
   This may be that you found something unexpected, but may also be that you
   ran out of input and needed more.
   3. When done, find the furthest error point and collect all the failure
   info for rules which failed at that point.
   4. If there's just a single failed rule, use that as your error rule to
   report from. Otherwise, use a heuristic like "find the shortest subset of
   all failed rules and use that path as the failing rule" (since all failures
   start from that point).
   5. Report something like "Error when parsing <rule from #4>, found <what
   you found at the error point> but was expecting one of <the set of all
   elements that failed rules were expecting at that point>".

There are some complications here in Clojure since we're parsing forms, and
not all form elements have metadata saying where they exist in the source
(e.g. keywords). Also, when you run out of input it's really useful to be
able to indicate the end of the form you were parsing to say "I needed more
input here". Finally, if we're parsing the result of a previous
macroexpansion, macros are historically pretty bad at propagating source
metadata. I'd like to see the following improvements:

   1. Ensure that forms always get source metadata added - I believe there
   were some bugs around this.
   2. For composite forms (i.e. collection literals) add metadata
   indicating the end of the form in the source.
   3. On macroexpansion, automatically propagate the source location
   metadata from the source form to the expansion if it's not present there -
   that way at least the form itself will retain it.

tools.reader handles all this stuff a lot better, I believe. I think there
are also open JIRAs for some of this work.

On 25 August 2016 at 09:33, Leon Grapenthin <grapenthinl...@gmail.com>
wrote:

>
>
> On Tuesday, August 23, 2016 at 3:27:28 AM UTC+2, Alex Miller wrote:
>>
>> predicate: (cat :args (* :clojure.core.specs/binding-form) :varargs (?
>> (cat :amp #{(quote &)} :form :clojure.core.specs/binding-form))),
>>
>> the predicate that is actually failing in the spec, probably not
>> particularly helpful given the complexity (and recursiveness) of the
>> destructuring specs
>>
>>
>> Extra input
>>
>> this is the part of cat that I think could be made more explicit - could
>> be saying here that the value it had (above) was expected to match the next
>> part of the cat (binding-form). So that could say the equivalent of
>> "Expected binding-form but had non-matching value ..." and could even find
>> what parts of that value matched and maybe which didn't (the :or keys) such
>> that you'd have a more precise description. There is some more stuff Rich
>> and I have worked on around "hybrid maps" which is the case here with map
>> destructuring - it's particularly challenging to get a good error out of
>> that at the moment, but there's more that can be done.
>>
>>
> Thank you for doing the walkthrough. I observed this too and became
> sceptical why spec doesn't go further down the path and apparently stops at
> ::binding-form.
> I could isolate the problem a bit by changing the spec of ::arg-list and
> temporarily removing the :varargs branch.
>
> (s/def ::arg-list
>   (s/and
>     vector?
>     (s/cat :args (s/* ::binding-form)
>     ;;       :varargs (s/? (s/cat :amp #{'&} :form ::binding-form))
>            )))
>
> This leads to a much better message:
>
> (s/explain (:args (s/get-spec 'clojure.core/defn))
>            '[foo [{:or {a/b 42}}]])
>
> In: [1 0] val: {:or #:a{b 42}} fails spec: :clojure.core.specs/local-name
> at: [:bs :arity-1 :args :args :sym] predicate: simple-symbol?
> In: [1 0 0] val: ([:or #:a{b 42}]) fails spec: 
> :clojure.core.specs/seq-binding-form
> at: [:bs :arity-1 :args :args :seq] predicate: (cat :elems (*
> :clojure.core.specs/binding-form) :rest (? (cat :amp #{(quote &)} :form
> :clojure.core.specs/binding-form)) :as (? (cat :as #{:as} :sym
> :clojure.core.specs/local-name))),  Extra input
> In: [1 0 :or a/b 0] val: a/b fails spec: :clojure.core.specs/or at: [:bs
> :arity-1 :args :args :map :or 0] predicate: simple-symbol?
> In: [1 0] val: {:or #:a{b 42}} fails spec: :clojure.core.specs/arg-list
> at: [:bs :arity-n :bodies :args] predicate: vector?
>
> The third one is the desired one and very precise - it seems to be usually
> the one with the largest :in path. The length of the :in path seems a good
> sorting criterium for reporting.
>
> However I was not able to track the issue further down. I also wasn't able
> to reproduce a more minimal case of this problem. It seems like a bug in
> how spec parses and must have something to do with s/cat and the :varargs
> branch.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to