Better support for subreadings is something I would like as well. This
basically comes down to nice ways to deal with slightly ambiguous
tokenisation.
Along with that it would be good to have an optional "strict" version of
the formalism where all tags used in the rules must be pre-defined in
lists. e.g. this --strict option (which would later ideally become standard
and the current system be --lazy, but quite a way in the future) would
basically complain loudly when tag literals are used inside rules.
e.g.
LIST Foo = foo ;
LIST Bar = bar ;
SECTION
SELECT Foo IF (1 Bar) ; # Nice rule
SELECT Foo IF (1 ("baz")) ; # Bad rule, much wailing and gnashing of teeth
from the compiler.
Fran
On Thu, Nov 6, 2014 at 7:33 PM, Flammie Pirinen <[email protected]> wrote:
> 2014-11-05, Tino Didriksen sanoi:
>
> > If we were to make a CG-3 version 1.0 release and time it for
> > NoDaLiDa 2015, what features, changes, additions, etc would you like
> > to see implemented for that?
>
> Personal wish-list:
>
> * XML based format for rule syntax instead of the bracket stuff
> * better ways to ensure consistent tagging in and out the pipeline
> * gently nudge people to encode towards established standards like
> google universal poses and stanford dependency tags
> * probably should output conll-u or whatever nowadays is easiest to
> evaluate against the state of the art published stuffs
> * sub-readings should define head per case
> * release a standard spec of what is CG-3 full with parsing grammar,
> test suite and all
>
> Some rationale:
>
> while XML is quite horrible, it's already used everywhere, and if we
> need to teach people working on computational linguistics one coding
> format, XML seems like a good enough choice at the moment and in the
> long run.
>
> most of the work with cg I've needed to do so far is struggling with
> formatting of tags and rewriting inputs and outputs to match hfst tags
> and apertium stuffs, this is something that really should be done by
> computer and not by humans and unreadable one-off perl scripts
>
> like with xml, google poses and dependencies are the standard of
> science now, it would not make sense to even publish intrinsic
> evaluation of your system on anything else right now, unless you want
> to distance yourself from everyone (or the main audience) working
> on the field
>
> from software engineering pov stable 1.0 to me suggests like it'd
> require well enough documented standard than anyone can from the docs
> reimplement it at will and all that
>
> > One clear goal for 1.0 is binary format backwards compatibility, so
> > that 1.0 grammars will remain loadable. The binary format is pretty
> > much already stable, but so far has no guarantees that old versions
> > remain loadable.
>
> stability is an excellent goal but can also be substituted with
> well-designed upwards and downwards compatibility in the format.
>
> --
> Flammie, computer scientist bachelor + linguist master = computational
> linguist doctor, free software Finnish localiser,
> and more! <http://www.iki.fi/flammie/>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Constraint Grammar" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/constraint-grammar.
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.