On Mon, 2013-03-04 at 11:37 +0100, Andreas Lauser wrote:
> Am Montag, 4. März 2013, 07:58:34 schrieb Joakim Hove:
> > > std::map<std::string, std::vector<string> >
> > 
> > That's not nearly enough.  By a long shot.  First of all, keywords can be
> > repeated, can specify parts of the data in one instance and the rest in
> > another.  Or overwrite existing data.  Second, keyword data for anything
> > other than mere block properties have (very) specialised formats so teasing
> > it into a list of strings requires at least *some* intelligence.  There is
> > also the 'INCLUDE' (and IMPORT) keyword as well as 'PATHS'.
> > 
> > Well; I must agree with Bård here. My experience from working with ECLIPSE
> > files (mostly binary, but also some of the .DATA content) is that the
> > amount of special cases, weirdness and surprises is quite large; if you
> > start out with too low level of abstraction on your data structures you
> > will pay for it ☹
> Hm, okay. Seems like the eclipse files I have are too simplistic. But
>  isn't there a (relatively) simple generic syntax for the file format?

I'm afraid not.  Some (most) keywords consist of (essentially) a single
data record, terminated by a '/' character.  Some keywords have no
associated data.  Other keywords have no terminator.  Some keywords have
an arbitrary number of records (e.g., 'PVTO', 'WELSPECS' or 'COMPDAT')
and are terminated by a null record (line) consisting only of '/'.  Add
to this that the number of data items in many records are generally
dependent upon descriptive metadata in the 'RUNSPEC' section and you end
up with a highly convoluted structure.

There is really only one general statement you can make about the
structure of keywords and that is that the keyword string matches the
regular expression


Anything else is more or less special case treatment for each keyword.
Happily, many of the keywords in the 'GRID' section have the *same*
special case treatment, so there is at least *some* commonality in the
input file processing.


> I assumption is that if you do not have to deal with seemingly simple
>  stuff like comments, line continuation, include statements, etc in the
>  code which adds sematics to the syntax, that code will be much easier
>  to write and to understand...

Maybe so, but *every* realistic .DATA file uses INCLUDE (possibly
multiple levels) and comments.


