Re: [ruby.parslet] Parsing the NCBI Genetic Code Table

Stefan Rohlfing Tue, 09 Aug 2011 02:01:47 -0700

Melissa,

Thanks for your help!


However, after fixing the problems you pointed me to I got stuck again

https://github.com/bytesource/CodonTableParser/blob/master/parser.rb

and I am realizing that I am more or less relying on trial & error here. In
other words, I am still lacking the knowledge of translating a document into
its Backus Naur form with which I can then feed the parser (Parslet).

As I have no background in computer science, I would be interested in any
resources (printed or online) you have found valuable in laying the basis
for building a parser. This question is for everyone, as I am always
interested in different opinions.

Stefan


On Mon, Aug 8, 2011 at 19:49, Melissa Whittington <
[email protected]> wrote:

> Whoops, I meant "The :file rule's repeat is what is describing multiple
> lines."
>
> -mj
>
> On Mon, Aug 8, 2011 at 7:47 AM, Melissa Whittington
> <[email protected]> wrote:
> > Stefan,
> >
> > The reason you're getting that error on the last line is because there
> > will be no newline at the end of the last line, so just switch it to
> > 'newline.maybe'.
> >
> > Your :line rule also does not need the .repeat because there will only
> > be one of either a :codon or a :comment and not more. The :line rule's
> > repeat is what is describing multiple lines.
> >
> > Also, I don't know what "repeat(1)" by itself does, but you probably
> > don't mean that?
> >
> > Don't forget any only matches one character. You should probably not
> > use any, either. For your :content and :no_value rules, they should be
> > matching everything on a line (sans a possible newline). You could use
> > any.repeat to parse the rest of the line, but it will try to parse
> > *anything* including newlines and on to the next lines which is not
> > what you want.
> >
> > So, it'll probably be helpful to be a little more descriptive.
> >
> > Hope that helps you make a little more progress!
> >
> > -mj
> >
> > On Mon, Aug 8, 2011 at 12:21 AM, Stefan Rohlfing
> > <[email protected]> wrote:
> >> Hi,
> >> I am trying to parse the NCBI genetic code table:
> >>
> https://github.com/bytesource/CodonTableParser/blob/master/data/codons.txt
> >> to extract those lines of each block that contain either "name", "id",
> >> "ncbieaa", or "sncbieaa".
> >> As each line either contains the content I am interested in or text that
> can
> >> be discarded, I started by first parsing the document on a per-line
> basis:
> >> https://github.com/bytesource/CodonTableParser/blob/master/parser.rb
> >> Unfortunately, parsing the file resulted in an error message that tells
> me
> >> Parslet failed to parse line 233, which is the very last line of the
> file:
> >> Expected at least 1 of LINE NEWLINE at line 1 char 1.
> >> `- Expected at least 1 of LINE NEWLINE at line 1 char 1.
> >>    `- Failed to match sequence (LINE NEWLINE) at line 233 char 1.
> >>       `- Failed to match sequence (LF CR?) at line 233 char 1.
> >>          `- Premature end of input at line 233 char 1.
> >> However, apart from knowing where is problem is located, I have
> difficulties
> >> finding out where my code went wrong.
> >> I already read Parslet's documentation without finding a solution, so
> now I
> >> hope someone on this list might help me with my problem.
> >> On a site note, I am often not sure when to use 'repeat(1)' instead of
> just
> >> repeat. I know the latter repeats the rule zero or more times, but how
> do I
> >> decide when zero is enough? Is there a rule to follow?
> >> Thanks again in advance!
> >> Stefan
> >>
> >>
> >>
> >>
> >>
> >>
> >
>

Re: [ruby.parslet] Parsing the NCBI Genetic Code Table

Reply via email to