Melissa, Thanks for your help!
However, after fixing the problems you pointed me to I got stuck again https://github.com/bytesource/CodonTableParser/blob/master/parser.rb and I am realizing that I am more or less relying on trial & error here. In other words, I am still lacking the knowledge of translating a document into its Backus Naur form with which I can then feed the parser (Parslet). As I have no background in computer science, I would be interested in any resources (printed or online) you have found valuable in laying the basis for building a parser. This question is for everyone, as I am always interested in different opinions. Stefan On Mon, Aug 8, 2011 at 19:49, Melissa Whittington < [email protected]> wrote: > Whoops, I meant "The :file rule's repeat is what is describing multiple > lines." > > -mj > > On Mon, Aug 8, 2011 at 7:47 AM, Melissa Whittington > <[email protected]> wrote: > > Stefan, > > > > The reason you're getting that error on the last line is because there > > will be no newline at the end of the last line, so just switch it to > > 'newline.maybe'. > > > > Your :line rule also does not need the .repeat because there will only > > be one of either a :codon or a :comment and not more. The :line rule's > > repeat is what is describing multiple lines. > > > > Also, I don't know what "repeat(1)" by itself does, but you probably > > don't mean that? > > > > Don't forget any only matches one character. You should probably not > > use any, either. For your :content and :no_value rules, they should be > > matching everything on a line (sans a possible newline). You could use > > any.repeat to parse the rest of the line, but it will try to parse > > *anything* including newlines and on to the next lines which is not > > what you want. > > > > So, it'll probably be helpful to be a little more descriptive. > > > > Hope that helps you make a little more progress! > > > > -mj > > > > On Mon, Aug 8, 2011 at 12:21 AM, Stefan Rohlfing > > <[email protected]> wrote: > >> Hi, > >> I am trying to parse the NCBI genetic code table: > >> > https://github.com/bytesource/CodonTableParser/blob/master/data/codons.txt > >> to extract those lines of each block that contain either "name", "id", > >> "ncbieaa", or "sncbieaa". > >> As each line either contains the content I am interested in or text that > can > >> be discarded, I started by first parsing the document on a per-line > basis: > >> https://github.com/bytesource/CodonTableParser/blob/master/parser.rb > >> Unfortunately, parsing the file resulted in an error message that tells > me > >> Parslet failed to parse line 233, which is the very last line of the > file: > >> Expected at least 1 of LINE NEWLINE at line 1 char 1. > >> `- Expected at least 1 of LINE NEWLINE at line 1 char 1. > >> `- Failed to match sequence (LINE NEWLINE) at line 233 char 1. > >> `- Failed to match sequence (LF CR?) at line 233 char 1. > >> `- Premature end of input at line 233 char 1. > >> However, apart from knowing where is problem is located, I have > difficulties > >> finding out where my code went wrong. > >> I already read Parslet's documentation without finding a solution, so > now I > >> hope someone on this list might help me with my problem. > >> On a site note, I am often not sure when to use 'repeat(1)' instead of > just > >> repeat. I know the latter repeats the rule zero or more times, but how > do I > >> decide when zero is enough? Is there a rule to follow? > >> Thanks again in advance! > >> Stefan > >> > >> > >> > >> > >> > >> > > >
