Even though empty alts are not the OPs problem...this seems like something which antlr could preempt with a stern warning at grammar generation time. Or is there ever a legitimate reason to let it pass? Sent from my Verizon Wireless BlackBerry
-----Original Message----- From: "Jim Idle" <j...@temporal-wave.com> Date: Thu, 25 Feb 2010 07:40:30 Cc: antlr-interest@antlr.org<antlr-interest@antlr.org> Subject: Re: [antlr-interest] Bounding the token stream in the C backend The problem is your lexer (almost 100%). Look for a rule that has an empty alt. This rule will match forever and consume no input: FRED : ; Jim > -----Original Message----- > From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- > boun...@antlr.org] On Behalf Of Nick Vlassopoulos > Sent: Thursday, February 25, 2010 7:31 AM > To: Christopher L Conway > Cc: antlr-interest@antlr.org > Subject: Re: [antlr-interest] Bounding the token stream in the C > backend > > Hi Christopher, > > I am not entirely sure, but you may have run into the same problem as I > did > a > while ago. You may want to have a look at the discussion thread back > then > for > some advices: > http://www.antlr.org/pipermail/antlr-interest/2009-April/034125.html > Although I used the simple solution Jim suggested, i.e. parsed the > headers and just used some custom code to parse the rest of the file, > some of the advices in that thread might be helpful. > > Hope this helps, > > Nikos > > > On Thu, Feb 25, 2010 at 6:09 AM, Christopher L Conway > <ccon...@cs.nyu.edu>wrote: > > > I've got a large input file (~39MB) that I'm attempting to parse with > > an ANTLR3-generated C parser. The parser is using a huge amount of > > memory (~3.7GB) and seems to start thrashing without making much > > progress towards termination. I found a thread from earlier this > month > > (http://markmail.org/message/jfngdd2ci6h7qrbo) suggesting the most > > likely cause of such behavior is a parser bug, but I've stepped > > through the code and it seems to be lexing just fine. Rather, it > seems > > the problem is that fillBuffer() is tokenizing the whole file in one > > go; then, the parsing rules slow to a crawl because the token buffer > > is sitting on all the memory. > > > > I wonder if there is a way to change fillBuffer()'s behavior, so that > > it will only lex some bounded number of tokens before allowing > parsing > > to proceed? > > > > Thanks, > > Chris > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > > Unsubscribe: > > http://www.antlr.org/mailman/options/antlr-interest/your-email- > address > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- > email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.