Hi, I'm using some lexer rules such like 'aaaa', 'bbb', ... Then, the code generated become too big. So, I reduced some lexer rules and I moved them to grammar rules.
Ex: TAG_START_OPEN 'link' attrUriId='uri' ATTR_EQ attrValueUri=ATTR_VALUE ((attrRoleId='role' ATTR_EQ attrValueRole=ATTR_VALUE) | (attrStartId='start' ATTR_EQ attrValueStart=ATTR_VALUE)|(attrEndId='end' ATTR_EQ attrValueEnd=ATTR_VALUE))* TAG_EMPTY_CLOSE Does anyone have any tips? Thanks for help, Marcelo On Wed, Oct 21, 2009 at 5:00 PM, <[email protected]> wrote: > Send antlr-interest mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://www.antlr.org/mailman/listinfo/antlr-interest > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of antlr-interest digest..." > > > Today's Topics: > > 1. Re: Status of the CSharp3 target and my C# ports of ANTLR and > StringTemplate (Robert van der Hulst) > 2. Re: Bytes Limit (David-Sarah Hopwood) > 3. Re: Bytes Limit (Jim Idle) > 4. Re: Status of the CSharp3 target and my C# portsof ANTLR and > StringTemplate (Jim Idle) > 5. Re: Using multiple grammars with a single parser (Jim Idle) > 6. [Antlr3 grammar] how to specify alpha token, numeric token > and mix of both (Hieu Phung) > 7. Re: [Antlr3 grammar] how to specify alpha token, numeric > token and mix of both (Kaleb Pederson) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 20 Oct 2009 22:27:43 +0200 > From: Robert van der Hulst <[email protected]> > Subject: Re: [antlr-interest] Status of the CSharp3 target and my C# > ports of ANTLR and StringTemplate > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset="us-ascii" > > An HTML attachment was scrubbed... > URL: > http://www.antlr.org/pipermail/antlr-interest/attachments/20091020/65d74ab9/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Wed, 21 Oct 2009 00:24:04 +0100 > From: David-Sarah Hopwood <[email protected]> > Subject: Re: [antlr-interest] Bytes Limit > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset=UTF-8 > > Marcelo Nichele wrote: > > Hi, > > > > I'm getting started in ANTLR and my grammar generated the > > specialStateTransition method too big. > > > > The error message is *The code of method specialStateTransition(int, > > IntStream) is exceeding the 65535 bytes limit.* > > > > The method assignature is: > > *public int specialStateTransition(int s, IntStream _input) throws > > NoViableAltException* > > Workaround: > > Look at the code for that method in the generated parser source > (note that there may be multiple DFA inner classes each with a > specialStateTransition method; the full error message should say which > one, or just look at the largest such methods). > Probably the code for specialStateTransition will include code copied > from predicates in your grammar, duplicated many times. Try to simplify > the code that is being duplicated. > > For example, you could declare a boolean variable in the parser class > using @parser::members, set it to the predicate condition in an @init > block of the relevant rule(s), and reference that variable in place of > the original condition. (Be careful that you aren't changing the behaviour > of the rule by moving the predicate evaluation to the @init block.) > > > Suggested longer-term improvement: > > The size of the generated specialStateTransition methods would be > considerably reduced if ANTLR were to automatically create temporary > variables for predicate conditions, rather than duplicating their code. > Since the DFA object is an instance of an inner class of the parser, > the workaround above requires the Java compiler to generate references > to outer class variables, which is more code than would be needed if > ANTLR were to create such temporaries as local variables of > specialStateTransition. Since there is no guarantee as to how often > predicates are evaluated, that change would not affect correctness. > > -- > David-Sarah Hopwood ? http://davidsarah.livejournal.com > > > > ------------------------------ > > Message: 3 > Date: Wed, 21 Oct 2009 13:27:29 +0530 > From: "Jim Idle" <[email protected]> > Subject: Re: [antlr-interest] Bytes Limit > To: "[email protected]" <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset="us-ascii" > > This is also quite often caused by a poorly specified grammar (especially > lexers) causing lots lookahead and states etc. A good way to determine this > is to find the DFA in question in the generated source code and see what > decisions/rules it is handling. This should help you pin down where things > are getting so big and then you can look at the why. > > > > Jim > > > > From: [email protected] [mailto: > [email protected]] On Behalf Of Horst Dehmer > Sent: Tuesday, October 20, 2009 12:13 PM > To: Marcelo Nichele; [email protected] > Subject: Re: [antlr-interest] Bytes Limit > > > > Hello Marcelo, > > i'm afraid you hit a hard limit with java binary class files. how is your > grammar looking, is it unusually big? > have also a look at > > http://groups.google.com/group/comp.lang.java.machine/browse_thread/thread/b0cf268515f1ef55 > > good luck, > horst > > > On 20.10.09 06:59, "Marcelo Nichele" <[email protected]> wrote: > > The code of method specialStateTransition(int, IntStream) is exceeding the > 65535 bytes limit > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://www.antlr.org/pipermail/antlr-interest/attachments/20091021/733c4727/attachment-0001.html > > ------------------------------ > > Message: 4 > Date: Wed, 21 Oct 2009 13:47:59 +0530 > From: "Jim Idle" <[email protected]> > Subject: Re: [antlr-interest] Status of the CSharp3 target and my C# > portsof ANTLR and StringTemplate > To: "[email protected]" <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset="us-ascii" > > I think we can expose the public/private stuff easily. I will talk to Ter > in case there is some reason it is not right now, but I don't think so as > the ability to do this is part of the current v2 grammar and I just coded it > in my v3 grammar. We should probably move this to ANTLR dev list. > > > > Jim > > > > From: Sam Harwell [mailto:[email protected]] > Sent: Tuesday, October 20, 2009 10:26 PM > To: Jim Idle; [email protected]; [email protected] > Subject: RE: [antlr-interest] Status of the CSharp3 target and my C# > portsof ANTLR and StringTemplate > > > > The code should be nearly the same as that of the CSharp2 target. Here is > the C# port of the CSharp2 target and CSharp3 target so you can see how the > CSharp3 one differs. Clearly it should be easy to make it work. > > > > public class CSharp2Target : Target > > { > > public override string EncodeIntAsCharEscape(int v) > > { > > return "\\x" + v.ToString("X"); > > } > > } > > > > public class CSharp3Target : Target > > { > > public override string EncodeIntAsCharEscape(int v) > > { > > return "\\x" + v.ToString("X"); > > } > > > > public override string GetTarget64BitStringFromValue(ulong word) > > { > > return "0x" + word.ToString("X"); > > } > > } > > > > Something to note: I'm not sure the Java version of the tool exposes the > property required to support marking rules as public/protected/private. > We'll have to check that out too, but it should be straightforward. > > > > Sam > > > > From: Jim Idle [mailto:[email protected]] > Sent: Tuesday, October 20, 2009 12:08 AM > To: Sam Harwell; [email protected]; > [email protected] > Subject: RE: [antlr-interest] Status of the CSharp3 target and my C# > portsof ANTLR and StringTemplate > > > > OK - well we can add that easily enough J Why don't we try it? > > > > Jim > > > > From: Sam Harwell [mailto:[email protected]] > Sent: Tuesday, October 20, 2009 7:46 AM > To: Jim Idle; [email protected]; [email protected] > Subject: RE: [antlr-interest] Status of the CSharp3 target and my C# > portsof ANTLR and StringTemplate > > > > I think the only thing missing is the Java class required for the Java > version to know the CSharp3 target exists. > > > > Sam > > > > From: [email protected] [mailto: > [email protected]] On Behalf Of Jim Idle > Sent: Tuesday, October 20, 2009 2:51 AM > To: [email protected]; [email protected] > Subject: Re: [antlr-interest] Status of the CSharp3 target and my C# > portsof ANTLR and StringTemplate > > > > Top posting for Sam's benefit ;-) > > > > Not being able to use the Csharp3 target from the standard version of the > tool is going to be a turn off for many I think L What is it that your port > of the tool has that the standard version does not. I know you have posted > some of that, but perhaps we can summarize this and see if such things can > be absorbed into the standard Java tool? Nobody minds you having your own > version of anything because it is open source, but most will want ot use the > 'offical' java version of the tool even if they are targeting C#. > > > > Thanks for the updates, > > > > Jim > > > > From: [email protected] [mailto: > [email protected]] On Behalf Of Sam Harwell > Sent: Monday, October 19, 2009 1:05 AM > To: [email protected]; [email protected] > Subject: [antlr-interest] Status of the CSharp3 target and my C# ports of > ANTLR and StringTemplate > > > > Hi everyone, > > > > Here's a status update that I know many people are asking for. For each > portion, I'll talk about the status of the code in Perforce. At the end, > I'll talk about the status of the posted binaries. > > > > Basic Status > > > > StringTemplate and the ANTLR Tool: Up-to-date with the Java version for all > targets as of August 4, 2009, which covers all of the changes made earlier > in the year and over the summer. > > > > CSharp3 Target: working and extensively used in the ANTLR Tool, > StringTemplate, and the commercial projects I use ANTLR for. I haven't > tested the -profile and -debug modes because I don't use them, however the > templates should be "close to working". Currently, the CSharp3 target can > only be used when generating grammars from the C# port of the tool. > > > > Design Changes > > > > 1. Rather than package the target templates as resources in the > tool's executable, I've chosen a flat file layout. That way, the templates > for a target can be updated without recompiling the tool. The targets > themselves are also implemented as individual DLL's. > > 2. The CSharp3 target declares rules as private methods by default. > Rules can be made public by simply marking them as such in the grammar: > "public compile_unit : declaration*;" I have updated the Java target's code > generation to support this as well, but it's not checked in. > > 3. StringTemplate has code for a high speed dynamically compiled > interpreter. By default, the build doesn't enable this mode, but when it's > turned on the output appears to work correctly. I need to do another round > of tests, but at this point the C# ports of the ANTLR Tool and > StringTemplate should be significantly faster than the Java version. We've > hit a brick wall preventing further optimization without rewriting ST, but > the work on STv4 should give another order of magnitude improvement in > template rendering performance. > > > > Things Holding Me Up > > > > 1. I haven't finalized the way I'm going to do assembly versioning, > although I think I've got that worked out now. I'll send a separate mail to > the list regarding this. > > 2. StringTemplate is only tested in regards to code generation for > the ANTLR tool. In particular, its ability to locate templates in resources > or on the file system is not documented and may or may not behave as people > expect. > > 3. I'm still making periodic changes to the API as I finalize things, > and breaking changes in production code aren't good. I don't want to suggest > replacing the CSharp2 target until the CSharp3 target is more tested by > other people. > > > > Things I want to do, but not really holding up the builds > > > > I really want to package a clean integration of ANTLR+CSharp3 for MSBuild. > We need this. This would include at least MSBuild targets file and templates > for adding grammars to a project. Unfortunately, there are many issues I > still need to resolve for this to be a reality, most of which have answers > in shades of gray. > > > > Status of the Posted Build > > > > The build available for download was uploaded on fairly short notice. > Mistakes (by me) included not having the assembly version set correctly and > not posting the source code from the build with the binaries. I've been > trying to wrap some of these things up before posting another build. > > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://www.antlr.org/pipermail/antlr-interest/attachments/20091021/45738851/attachment-0001.html > > ------------------------------ > > Message: 5 > Date: Wed, 21 Oct 2009 13:44:25 +0530 > From: "Jim Idle" <[email protected]> > Subject: Re: [antlr-interest] Using multiple grammars with a single > parser > To: "[email protected]" <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset="us-ascii" > > This has been covered in previous discussions (use the search), but > basically you create tokens that look for dates using code that is sensitive > to the locale, which leaves you with a single lexer with a code based match > rather than pattern based match. > > > > DATE : '#' // Assuming that you delimit the dates somehow > > { > > setText(myDateFunctionThatReturnsString()); > > } > > '#' > > ; > > > > There are other approaches than returning the string but you should get the > picture? > > > > Jim > > > > From: [email protected] [mailto: > [email protected]] On Behalf Of Parambir Singh > Sent: Tuesday, October 20, 2009 7:06 PM > To: [email protected] > Subject: [antlr-interest] Using multiple grammars with a single parser > > > > Hi > > > > I am working on a project where I want to parse input in different locales > (e.g. english, french & german dates). I don't want to create multiple > parsers, since the semantics of the grammar don't change between locales. So > probably I'll need multiple lexers and a single parser. Moreover, I want to > specify a locale to the parser and the input should be matched against only > that particular locale (e.g. german dates should be invalid in english > locale). > > > > What would be the best approach to construct such a parser using ANTLR. I > don't have much experience with ANTLR but I read about grammar inheritance > and think it could be useful here. > > > > Thanks > > Param > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://www.antlr.org/pipermail/antlr-interest/attachments/20091021/f2d7d8df/attachment-0001.html > > ------------------------------ > > Message: 6 > Date: Wed, 21 Oct 2009 18:23:47 +0800 > From: Hieu Phung <[email protected]> > Subject: [antlr-interest] [Antlr3 grammar] how to specify alpha token, > numeric token and mix of both > To: [email protected] > Message-ID: > <[email protected]> > Content-Type: text/plain; charset="iso-8859-1" > > Hi all, > > My grammar has 3 kinds of tokens: > 1) number: contain numeric character > 2) alpha: contain alphabetic character; > 3) mix: contain number and alpha and hyphen, full stop or space > > For example: > 1/VEC305/03MAR/PTY > => in the above input data, 03MAR should be interpreted as a number of > length 2 followed by alpha of length 3. But VEC305 is a mix of length 6. > > If I define grammar like below: > > NUMBER : ('0'..'9')+ ; > ALPHA : ('a'..'z'|'A'..'Z')+; > MIX : (NUMBER | ALPHA | OTHER)+; > fragment OTHER : (' ' | '-' | '.')+; > SLANT : '/'; > > Antlr will return me VEC305 and 03MAR as two MIX tokens. Is there any way > to > define tokens such that Antlr will return me number, slant, mix, slant, > number, alpha, slant, alpha for the input "1/VEC305/03MAR/PTY" ? > > Thank you very much for your suggestions. > > Regards, > Helen > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://www.antlr.org/pipermail/antlr-interest/attachments/20091021/c2aac0a2/attachment-0001.html > > ------------------------------ > > Message: 7 > Date: Wed, 21 Oct 2009 08:36:07 -0700 > From: Kaleb Pederson <[email protected]> > Subject: Re: [antlr-interest] [Antlr3 grammar] how to specify alpha > token, numeric token and mix of both > To: [email protected] > Message-ID: <[email protected]> > Content-Type: Text/Plain; charset="us-ascii" > > On Wednesday 21 October 2009 03:23:47 am Hieu Phung wrote: > > My grammar has 3 kinds of tokens: > > 1) number: contain numeric character > > 2) alpha: contain alphabetic character; > > 3) mix: contain number and alpha and hyphen, full stop or space > > > > For example: > > 1/VEC305/03MAR/PTY > > => in the above input data, 03MAR should be interpreted as a number of > > length 2 followed by alpha of length 3. But VEC305 is a mix of length 6. > > Hieu, > > How do you know that VEC305 is a mix of length six? It sure looks like an > alpha followed by a number to me... so what makes it special or different > than 03MAR? > > -- > Kaleb Pederson > > Twitter - http://twitter.com/kalebpederson > Blog - http://kalebpederson.com > > > ------------------------------ > > _______________________________________________ > antlr-interest mailing list > [email protected] > http://www.antlr.org/mailman/listinfo/antlr-interest > > End of antlr-interest Digest, Vol 59, Issue 22 > ********************************************** > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en -~----------~----~----~----~------~----~------~--~---
List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
