[The Java Posse] Re: Java officially lags C

mbien Wed, 02 Sep 2009 16:52:18 -0700

Hi,
or just do it like html does it...

@DSL(lang="Brainfuck") {
/*
++++++++++[>+++++++>++++++++++>+++>
+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++
++++++++.>.+++.------.--------.>+.>.
*/
}


javac ignores the comment, preprocessors like projectlombok.org could
do whatever they want with the comment (convert to string, generate
code etc)..
(of course the escaping char issue remains...)

regards,
- - -
http://michael-bien.com

On Sep 2, 10:22 pm, Reinier Zwitserloot <reini...@gmail.com> wrote:
> Casper: I don't think you grokked my point.
>
> I'm saying it's impossible to build any java, vanilla or otherwise,
> that can handle this. For the reasons I stated: You'd have to flip the
> architecture upside down and resolve 'DSL' properly midway through
> tokenizing it. Be aware that this automatically means that any error
> caused by the DSL provider *HAS TO* stop the parsing right there on
> the spot, no further error reporting for anything that follows the DSL
> block. Tricks IDEs do to make a class file with whatever methods have
> syntax errors in them replaced with dummies that throw exceptions
> would be impossible.
>
> You'd be giving up an awful lot.
>
> Don't get me wrong, I love the idea, but I haven't seen a workable
> proposal yet. I'm leaning towards the notion that it's impossible to
> get right. Fan tries to use a sufficiently arcane separator (bar-
> angle, so <| special code goes here |>), but if java uses the same
> thing, then you can't embed fan in java. That's not a solution.
>
> Here's a simplistic approach to something that might actually work:
>
> 1. identifier resolution is decoupled from the rest of the source file
> for parsing. In other words, the parser will parse all import
> statements, resolve them, and only then continue on its way.
>
> 2. blocks start with a hash, followed by a type identifier. This type
> identifier is resolved only according to import statements; to make
> this smooth, the definitions for how to handle these blocks MUST
> ALWAYS be top level members, no exceptions. Now the parser does not
> have to consider inner classes and such to resolve the name; the
> process of checking the current package and all import statements
> suffices.
>
> 3. The tokenizer will remember the character that followed the token
> (e.g. the non-identifier character that immediately followed the last
> identifier character in the DSL name, which can be a space, a quote, a
> brace, whatever), and restuffs this back into the source view. The
> tokenizer then hands the raw source (as a Reader or some such) off to
> the .tokenize() method of the provider. The tokenizer MUST return any
> object, and have consumed exactly up to (and including) the closing
> element of the DSL block.
>
> 4. During compilation, the DSL block (which is an expression which can
> have an arbitrary type, including void) is translated into a pure java
> expression by calling the .parse() method of the DSL provider.
>
> 5. Exceptions during the tokenize phase result in the immediate end of
> parsing that java source file, as javac will not know where to
> continue. Exceptions during the parse method aren't nearly as drastic;
> it just means there's an error in the DSL block and the block's
> expression is of an unknown type - certainly not rocket science
> compared to the advanced error recovery employed in many IDEs.
>
> public interface DSLProvider<T> {
>     public T tokenize(SourceReader reader);
>     public String parse(T token, Context c);
>
> }
>
> some open issues are: What should 'parse' return - there's an argument
> to be made for: 'bytecode', 'raw java source as a String', and 'a
> JCExpression object (from javac's internal AST classes). Each has its
> advantages and disadvantages.
>
> Context is some useful construct that allows access to variables legal
> in the current scope, the filer (for looking up types), and similar
> things. A lot of this API already exists (annotation processor API).
>
> Such a system could rather easily support a wide variety of stuff you
> may wish to inject into java source files:
>
>  - String literals
>  - Regexp literals - the compiled regexp tree would be stored into the
> class file.
>  - XML literals
>  - multiline and/or raw string literals.
>  - python - even including python's whitespace based delimiting as the
> mechanism to delimit the block ITSELF, if you think that is a good
> idea.
>  - Clojure, LISP, and other lisp dialects.
>  - just about every programming language in existence (incl. ruby,
> Javascript, C, C#, C++, fortran, ada, and, sure, why not - APL).
>
> The documentation should stress that the .tokenize() method really
> should try its very best to return and not throw an exception.
>
> hypothetical source:
>
> int x = #python:
>     5 + 5
> int thisIsJavaAgain;
>
> String long = #long """This is a long string where \backslashes need
> not be escaped""" + "this is parsed by javac again";
> Pattern p = #regexp /[abc]d\s+(\d*)/i;
>
> Presuming that the context object is sufficiently advanced, this
> should also be possible, especially if you add a way to parse a java
> snippet in that context:
>
> private final Comparator<Integer> absoluteComparator = #closure
> Comparator(Integer a, Integer b) { return Integer.compare(Math.abs(a),
> Math.abs(b)); };
>
> Of course, trying to include java inside such a block has the same
> issue as javac's original problem: How does the closure DSL provider
> know where the closure ends without being as complicated as javac's
> tokenizer? Theoretically java itself could be implemented with this
> scheme, and you could then start the snippet parser at the 'return'
> statement, getting a tokenized object back, which, during your parse
> phase, you can get parsed by calling on javac's own parse method.
>
> The central point is this: You have to split tokenizing and parsing.
> This is yet another instance where fan tries to take the easy way out.
>
> On Sep 2, 7:39 pm, Casper Bang <casper.b...@gmail.com> wrote:
>
> > > tell me how the compiler could possibly sort this out? The only way is
> > > for the compiler to hand off the entire process of TOKENIZING this
> > > stream to the DSL provider for 'longString', which is an entirely
> > > different architecture - right now all java parsers do the fairly
> > > usual thing of tokenizing the whole deal, then tree-izing the whole
> > > thing, and only then starting the process of resolving 'DSL' into
> > > "java.lang.DSL" or whatever you had in mind.
>
> > Oh sure, I should had mentioned explicitly how this obviously won't
> > work with a vanilla javac. Anyway here's the original post I was
> > referring:http://www.jroller.com/scolebourne/entry/enhancing_java_multi_lingual...
>
> > > You'd have to create very specific rules about how the compiler can
> > > find the end of the DSL string. I've thought about this and have not
> > > been able to come up with a particularly sensible rule. The only one I
> > > can think of is to stick to C-esque rules: strings are things in
> > > double or single quotes, and use backslash internally for escapes, and
> > > braces are supposed to be matched. However, these restrictions already
> > > remove most other languages: You can't put python in there (multi-line
> > > strings will screw up java's parser), you can't put regular
> > > expressions in there (no rule enforcing matched quotes or braces). You
> > > can't put XML in there (no rule enforcing matched braces or quotes).
> > > No go.
>
> > Well it's not a trivial issue no, but this is how it work in 
> > Fan:http://fandev.org/sidewalk/topic/438
>
> > /Casper
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "The 
Java Posse" group.
To post to this group, send email to javaposse@googlegroups.com
To unsubscribe from this group, send email to 
javaposse+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en
-~----------~----~----~----~------~----~------~--~---

[The Java Posse] Re: Java officially lags C

Reply via email to