Re: [GSoC 2015 - JENA-491] JavaCC with master.jj
Hi, I've studied the jena tests. It looks like that the syntax tests are generated by syn.sh. But the execution tests are not generated by scripts, which are written by hand one by one. Is that true? Since I have enough time, I'd like to directly go for the syn.sh and syn-arq.sh to generate tests for constructing quads. Thanks! regards, Qihong On Tue, Jun 16, 2015 at 9:24 PM, Andy Seaborne a...@apache.org wrote: On 16/06/15 09:06, Qihong Lin wrote: Hi, Thanks! I just marked GRAPH mandatory, and it worked without producing the warnings. I'll look into the details later. By the way, if the new parser is ready, how to test it? I mean, where to drop the unit test code and the query strings to be tested? I'm confused with org.apache.jena.sparql.junit.QueryTest (is that what I need to deal with?). Any guideline or documentation for arq test? regards, Qihong Most testing of queries is by externally defined manifest files (manifest.ttl) jena-arq/testing/ARQ For now, keep it clean and start a new directory jena-arq/testing/ARQ/ConstructQuads with both syntax and execution tests. This is just to keep everything in one place for now. See jena-arq/testing/ARQ/Syntax/Syntax-ARQ/manifest.ttl and jena-arq/testing/ARQ/Construct/manifest.ttl. A manifest can have syntax and execution tests - it so happens that they are in separate places in the current test suite which was input the the working group. A syntax test looks like: :test_1 rdf:type mfx:PositiveSyntaxTestARQ ; dawgt:approval dawgt:NotClassified ; mf:namesyntax-select-expr-01.arq ; mf:action syntax-select-expr-01.arq ;. to parse syntax-select-expr-01.arq, expecting it to be good, and an execution test is an action and a result: :someExecutionTest rdf:type mfx:TestQuery ; mf:nameConstruct Quads 1 ; mf:action [ qt:query q-construct-1.rq ; qt:data data-1.ttl ] ; mf:result results-construct-1.ttl . an action is a query and a data file. There are different styles of layout in different places. The test suite has grown incrementally over the years of SPARQL 1.0 and SPARQL 1.1. Some test come from outside the project. You can test from the command line using the arq.qparse tool. See other message. There is a command qtest for running manifests. Background FYI: You won't need this when put everything in jena-arq/testing/ARQ/ConstructQuads but to explain: the main test syntax suites are auto-generated by syn.sh Part of that is syn-arq.sh. But hand writing syntax easier for now. Andy
Re: [GSoC 2015 - JENA-491] JavaCC with master.jj
On 16/06/15 09:06, Qihong Lin wrote: Hi, Thanks! I just marked GRAPH mandatory, and it worked without producing the warnings. I'll look into the details later. By the way, if the new parser is ready, how to test it? I mean, where to drop the unit test code and the query strings to be tested? I'm confused with org.apache.jena.sparql.junit.QueryTest (is that what I need to deal with?). Any guideline or documentation for arq test? regards, Qihong Most testing of queries is by externally defined manifest files (manifest.ttl) jena-arq/testing/ARQ For now, keep it clean and start a new directory jena-arq/testing/ARQ/ConstructQuads with both syntax and execution tests. This is just to keep everything in one place for now. See jena-arq/testing/ARQ/Syntax/Syntax-ARQ/manifest.ttl and jena-arq/testing/ARQ/Construct/manifest.ttl. A manifest can have syntax and execution tests - it so happens that they are in separate places in the current test suite which was input the the working group. A syntax test looks like: :test_1 rdf:type mfx:PositiveSyntaxTestARQ ; dawgt:approval dawgt:NotClassified ; mf:namesyntax-select-expr-01.arq ; mf:action syntax-select-expr-01.arq ;. to parse syntax-select-expr-01.arq, expecting it to be good, and an execution test is an action and a result: :someExecutionTest rdf:type mfx:TestQuery ; mf:nameConstruct Quads 1 ; mf:action [ qt:query q-construct-1.rq ; qt:data data-1.ttl ] ; mf:result results-construct-1.ttl . an action is a query and a data file. There are different styles of layout in different places. The test suite has grown incrementally over the years of SPARQL 1.0 and SPARQL 1.1. Some test come from outside the project. You can test from the command line using the arq.qparse tool. See other message. There is a command qtest for running manifests. Background FYI: You won't need this when put everything in jena-arq/testing/ARQ/ConstructQuads but to explain: the main test syntax suites are auto-generated by syn.sh Part of that is syn-arq.sh. But hand writing syntax easier for now. Andy
Re: [GSoC 2015 - JENA-491] JavaCC with master.jj
Hi, Thanks! I just marked GRAPH mandatory, and it worked without producing the warnings. I'll look into the details later. By the way, if the new parser is ready, how to test it? I mean, where to drop the unit test code and the query strings to be tested? I'm confused with org.apache.jena.sparql.junit.QueryTest (is that what I need to deal with?). Any guideline or documentation for arq test? regards, Qihong On Mon, Jun 15, 2015 at 11:45 PM, Ying Jiang jpz6311...@gmail.com wrote: Hi Qihong, In addition to Andy's explanation, You might take look at this tutorial for more details on javacc lookahead: https://javacc.java.net/doc/lookahead.html Best regards, Ying Jiang On Mon, Jun 15, 2015 at 10:42 PM, Andy Seaborne a...@apache.org wrote: Qihong, There is an ambiguity in the grammar if you make GRAPH optional. See rule 'Quads' Consider these two cases: :s :p :o . :z { :s1 :p1 :o1 } . :s :p :o . :z :q :o2 . when the parser get to end of the triple in the default graph: :s :p :o . there are two ways forward: more triples (TriplesTemplate) and end of the triples part, start of named graph. It looks ahead one token and see :z and needs to decide whether the next rule is more triples, the :z :q :o2 . case, or the end of the triples for the default graph and the start of a named graph the :z { :s1 :p1 :o1 } . where it exists TriplesTemplate and moves on to QuadsNotTriples If GRAPH then the entry to QuadsNotTriples is marked by a GRAPH which is never in triples. The grammar is LL(1) - a lookahead of 1 - by default. There are two solutions (I haven't checked exact deatils): 1/ Use LOOKAHEAD(2) so it sees tokens ':z' and ':q' or ':z' (triples) and '{' which is the named graphs case. I think this is in Quads somewhere. 2/ Leave GRAPH required. (2) is fine for now - it will not be too unexpected to users because INSERT DATA requires a GRAPH and it is legal TriG, even if not the short form in TriG. You can come back and look at (1) later. I'm keen for you to get something going as soon as possible, not get lost in details. Background: There is a third solution but it's not as so simple which is to introduce an intermediate state of MaybeTriplesMaybeQuads but if you do that, more of the grammar needs rewriting. I'm not sure how widespread the changes would be. Jena's TriG parser (which is not JavaCC based see LangTriG::oneNamedGraphBlock2) has this comment: // Either :s :p :o or :g { ... } and does one look ahead to get the :s or :g (the :z above), keeps that hanging around, does another lookahead to see '{' or not, then calls turtle(n) if triples. In LangTriG: turtle() is roughly TriplesSameSubject turtle(n) is roughly PropertyListNotEmpty Andy On 15/06/15 11:53, Qihong Lin wrote: Hi, I'm trying to play with master.jj. But the grammar script somethings prints warning messages. The behavior is strange. In order to simplify my question, I'd like to take the following example: In QuadsNotTriples(), line 691 in master.jj, in the master branch: GRAPH If I change it to optional (which is required in future implementations, for the new grammar): (GRAPH)? the grammar script goes like this: $ ./grammar Process grammar -- sparql_11.jj Java Compiler Compiler Version 5.0 (Parser Generator) (type javacc with no arguments for help) Reading from file sparql_11.jj . . . Warning: Choice conflict in [...] construct at line 464, column 4. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 468, column 6. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 484, column 12. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 759, column 3. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 767, column 5. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. File TokenMgrError.java does not exist. Will create one. File ParseException.java does not exist. Will create one. File
Re: [GSoC 2015 - JENA-491] JavaCC with master.jj
Qihong, There is an ambiguity in the grammar if you make GRAPH optional. See rule 'Quads' Consider these two cases: :s :p :o . :z { :s1 :p1 :o1 } . :s :p :o . :z :q :o2 . when the parser get to end of the triple in the default graph: :s :p :o . there are two ways forward: more triples (TriplesTemplate) and end of the triples part, start of named graph. It looks ahead one token and see :z and needs to decide whether the next rule is more triples, the :z :q :o2 . case, or the end of the triples for the default graph and the start of a named graph the :z { :s1 :p1 :o1 } . where it exists TriplesTemplate and moves on to QuadsNotTriples If GRAPH then the entry to QuadsNotTriples is marked by a GRAPH which is never in triples. The grammar is LL(1) - a lookahead of 1 - by default. There are two solutions (I haven't checked exact deatils): 1/ Use LOOKAHEAD(2) so it sees tokens ':z' and ':q' or ':z' (triples) and '{' which is the named graphs case. I think this is in Quads somewhere. 2/ Leave GRAPH required. (2) is fine for now - it will not be too unexpected to users because INSERT DATA requires a GRAPH and it is legal TriG, even if not the short form in TriG. You can come back and look at (1) later. I'm keen for you to get something going as soon as possible, not get lost in details. Background: There is a third solution but it's not as so simple which is to introduce an intermediate state of MaybeTriplesMaybeQuads but if you do that, more of the grammar needs rewriting. I'm not sure how widespread the changes would be. Jena's TriG parser (which is not JavaCC based see LangTriG::oneNamedGraphBlock2) has this comment: // Either :s :p :o or :g { ... } and does one look ahead to get the :s or :g (the :z above), keeps that hanging around, does another lookahead to see '{' or not, then calls turtle(n) if triples. In LangTriG: turtle() is roughly TriplesSameSubject turtle(n) is roughly PropertyListNotEmpty Andy On 15/06/15 11:53, Qihong Lin wrote: Hi, I'm trying to play with master.jj. But the grammar script somethings prints warning messages. The behavior is strange. In order to simplify my question, I'd like to take the following example: In QuadsNotTriples(), line 691 in master.jj, in the master branch: GRAPH If I change it to optional (which is required in future implementations, for the new grammar): (GRAPH)? the grammar script goes like this: $ ./grammar Process grammar -- sparql_11.jj Java Compiler Compiler Version 5.0 (Parser Generator) (type javacc with no arguments for help) Reading from file sparql_11.jj . . . Warning: Choice conflict in [...] construct at line 464, column 4. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 468, column 6. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 484, column 12. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 759, column 3. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 767, column 5. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. File TokenMgrError.java does not exist. Will create one. File ParseException.java does not exist. Will create one. File Token.java does not exist. Will create one. File JavaCharStream.java does not exist. Will create one. Parser generated with 0 errors and 5 warnings. Create text form Java Compiler Compiler Version 5.0 (Documentation Generator Version 0.1.4) (type jjdoc with no arguments for help) Reading from file sparql_11.jj . . . Grammar documentation generated successfully in sparql_11.txt Fixing Java warnings in TokenManager ... Fixing Java warnings in Token ... Fixing Java warnings in TokenMgrError ... Fixing Java warnings in SPARQLParser11 ... Done Process grammar -- arq.jj Java Compiler Compiler Version 5.0 (Parser Generator) (type javacc with no arguments for help) Reading from file arq.jj . . . Warning: Choice conflict in [...] construct at line 486, column 4. Expansion nested within
Re: [GSoC 2015 - JENA-491] JavaCC with master.jj
Hi Qihong, In addition to Andy's explanation, You might take look at this tutorial for more details on javacc lookahead: https://javacc.java.net/doc/lookahead.html Best regards, Ying Jiang On Mon, Jun 15, 2015 at 10:42 PM, Andy Seaborne a...@apache.org wrote: Qihong, There is an ambiguity in the grammar if you make GRAPH optional. See rule 'Quads' Consider these two cases: :s :p :o . :z { :s1 :p1 :o1 } . :s :p :o . :z :q :o2 . when the parser get to end of the triple in the default graph: :s :p :o . there are two ways forward: more triples (TriplesTemplate) and end of the triples part, start of named graph. It looks ahead one token and see :z and needs to decide whether the next rule is more triples, the :z :q :o2 . case, or the end of the triples for the default graph and the start of a named graph the :z { :s1 :p1 :o1 } . where it exists TriplesTemplate and moves on to QuadsNotTriples If GRAPH then the entry to QuadsNotTriples is marked by a GRAPH which is never in triples. The grammar is LL(1) - a lookahead of 1 - by default. There are two solutions (I haven't checked exact deatils): 1/ Use LOOKAHEAD(2) so it sees tokens ':z' and ':q' or ':z' (triples) and '{' which is the named graphs case. I think this is in Quads somewhere. 2/ Leave GRAPH required. (2) is fine for now - it will not be too unexpected to users because INSERT DATA requires a GRAPH and it is legal TriG, even if not the short form in TriG. You can come back and look at (1) later. I'm keen for you to get something going as soon as possible, not get lost in details. Background: There is a third solution but it's not as so simple which is to introduce an intermediate state of MaybeTriplesMaybeQuads but if you do that, more of the grammar needs rewriting. I'm not sure how widespread the changes would be. Jena's TriG parser (which is not JavaCC based see LangTriG::oneNamedGraphBlock2) has this comment: // Either :s :p :o or :g { ... } and does one look ahead to get the :s or :g (the :z above), keeps that hanging around, does another lookahead to see '{' or not, then calls turtle(n) if triples. In LangTriG: turtle() is roughly TriplesSameSubject turtle(n) is roughly PropertyListNotEmpty Andy On 15/06/15 11:53, Qihong Lin wrote: Hi, I'm trying to play with master.jj. But the grammar script somethings prints warning messages. The behavior is strange. In order to simplify my question, I'd like to take the following example: In QuadsNotTriples(), line 691 in master.jj, in the master branch: GRAPH If I change it to optional (which is required in future implementations, for the new grammar): (GRAPH)? the grammar script goes like this: $ ./grammar Process grammar -- sparql_11.jj Java Compiler Compiler Version 5.0 (Parser Generator) (type javacc with no arguments for help) Reading from file sparql_11.jj . . . Warning: Choice conflict in [...] construct at line 464, column 4. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 468, column 6. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 484, column 12. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 759, column 3. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. Warning: Choice conflict in [...] construct at line 767, column 5. Expansion nested within construct and expansion following construct have common prefixes, one of which is: VAR1 Consider using a lookahead of 2 or more for nested expansion. File TokenMgrError.java does not exist. Will create one. File ParseException.java does not exist. Will create one. File Token.java does not exist. Will create one. File JavaCharStream.java does not exist. Will create one. Parser generated with 0 errors and 5 warnings. Create text form Java Compiler Compiler Version 5.0 (Documentation Generator Version 0.1.4) (type jjdoc with no arguments for help) Reading from file sparql_11.jj . . . Grammar documentation generated successfully in sparql_11.txt Fixing Java warnings in TokenManager ... Fixing Java warnings in Token ... Fixing Java