Re: [DISCUSS] Tests vs multiline strings

2019-12-17 Thread Stamatis Zampetakis
Undeniably multi-line strings can improve the readability of the code and
facilitate reviews but I would prefer if we didn't refactor existing tests
from Java to Kotlin mainly for two reasons:
1. Losing the history is an important disadvantage for me. Quite often, I
find my self looking in the history of the tests to find the parts of
the core that were changed. Or the opposite, looking for instance how an
API works by checking the tests that were added in a particular commit.
2. The feature is already present as experimental in Java 13 [1] so we
could start using it in the near future (eventually we will start using a
newer jdk) so from my point of view there is no big rush to use Kotlin for
this.

If I had to choose between the history and multi-line strings I would opt
for the history. I lived with normal string concatenation for a while so I
am accustomed to reading the code the way it is without too much struggle.

No objections for using Kotlin for new tests.

Best,
Stamatis

[1] https://openjdk.java.net/jeps/355


On Tue, Dec 17, 2019 at 1:27 AM Kevin Risden  wrote:

> I don't see a major benefit to switching to an entirely new language to get
> multiline strings. I agree that sticking to Java makes sense.
>
> Kevin Risden
>
>
> On Mon, Dec 16, 2019 at 12:50 AM Vladimir Sitnikov <
> sitnikov.vladi...@gmail.com> wrote:
>
> > Feng>Introducing another language
> >
> > It is the same language that is used for the build scripts, so it is not
> a
> > new language.
> >
> > Danny> in the code evolving and the Scala has many
> > Danny> tricky problems especially the version compatibility
> >
> > Kotlin has strong backward compatibility.
> >
> > Julian>Transitioning our tests to a different language (Kotlin) is a
> > Julian> drastic solution. It requires developers to understand a new
> > language,
> >
> > Note: most of the time, test code is just a glue code between input data
> > and asserts. The complicated logic is not there (which is good by the
> way).
> >
> > At the same time, it means it will be very little new to understand.
> >
> > Vladimir
> >
>


Re: [DISCUSS] Tests vs multiline strings

2019-12-16 Thread Kevin Risden
I don't see a major benefit to switching to an entirely new language to get
multiline strings. I agree that sticking to Java makes sense.

Kevin Risden


On Mon, Dec 16, 2019 at 12:50 AM Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Feng>Introducing another language
>
> It is the same language that is used for the build scripts, so it is not a
> new language.
>
> Danny> in the code evolving and the Scala has many
> Danny> tricky problems especially the version compatibility
>
> Kotlin has strong backward compatibility.
>
> Julian>Transitioning our tests to a different language (Kotlin) is a
> Julian> drastic solution. It requires developers to understand a new
> language,
>
> Note: most of the time, test code is just a glue code between input data
> and asserts. The complicated logic is not there (which is good by the way).
>
> At the same time, it means it will be very little new to understand.
>
> Vladimir
>


Re: [DISCUSS] Tests vs multiline strings

2019-12-15 Thread Vladimir Sitnikov
Feng>Introducing another language

It is the same language that is used for the build scripts, so it is not a
new language.

Danny> in the code evolving and the Scala has many
Danny> tricky problems especially the version compatibility

Kotlin has strong backward compatibility.

Julian>Transitioning our tests to a different language (Kotlin) is a
Julian> drastic solution. It requires developers to understand a new
language,

Note: most of the time, test code is just a glue code between input data
and asserts. The complicated logic is not there (which is good by the way).

At the same time, it means it will be very little new to understand.

Vladimir


Re: [DISCUSS] Tests vs multiline strings

2019-12-15 Thread Feng Zhu
" It is pretty good as a pure Java project"
+1 for Danny's comment.
Introducing another language (without strong demands) brings burden for
project maintenance and evolution.

Best,
Feng

Danny Chan  于2019年12月16日周一 上午11:34写道:

> I also have the same concern with Julian, in Apache Flink SQL, we did a
> lot of code with Scala, but in the code evolving and the Scala has many
> tricky problems especially the version compatibility. So finally, we decide
> to move on to Java code: rewrite exiting Scala code to Java again.
>
> I don’t think Kotlin is more popular than Scala or better than Scala, so
> I’m scared for this suggestion.
>
> I also didn’t want Calcite to repeat the mistake, it is pretty good as a
> pure Java project.
>
> BTW, I always saw some new code as Kotlin in Calcite[1](Some of them are
> test APIs), maybe we should change them back to Java.
>
> [1]
> https://github.com/apache/calcite/commit/0d6bec4140da46af07d58cf07a5e682d61529603
>
> Best,
> Danny Chan
> 在 2019年12月16日 +0800 AM9:02,Julian Hyde ,写道:
> > I don't think we should do this.
> >
> > Multi-line strings are a bit unwieldy, but they're not a major
> > problem. Transitioning our tests to a different language (Kotlin) is a
> > drastic solution. It requires developers to understand a new language,
> > and it loses all history from the source files.
> >
> > Julian
> >
> > On Sun, Dec 15, 2019 at 4:37 AM Vladimir Sitnikov
> >  wrote:
> > >
> > > I've filed two PRs to evaluate the impact of the replacement.
> > >
> > > $ to _: https://github.com/apache/calcite/pull/1659
> > > 203.3sec, 5510 completed, 3 failed, 91 skipped, Gradle Test Run
> > > :core:test
> > >
> > > $ to #: https://github.com/apache/calcite/pull/1660
> > > 196.7sec, 5510 completed, 53 failed, 91 skipped, Gradle Test Run
> > > :core:test
> > >
> > > There are test failures, however, both of them are almost ready.
> > >
> > > Both $ and _ are valid "Java identifier start" characters, so variable
> name
> > > like _3 is valid.
> > > If we use # instead of $, then code generator needs to be updated as
> well
> > > as it sometimes uses Java variables like $3, and #3 would become
> invalid.
> > >
> > > Any thoughts?
> > >
> > > Vladimir
>


Re: [DISCUSS] Tests vs multiline strings

2019-12-15 Thread Danny Chan
I also have the same concern with Julian, in Apache Flink SQL, we did a lot of 
code with Scala, but in the code evolving and the Scala has many tricky 
problems especially the version compatibility. So finally, we decide to move on 
to Java code: rewrite exiting Scala code to Java again.

I don’t think Kotlin is more popular than Scala or better than Scala, so I’m 
scared for this suggestion.

I also didn’t want Calcite to repeat the mistake, it is pretty good as a pure 
Java project.

BTW, I always saw some new code as Kotlin in Calcite[1](Some of them are test 
APIs), maybe we should change them back to Java.

[1] 
https://github.com/apache/calcite/commit/0d6bec4140da46af07d58cf07a5e682d61529603

Best,
Danny Chan
在 2019年12月16日 +0800 AM9:02,Julian Hyde ,写道:
> I don't think we should do this.
>
> Multi-line strings are a bit unwieldy, but they're not a major
> problem. Transitioning our tests to a different language (Kotlin) is a
> drastic solution. It requires developers to understand a new language,
> and it loses all history from the source files.
>
> Julian
>
> On Sun, Dec 15, 2019 at 4:37 AM Vladimir Sitnikov
>  wrote:
> >
> > I've filed two PRs to evaluate the impact of the replacement.
> >
> > $ to _: https://github.com/apache/calcite/pull/1659
> > 203.3sec, 5510 completed, 3 failed, 91 skipped, Gradle Test Run
> > :core:test
> >
> > $ to #: https://github.com/apache/calcite/pull/1660
> > 196.7sec, 5510 completed, 53 failed, 91 skipped, Gradle Test Run
> > :core:test
> >
> > There are test failures, however, both of them are almost ready.
> >
> > Both $ and _ are valid "Java identifier start" characters, so variable name
> > like _3 is valid.
> > If we use # instead of $, then code generator needs to be updated as well
> > as it sometimes uses Java variables like $3, and #3 would become invalid.
> >
> > Any thoughts?
> >
> > Vladimir


Re: [DISCUSS] Tests vs multiline strings

2019-12-15 Thread XING JIN
Before coming to Calcite, I works quite some time on Scala. The code style
shares some similarities with Kotlin -- It's much simple and easier to read
when you write a test.
But we should think twice whether to bring in another language.
To answer Haisheng's question:
Because default lex config is Lex.ORACLE.

Best,
Jin

Chunwei Lei  于2019年12月16日周一 上午10:34写道:

> I agree with Julian.
>
> Changing to Kotlin needs lots of error, but gets a little gain. Besides, It
> costs much more
> time to write a test if developers are not familiar with Kotlin. I
> prefer to use Java as it is now.
>
>
> Best,
> Chunwei
>
>
> On Mon, Dec 16, 2019 at 9:02 AM Julian Hyde  wrote:
>
> > I don't think we should do this.
> >
> > Multi-line strings are a bit unwieldy, but they're not a major
> > problem. Transitioning our tests to a different language (Kotlin) is a
> > drastic solution. It requires developers to understand a new language,
> > and it loses all history from the source files.
> >
> > Julian
> >
> > On Sun, Dec 15, 2019 at 4:37 AM Vladimir Sitnikov
> >  wrote:
> > >
> > > I've filed two PRs to evaluate the impact of the replacement.
> > >
> > > $ to _: https://github.com/apache/calcite/pull/1659
> > > 203.3sec, 5510 completed,   3 failed,  91 skipped, Gradle Test Run
> > > :core:test
> > >
> > > $ to #: https://github.com/apache/calcite/pull/1660
> > > 196.7sec, 5510 completed,  53 failed,  91 skipped, Gradle Test Run
> > > :core:test
> > >
> > > There are test failures, however, both of them are almost ready.
> > >
> > > Both $ and _ are valid "Java identifier start" characters, so variable
> > name
> > > like _3 is valid.
> > > If we use # instead of $, then code generator needs to be updated as
> well
> > > as it sometimes uses Java variables like $3, and #3 would become
> invalid.
> > >
> > > Any thoughts?
> > >
> > > Vladimir
> >
>


Re: [DISCUSS] Tests vs multiline strings

2019-12-15 Thread Chunwei Lei
I agree with Julian.

Changing to Kotlin needs lots of error, but gets a little gain. Besides, It
costs much more
time to write a test if developers are not familiar with Kotlin. I
prefer to use Java as it is now.


Best,
Chunwei


On Mon, Dec 16, 2019 at 9:02 AM Julian Hyde  wrote:

> I don't think we should do this.
>
> Multi-line strings are a bit unwieldy, but they're not a major
> problem. Transitioning our tests to a different language (Kotlin) is a
> drastic solution. It requires developers to understand a new language,
> and it loses all history from the source files.
>
> Julian
>
> On Sun, Dec 15, 2019 at 4:37 AM Vladimir Sitnikov
>  wrote:
> >
> > I've filed two PRs to evaluate the impact of the replacement.
> >
> > $ to _: https://github.com/apache/calcite/pull/1659
> > 203.3sec, 5510 completed,   3 failed,  91 skipped, Gradle Test Run
> > :core:test
> >
> > $ to #: https://github.com/apache/calcite/pull/1660
> > 196.7sec, 5510 completed,  53 failed,  91 skipped, Gradle Test Run
> > :core:test
> >
> > There are test failures, however, both of them are almost ready.
> >
> > Both $ and _ are valid "Java identifier start" characters, so variable
> name
> > like _3 is valid.
> > If we use # instead of $, then code generator needs to be updated as well
> > as it sometimes uses Java variables like $3, and #3 would become invalid.
> >
> > Any thoughts?
> >
> > Vladimir
>


Re: [DISCUSS] Tests vs multiline strings

2019-12-15 Thread Julian Hyde
I don't think we should do this.

Multi-line strings are a bit unwieldy, but they're not a major
problem. Transitioning our tests to a different language (Kotlin) is a
drastic solution. It requires developers to understand a new language,
and it loses all history from the source files.

Julian

On Sun, Dec 15, 2019 at 4:37 AM Vladimir Sitnikov
 wrote:
>
> I've filed two PRs to evaluate the impact of the replacement.
>
> $ to _: https://github.com/apache/calcite/pull/1659
> 203.3sec, 5510 completed,   3 failed,  91 skipped, Gradle Test Run
> :core:test
>
> $ to #: https://github.com/apache/calcite/pull/1660
> 196.7sec, 5510 completed,  53 failed,  91 skipped, Gradle Test Run
> :core:test
>
> There are test failures, however, both of them are almost ready.
>
> Both $ and _ are valid "Java identifier start" characters, so variable name
> like _3 is valid.
> If we use # instead of $, then code generator needs to be updated as well
> as it sometimes uses Java variables like $3, and #3 would become invalid.
>
> Any thoughts?
>
> Vladimir


Re: [DISCUSS] Tests vs multiline strings

2019-12-15 Thread Vladimir Sitnikov
I've filed two PRs to evaluate the impact of the replacement.

$ to _: https://github.com/apache/calcite/pull/1659
203.3sec, 5510 completed,   3 failed,  91 skipped, Gradle Test Run
:core:test

$ to #: https://github.com/apache/calcite/pull/1660
196.7sec, 5510 completed,  53 failed,  91 skipped, Gradle Test Run
:core:test

There are test failures, however, both of them are almost ready.

Both $ and _ are valid "Java identifier start" characters, so variable name
like _3 is valid.
If we use # instead of $, then code generator needs to be updated as well
as it sometimes uses Java variables like $3, and #3 would become invalid.

Any thoughts?

Vladimir


Re: [DISCUSS] Tests vs multiline strings

2019-12-14 Thread Rui Wang
Replace $ with something else will have a large impact. E.g. projects using
Calcite might have tests that verify plans by plan.toString() thus $ is
exposed in many places.


Can we only use Kotlin for tests that does not evaluate plans? I think
there are still many tests that does not have "$" can benefit from
multiple strings.


-Rui

On Sat, Dec 14, 2019 at 3:03 PM Haisheng Yuan 
wrote:

> The plan becomes not so easier to read, which is more important to me.
> Unless we change $ to other symbol, I am inclined to keep using Java.
>
> I am not sure why schemas in MaterializationTest require column names to
> be quoted, If we can change that, queries will be much easier to read (even
> for multiline strings), like RelOptRulesTest does.
>
> - Haisheng
>
> --
> 发件人:Vladimir Sitnikov
> 日 期:2019年12月15日 06:09:37
> 收件人:Apache Calcite dev list
> 主 题:[DISCUSS] Tests vs multiline strings
>
> Hi,
>
> Calcite tests sometimes require multiline strings.
> For instance: input SQL, output plan.
>
> TL;DR: let's use Kotlin to improve test code readability. What do you
> think?
>
> Unfortunately, Java <14 lacks multiline strings, so the current tests are
> not that pretty :(
> They are hard to read, hard to understand, and it is hard to maintain them.
>
> Sample test (from MaterializationTest):
>
>   @Test public void testFilterQueryOnProjectView5() {
> checkMaterialize(
> "select \"deptno\" - 10 as \"x\", \"empid\" + 1 as ee, \"name\"\n"
> + "from \"emps\"",
> "select \"name\", \"empid\" + 1 as e\n"
> + "from \"emps\" where \"deptno\" - 10 = 2",
> HR_FKUK_MODEL,
> CalciteAssert.checkResultContains(
> "EnumerableCalc(expr#0..2=[{inputs}], expr#3=[2], "
> + "expr#4=[=($t0, $t3)], name=[$t2], E=[$t1],
> $condition=[$t4])\n"
> + "  EnumerableTableScan(table=[[hr, m0]]"));
>   }
>
> What do you think if we migrate those types of tests to Kotlin?
> Note: I'm well aware there are Quidem-based tests, and those do not replace
> Java-based ones.
>
> Here's how the same test might look like in Kotlin:
>
> @Test
> fun `filter query on project view5`() {
> checkMaterialize(
> """select "deptno" - 10 as "x", "empid" + 1 as ee, "name" from
> "emps" """,
> """select "name", "empid" + 1 as e from "emps" where "deptno" -
> 10 = 2""",
> HR_FKUK_MODEL,
> CalciteAssert.checkResultContains(
> """
> EnumerableCalc(expr#0..2=[{inputs}], expr#3=[2],
> expr#4=[=(${'$'}t0, ${'$'}t3)], name=[${'$'}t2], E=[${'$'}t1],
> ${'$'}condition=[${'$'}t4])
>   EnumerableTableScan(table=[[hr, m0]]
> """.trimIndent()
> )
> )
> }
>
> Pros:
> * SQLs are much easier to read since all the slashes are gone
> * Multiline text is much easier to follow
> * ctrl+c/v works much better: I can copy a substring, even multiple lines
> from the middle, and it just works
> * PR review would be simplified as SQL in PRs would be easier to read
>
> Cons:
> * A bit of a sad point is $ is a special symbol there, and it needs to be
> escaped.
> A (not that?) crazy idea might be to make Calcite use another character
> instead of $ (as $ is used in string templating languages quite often
> anyway),
> * If a class is converted from Java to Kotlin, then default Git annotate
> fails to attribute lines to commits (== the file looks like a brand new
> one)
> * Something else?
>
> I don't suggest to go ahead and convert all the things, however, in my
> opinion, converting at least some of the tests or using multiline strings
> for newly added tests makes sense.
>
> What do you think?
>
> Vladimir
>
>


Re: [DISCUSS] Tests vs multiline strings

2019-12-14 Thread Haisheng Yuan
The plan becomes not so easier to read, which is more important to me. 
Unless we change $ to other symbol, I am inclined to keep using Java.

I am not sure why schemas in MaterializationTest require column names to be 
quoted, If we can change that, queries will be much easier to read (even for 
multiline strings), like RelOptRulesTest does.

- Haisheng

--
发件人:Vladimir Sitnikov
日 期:2019年12月15日 06:09:37
收件人:Apache Calcite dev list
主 题:[DISCUSS] Tests vs multiline strings

Hi,

Calcite tests sometimes require multiline strings.
For instance: input SQL, output plan.

TL;DR: let's use Kotlin to improve test code readability. What do you think?

Unfortunately, Java <14 lacks multiline strings, so the current tests are
not that pretty :(
They are hard to read, hard to understand, and it is hard to maintain them.

Sample test (from MaterializationTest):

  @Test public void testFilterQueryOnProjectView5() {
checkMaterialize(
"select \"deptno\" - 10 as \"x\", \"empid\" + 1 as ee, \"name\"\n"
+ "from \"emps\"",
"select \"name\", \"empid\" + 1 as e\n"
+ "from \"emps\" where \"deptno\" - 10 = 2",
HR_FKUK_MODEL,
CalciteAssert.checkResultContains(
"EnumerableCalc(expr#0..2=[{inputs}], expr#3=[2], "
+ "expr#4=[=($t0, $t3)], name=[$t2], E=[$t1],
$condition=[$t4])\n"
+ "  EnumerableTableScan(table=[[hr, m0]]"));
  }

What do you think if we migrate those types of tests to Kotlin?
Note: I'm well aware there are Quidem-based tests, and those do not replace
Java-based ones.

Here's how the same test might look like in Kotlin:

@Test
fun `filter query on project view5`() {
checkMaterialize(
"""select "deptno" - 10 as "x", "empid" + 1 as ee, "name" from
"emps" """,
"""select "name", "empid" + 1 as e from "emps" where "deptno" -
10 = 2""",
HR_FKUK_MODEL,
CalciteAssert.checkResultContains(
"""
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[2],
expr#4=[=(${'$'}t0, ${'$'}t3)], name=[${'$'}t2], E=[${'$'}t1],
${'$'}condition=[${'$'}t4])
  EnumerableTableScan(table=[[hr, m0]]
""".trimIndent()
)
)
}

Pros:
* SQLs are much easier to read since all the slashes are gone
* Multiline text is much easier to follow
* ctrl+c/v works much better: I can copy a substring, even multiple lines
from the middle, and it just works
* PR review would be simplified as SQL in PRs would be easier to read

Cons:
* A bit of a sad point is $ is a special symbol there, and it needs to be
escaped.
A (not that?) crazy idea might be to make Calcite use another character
instead of $ (as $ is used in string templating languages quite often
anyway),
* If a class is converted from Java to Kotlin, then default Git annotate
fails to attribute lines to commits (== the file looks like a brand new one)
* Something else?

I don't suggest to go ahead and convert all the things, however, in my
opinion, converting at least some of the tests or using multiline strings
for newly added tests makes sense.

What do you think?

Vladimir



[DISCUSS] Tests vs multiline strings

2019-12-14 Thread Vladimir Sitnikov
Hi,

Calcite tests sometimes require multiline strings.
For instance: input SQL, output plan.

TL;DR: let's use Kotlin to improve test code readability. What do you think?

Unfortunately, Java <14 lacks multiline strings, so the current tests are
not that pretty :(
They are hard to read, hard to understand, and it is hard to maintain them.

Sample test (from MaterializationTest):

  @Test public void testFilterQueryOnProjectView5() {
checkMaterialize(
"select \"deptno\" - 10 as \"x\", \"empid\" + 1 as ee, \"name\"\n"
+ "from \"emps\"",
"select \"name\", \"empid\" + 1 as e\n"
+ "from \"emps\" where \"deptno\" - 10 = 2",
HR_FKUK_MODEL,
CalciteAssert.checkResultContains(
"EnumerableCalc(expr#0..2=[{inputs}], expr#3=[2], "
+ "expr#4=[=($t0, $t3)], name=[$t2], E=[$t1],
$condition=[$t4])\n"
+ "  EnumerableTableScan(table=[[hr, m0]]"));
  }

What do you think if we migrate those types of tests to Kotlin?
Note: I'm well aware there are Quidem-based tests, and those do not replace
Java-based ones.

Here's how the same test might look like in Kotlin:

@Test
fun `filter query on project view5`() {
checkMaterialize(
"""select "deptno" - 10 as "x", "empid" + 1 as ee, "name" from
"emps" """,
"""select "name", "empid" + 1 as e from "emps" where "deptno" -
10 = 2""",
HR_FKUK_MODEL,
CalciteAssert.checkResultContains(
"""
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[2],
expr#4=[=(${'$'}t0, ${'$'}t3)], name=[${'$'}t2], E=[${'$'}t1],
${'$'}condition=[${'$'}t4])
  EnumerableTableScan(table=[[hr, m0]]
""".trimIndent()
)
)
}

Pros:
* SQLs are much easier to read since all the slashes are gone
* Multiline text is much easier to follow
* ctrl+c/v works much better: I can copy a substring, even multiple lines
from the middle, and it just works
* PR review would be simplified as SQL in PRs would be easier to read

Cons:
* A bit of a sad point is $ is a special symbol there, and it needs to be
escaped.
A (not that?) crazy idea might be to make Calcite use another character
instead of $ (as $ is used in string templating languages quite often
anyway),
* If a class is converted from Java to Kotlin, then default Git annotate
fails to attribute lines to commits (== the file looks like a brand new one)
* Something else?

I don't suggest to go ahead and convert all the things, however, in my
opinion, converting at least some of the tests or using multiline strings
for newly added tests makes sense.

What do you think?

Vladimir