[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-17 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r203261248
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ---
@@ -318,7 +318,8 @@ case class SampleExec(
 v => s"""
   | $v = new $samplerClass($lowerBound, $upperBound, 
false);
   | $v.setSeed(${seed}L + partitionIndex);
- """.stripMargin.trim)
+ """.stripMargin.trim,
+forceInline = true)
--- End diff --

aha, ok. let's us wait for the other developer's comments.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-16 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202623270
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ---
@@ -318,7 +318,8 @@ case class SampleExec(
 v => s"""
   | $v = new $samplerClass($lowerBound, $upperBound, 
false);
   | $v.setSeed(${seed}L + partitionIndex);
- """.stripMargin.trim)
+ """.stripMargin.trim,
+forceInline = true)
--- End diff --

Though this may cause problems leading to the 64KB limit issue. So if we 
are not including the jdk support I'd be against this change...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202563865
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -1588,6 +1588,15 @@ object CodeGenerator extends Logging {
 
   def primitiveTypeName(dt: DataType): String = 
primitiveTypeName(javaType(dt))
 
+  def getClassName(cls: Class[_]): String = {
+try {
+  return Option(cls.getCanonicalName).getOrElse(cls.getName)
+} catch {
+  case err: InternalError =>
--- End diff --

Anyway, `return` removed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202563817
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -1588,6 +1588,15 @@ object CodeGenerator extends Logging {
 
   def primitiveTypeName(dt: DataType): String = 
primitiveTypeName(javaType(dt))
 
+  def getClassName(cls: Class[_]): String = {
+try {
+  return Option(cls.getCanonicalName).getOrElse(cls.getName)
+} catch {
+  case err: InternalError =>
--- End diff --

Please see the commit 
https://github.com/apache/spark/commit/cc88d7fad16e8b5cbf7b6b9bfe412908782b4a45 
and this is the same fix with 
[Utils.getSimpleName](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/Utils.scala#L2732).
 Yea, I see and I think this is workaround now.

Yea, in some cases, the doc says that `getCanonicalName` could return null;
```
/**
 * Returns the canonical name of the underlying class as
 * defined by the Java Language Specification.  Returns null if
 * the underlying class does not have a canonical name (i.e., if
 * it is a local or anonymous class or an array whose component
 * type does not have a canonical name).
 * @return the canonical name of the underlying class if it exists, and
 * {@code null} otherwise.
 * @since 1.5
 */
public String getCanonicalName() {
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202562989
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -890,7 +890,7 @@ case class Cast(child: Expression, dataType: DataType, 
timeZoneId: Option[String
 case StringType =>
   val intOpt = ctx.freshName("intOpt")
   (c, evPrim, evNull) => s"""
-scala.Option $intOpt =
+scala.Option $intOpt =
--- End diff --

I don't look into the bytecode that JDK compilers generate though, the type 
of the return value seems to be erased in the bytecode;
```
...
/* 045 */   scala.Option intOpt_0 =
/* 046 */   
org.apache.spark.sql.catalyst.util.DateTimeUtils.stringToDate(((UTF8String) 
references[0] /* literal */));
/* 047 */   if (intOpt_0.isDefined()) {
/* 048 */ value_1 = ((Integer) intOpt_0.get()).intValue();
/* 049 */   } else {
/* 050 */ isNull_1 = true;
/* 051 */   }
/* 052 */
...
- Day / DayOfMonth *** FAILED ***
  Code generation of dayofmonth(cast(2000-02-29 as date)) failed:
  java.util.concurrent.ExecutionException: 
org.codehaus.commons.compiler.CompileException: failed to compile:
  (Line 67, Column 62) incompatible types: scala.Option 
cannot be converted to scala.Option
  java.util.concurrent.ExecutionException: 
org.codehaus.commons.compiler.CompileException: failed to compile:
  (Line 67, Column 62) incompatible types: scala.Option 
cannot be converted to scala.Option
at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202548522
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -890,7 +890,7 @@ case class Cast(child: Expression, dataType: DataType, 
timeZoneId: Option[String
 case StringType =>
   val intOpt = ctx.freshName("intOpt")
   (c, evPrim, evNull) => s"""
-scala.Option $intOpt =
+scala.Option $intOpt =
--- End diff --

Out of curiosity, why is this change needed? Is it because it's really a 
scala int that's returned?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202548556
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -1588,6 +1588,15 @@ object CodeGenerator extends Logging {
 
   def primitiveTypeName(dt: DataType): String = 
primitiveTypeName(javaType(dt))
 
+  def getClassName(cls: Class[_]): String = {
+try {
+  return Option(cls.getCanonicalName).getOrElse(cls.getName)
+} catch {
+  case err: InternalError =>
--- End diff --

What would cause InternalError here? just wondering if there's a cleaner 
way.
Will `getCanonicalName` ever return null? otherwise seems like you don't 
need to fall back to `getName` above. Also `return` isn't needed below?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202547003
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -3758,7 +3758,14 @@ case class ArrayUnion(left: Expression, right: 
Expression) extends ArraySetLike
   } else {
 val arrayUnion = classOf[ArrayUnion].getName
 val et = ctx.addReferenceObj("elementTypeUnion", elementType)
-val order = ctx.addReferenceObj("orderingUnion", ordering)
+// Some data types (e.g., `BinaryType`) have anonymous classes for 
ordering and
+// `addReferenceObj` generates code below for anonymous classes;
+//
+//   (org.apache.spark.sql.types.BinaryType$$anon$1) references[0];
+//
+// Janino can compile this statement, but JDK Java compilers 
can't. So, we explicitly
+// set `className` here.
+val order = ctx.addReferenceObj("orderingUnion", ordering, 
"scala.math.Ordering")
--- End diff --

@mgaido91 I a little misunderstood, so I updated the comment. I don't know 
why this fails in JDK compilers now, so I think I need to dig into this more. 
cc: @rednaxelafx


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202543135
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -1585,6 +1585,9 @@ case class InitializeJavaBean(beanInstance: 
Expression, setters: Map[String, Exp
   }
 
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+// Resolves setters before compilation
+require(resolvedSetters.nonEmpty)
--- End diff --

In the master, [this 
test](https://github.com/apache/spark/blob/69993217fc4f5e5e41a297702389e86fe534dc2f/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala#L217)
 currently depends on the message of janino compilation errors. So, If we used 
JDK java compilers, the test could fail (because the compilers throw an 
exception with a different message). To fix this, the change check the setters 
before compilation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202542918
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ---
@@ -318,7 +318,8 @@ case class SampleExec(
 v => s"""
   | $v = new $samplerClass($lowerBound, $upperBound, 
false);
   | $v.setSeed(${seed}L + partitionIndex);
- """.stripMargin.trim)
+ """.stripMargin.trim,
+forceInline = true)
--- End diff --

Generic arrays not allowed;
```
/* 019 */   private 
org.apache.spark.util.random.BernoulliCellSampler[] 
sample_mutableStateArray_0 = new 
org.apache.spark.util.random.BernoulliCellSampler[1];
--- 
 Cause: java.util.concurrent.ExecutionException: 
org.codehaus.commons.compiler.CompileException: failed to compile:
(Line 40, Column 101) generic array creation
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202534510
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -1585,6 +1585,9 @@ case class InitializeJavaBean(beanInstance: 
Expression, setters: Map[String, Exp
   }
 
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+// Resolves setters before compilation
+require(resolvedSetters.nonEmpty)
--- End diff --

why do we need to add this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202534515
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ---
@@ -318,7 +318,8 @@ case class SampleExec(
 v => s"""
   | $v = new $samplerClass($lowerBound, $upperBound, 
false);
   | $v.setSeed(${seed}L + partitionIndex);
- """.stripMargin.trim)
+ """.stripMargin.trim,
+forceInline = true)
--- End diff --

why do we need this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-15 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21770#discussion_r202534499
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -3758,7 +3758,10 @@ case class ArrayUnion(left: Expression, right: 
Expression) extends ArraySetLike
   } else {
 val arrayUnion = classOf[ArrayUnion].getName
 val et = ctx.addReferenceObj("elementTypeUnion", elementType)
-val order = ctx.addReferenceObj("orderingUnion", ordering)
+// Some data types (e.g., `BinaryType`) have anonymous classes for 
ordering and
+// `getCanonicalName` returns null in these classes. Therefore, we 
need to
+// explicitly set `className` here.
--- End diff --

nit: as we are adding this comment, shall we also mention that Janino works 
anyway, but JDK complains here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21770: [SPARK-24806][SQL] Brush up generated code so tha...

2018-07-14 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/21770

[SPARK-24806][SQL] Brush up generated code so that JDK compilers can handle 
it

## What changes were proposed in this pull request?
This pr brushed up code so that JDK compilers can handle it because the 
master sometimes generates Java code that only janino can compile it. This is 
the issue I found when working on SPARK-24498.

## How was this patch tested?
Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark FixJavaCompilerErrors

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21770.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21770


commit 08fddcbb755b8b4e84238a2e3d96c2e0967daa99
Author: Takeshi Yamamuro 
Date:   2018-06-18T12:33:43Z

Fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org