[jira] [Commented] (SPARK-23251) ClassNotFoundException: scala.Any when there's a missing implicit Map encoder

2018-02-04 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-23251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351921#comment-16351921
 ] 

Michal Šenkýř commented on SPARK-23251:
---

I tried multiple solutions but this was the only one that worked for me. 
Unfortunately, I was unable to get it to use the proper error message as Scala 
throws a different error when more information is available (see 
[ContextErrors|[https://github.com/scala/scala/blob/v2.11.12/src/compiler/scala/tools/nsc/typechecker/ContextErrors.scala#L77])
 and the 
implicitNotFound|https://github.com/scala/scala/blob/v2.11.12/src/compiler/scala/tools/nsc/typechecker/ContextErrors.scala#L77]).]
 annotation, which is used by Spark to modify the error message, isn't used in 
this case (a Scala bug?). Still, it's much better than the present one and the 
Encoders are enforced.

> ClassNotFoundException: scala.Any when there's a missing implicit Map encoder
> -
>
> Key: SPARK-23251
> URL: https://issues.apache.org/jira/browse/SPARK-23251
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
> Environment: mac os high sierra, centos 7
>Reporter: Bruce Robbins
>Priority: Minor
>
> In branch-2.2, when you attempt to use row.getValuesMap[Any] without an 
> implicit Map encoder, you get a nice descriptive compile-time error:
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> :26: error: Unable to find encoder for type stored in a Dataset.  
> Primitive types (Int, String, etc) and Product types (case classes) are 
> supported by importing spark.implicits._  Support for serializing other types 
> will be added in future releases.
>        df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
>              ^
> scala> implicit val mapEncoder = 
> org.apache.spark.sql.Encoders.kryo[Map[String, Any]]
> mapEncoder: org.apache.spark.sql.Encoder[Map[String,Any]] = class[value[0]: 
> binary]
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> res1: Array[Map[String,Any]] = Array(Map(stationName -> 007026 9, year -> 
> 2014), Map(stationName -> 007026 9, year -> 2014), Map(stationName -> 
> 007026 9, year -> 2014),
> etc...
> {noformat}
>  
>  On the latest master and also on branch-2.3, the transformation compiles (at 
> least on spark-shell), but throws a ClassNotFoundException:
>  
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> java.lang.ClassNotFoundException: scala.Any
>  at 
> scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>  at java.lang.Class.forName0(Native Method)
>  at java.lang.Class.forName(Class.java:348)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.javaClass(JavaMirrors.scala:555)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1211)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache$$anonfun$toJava$1.apply(TwoWayCaches.scala:49)
>  at scala.reflect.runtime.Gil$class.gilSynchronized(Gil.scala:19)
>  at scala.reflect.runtime.JavaUniverse.gilSynchronized(JavaUniverse.scala:16)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache.toJava(TwoWayCaches.scala:44)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.classToJava(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:194)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:54)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.getClassFromType(ScalaReflection.scala:700)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:84)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:65)
>  at 
> scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:824)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor(ScalaReflection.scala:64)
>  at 
> 

[jira] [Commented] (SPARK-23251) ClassNotFoundException: scala.Any when there's a missing implicit Map encoder

2018-02-04 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351920#comment-16351920
 ] 

Apache Spark commented on SPARK-23251:
--

User 'michalsenkyr' has created a pull request for this issue:
https://github.com/apache/spark/pull/20505

> ClassNotFoundException: scala.Any when there's a missing implicit Map encoder
> -
>
> Key: SPARK-23251
> URL: https://issues.apache.org/jira/browse/SPARK-23251
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
> Environment: mac os high sierra, centos 7
>Reporter: Bruce Robbins
>Priority: Minor
>
> In branch-2.2, when you attempt to use row.getValuesMap[Any] without an 
> implicit Map encoder, you get a nice descriptive compile-time error:
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> :26: error: Unable to find encoder for type stored in a Dataset.  
> Primitive types (Int, String, etc) and Product types (case classes) are 
> supported by importing spark.implicits._  Support for serializing other types 
> will be added in future releases.
>        df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
>              ^
> scala> implicit val mapEncoder = 
> org.apache.spark.sql.Encoders.kryo[Map[String, Any]]
> mapEncoder: org.apache.spark.sql.Encoder[Map[String,Any]] = class[value[0]: 
> binary]
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> res1: Array[Map[String,Any]] = Array(Map(stationName -> 007026 9, year -> 
> 2014), Map(stationName -> 007026 9, year -> 2014), Map(stationName -> 
> 007026 9, year -> 2014),
> etc...
> {noformat}
>  
>  On the latest master and also on branch-2.3, the transformation compiles (at 
> least on spark-shell), but throws a ClassNotFoundException:
>  
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> java.lang.ClassNotFoundException: scala.Any
>  at 
> scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>  at java.lang.Class.forName0(Native Method)
>  at java.lang.Class.forName(Class.java:348)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.javaClass(JavaMirrors.scala:555)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1211)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache$$anonfun$toJava$1.apply(TwoWayCaches.scala:49)
>  at scala.reflect.runtime.Gil$class.gilSynchronized(Gil.scala:19)
>  at scala.reflect.runtime.JavaUniverse.gilSynchronized(JavaUniverse.scala:16)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache.toJava(TwoWayCaches.scala:44)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.classToJava(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:194)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:54)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.getClassFromType(ScalaReflection.scala:700)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:84)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:65)
>  at 
> scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:824)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor(ScalaReflection.scala:64)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:512)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:445)
>  at 
> scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:824)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
>  at 
> 

[jira] [Commented] (SPARK-23251) ClassNotFoundException: scala.Any when there's a missing implicit Map encoder

2018-01-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-23251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347808#comment-16347808
 ] 

Michal Šenkýř commented on SPARK-23251:
---

Yes, this does seem like the Map encoder is not checking whether appropriate 
encoders exist for the key and value types. I think I couldn't get the compiler 
to resolve it if I added the appropriate typeclass checks and wanted to support 
subclasses of the collection type at the same time. I will check in the 
following few days whether that was the case and try to figure out some 
alternative.

> ClassNotFoundException: scala.Any when there's a missing implicit Map encoder
> -
>
> Key: SPARK-23251
> URL: https://issues.apache.org/jira/browse/SPARK-23251
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
> Environment: mac os high sierra, centos 7
>Reporter: Bruce Robbins
>Priority: Minor
>
> In branch-2.2, when you attempt to use row.getValuesMap[Any] without an 
> implicit Map encoder, you get a nice descriptive compile-time error:
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> :26: error: Unable to find encoder for type stored in a Dataset.  
> Primitive types (Int, String, etc) and Product types (case classes) are 
> supported by importing spark.implicits._  Support for serializing other types 
> will be added in future releases.
>        df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
>              ^
> scala> implicit val mapEncoder = 
> org.apache.spark.sql.Encoders.kryo[Map[String, Any]]
> mapEncoder: org.apache.spark.sql.Encoder[Map[String,Any]] = class[value[0]: 
> binary]
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> res1: Array[Map[String,Any]] = Array(Map(stationName -> 007026 9, year -> 
> 2014), Map(stationName -> 007026 9, year -> 2014), Map(stationName -> 
> 007026 9, year -> 2014),
> etc...
> {noformat}
>  
>  On the latest master and also on branch-2.3, the transformation compiles (at 
> least on spark-shell), but throws a ClassNotFoundException:
>  
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> java.lang.ClassNotFoundException: scala.Any
>  at 
> scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>  at java.lang.Class.forName0(Native Method)
>  at java.lang.Class.forName(Class.java:348)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.javaClass(JavaMirrors.scala:555)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1211)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache$$anonfun$toJava$1.apply(TwoWayCaches.scala:49)
>  at scala.reflect.runtime.Gil$class.gilSynchronized(Gil.scala:19)
>  at scala.reflect.runtime.JavaUniverse.gilSynchronized(JavaUniverse.scala:16)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache.toJava(TwoWayCaches.scala:44)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.classToJava(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:194)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:54)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.getClassFromType(ScalaReflection.scala:700)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:84)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:65)
>  at 
> scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:824)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor(ScalaReflection.scala:64)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:512)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:445)
>  at 
> 

[jira] [Commented] (SPARK-23251) ClassNotFoundException: scala.Any when there's a missing implicit Map encoder

2018-01-30 Thread Bruce Robbins (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346246#comment-16346246
 ] 

Bruce Robbins commented on SPARK-23251:
---

[~srowen] This also occurs with compiled apps submitted via spark-submit.

For example, this app:
{code:java}
object Implicit1 {
 def main(args: Array[String]) {
 if (args.length < 1) {
 Console.err.println("No input file specified")
 System.exit(1)
 }
 val inputFilename = args(0)

 val spark = SparkSession.builder().appName("Implicit1").getOrCreate()
 import spark.implicits._

 val df = spark.read.json(inputFilename)
 //implicit val mapEncoder = org.apache.spark.sql.Encoders.kryo[Map[String, 
Any]]
 val results = df.map(row => row.getValuesMap[Any](List("stationName", 
"year"))).take(15)
 results.foreach(println)
 }
}{code}
When run on Spark 2.3 (via spark-submit), I get the same exception as I see 
with spark-shell.

With the implicit mapEncoder line uncommented, this compiles and runs fine on 
both 2.2 and 2.3.

Here's the exception from spark-submit on spark 2.3:
{noformat}
bash-3.2$ ./bin/spark-submit --version
Welcome to
  __
 / __/__ ___ _/ /__
 _\ \/ _ \/ _ `/ __/ '_/
 /___/ .__/\_,_/_/ /_/\_\ version 2.3.1-SNAPSHOT
 /_/
 
Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_161
Branch branch-2.3
Compiled by user brobbins on 2018-01-28T01:25:18Z
Revision 3b6fc286d105ae7de737c46e50cf941e6831ab98
Url https://github.com/apache/spark.git
Type --help for more information.

bash-3.2$ ./bin/spark-submit --class "Implicit1" 
~/github/sparkAppPlay/target/scala-2.11/temps_2.11-1.0.jar 
~/ncdc_gsod_short.jsonl 
..
Exception in thread "main" java.lang.ClassNotFoundException: scala.Any
 at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:348)
 at 
scala.reflect.runtime.JavaMirrors$JavaMirror.javaClass(JavaMirrors.scala:555)
 at 
scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1211)
 at 
scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1203)
 at 
scala.reflect.runtime.TwoWayCaches$TwoWayCache$$anonfun$toJava$1.apply(TwoWayCaches.scala:49)
 at scala.reflect.runtime.Gil$class.gilSynchronized(Gil.scala:19)
 at scala.reflect.runtime.JavaUniverse.gilSynchronized(JavaUniverse.scala:16)
 at scala.reflect.runtime.TwoWayCaches$TwoWayCache.toJava(TwoWayCaches.scala:44)
 at 
scala.reflect.runtime.JavaMirrors$JavaMirror.classToJava(JavaMirrors.scala:1203)
 at 
scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:194)
 at 
scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:54)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$.getClassFromType(ScalaReflection.scala:700)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:84)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:65)
 at 
scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:824)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor(ScalaReflection.scala:64)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:512)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:445)
 at 
scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:824)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:445)
 at 
org.apache.spark.sql.catalyst.ScalaReflection$.serializerFor(ScalaReflection.scala:434)
 at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:71)
 at org.apache.spark.sql.SQLImplicits.newMapEncoder(SQLImplicits.scala:172)
 at Implicit1$.main(Implicit1.scala:17)
 at Implicit1.main(Implicit1.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 

[jira] [Commented] (SPARK-23251) ClassNotFoundException: scala.Any when there's a missing implicit Map encoder

2018-01-30 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346075#comment-16346075
 ] 

Sean Owen commented on SPARK-23251:
---

Hm. I don't know is this is related to Encoders and the mechanism you cite, not 
directly. The error is that {{scala.Any}} can't be found, which of course must 
certainly be available. This is typically a classloader issue, and in 
{{spark-shell}} the classloader situation is complicated.

It may be a real problem still, or at least a symptom of a known class of 
problems. But can you confirm that this doesn't happen without the shell?

> ClassNotFoundException: scala.Any when there's a missing implicit Map encoder
> -
>
> Key: SPARK-23251
> URL: https://issues.apache.org/jira/browse/SPARK-23251
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
> Environment: mac os high sierra, centos 7
>Reporter: Bruce Robbins
>Priority: Minor
>
> In branch-2.2, when you attempt to use row.getValuesMap[Any] without an 
> implicit Map encoder, you get a nice descriptive compile-time error:
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> :26: error: Unable to find encoder for type stored in a Dataset.  
> Primitive types (Int, String, etc) and Product types (case classes) are 
> supported by importing spark.implicits._  Support for serializing other types 
> will be added in future releases.
>        df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
>              ^
> scala> implicit val mapEncoder = 
> org.apache.spark.sql.Encoders.kryo[Map[String, Any]]
> mapEncoder: org.apache.spark.sql.Encoder[Map[String,Any]] = class[value[0]: 
> binary]
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> res1: Array[Map[String,Any]] = Array(Map(stationName -> 007026 9, year -> 
> 2014), Map(stationName -> 007026 9, year -> 2014), Map(stationName -> 
> 007026 9, year -> 2014),
> etc...
> {noformat}
>  
>  On the latest master and also on branch-2.3, the transformation compiles (at 
> least on spark-shell), but throws a ClassNotFoundException:
>  
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> java.lang.ClassNotFoundException: scala.Any
>  at 
> scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>  at java.lang.Class.forName0(Native Method)
>  at java.lang.Class.forName(Class.java:348)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.javaClass(JavaMirrors.scala:555)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1211)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache$$anonfun$toJava$1.apply(TwoWayCaches.scala:49)
>  at scala.reflect.runtime.Gil$class.gilSynchronized(Gil.scala:19)
>  at scala.reflect.runtime.JavaUniverse.gilSynchronized(JavaUniverse.scala:16)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache.toJava(TwoWayCaches.scala:44)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.classToJava(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:194)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:54)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.getClassFromType(ScalaReflection.scala:700)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:84)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:65)
>  at 
> scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:824)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor(ScalaReflection.scala:64)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:512)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:445)
>  at 
> 

[jira] [Commented] (SPARK-23251) ClassNotFoundException: scala.Any when there's a missing implicit Map encoder

2018-01-30 Thread Bruce Robbins (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346050#comment-16346050
 ] 

Bruce Robbins commented on SPARK-23251:
---

I commented out the following line in 

sql/core/src/main/scala/org/apache/spark/sql/SQLImplicits.scala and the problem 
went away:

 

 
{code:java}
implicit def newMapEncoder[T <: Map[_, _] : TypeTag]: Encoder[T] = 
ExpressionEncoder()
{code}
 

By "went away", I mean I now had to specify a Map encoder for my map function 
to compile (rather than have it compile and then throw an exception).

Checking with [~michalsenkyr], who will know more than I do.

> ClassNotFoundException: scala.Any when there's a missing implicit Map encoder
> -
>
> Key: SPARK-23251
> URL: https://issues.apache.org/jira/browse/SPARK-23251
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
> Environment: mac os high sierra, centos 7
>Reporter: Bruce Robbins
>Priority: Minor
>
> In branch-2.2, when you attempt to use row.getValuesMap[Any] without an 
> implicit Map encoder, you get a nice descriptive compile-time error:
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> :26: error: Unable to find encoder for type stored in a Dataset.  
> Primitive types (Int, String, etc) and Product types (case classes) are 
> supported by importing spark.implicits._  Support for serializing other types 
> will be added in future releases.
>        df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
>              ^
> scala> implicit val mapEncoder = 
> org.apache.spark.sql.Encoders.kryo[Map[String, Any]]
> mapEncoder: org.apache.spark.sql.Encoder[Map[String,Any]] = class[value[0]: 
> binary]
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> res1: Array[Map[String,Any]] = Array(Map(stationName -> 007026 9, year -> 
> 2014), Map(stationName -> 007026 9, year -> 2014), Map(stationName -> 
> 007026 9, year -> 2014),
> etc...
> {noformat}
>  
>  On the latest master and also on branch-2.3, the transformation compiles (at 
> least on spark-shell), but throws a ClassNotFoundException:
>  
> {noformat}
> scala> df.map(row => row.getValuesMap[Any](List("stationName", 
> "year"))).collect
> java.lang.ClassNotFoundException: scala.Any
>  at 
> scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>  at java.lang.Class.forName0(Native Method)
>  at java.lang.Class.forName(Class.java:348)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.javaClass(JavaMirrors.scala:555)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1211)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$classToJava$1.apply(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache$$anonfun$toJava$1.apply(TwoWayCaches.scala:49)
>  at scala.reflect.runtime.Gil$class.gilSynchronized(Gil.scala:19)
>  at scala.reflect.runtime.JavaUniverse.gilSynchronized(JavaUniverse.scala:16)
>  at 
> scala.reflect.runtime.TwoWayCaches$TwoWayCache.toJava(TwoWayCaches.scala:44)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.classToJava(JavaMirrors.scala:1203)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:194)
>  at 
> scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:54)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.getClassFromType(ScalaReflection.scala:700)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:84)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor$1.apply(ScalaReflection.scala:65)
>  at 
> scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:56)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$class.cleanUpReflectionObjects(ScalaReflection.scala:824)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:39)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$dataTypeFor(ScalaReflection.scala:64)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:512)
>  at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1.apply(ScalaReflection.scala:445)