[ https://issues.apache.org/jira/browse/SPARK-12350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcelo Vanzin resolved SPARK-12350. ------------------------------------ Resolution: Fixed Assignee: Marcelo Vanzin (was: Apache Spark) Fix Version/s: 2.0.0 > VectorAssembler#transform() initially throws an exception > --------------------------------------------------------- > > Key: SPARK-12350 > URL: https://issues.apache.org/jira/browse/SPARK-12350 > Project: Spark > Issue Type: Bug > Components: Spark Core, Spark Shell > Environment: sparkShell command from sbt > Reporter: Jakob Odersky > Assignee: Marcelo Vanzin > Fix For: 2.0.0 > > > Calling VectorAssembler.transform() initially throws an exception, subsequent > calls work. > h3. Steps to reproduce > In spark-shell, > 1. Create a dummy dataframe and define an assembler > {code} > import org.apache.spark.ml.feature.VectorAssembler > val df = sc.parallelize(List((1,2), (3,4))).toDF > val assembler = new VectorAssembler().setInputCols(Array("_1", > "_2")).setOutputCol("features") > {code} > 2. Run > {code} > assembler.transform(df).show > {code} > Initially the following exception is thrown: > {code} > 15/12/15 16:20:19 ERROR TransportRequestHandler: Error opening stream > /classes/org/apache/spark/sql/catalyst/expressions/Object.class for request > from /9.72.139.102:60610 > java.lang.IllegalArgumentException: requirement failed: File not found: > /classes/org/apache/spark/sql/catalyst/expressions/Object.class > at scala.Predef$.require(Predef.scala:233) > at > org.apache.spark.rpc.netty.NettyStreamManager.openStream(NettyStreamManager.scala:60) > at > org.apache.spark.network.server.TransportRequestHandler.processStreamRequest(TransportRequestHandler.java:136) > at > org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:106) > at > org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104) > at > org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:745) > {code} > Subsequent calls work: > {code} > +---+---+---------+ > | _1| _2| features| > +---+---+---------+ > | 1| 2|[1.0,2.0]| > | 3| 4|[3.0,4.0]| > +---+---+---------+ > {code} > It seems as though there is some internal state that is not initialized. > [~iyounus] originally found this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org