Re: Spark 1.3.0 DataFrame count() method throwing java.io.EOFException

Dean Wampler Wed, 01 Apr 2015 15:08:26 -0700

Is it possible "tbBER" is empty? If so, it shouldn't fail like this, of
course.


Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
Typesafe <http://typesafe.com>
@deanwampler <http://twitter.com/deanwampler>
http://polyglotprogramming.com

On Wed, Apr 1, 2015 at 5:57 PM, ARose <ashley.r...@telarix.com> wrote:

> Note: I am running Spark on Windows 7 in standalone mode.
>
> In my app, I run the following:
>
>         DataFrame df = sqlContext.sql("SELECT * FROM tbBER");
>         System.out.println("Count: " + df.count());
>
> tbBER is registered as a temp table in my SQLContext. When I try to print
> the number of rows in the DataFrame, the job fails and I get the following
> error message:
>
>         java.io.EOFException
>         at
>
> java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2747)
>         at java.io.ObjectInputStream.readFully(ObjectInputStream.java:1033)
>         at
>
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
>         at
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
>         at org.apache.hadoop.io.UTF8.readChars(UTF8.java:216)
>         at org.apache.hadoop.io.UTF8.readString(UTF8.java:208)
>         at org.apache.hadoop.mapred.FileSplit.readFields(FileSplit.java:87)
>         at
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:237)
>         at
> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
>         at
>
> org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:43)
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1137)
>         at
>
> org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:483)
>         at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1896)
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>         at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>         at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>         at
>
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>         at
>
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:94)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:185)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>
> This only happens when I try to call df.count(). The rest runs fine. Is the
> count() function not supported in standalone mode? The stack trace makes it
> appear to be Hadoop functionality...
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-3-0-DataFrame-count-method-throwing-java-io-EOFException-tp22344.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Spark 1.3.0 DataFrame count() method throwing java.io.EOFException

Reply via email to