Hi Nick,

I will try to reproduce your issue on a couple of environment. Just wanted which kind of environment you use: spark standalone, spark on yarn, or spark on mesos ?

For you, does it occur with any transform() on any RDD or do you use specific RDD ?

I plan to use your code in a main and use spark-submit: do you use such kind of deployment ?

Thanks !
Regards
JB

On 10/07/2015 07:18 AM, pnpritchard wrote:
Hi spark community,

I was hoping someone could help me by running a code snippet below in the
spark shell, and seeing if they see the same buggy behavior I see. Full
details of the bug can be found in this JIRA issue I filed:
https://issues.apache.org/jira/browse/SPARK-10942.

The issue was closed due to cannot reproduce, however, I can't seem to shake
it. I have worked on this for a while, removing all known variables, and
trying different versions of spark (1.5.0, 1.5.1, master), and different OSs
(Mac OSX, Debian Linux). My coworkers have tried as well and see the same
behavior. This has me convinced that I cannot be the only one in the
community to be able to produce this.

If you have a minute or two, please open a spark shell and copy/paste the
below code. After 30 seconds, check the spark ui, storage tab. If you see
some cached RDDs listed, then the bug has been reproduced. If not, then
there is no bug... and I may be losing my mind.

Thanks in advance!

Nick


------------


import org.apache.spark.streaming.{Seconds, StreamingContext}
import scala.collection.mutable

val ssc = new StreamingContext(sc, Seconds(1))

val inputRDDs = mutable.Queue.tabulate(30) { i =>
   sc.parallelize(Seq(i))
}

val input = ssc.queueStream(inputRDDs)

val output = input.transform { rdd =>
   if (rdd.isEmpty()) {
     rdd
   } else {
     val rdd2 = rdd.map(identity)
     rdd2.cache()
     rdd2.setName(rdd.first().toString)
     val rdd3 = rdd2.map(identity) ++ rdd2.map(identity)
     rdd3
   }
}

output.print()

ssc.start()





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Help-needed-to-reproduce-bug-tp24965.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to