Thank you for the reply. I implemented my InputDStream to return None when
there's no data. After changing it to return empty RDD, the exception is
gone.
I am curious as to why all other processings worked correctly with my old
incorrect implementation, with or without data? My actual codes, witho
Looking at the source codes of DStream.scala
> /**
>* Return a new DStream in which each RDD has a single element generated
> by counting each RDD
>* of this DStream.
>*/
> def count(): DStream[Long] = {
> this.map(_ => (null, 1L))
> .transform(_.union(context.sparkCon
I have not used this, only watched a presentation of it in spark summit 2013.
https://github.com/radlab/sparrow
https://spark-summit.org/talk/ousterhout-next-generation-spark-scheduling-with-sparrow/
Pure conjecture from your high scheduling latency and the size of your
cluster, it seems one way
Assuming "this should not happen", I don't want to have to keep building a
custom version of spark for every new release, thus preferring the
workaround.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-Spark-Accumulo-Error-java-lang-NoSuchMethodErr
found a workaround by adding "SPARK_CLASSPATH=.../commons-codec-xxx.jar" to
spark-env.sh
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-Spark-Accumulo-Error-java-lang-NoSuchMethodError-org-apache-commons-codec-binary-Base64-eng-tp7667p8117.html
S
I used Java Decompiler to check the included
"org.apache.commons.codec.binary.Base64" .class file (in spark-assembly jar
file) and for both "encodeBase64" and "decodeBase64", there is only (byte
[]) version and no encodeBase64/decodeBase64(String).
I have encountered the reported issue. This confl
I checked the META-INF/DEPENDENCIES file in the spark-assembly jar from
official 1.0.0 binary release for CDH4, and found one "commons-codec" entry
From: 'The Apache Software Foundation' (http://jakarta.apache.org)
- Codec (http://jakarta.apache.org/commons/codec/)
commons-codec:commons-codec:ja
It is my understanding that there is no way to make FlumeInputDStream work in
a cluster environment with the current release. Switch to Kafka, if you can,
would be my suggestion, although I have not used KafkaInputDStream. There is
a big difference between Kafka and Flume InputDstream: KafkaInputDS
Hi,
This is my summary of the gap between expected behavior and actual behavior.
FlumeEventCount spark://:7077
Expected: an 'agent' listening on : (bind to). In the context
of Spark, this agent should be running on one of the slaves, which should be
the slave whose ip/hostname is .
Observed:
Dear all,
I encountered NullPointerException running a simple program like below:
> val sparkconf = new SparkConf()
> .setMaster(master)
> .setAppName("myapp")
> // and other setups
>
> val ssc = new StreamingContext(sparkconf, Seconds(30))
> val flume = new FlumeInputDStream(ssc, f
10 matches
Mail list logo