Hi, all
*Spark version: bae07e3 [behind 1] fix different versions of commons-lang
dependency and apache/spark#746 addendum*
I have six worker nodes and four of them have this NoClassDefFoundError when
I use thestart-slaves.sh on my driver node. However, running ./bin/spark-class
Hi, xiangrui
i check the stderr of worker node, yes it's failed to load implementation
from:
com.github.fommil.netlib.NativeSystemBLAS...
what do you mean by include breeze-natives or netlib:all?
things i've already done:
1. add breeze and breeze native dependency in sbt build file
Hi, xiangrui
you said It doesn't work if you put the netlib-native jar inside an
assembly
jar. Try to mark it provided in the dependencies, and use --jars to
include them with spark-submit. -Xiangrui
i'am not use an assembly jar which contains every thing, i also mark
breeze
Hi
Try using *reduceByKeyLocally*.
Regards
Lukas Nalezenec
On Sun, May 18, 2014 at 3:33 AM, Matei Zaharia matei.zaha...@gmail.comwrote:
Make sure you set up enough reduce partitions so you don’t overload them.
Another thing that may help is checking whether you’ve run out of local
disk space
Doesn’t using an HDFS path pattern then restrict the URI to an HDFS URI. Since
Spark supports several FS schemes I’m unclear about how much to assume about
using the hadoop file systems APIs and conventions. Concretely if I pass a
pattern in with a HTTPS file system, will the pattern work?
Spark's
sc.textFile()https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L456
method
delegates to sc.hadoopFile(), which uses Hadoop's
I see - I didn't realize that scope would work like that. Are you
saying that any variable that is in scope of the lambda passed to map
will be automagically propagated to all workers? What if it's not
explicitly referenced in the map, only used by it. E.g.:
def main:
settings.setSettings
Hi, all
I tried to write data to HBase in a Spark-1.0 rc8 application,
the application is terminated due to the java.lang.IllegalAccessError, Hbase
shell works fine, and the same application works with a standalone Hbase
deployment
java.lang.IllegalAccessError:
I tried hbase-0.96.2/0.98.1/0.98.2
HDFS version is 2.3
--
Nan Zhu
On Sunday, May 18, 2014 at 4:18 PM, Nan Zhu wrote:
Hi, all
I tried to write data to HBase in a Spark-1.0 rc8 application,
the application is terminated due to the java.lang.IllegalAccessError, Hbase
shell works
Hi Shangyu (and everyone else looking to unsubscribe!),
If you'd like to get off this mailing list, please send an email to user
*-unsubscribe*@spark.apache.org, not the regular user@spark.apache.org list.
How to use the Apache mailing list infrastructure is documented here:
I think the shuffle is unavoidable given that the input partitions
(probably hadoop input spits in your case) are not arranged in the way of a
cogroup job. But maybe you can try:
1) co-partition you data for cogroup:
val par = HashPartitioner(128)
val big =
Hi,
I'm quite new to Spark Streaming and developed the following
application to pass 4 strings, process them and shut down:
val conf = new SparkConf(false) // skip loading external settings
.setMaster(local[1]) // run locally with one thread
.setAppName(Spark Streaming with
ok
Spark Executor Command: java -cp
When you reference any variable outside the executor's scope, spark will
automatically serialize them in the driver, and send them to executors,
which implies, those variables have to implement serializable.
For the example you mention, the Spark will serialize object F, and if it's
not
I am launching a rather large cluster on ec2.
It seems like the launch is taking forever on
Setting up spark
RSYNC'ing /root/spark to slaves...
...
It seems that bittorrent might be a faster way to replicate
the sizeable spark directory to the slaves
particularly if there is a lot of not
Out of curiosity, do you have a library in mind that would make it easy to
setup a bit torrent network and distribute files in an rsync (i.e., apply a
diff to a tree, ideally) fashion? I'm not familiar with this space, but we
do want to minimize the complexity of our standard ec2 launch scripts to
I am not an expert in this space either. I thought the initial rsync during
launch is really just a straight copy that did not need the tree diff. So
it seemed like having the slaves do the copying among it each other would
be better than having the master copy to everyone directly. That made me
17 matches
Mail list logo