Egor, i encounter the same problem which you have asked in this thread:
http://mail-archives.apache.org/mod_mbox/spark-user/201402.mbox/%3CCAMrx5DwJVJS0g_FE7_2qwMu4Xf0y5VfV=tlyauv2kh5v4k6...@mail.gmail.com%3E
have you fixed this problem?
i am using shark to read a table which i have created on
Much of this sounds related to the memory issue mentioned earlier in this
thread. Are you using a build that has fixed that? That would be by far
most important here.
If the raw memory requirement is 8GB, the actual heap size necessary could
be a lot larger -- object overhead, all the other stuff
Thanks Sean. Looking into executor memory options now...
I am at incubator_spark head. Does that has all the fixes or I need spark
head ? I can deploy the spark head as well...
I am not running implicit feedback yet...I remember memory enhancements
were mainly for implicit right ?
For ulimit
Hey TD,
This one we just merged into master this morning:
https://spark-project.atlassian.net/browse/SPARK-1322
It should definitely go into the 0.9 branch because there was a bug in the
semantics of top() which at this point is unreleased in Python.
I didn't backport it yet because I figured
@qingyang, spark 0.9.0 works for me perfectly when accessing (read/write)
data on hdfs. BTW, if you look at pom.xml, you have to choose yarn profile
to compile spark, so that it won't include protobuf 2.4.1 in your final
jars. Here is the command line we use to compile spark with hadoop 2.2:
mvn
Cool. It sounds like focusing on sbt-pom-reader would be a good thing for
you guys then.
There's a few... fun... issues around maven parent projects that are still
running around with sbt-pom-reader that appear to be fundamental ivy-maven
hate-based issues.
IN any case, while I'm generally
my spark can also work well with hadoop2.2.0
my shark can not work well with hadoop2.2.0 because protobuf version
problem.
in shark direcotry , i found two vesions of protobuf, and they all are
loaded into classpath.
[root@bigdata001 shark-0.9.0]# find . -name proto*.jar