Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-26 Thread qingyang li
Egor, i encounter the same problem which you have asked in this thread: http://mail-archives.apache.org/mod_mbox/spark-user/201402.mbox/%3CCAMrx5DwJVJS0g_FE7_2qwMu4Xf0y5VfV=tlyauv2kh5v4k6...@mail.gmail.com%3E have you fixed this problem? i am using shark to read a table which i have created on

Re: ALS memory limits

2014-03-26 Thread Sean Owen
Much of this sounds related to the memory issue mentioned earlier in this thread. Are you using a build that has fixed that? That would be by far most important here. If the raw memory requirement is 8GB, the actual heap size necessary could be a lot larger -- object overhead, all the other stuff

Re: ALS memory limits

2014-03-26 Thread Debasish Das
Thanks Sean. Looking into executor memory options now... I am at incubator_spark head. Does that has all the fixes or I need spark head ? I can deploy the spark head as well... I am not running implicit feedback yet...I remember memory enhancements were mainly for implicit right ? For ulimit

Re: Spark 0.9.1 release

2014-03-26 Thread Patrick Wendell
Hey TD, This one we just merged into master this morning: https://spark-project.atlassian.net/browse/SPARK-1322 It should definitely go into the 0.9 branch because there was a bug in the semantics of top() which at this point is unreleased in Python. I didn't backport it yet because I figured

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-26 Thread yao
@qingyang, spark 0.9.0 works for me perfectly when accessing (read/write) data on hdfs. BTW, if you look at pom.xml, you have to choose yarn profile to compile spark, so that it won't include protobuf 2.4.1 in your final jars. Here is the command line we use to compile spark with hadoop 2.2: mvn

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-03-26 Thread Josh Suereth
Cool. It sounds like focusing on sbt-pom-reader would be a good thing for you guys then. There's a few... fun... issues around maven parent projects that are still running around with sbt-pom-reader that appear to be fundamental ivy-maven hate-based issues. IN any case, while I'm generally

Re: Error executing sql using shark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-26 Thread qingyang li
my spark can also work well with hadoop2.2.0 my shark can not work well with hadoop2.2.0 because protobuf version problem. in shark direcotry , i found two vesions of protobuf, and they all are loaded into classpath. [root@bigdata001 shark-0.9.0]# find . -name proto*.jar