Re: [SPARK-2878] Kryo serialisation with custom Kryo registrator failing

2014-08-19 Thread Debasish Das
@rxin With the fixes, I could run it fine on top of branch-1.0 On master when running using YARN I am getting another KryoException: Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 247 in stage 52.0 failed 4 times, most recent failure: Lost task

Re: mvn test error

2014-08-19 Thread Cheng Lian
Just FYI, thought this might be helpful, I'm refactoring Hive Thrift server test suites. These suites also fork new processes and suffer similar issues. Stdout and stderr of forked processes are logged in the new version of test suites with utilities under scala.sys.process package https://github.c

Re: Spark on YARN webui

2014-08-19 Thread Tom Graves
yes the webui works on yarn. You should be able to go to the Yarn ResourceManager UI and it will have a link to the web UI for a running spark application.  You can also set it up to save the history and view it after it has finished.  History info can be found here: Monitoring and Instrumentati

Spark SQL Query and join different data sources.

2014-08-19 Thread alexliu68
Is there anyone make the query join different data sources work? especially Join hive table with other data sources. For example, hql uses HiveContext, and it needs first call "use " and other datasources use SqlContext, how can SqlContext know Hive tables? I follow https://spark.apache.org/sql/ e

Re: Data Locality In Spark

2014-08-19 Thread Chris Fregly
and even the same process where the data might be cached. these are the different locality levels: PROCESS_LOCAL NODE_LOCAL RACK_LOCAL ANY relevant code: https://github.com/apache/spark/blob/7712e724ad69dd0b83754e938e9799d13a4d43b9/core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSu

Lost executor on YARN ALS iterations

2014-08-19 Thread Debasish Das
Hi, During the 4th ALS iteration, I am noticing that one of the executor gets disconnected: 14/08/19 23:40:00 ERROR network.ConnectionManager: Corresponding SendingConnectionManagerId not found 14/08/19 23:40:00 INFO cluster.YarnClientSchedulerBackend: Executor 5 disconnected, so removing it 14

Re: Lost executor on YARN ALS iterations

2014-08-19 Thread Xiangrui Meng
Hi Deb, I think this may be the same issue as described in https://issues.apache.org/jira/browse/SPARK-2121 . We know that the container got killed by YARN because it used much more memory that it requested. But we haven't figured out the root cause yet. +Sandy Best, Xiangrui On Tue, Aug 19, 20