EC2 cluster doesn't work saveAsTextFile

2015-08-10 Thread Yasemin Kaya
Hi, I have EC2 cluster, and am using spark 1.3, yarn and HDFS . When i submit at local there is no problem , but i run at cluster, saveAsTextFile doesn't work.*It says me User class threw exception: Output directory hdfs://172.31.42.10:54310/./weblogReadResult

How to programmatically create, submit and report on Spark jobs?

2015-08-10 Thread mark
Hi All I need to be able to create, submit and report on Spark jobs programmatically in response to events arriving on a Kafka bus. I also need end-users to be able to create data queries that launch Spark jobs 'behind the scenes'. I would expect to use the same API for both, and be able to

Re: How to connect to spark remotely from java

2015-08-10 Thread Zsombor Egyed
Thank you for your respond! If I understand well I should get YARN cluster on server/HDP. I should start the yarn services, nodemanager, resourcemanager etc. and I also need to install a spark on my machine, write a java code, make a jar file, and submit it to the server? Am I right? On Mon,

spark vs flink low memory available

2015-08-10 Thread Pa Rö
hi community, i have build a spark and flink k-means application. my test case is a clustering on 1 million points on 3node cluster. in memory bottlenecks begins flink to outsource to disk and work slowly but works. however spark lose executers if the memory is full and starts again (infinety

spark vs flink low memory available

2015-08-10 Thread Pa Rö
hi community, i have build a spark and flink k-means application. my test case is a clustering on 1 million points on 3node cluster. in memory bottlenecks begins flink to outsource to disk and work slowly but works. however spark lose executers if the memory is full and starts again (infinety

Re: Spark Maven Build

2015-08-10 Thread Benyi Wang
Never mind. Instead of set property in the profile profile idcdh5.3.2/id properties hadoop.version2.5.0-cdh5.3.2/hadoop.version ... /properties profile I have to change the property hadoop.version from 2.2.0 to 2.5.0-cdh5.3.2 in spark-parent's pom.xml. Otherwise, maven

Spark Streaming dealing with broken files without dying

2015-08-10 Thread Mario Pastorelli
Hey Sparkers, I would like to use Spark Streaming in production to observe a directory and process files that are put inside it. The problem is that some of those files can be broken leading to a IOException from the input reader. This should be fine for the framework I think: the exception should

<    1   2