Re: Bind Exception

2015-01-19 Thread Deep Pradhan
I closed the Spark Shell and tried but no change. Here is the error: . 15/01/17 14:33:39 INFO AbstractConnector: Started SocketConnector@0.0.0.0:59791 15/01/17 14:33:39 INFO Server: jetty-8.y.z-SNAPSHOT 15/01/17 14:33:39 WARN AbstractLifeCycle: FAILED

Bind Exception

2015-01-19 Thread Deep Pradhan
Hi, I am running a Spark job. I get the output correctly but when I see the logs file I see the following: AbstractLifeCycle: FAILED.: java.net.BindException: Address already in use... What could be the reason for this? Thank You

Re: Why custom parquet format hive table execute ParquetTableScan physical plan, not HiveTableScan?

2015-01-19 Thread Xiaoyu Wang
The *spark.sql.parquet.**filterPushdown=true *has been turned on. But set *spark.sql.hive.**convertMetastoreParquet *to *false*. the first parameter is lose efficacy!!! 2015-01-20 6:52 GMT+08:00 Yana Kadiyska yana.kadiy...@gmail.com: If you're talking about filter pushdowns for parquet files

Re: Bind Exception

2015-01-19 Thread Deep Pradhan
I had the Spark Shell running through out. Is it because of that? On Tue, Jan 20, 2015 at 9:47 AM, Ted Yu yuzhih...@gmail.com wrote: Was there another instance of Spark running on the same machine ? Can you pastebin the full stack trace ? Cheers On Mon, Jan 19, 2015 at 8:11 PM, Deep

Re: Bind Exception

2015-01-19 Thread Deep Pradhan
Yes, I have increased the driver memory in spark-default.conf to 2g. Still the error persists. On Tue, Jan 20, 2015 at 10:18 AM, Ted Yu yuzhih...@gmail.com wrote: Have you seen these threads ? http://search-hadoop.com/m/JW1q5tMFlb http://search-hadoop.com/m/JW1q5dabji1 Cheers On Mon, Jan

Re: Finding most occurrences in a JSON Nested Array

2015-01-19 Thread Pankaj Narang
I just checked the post. do you need help still ? I think getAs(Seq[String]) should help. If you are still stuck let me know. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Finding-most-occurrences-in-a-JSON-Nested-Array-tp20971p21252.html Sent from

Re: If an RDD appeared twice in a DAG, of which calculation is triggered by a single action, will this RDD be calculated twice?

2015-01-19 Thread Tobias Pfeiffer
Hi, On Sat, Jan 17, 2015 at 3:37 AM, Peng Cheng pc...@uow.edu.au wrote: I'm talking about RDD1 (not persisted or checkpointed) in this situation: ...(somewhere) - RDD1 - RDD2 || V V

RE: MatchError in JsonRDD.toLong

2015-01-19 Thread Wang, Daoyuan
Yes, actually that is what I mean exactly. And maybe you missed my last response, you can use the API: jsonRDD(json:RDD[String], schema:StructType) to clearly clarify your schema. For numbers bigger than Long, we can use DecimalType. Thanks, Daoyuan From: Tobias Pfeiffer

How to compute RDD[(String, Set[String])] that include large Set

2015-01-19 Thread jagaximo
i want compute RDD[(String, Set[String])] that include a part of large size ’Set[String]’. -- val hoge: RDD[(String, Set[String])] = ... val reduced = hoge.reduceByKey(_ ++ _) //= create large size Set (shuffle read size 7GB) val counted = reduced.map{ case (key, strSeq) =

Re: Does Spark automatically run different stages concurrently when possible?

2015-01-19 Thread Ashish
Sean, A related question. When to persist the RDD after step 2 or after Step 3 (nothing would happen before step 3 I assume)? On Mon, Jan 19, 2015 at 5:17 PM, Sean Owen so...@cloudera.com wrote: From the OP: (1) val lines = Import full dataset using sc.textFile (2) val ABonly = Filter out

Re: How to compute RDD[(String, Set[String])] that include large Set

2015-01-19 Thread Pankaj Narang
Instead of counted.saveAsText(“/path/to/save/dir) if you call counted.collect what happens ? If you still face the same issue please paste the stacktrace here. -- View this message in context:

Re: Bind Exception

2015-01-19 Thread Ted Yu
Have you seen these threads ? http://search-hadoop.com/m/JW1q5tMFlb http://search-hadoop.com/m/JW1q5dabji1 Cheers On Mon, Jan 19, 2015 at 8:33 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi Ted, When I am running the same job with small data, I am able to run. But when I run it with

Re: How to get the master URL at runtime inside driver program?

2015-01-19 Thread Tobias Pfeiffer
Hi, On Sun, Jan 18, 2015 at 11:08 AM, guxiaobo1982 guxiaobo1...@qq.com wrote: Driver programs submitted by the spark-submit script will get the runtime spark master URL, but how it get the URL inside the main method when creating the SparkConf object? The master will be stored in the

Re: How to compute RDD[(String, Set[String])] that include large Set

2015-01-19 Thread Kevin Jung
As far as I know, the tasks before calling saveAsText are transformations so that they are lazy computed. Then saveAsText action performs all transformations and your Set[String] grows up at this time. It creates large collection if you have few keys and this causes OOM easily when your executor

Re: Issues with constants in Spark HiveQL queries

2015-01-19 Thread Pala M Muthaia
Yes we tried the master branch (sometime last week) and there was no issue, but the above repro is for branch 1.2 and Hive 0.13. Isn't that the final release branch for Spark 1.2? If so, a patch needs to be created or back-ported from master? (Yes the obvious typo in the column name was

Re: MatchError in JsonRDD.toLong

2015-01-19 Thread Tobias Pfeiffer
Hi, On Fri, Jan 16, 2015 at 6:14 PM, Wang, Daoyuan daoyuan.w...@intel.com wrote: The second parameter of jsonRDD is the sampling ratio when we infer schema. OK, I was aware of this, but I guess I understand the problem now. My sampling ratio is so low that I only see the Long values of data

How to output to S3 and keep the order

2015-01-19 Thread anny9699
Hi, I am using Spark on AWS and want to write the output to S3. It is a relatively small file and I don't want them to output as multiple parts. So I use result.repartition(1).saveAsTextFile(s3://...) However as long as I am using the saveAsTextFile method, the output doesn't keep the original

Re: Why custom parquet format hive table execute ParquetTableScan physical plan, not HiveTableScan?

2015-01-19 Thread Yana Kadiyska
If you're talking about filter pushdowns for parquet files this also has to be turned on explicitly. Try *spark.sql.parquet.**filterPushdown=true . *It's off by default On Mon, Jan 19, 2015 at 3:46 AM, Xiaoyu Wang wangxy...@gmail.com wrote: Yes it works! But the filter can't pushdown!!! If

Aggregations based on sort order

2015-01-19 Thread justin.uang
Hi, I am trying to aggregate a key based on some timestamp, and I believe that spilling to disk is changing the order of the data fed into the combiner. I have some timeseries data that is of the form: (key, date, other data) Partition 1 (A, 2, ...) (B, 4, ...) (A, 1, ...)

Re: How to output to S3 and keep the order

2015-01-19 Thread Aniket Bhatnagar
When you repartiton, ordering can get lost. You would need to sort after repartitioning. Aniket On Tue, Jan 20, 2015, 7:08 AM anny9699 anny9...@gmail.com wrote: Hi, I am using Spark on AWS and want to write the output to S3. It is a relatively small file and I don't want them to output as

Re: Bind Exception

2015-01-19 Thread Ted Yu
Was there another instance of Spark running on the same machine ? Can you pastebin the full stack trace ? Cheers On Mon, Jan 19, 2015 at 8:11 PM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I am running a Spark job. I get the output correctly but when I see the logs file I see the

Re: Does Spark automatically run different stages concurrently when possible?

2015-01-19 Thread davidkl
Hi Jon, I am looking for an answer for a similar question in the doc now, so far no clue. I would need to know what is spark behaviour in a situation like the example you provided, but taking into account also that there are multiple partitions/workers. I could imagine it's possible that

Need some help to create user defined type for ML pipeline

2015-01-19 Thread Jaonary Rabarisoa
Hi all, I'm trying to implement a pipeline for computer vision based on the latest ML package in spark. The first step of my pipeline is to decode image (jpeg for instance) stored in a parquet file. For this, I begin to create a UserDefinedType that represents a decoded image stored in a array of

Re: Newbie Question on How Tasks are Executed

2015-01-19 Thread davidkl
Hello Mixtou, if you want to look at partition ID, I believe you want to use mapPartitionsWithIndex -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Newbie-Question-on-How-Tasks-are-Executed-tp21064p21228.html Sent from the Apache Spark User List mailing

Re: Why custom parquet format hive table execute ParquetTableScan physical plan, not HiveTableScan?

2015-01-19 Thread Xiaoyu Wang
Yes it works! But the filter can't pushdown!!! If custom parquetinputformat only implement the datasource API? https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala 2015-01-16 21:51 GMT+08:00 Xiaoyu Wang wangxy...@gmail.com: Thanks

Re: How to compute RDD[(String, Set[String])] that include large Set

2015-01-19 Thread Kevin (Sangwoo) Kim
In your code, you're doing combination of large sets, like (set1 ++ set2).size which is not a good idea. (rdd1 ++ rdd2).distinct is equivalent implementation and will compute in distributed manner. Not very sure your computation on key'd sets are feasible to be transformed into RDDs. Regards,

com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain

2015-01-19 Thread Hafiz Mujadid
Hi all! I am trying to use kinesis and spark streaming together. So when I execute program I get exception com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain Here is my piece of code val credentials = new

Re: com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain

2015-01-19 Thread Akhil Das
Try this piece of code: System.setProperty(AWS_ACCESS_KEY_ID, access_key) System.setProperty(AWS_SECRET_KEY, secret) val streamName = mystream val endpointUrl = https://kinesis.us-east-1.amazonaws.com/; val kinesisClient = new AmazonKinesisClient(new

Re: How to compute RDD[(String, Set[String])] that include large Set

2015-01-19 Thread jagaximo
That i want to do, get unique count for each key. so take map() or countByKey(), not get unique count. (because duplicate string is likely to be counted)... -- View this message in context:

Re: How to compute RDD[(String, Set[String])] that include large Set

2015-01-19 Thread Kevin (Sangwoo) Kim
If keys are not too many, You can do like this: val data = List( (A, Set(1,2,3)), (A, Set(1,2,4)), (B, Set(1,2,3)) ) val rdd = sc.parallelize(data) rdd.persist() rdd.filter(_._1 == A).flatMap(_._2).distinct.count rdd.filter(_._1 == B).flatMap(_._2).distinct.count rdd.unpersist() == data:

Re: Bind Exception

2015-01-19 Thread Prashant Sharma
Deep, Yes you have another spark shell or application sticking around somewhere. Try to inspect running processes and lookout for jave process. And kill it. This might be helpful https://www.digitalocean.com/community/tutorials/how-to-use-ps-kill-and-nice-to-manage-processes-in-linux Also, That

Re: Join DStream With Other Datasets

2015-01-19 Thread Sean Owen
I don't think this has anything to do with transferring anything from the driver, or per task. I'm talking about a singleton object in the JVM that loads whatever you want from wherever you want and holds it in memory once per JVM. That is, I do not think you have to use broadcast, or even any

Re: Guava 11 dependency issue in Spark 1.2.0

2015-01-19 Thread Romi Kuntsman
I have recently encountered a similar problem with Guava version collision with Hadoop. Isn't it more correct to upgrade Hadoop to use the latest Guava? Why are they staying in version 11, does anyone know? *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Wed, Jan 7, 2015 at 7:59

Fwd: UnknownhostException : home

2015-01-19 Thread Kartheek.R
-- Forwarded message -- From: Rapelly Kartheek kartheek.m...@gmail.com Date: Mon, Jan 19, 2015 at 3:03 PM Subject: UnknownhostException : home To: user@spark.apache.org user@spark.apache.org Hi, I get the following exception when I run my application:

Re: Guava 11 dependency issue in Spark 1.2.0

2015-01-19 Thread Romi Kuntsman
Actually there is already someone on Hadoop-Common-Dev taking care of removing the old Guava dependency http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201501.mbox/browser https://issues.apache.org/jira/browse/HADOOP-11470 *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com

Re: Guava 11 dependency issue in Spark 1.2.0

2015-01-19 Thread Ted Yu
Please see this thread: http://search-hadoop.com/m/LgpTk2aVYgr/Hadoop+guava+upgradesubj=Re+Time+to+address+the+Guava+version+problem On Jan 19, 2015, at 6:03 AM, Romi Kuntsman r...@totango.com wrote: I have recently encountered a similar problem with Guava version collision with Hadoop.

Spark SQL: Assigning several aliases to the output (several return values) of an UDF

2015-01-19 Thread mucks17
Hello I use Hive on Spark and have an issue with assigning several aliases to the output (several return values) of an UDF. I ran in several issues and ended up with a workaround (described at the end of this message). - Is assigning several aliases to the output of an UDF not supported by

[SQL] Using HashPartitioner to distribute by column

2015-01-19 Thread Mick Davies
Is it possible to use a HashPartioner or something similar to distribute a SchemaRDDs data by the hash of a particular column or set of columns. Having done this I would then hope that GROUP BY could avoid shuffle E.g. set up a HashPartioner on CustomerCode field so that SELECT CustomerCode,

Spark 1.20 resource issue with Mesos .21.1

2015-01-19 Thread Brian Belgodere
:44.119282 19317 sched.cpp:242] No credentials provided. Attempting to register without authentication I0119 02:41:44.123064 19317 sched.cpp:408] Framework registered with 20150119-003609-201523392-5050-7198-0002 15/01/19 02:41:44 INFO MesosSchedulerBackend: Registered as framework ID 20150119-003609

unit tests with java.io.IOException: Could not create FileClient

2015-01-19 Thread Jianguo Li
Hi, I created some unit tests to test some of the functions in my project which use Spark. However, when I used the sbt tool to build it and then ran the sbt test, I ran into java.io.IOException: Could not create FileClient: 2015-01-19 08:50:38,1894 ERROR Client

Re: If an RDD appeared twice in a DAG, of which calculation is triggered by a single action, will this RDD be calculated twice?

2015-01-19 Thread Xuefeng Wu
I think it's always twice, could you provide some demo case for sometimes the RDD1 calculated only once? On Sat, Jan 17, 2015 at 2:37 AM, Peng Cheng pc...@uow.edu.au wrote: I'm talking about RDD1 (not persisted or checkpointed) in this situation: ...(somewhere) - RDD1 - RDD2

Re: UnknownhostException : home

2015-01-19 Thread Ashish
it's not able to resolve home to an IP. Assuming it's your local machine, add an entry in your /etc/hosts file like and then run the program again (use sudo to edit the file) 127.0.0.1 home On Mon, Jan 19, 2015 at 3:03 PM, Rapelly Kartheek kartheek.m...@gmail.com wrote: Hi, I get the

Re: UnknownhostException : home

2015-01-19 Thread Sean Owen
Sorry, to be clear, you need to write hdfs:///home/ Note three slashes; there is an empty host between the 2nd and 3rd. This is true of most URI schemes with a host. On Mon, Jan 19, 2015 at 9:56 AM, Rapelly Kartheek kartheek.m...@gmail.com wrote: Yes yes.. hadoop/etc/hadoop/hdfs-site.xml

Re: UnknownhostException : home

2015-01-19 Thread Ashish
+1 to Sean suggestion On Mon, Jan 19, 2015 at 3:21 PM, Sean Owen so...@cloudera.com wrote: I bet somewhere you have a path like hdfs://home/... which would suggest that 'home' is a hostname, when I imagine you mean it as a root directory. On Mon, Jan 19, 2015 at 9:33 AM, Rapelly Kartheek

Re: using hiveContext to select a nested Map-data-type from an AVROmodel+parquet file

2015-01-19 Thread BB
I am quoting the reply I got on this - which for some reason did not get posted here. The suggestion in the reply below worked perfectly for me. The error mentioned in the reply is not related (or old). Hope this is helpful to someone. Cheers, BB Hi, BB Ideally you can do the query like:

is there documentation on spark sql catalyst?

2015-01-19 Thread critikaled
Where can I find a good documentation on sql catalyst? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/is-there-documentation-on-spark-sql-catalyst-tp21232.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: UnknownhostException : home

2015-01-19 Thread Rapelly Kartheek
Yes yes.. hadoop/etc/hadoop/hdfs-site.xml file has the path like: hdfs://home/... On Mon, Jan 19, 2015 at 3:21 PM, Sean Owen so...@cloudera.com wrote: I bet somewhere you have a path like hdfs://home/... which would suggest that 'home' is a hostname, when I imagine you mean it as a root

possible memory leak when re-creating SparkContext's in the same JVM

2015-01-19 Thread Noam Barcay
Problem we're seeing is a gradual memory leak in the driver's JVM. Executing jobs using a long running Java app which creates relatively short-lived SparkContext's. So our Spark drivers are created as part of this application's JVM. We're using standalone cluster mode, spark 1.0.2 Root cause of

Is there any way to support multiple users executing SQL on thrift server?

2015-01-19 Thread Yi Tian
Is there any way to support multiple users executing SQL on one thrift server? I think there are some problems for spark 1.2.0, for example: 1. Start thrift server with user A 2. Connect to thrift server via beeline with user B 3. Execute “insert into table dest select … from table src” then

Re: ALS.trainImplicit running out of mem when using higher rank

2015-01-19 Thread Sean Owen
The problem is clearly to do with the executor exceeding YARN allocations, so, this can't be in local mode. He said this was running on YARN at the outset. On Mon, Jan 19, 2015 at 2:27 AM, Raghavendra Pandey raghavendra.pan...@gmail.com wrote: If you are running spark in local mode, executor

Re: Spark SQL Parquet - data are reading very very slow

2015-01-19 Thread Mick Davies
Added a JIRA to track https://issues.apache.org/jira/browse/SPARK-5309 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Parquet-data-are-reading-very-very-slow-tp21061p21229.html Sent from the Apache Spark User List mailing list archive at

UnknownhostException : home

2015-01-19 Thread Rapelly Kartheek
Hi, I get the following exception when I run my application: karthik@karthik:~/spark-1.2.0$ ./bin/spark-submit --class org.apache.spark.examples.SimpleApp001 --deploy-mode client --master spark://karthik:7077 $SPARK_HOME/examples/*/scala-*/spark-examples-*.jar out1.txt log4j:WARN No such

Re: Determine number of running executors

2015-01-19 Thread Tobias Pfeiffer
Hi, On Sat, Jan 17, 2015 at 3:05 AM, Shuai Zheng szheng.c...@gmail.com wrote: Can you share more information about how do you do that? I also have similar question about this. Not very proud about it ;-), but here you go: // find the number of workers available to us. val _runCmd =

Re: UnknownhostException : home

2015-01-19 Thread Sean Owen
I bet somewhere you have a path like hdfs://home/... which would suggest that 'home' is a hostname, when I imagine you mean it as a root directory. On Mon, Jan 19, 2015 at 9:33 AM, Rapelly Kartheek kartheek.m...@gmail.com wrote: Hi, I get the following exception when I run my application:

Re: UnknownhostException : home

2015-01-19 Thread Rapelly Kartheek
Actually, I don't have any entry in my /etc/hosts file with hostname: home. Infact, I didn't use this hostname naywhere. Then why is it that its trying to resolve this? On Mon, Jan 19, 2015 at 3:15 PM, Ashish paliwalash...@gmail.com wrote: it's not able to resolve home to an IP. Assuming it's

Re: UnknownhostException : home

2015-01-19 Thread Rapelly Kartheek
Yeah... I made that mistake in spark/conf/spark-defaults.conf for setting: spark.eventLog.dir. Now it works Thank you Karthik On Mon, Jan 19, 2015 at 3:29 PM, Sean Owen so...@cloudera.com wrote: Sorry, to be clear, you need to write hdfs:///home/ Note three slashes; there is an

Re: Does Spark automatically run different stages concurrently when possible?

2015-01-19 Thread critikaled
+1, I too need to know. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Does-Spark-automatically-run-different-stages-concurrently-when-possible-tp21075p21233.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Does Spark automatically run different stages concurrently when possible?

2015-01-19 Thread Sean Owen
From the OP: (1) val lines = Import full dataset using sc.textFile (2) val ABonly = Filter out all rows from lines that are not of type A or B (3) val processA = Process only the A rows from ABonly (4) val processB = Process only the B rows from ABonly I assume that 3 and 4 are actions, or else

Re: Streaming with Java: Expected ReduceByWindow to Return JavaDStream

2015-01-19 Thread Jeff Nadler
For anyone who finds this later, looks like Jerry already took care of this here: https://issues.apache.org/jira/browse/SPARK-5315 Thanks! On Sun, Jan 18, 2015 at 10:28 PM, Shao, Saisai saisai.s...@intel.com wrote: Hi Jeff, From my understanding it seems more like a bug, since

Spark SQL: Assigning several aliases to the output (several return values) of an UDF

2015-01-19 Thread mucks17
Hello I use Hive on Spark and have an issue with assigning several aliases to the output (several return values) of an UDF. I ran in several issues and ended up with a workaround (described at the end of this message). - Is assigning several aliases to the output of an UDF not supported by

Re: unit tests with java.io.IOException: Could not create FileClient

2015-01-19 Thread Ted Yu
Your classpath has some MapR jar. Is that intentional ? Cheers On Mon, Jan 19, 2015 at 6:58 AM, Jianguo Li flyingfromch...@gmail.com wrote: Hi, I created some unit tests to test some of the functions in my project which use Spark. However, when I used the sbt tool to build it and then ran

Re: Does Spark automatically run different stages concurrently when possible?

2015-01-19 Thread critikaled
Hi, john and david I tried this to run them concurrently List(RDD1,RDD2,.).par.foreach{ rdd= rdd.collect().foreach(println) } this was able to successfully register the task but the parallelism of the stages is limited it was able run 4 of them some time and only one of them some time

Re: Does Spark automatically run different stages concurrently when possible?

2015-01-19 Thread Sean Owen
Keep in mind that your executors will be able to run some fixed number of tasks in parallel, given your configuration. You should not necessarily expect that arbitrarily many RDDs and tasks would schedule simultaneously. On Mon, Jan 19, 2015 at 5:34 PM, critikaled isasmani@gmail.com wrote:

Re: EC2 VPC script

2015-01-19 Thread Vladimir Grigor
I also found this issue. I have reported it as a bug https://issues.apache.org/jira/browse/SPARK-5242 and submitted a fix. You can find link to fixed fork in the comments on the issue page. Please vote on the issue, hopefully guys will accept pull request faster then :) Regards, Vladimir On Mon,

Re: Why Parquet Predicate Pushdown doesn't work?

2015-01-19 Thread Jerry Lam
Hi guys, Does this issue affect 1.2.0 only or all previous releases as well? Best Regards, Jerry On Thu, Jan 8, 2015 at 1:40 AM, Xuelin Cao xuelincao2...@gmail.com wrote: Yes, the problem is, I've turned the flag on. One possible reason for this is, the parquet file supports predicate

Error for first run from iPython Notebook

2015-01-19 Thread Dave
Hi, I've setup my first spark cluster (1 master, 2 workers) and an iPython notebook server that I'm trying to setup to access the cluster. I'm running the workers from Anaconda to make sure the python setup is correct on each box. The iPy notebook server appears to have everything setup

Re: How to get the master URL at runtime inside driver program?

2015-01-19 Thread Raghavendra Pandey
If you pass spark master URL to spark-submit, you don't need to pass the same to SparkConf object. You can create SparkConf without this property or for that matter any other property that you pass in spark-submit. On Sun, Jan 18, 2015 at 7:38 AM, guxiaobo1982 guxiaobo1...@qq.com wrote: Hi,

Re: ERROR TaskSchedulerImpl: Lost an executor

2015-01-19 Thread suresh
I am trying to run SparkR shell on aws I am unable to access worker nodes webUI access. 15/01/19 19:57:17 ERROR TaskSchedulerImpl: Lost an executor 0 (already removed): remote Akka client disassociated 15/01/19 19:57:17 ERROR TaskSchedulerImpl: Lost an executor 1 (already removed): remote Akka

Re: ERROR TaskSchedulerImpl: Lost an executor

2015-01-19 Thread Ted Yu
Have you seen this thread ? http://search-hadoop.com/m/JW1q5PgA7X What Spark release are you running ? Cheers On Mon, Jan 19, 2015 at 12:04 PM, suresh lanki.sur...@gmail.com wrote: I am trying to run SparkR shell on aws I am unable to access worker nodes webUI access. 15/01/19 19:57:17

Re: ERROR TaskSchedulerImpl: Lost an executor

2015-01-19 Thread Suresh Lankipalli
Hi Yu, I am able to run Spark-example's, I am unable to run SparkR example (only Pi example is running on SparkR). Thank you Regards Suresh On Mon, Jan 19, 2015 at 3:08 PM, Ted Yu yuzhih...@gmail.com wrote: Have you seen this thread ? http://search-hadoop.com/m/JW1q5PgA7X What Spark