Can you look more in the worker logs and see whats going on? It looks like a memory issue (kind of GC overhead etc., You need to look in the worker logs)
Thanks Best Regards On Fri, Aug 7, 2015 at 3:21 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote: > Re attaching the images. > > On Thu, Aug 6, 2015 at 2:50 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote: > >> Code: >> import java.text.SimpleDateFormat >> import java.util.Calendar >> import java.sql.Date >> import org.apache.spark.storage.StorageLevel >> >> def extract(array: Array[String], index: Integer) = { >> if (index < array.length) { >> array(index).replaceAll("\"", "") >> } else { >> "" >> } >> } >> >> >> case class GuidSess( >> guid: String, >> sessionKey: String, >> sessionStartDate: String, >> siteId: String, >> eventCount: String, >> browser: String, >> browserVersion: String, >> operatingSystem: String, >> experimentChannel: String, >> deviceName: String) >> >> val rowStructText = >> sc.textFile("/user/zeppelin/guidsess/2015/08/05/part-m-00001.gz") >> val guidSessRDD = rowStructText.filter(s => s.length != 1).map(s => >> s.split(",")).map( >> { >> s => >> GuidSess(extract(s, 0), >> extract(s, 1), >> extract(s, 2), >> extract(s, 3), >> extract(s, 4), >> extract(s, 5), >> extract(s, 6), >> extract(s, 7), >> extract(s, 8), >> extract(s, 9)) >> }) >> >> val guidSessDF = guidSessRDD.toDF() >> guidSessDF.registerTempTable("guidsess") >> >> Once the temp table is created, i wrote this query >> >> select siteid, count(distinct guid) total_visitor, >> count(sessionKey) as total_visits >> from guidsess >> group by siteid >> >> *Metrics:* >> >> Data Size: 170 MB >> Spark Version: 1.3.1 >> YARN: 2.7.x >> >> >> >> Timeline: >> There is 1 Job, 2 stages with 1 task each. >> >> *1st Stage : mapPartitions* >> [image: Inline image 1] >> >> 1st Stage: Task 1 started to fail. A second attempt started for 1st task >> of first Stage. The first attempt failed "Executor LOST" >> when i go to YARN resource manager and go to that particular host, i see >> that its running fine. >> >> *Attempt #1* >> [image: Inline image 2] >> >> *Attempt #2* Executor LOST AGAIN >> [image: Inline image 3] >> *Attempt 3&4* >> >> *[image: Inline image 4]* >> >> >> >> *2nd Stage runJob : SKIPPED* >> >> *[image: Inline image 5]* >> >> Any suggestions ? >> >> >> -- >> Deepak >> >> > > > -- > Deepak > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >