My guess would be data skew. Do you know if there is some item id that is a catch all? can it be null? item id 0? lots of data sets have this sort of value and it always kills joins
2015-04-13 11:32 GMT-04:00 ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>: > Code: > > val viEventsWithListings: RDD[(Long, (DetailInputRecord, VISummary, > Long))] = lstgItem.join(viEvents).map { > case (itemId, (listing, viDetail)) => > val viSummary = new VISummary > viSummary.leafCategoryId = listing.getLeafCategId().toInt > viSummary.itemSiteId = listing.getItemSiteId().toInt > viSummary.auctionTypeCode = listing.getAuctTypeCode().toInt > viSummary.sellerCountryId = listing.getSlrCntryId().toInt > viSummary.buyerSegment = "0" > viSummary.isBin = (if (listing.getBinPriceLstgCurncy.doubleValue() > > 0) 1 else 0) > val sellerId = listing.getSlrId.toLong > (sellerId, (viDetail, viSummary, itemId)) > } > > Running Tasks: > Tasks IndexIDAttemptStatus ▾Locality LevelExecutor ID / HostLaunch Time > DurationGC TimeShuffle Read Size / RecordsWrite TimeShuffle Write Size / > RecordsShuffle Spill (Memory)Shuffle Spill (Disk)Errors 0 216 0 RUNNING > PROCESS_LOCAL 181 / phxaishdc9dn0474.phx.ebay.com 2015/04/13 06:43:53 1.7 > h 13 min 3.0 GB / 56964921 0.0 B / 0 21.2 GB 1902.6 MB 2 218 0 > SUCCESS PROCESS_LOCAL 582 / phxaishdc9dn0235.phx.ebay.com 2015/04/13 > 06:43:53 15 min 31 s 2.2 GB / 1666851 0.1 s 3.0 MB / 2062 54.8 GB 1924.5 > MB 1 217 0 SUCCESS PROCESS_LOCAL 202 / > phxdpehdc9dn2683.stratus.phx.ebay.com 2015/04/13 06:43:53 19 min 1.3 min > 2.2 GB / 1687086 75 ms 3.9 MB / 2692 33.7 GB 1960.4 MB 4 220 0 SUCCESS > PROCESS_LOCAL 218 / phxaishdc9dn0855.phx.ebay.com 2015/04/13 06:43:53 15 > min 56 s 2.2 GB / 1675654 40 ms 3.3 MB / 2260 26.2 GB 1928.4 MB > > > > Command: > ./bin/spark-submit -v --master yarn-cluster --driver-class-path > /apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/hdfs/hadoop-hdfs-2.4.1-EBAY-2.jar > --jars > /apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/hdfs/hadoop-hdfs-2.4.1-EBAY-2.jar,/home/dvasthimal/spark1.3/spark_reporting_dep_only-1.0-SNAPSHOT.jar > --num-executors 3000 --driver-memory 12g --driver-java-options > "-XX:MaxPermSize=6G" --executor-memory 12g --executor-cores 1 --queue > hdmi-express --class com.ebay.ep.poc.spark.reporting.SparkApp > /home/dvasthimal/spark1.3/spark_reporting-1.0-SNAPSHOT.jar > startDate=2015-04-6 endDate=2015-04-7 > input=/user/dvasthimal/epdatasets_small/exptsession subcommand=viewItem > output=/user/dvasthimal/epdatasets/viewItem buffersize=128 > maxbuffersize=1068 maxResultSize=2G > > > What do i do ? I killed the job twice and its stuck again. Where is it > stuck ? > > -- > Deepak > >