Greetings!

I've been running various spark streaming jobs to persist data from kafka 
topics and one persister in particular seems to have issues. I've verified that 
the number of messages is the same per partition (roughly of course) and the 
volume of data is a fraction of the volume of other persisters that appear to 
be working fine. 

The tasks appear to go fine until approximately 74-80 of the tasks (of 96) in, 
and then the remaining tasks take a while. I'm using EMR/Spark 2.1.0/Kafka 
0.10.0.1/EMRFS (EMR's S3 solution). Any help would be greatly appreciated!

Here's the code I'm using to do the transformation:

val transformedData = transformer(sqlContext.createDataFrame(values, 
converter.schema))

transformedData
  .write
  .mode(Append)
  .partitionBy(persisterConfig.partitioning: _*)
  .format("parquet")
  .save(parquetPath)

Here's the output of the job as it's running (thrift -> parquet/snappy -> s3 is 
the flow), the files are roughly the same size (96 files per 10 minute window):

17/04/05 16:43:43 INFO TaskSetManager: Finished task 72.0 in stage 7.0 (TID 
722) in 10089 ms on ip-172-20-213-64.us-west-2.compute.internal (executor 57) 
(1/96)
17/04/05 16:43:43 INFO TaskSetManager: Finished task 58.0 in stage 7.0 (TID 
680) in 10099 ms on ip-172-20-218-229.us-west-2.compute.internal (executor 90) 
(2/96)
17/04/05 16:43:43 INFO TaskSetManager: Finished task 81.0 in stage 7.0 (TID 
687) in 10244 ms on ip-172-20-218-144.us-west-2.compute.internal (executor 8) 
(3/96)
17/04/05 16:43:43 INFO TaskSetManager: Finished task 23.0 in stage 7.0 (TID 
736) in 10236 ms on ip-172-20-209-248.us-west-2.compute.internal (executor 82) 
(4/96)
17/04/05 16:43:43 INFO TaskSetManager: Finished task 52.0 in stage 7.0 (TID 
730) in 10275 ms on ip-172-20-218-144.us-west-2.compute.internal (executor 78) 
(5/96)
17/04/05 16:43:43 INFO TaskSetManager: Finished task 45.0 in stage 7.0 (TID 
691) in 10289 ms on ip-172-20-215-172.us-west-2.compute.internal (executor 41) 
(6/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 13.0 in stage 7.0 (TID 
712) in 10532 ms on ip-172-20-223-100.us-west-2.compute.internal (executor 65) 
(7/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 42.0 in stage 7.0 (TID 
694) in 10595 ms on ip-172-20-208-230.us-west-2.compute.internal (executor 18) 
(8/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 2.0 in stage 7.0 (TID 763) 
in 10623 ms on ip-172-20-208-230.us-west-2.compute.internal (executor 74) (9/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 82.0 in stage 7.0 (TID 
727) in 10631 ms on ip-172-20-212-76.us-west-2.compute.internal (executor 72) 
(10/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 69.0 in stage 7.0 (TID 
729) in 10716 ms on ip-172-20-215-172.us-west-2.compute.internal (executor 55) 
(11/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 65.0 in stage 7.0 (TID 
673) in 10733 ms on ip-172-20-217-201.us-west-2.compute.internal (executor 67) 
(12/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 15.0 in stage 7.0 (TID 
684) in 10737 ms on ip-172-20-213-64.us-west-2.compute.internal (executor 85) 
(13/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 27.0 in stage 7.0 (TID 
748) in 10747 ms on ip-172-20-217-201.us-west-2.compute.internal (executor 10) 
(14/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 46.0 in stage 7.0 (TID 
699) in 10834 ms on ip-172-20-218-229.us-west-2.compute.internal (executor 48) 
(15/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 6.0 in stage 7.0 (TID 719) 
in 10838 ms on ip-172-20-211-125.us-west-2.compute.internal (executor 52) 
(16/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 11.0 in stage 7.0 (TID 
739) in 10892 ms on ip-172-20-215-172.us-west-2.compute.internal (executor 83) 
(17/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 88.0 in stage 7.0 (TID 
697) in 10900 ms on ip-172-20-212-43.us-west-2.compute.internal (executor 70) 
(18/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 35.0 in stage 7.0 (TID 
678) in 10909 ms on ip-172-20-212-63.us-west-2.compute.internal (executor 77) 
(19/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 0.0 in stage 7.0 (TID 700) 
in 10906 ms on ip-172-20-208-230.us-west-2.compute.internal (executor 46) 
(20/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 36.0 in stage 7.0 (TID 
732) in 10935 ms on ip-172-20-215-172.us-west-2.compute.internal (executor 69) 
(21/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 19.0 in stage 7.0 (TID 
759) in 10948 ms on ip-172-20-223-100.us-west-2.compute.internal (executor 37) 
(22/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 41.0 in stage 7.0 (TID 
703) in 11013 ms on ip-172-20-217-201.us-west-2.compute.internal (executor 81) 
(23/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 8.0 in stage 7.0 (TID 745) 
in 11007 ms on ip-172-20-215-172.us-west-2.compute.internal (executor 13) 
(24/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 12.0 in stage 7.0 (TID 
742) in 11014 ms on ip-172-20-212-43.us-west-2.compute.internal (executor 56) 
(25/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 55.0 in stage 7.0 (TID 
734) in 11105 ms on ip-172-20-218-229.us-west-2.compute.internal (executor 6) 
(26/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 48.0 in stage 7.0 (TID 
698) in 11139 ms on ip-172-20-218-229.us-west-2.compute.internal (executor 20) 
(27/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 64.0 in stage 7.0 (TID 
685) in 11160 ms on ip-172-20-212-63.us-west-2.compute.internal (executor 63) 
(28/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 33.0 in stage 7.0 (TID 
708) in 11168 ms on ip-172-20-218-144.us-west-2.compute.internal (executor 22) 
(29/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 53.0 in stage 7.0 (TID 
749) in 11165 ms on ip-172-20-215-172.us-west-2.compute.internal (executor 27) 
(30/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 91.0 in stage 7.0 (TID 
723) in 11179 ms on ip-172-20-220-110.us-west-2.compute.internal (executor 59) 
(31/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 34.0 in stage 7.0 (TID 
743) in 11187 ms on ip-172-20-208-230.us-west-2.compute.internal (executor 32) 
(32/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 32.0 in stage 7.0 (TID 
676) in 11201 ms on ip-172-20-211-125.us-west-2.compute.internal (executor 25) 
(33/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 59.0 in stage 7.0 (TID 
755) in 11191 ms on ip-172-20-219-239.us-west-2.compute.internal (executor 33) 
(34/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 57.0 in stage 7.0 (TID 
738) in 11206 ms on ip-172-20-213-64.us-west-2.compute.internal (executor 71) 
(35/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 17.0 in stage 7.0 (TID 
728) in 11226 ms on ip-172-20-212-43.us-west-2.compute.internal (executor 28) 
(36/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 47.0 in stage 7.0 (TID 
689) in 11233 ms on ip-172-20-223-100.us-west-2.compute.internal (executor 51) 
(37/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 70.0 in stage 7.0 (TID 
737) in 11228 ms on ip-172-20-218-144.us-west-2.compute.internal (executor 92) 
(38/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 79.0 in stage 7.0 (TID 
710) in 11238 ms on ip-172-20-208-230.us-west-2.compute.internal (executor 88) 
(39/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 80.0 in stage 7.0 (TID 
679) in 11253 ms on ip-172-20-212-76.us-west-2.compute.internal (executor 16) 
(40/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 31.0 in stage 7.0 (TID 
746) in 11298 ms on ip-172-20-223-100.us-west-2.compute.internal (executor 23) 
(41/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 89.0 in stage 7.0 (TID 
718) in 11314 ms on ip-172-20-211-125.us-west-2.compute.internal (executor 66) 
(42/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 77.0 in stage 7.0 (TID 
706) in 11329 ms on ip-172-20-211-125.us-west-2.compute.internal (executor 93) 
(43/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 95.0 in stage 7.0 (TID 
767) in 11365 ms on ip-172-20-212-43.us-west-2.compute.internal (executor 42) 
(44/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 43.0 in stage 7.0 (TID 
696) in 11382 ms on ip-172-20-211-125.us-west-2.compute.internal (executor 39) 
(45/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 71.0 in stage 7.0 (TID 
713) in 11426 ms on ip-172-20-212-63.us-west-2.compute.internal (executor 21) 
(46/96)
17/04/05 16:43:44 INFO TaskSetManager: Finished task 20.0 in stage 7.0 (TID 
721) in 11437 ms on ip-172-20-212-63.us-west-2.compute.internal (executor 7) 
(47/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 60.0 in stage 7.0 (TID 
733) in 11534 ms on ip-172-20-213-64.us-west-2.compute.internal (executor 43) 
(48/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 21.0 in stage 7.0 (TID 
741) in 11548 ms on ip-172-20-211-125.us-west-2.compute.internal (executor 11) 
(49/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 66.0 in stage 7.0 (TID 
758) in 11657 ms on ip-172-20-212-63.us-west-2.compute.internal (executor 35) 
(50/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 40.0 in stage 7.0 (TID 
765) in 11659 ms on ip-172-20-220-110.us-west-2.compute.internal (executor 73) 
(51/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 49.0 in stage 7.0 (TID 
702) in 11711 ms on ip-172-20-209-248.us-west-2.compute.internal (executor 68) 
(52/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 22.0 in stage 7.0 (TID 
754) in 11732 ms on ip-172-20-212-76.us-west-2.compute.internal (executor 2) 
(53/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 54.0 in stage 7.0 (TID 
711) in 11784 ms on ip-172-20-212-43.us-west-2.compute.internal (executor 14) 
(54/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 78.0 in stage 7.0 (TID 
675) in 11837 ms on ip-172-20-220-110.us-west-2.compute.internal (executor 87) 
(55/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 7.0 in stage 7.0 (TID 701) 
in 11842 ms on ip-172-20-220-110.us-west-2.compute.internal (executor 45) 
(56/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 14.0 in stage 7.0 (TID 
747) in 11839 ms on ip-172-20-218-229.us-west-2.compute.internal (executor 34) 
(57/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 26.0 in stage 7.0 (TID 
760) in 11888 ms on ip-172-20-209-248.us-west-2.compute.internal (executor 54) 
(58/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 9.0 in stage 7.0 (TID 693) 
in 11911 ms on ip-172-20-223-100.us-west-2.compute.internal (executor 94) 
(59/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 76.0 in stage 7.0 (TID 
750) in 11961 ms on ip-172-20-212-63.us-west-2.compute.internal (executor 49) 
(60/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 30.0 in stage 7.0 (TID 
764) in 12031 ms on ip-172-20-209-248.us-west-2.compute.internal (executor 40) 
(61/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 39.0 in stage 7.0 (TID 
674) in 12084 ms on ip-172-20-209-248.us-west-2.compute.internal (executor 12) 
(62/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 29.0 in stage 7.0 (TID 
740) in 12091 ms on ip-172-20-219-239.us-west-2.compute.internal (executor 47) 
(63/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 61.0 in stage 7.0 (TID 
683) in 12163 ms on ip-172-20-218-229.us-west-2.compute.internal (executor 62) 
(64/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 50.0 in stage 7.0 (TID 
705) in 12185 ms on ip-172-20-212-76.us-west-2.compute.internal (executor 44) 
(65/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 10.0 in stage 7.0 (TID 
707) in 12266 ms on ip-172-20-219-239.us-west-2.compute.internal (executor 61) 
(66/96)
17/04/05 16:43:45 INFO TaskSetManager: Finished task 62.0 in stage 7.0 (TID 
688) in 12374 ms on ip-172-20-219-239.us-west-2.compute.internal (executor 89) 
(67/96)
17/04/05 16:43:46 INFO TaskSetManager: Finished task 5.0 in stage 7.0 (TID 752) 
in 12491 ms on ip-172-20-223-100.us-west-2.compute.internal (executor 9) (68/96)
17/04/05 16:43:46 INFO TaskSetManager: Finished task 83.0 in stage 7.0 (TID 
751) in 12649 ms on ip-172-20-209-248.us-west-2.compute.internal (executor 26) 
(69/96)
17/04/05 16:43:46 INFO TaskSetManager: Finished task 67.0 in stage 7.0 (TID 
682) in 12724 ms on ip-172-20-217-201.us-west-2.compute.internal (executor 38) 
(70/96)
17/04/05 16:43:46 INFO TaskSetManager: Finished task 90.0 in stage 7.0 (TID 
756) in 12825 ms on ip-172-20-212-76.us-west-2.compute.internal (executor 30) 
(71/96)
17/04/05 16:43:46 INFO TaskSetManager: Finished task 25.0 in stage 7.0 (TID 
757) in 13302 ms on ip-172-20-212-76.us-west-2.compute.internal (executor 58) 
(72/96)
17/04/05 16:43:47 INFO TaskSetManager: Finished task 28.0 in stage 7.0 (TID 
735) in 13667 ms on ip-172-20-220-110.us-west-2.compute.internal (executor 17) 
(73/96)
17/04/05 16:44:07 INFO TaskSetManager: Finished task 93.0 in stage 7.0 (TID 
681) in 33805 ms on ip-172-20-220-110.us-west-2.compute.internal (executor 31) 
(74/96)
17/04/05 16:48:43 INFO TaskSetManager: Finished task 87.0 in stage 7.0 (TID 
744) in 310121 ms on ip-172-20-223-100.us-west-2.compute.internal (executor 80) 
(75/96)
17/04/05 16:48:43 INFO TaskSetManager: Finished task 3.0 in stage 7.0 (TID 709) 
in 310221 ms on ip-172-20-212-63.us-west-2.compute.internal (executor 91) 
(76/96)
17/04/05 16:48:43 INFO TaskSetManager: Finished task 85.0 in stage 7.0 (TID 
726) in 310370 ms on ip-172-20-209-248.us-west-2.compute.internal (executor 96) 
(77/96)
17/04/05 16:48:43 INFO TaskSetManager: Finished task 38.0 in stage 7.0 (TID 
725) in 310391 ms on ip-172-20-219-239.us-west-2.compute.internal (executor 75) 
(78/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 37.0 in stage 7.0 (TID 
766) in 310617 ms on ip-172-20-219-239.us-west-2.compute.internal (executor 19) 
(79/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 16.0 in stage 7.0 (TID 
720) in 310678 ms on ip-172-20-218-144.us-west-2.compute.internal (executor 64) 
(80/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 68.0 in stage 7.0 (TID 
753) in 310779 ms on ip-172-20-218-144.us-west-2.compute.internal (executor 50) 
(81/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 24.0 in stage 7.0 (TID 
695) in 310802 ms on ip-172-20-212-76.us-west-2.compute.internal (executor 86) 
(82/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 86.0 in stage 7.0 (TID 
714) in 310808 ms on ip-172-20-218-144.us-west-2.compute.internal (executor 36) 
(83/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 51.0 in stage 7.0 (TID 
716) in 310837 ms on ip-172-20-217-201.us-west-2.compute.internal (executor 24) 
(84/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 92.0 in stage 7.0 (TID 
761) in 310858 ms on ip-172-20-213-64.us-west-2.compute.internal (executor 1) 
(85/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 75.0 in stage 7.0 (TID 
672) in 310995 ms on ip-172-20-213-64.us-west-2.compute.internal (executor 29) 
(86/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 1.0 in stage 7.0 (TID 715) 
in 311159 ms on ip-172-20-212-43.us-west-2.compute.internal (executor 84) 
(87/96)
17/04/05 16:48:44 INFO TaskSetManager: Finished task 4.0 in stage 7.0 (TID 677) 
in 311443 ms on ip-172-20-220-110.us-west-2.compute.internal (executor 3) 
(88/96)
17/04/05 16:48:45 INFO TaskSetManager: Finished task 73.0 in stage 7.0 (TID 
690) in 311523 ms on ip-172-20-218-229.us-west-2.compute.internal (executor 76) 
(89/96)
17/04/05 16:48:45 INFO TaskSetManager: Finished task 84.0 in stage 7.0 (TID 
686) in 311554 ms on ip-172-20-208-230.us-west-2.compute.internal (executor 60) 
(90/96)
17/04/05 16:48:45 INFO TaskSetManager: Finished task 44.0 in stage 7.0 (TID 
692) in 312165 ms on ip-172-20-208-230.us-west-2.compute.internal (executor 4) 
(91/96)
17/04/05 16:48:45 INFO TaskSetManager: Finished task 63.0 in stage 7.0 (TID 
762) in 312299 ms on ip-172-20-211-125.us-west-2.compute.internal (executor 79) 
(92/96)
17/04/05 16:48:46 INFO TaskSetManager: Finished task 94.0 in stage 7.0 (TID 
724) in 313148 ms on ip-172-20-219-239.us-west-2.compute.internal (executor 5) 
(93/96)
17/04/05 16:48:46 INFO TaskSetManager: Finished task 18.0 in stage 7.0 (TID 
717) in 313332 ms on ip-172-20-213-64.us-west-2.compute.internal (executor 15) 
(94/96)
17/04/05 16:48:48 INFO TaskSetManager: Finished task 56.0 in stage 7.0 (TID 
731) in 314838 ms on ip-172-20-217-201.us-west-2.compute.internal (executor 95) 
(95/96)
17/04/05 16:48:52 INFO TaskSetManager: Finished task 74.0 in stage 7.0 (TID 
704) in 318573 ms on ip-172-20-217-201.us-west-2.compute.internal (executor 53) 
(96/96)

Thanks,
Justin


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to