14/03/19 19:11:37 INFO Executor: Serialized size of result for 678 is 1423 14/03/19 19:11:37 INFO Executor: Sending result for 678 directly to driver14/03/19 19:11:37 INFO Executor: Finished task ID 678 14/03/19 19:11:37 INFO NativeS3FileSystem: Opening key 'test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_1.csv' for reading at position '134217727' 14/03/19 19:11:37 INFO NativeS3FileSystem: Opening key 'test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_1.csv' for reading at position '402653183' 14/03/19 19:11:38 INFO MemoryStore: ensureFreeSpace(82814164) called with curMem=11412250398, maxMem=35160431001 14/03/19 19:11:38 INFO MemoryStore: Block rdd_5_681 stored as bytes to memory (size 79.0 MB, free 22.0 GB) 14/03/19 19:11:38 INFO BlockManagerMaster: Updated info of block rdd_5_681 14/03/19 19:11:38 INFO MemoryStore: ensureFreeSpace(83081354) called with curMem=11495064562, maxMem=35160431001 14/03/19 19:11:38 INFO MemoryStore: Block rdd_5_693 stored as bytes to memory (size 79.2 MB, free 22.0 GB) 14/03/19 19:11:38 INFO BlockManagerMaster: Updated info of block rdd_5_693 14/03/19 19:11:38 INFO Executor: Serialized size of result for 681 is 1423 14/03/19 19:11:38 INFO Executor: Sending result for 681 directly to driver 14/03/19 19:11:38 INFO Executor: Finished task ID 681 14/03/19 19:11:39 INFO CoarseGrainedExecutorBackend: Got assigned task 707 14/03/19 19:11:39 INFO Executor: Running task ID 707 14/03/19 19:11:39 INFO BlockManager: Found block broadcast_1 locally 14/03/19 19:11:39 INFO CacheManager: Partition rdd_5_685 not found, computing it 14/03/19 19:11:39 INFO NewHadoopRDD: Input split: s3n://AKIAJ346M2WM3VKBHFJA:ezu6d3li5gu6j3panqtxmihlypliwhqme+du8...@platfora.qa/test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_*.csv/proton:0+0 14/03/19 19:11:39 INFO Executor: Serialized size of result for 693 is 1423 14/03/19 19:11:39 INFO Executor: Sending result for 693 directly to driver 14/03/19 19:11:39 INFO Executor: Finished task ID 693 14/03/19 19:11:39 ERROR Executor: Exception in task ID 706 java.io.IOException: 's3n://AKIAJ346M2WM3VKBHFJA:ezu6d3li5gu6j3panqtxmihlypliwhqme+du8...@platfora.qa/test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_*.csv/base-maps' is a directory at org.apache.hadoop.fs.s3native.NativeS3FileSystem.open(NativeS3FileSystem.java:559) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:711) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:75) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:96) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:84) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:48) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:71) at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) 14/03/19 19:11:39 INFO CoarseGrainedExecutorBackend: Got assigned task 708 14/03/19 19:11:39 INFO Executor: Running task ID 708 14/03/19 19:11:39 INFO BlockManager: Found block broadcast_1 locally 14/03/19 19:11:39 INFO CacheManager: Partition rdd_5_684 not found, computing it 14/03/19 19:11:39 INFO NewHadoopRDD: Input split: s3n://AKIAJ346M2WM3VKBHFJA:ezu6d3li5gu6j3panqtxmihlypliwhqme+du8...@platfora.qa/test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_*.csv/base-maps:0+0 14/03/19 19:11:39 ERROR Executor: Exception in task ID 707 java.io.IOException: 's3n://AKIAJ346M2WM3VKBHFJA:ezu6d3li5gu6j3panqtxmihlypliwhqme+du8...@platfora.qa/test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_*.csv/proton' is a directory at org.apache.hadoop.fs.s3native.NativeS3FileSystem.open(NativeS3FileSystem.java:559) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:711) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:75) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:96) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:84) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:48) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:71) at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) 14/03/19 19:11:39 INFO CoarseGrainedExecutorBackend: Got assigned task 709 14/03/19 19:11:39 INFO Executor: Running task ID 709 14/03/19 19:11:39 INFO BlockManager: Found block broadcast_1 locally 14/03/19 19:11:39 INFO CacheManager: Partition rdd_5_685 not found, computing it 14/03/19 19:11:39 INFO NewHadoopRDD: Input split: s3n://AKIAJ346M2WM3VKBHFJA:ezu6d3li5gu6j3panqtxmihlypliwhqme+du8...@platfora.qa/test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_*.csv/proton:0+0 14/03/19 19:11:39 ERROR Executor: Exception in task ID 708 java.io.IOException: 's3n://AKIAJ346M2WM3VKBHFJA:ezu6d3li5gu6j3panqtxmihlypliwhqme+du8...@platfora.qa/test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_*.csv/base-maps' is a directory at org.apache.hadoop.fs.s3native.NativeS3FileSystem.open(NativeS3FileSystem.java:559) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:711) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:75) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:96) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:84) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:48) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:71) at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) 14/03/19 19:11:39 INFO CoarseGrainedExecutorBackend: Got assigned task 710 14/03/19 19:11:39 INFO Executor: Running task ID 710 14/03/19 19:11:39 INFO BlockManager: Found block broadcast_1 locally 14/03/19 19:11:39 INFO CacheManager: Partition rdd_5_684 not found, computing it 14/03/19 19:11:39 INFO NewHadoopRDD: Input split: s3n://AKIAJ346M2WM3VKBHFJA:ezu6d3li5gu6j3panqtxmihlypliwhqme+du8...@platfora.qa/test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_*.csv/base-maps:0+0 14/03/19 19:11:40 INFO MemoryStore: ensureFreeSpace(49980563) called with curMem=11578145916, maxMem=35160431001 14/03/19 19:11:40 INFO MemoryStore: Block rdd_5_694 stored as bytes to memory (size 47.7 MB, free 21.9 GB) 14/03/19 19:11:40 ERROR Executor: Exception in task ID 709 java.io.IOException: 's3n://AKIAJ346M2WM3VKBHFJA:ezu6d3li5gu6j3panqtxmihlypliwhqme+du8...@platfora.qa/test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_*.csv/proton' is a directory at org.apache.hadoop.fs.s3native.NativeS3FileSystem.open(NativeS3FileSystem.java:559) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:711) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:75) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:96) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:84) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:48) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:71) at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) 14/03/19 19:11:40 INFO BlockManagerMaster: Updated info of block rdd_5_694 14/03/19 19:11:40 ERROR Executor: Exception in task ID 710 java.io.IOException: 's3n://AKIAJ346M2WM3VKBHFJA:ezu6d3li5gu6j3panqtxmihlypliwhqme+du8...@platfora.qa/test_data/jws/video_logs2/video_logs2_00128G/video_view/video_views_1376612461124_1_data_*.csv/base-maps' is a directory at org.apache.hadoop.fs.s3native.NativeS3FileSystem.open(NativeS3FileSystem.java:559) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:711) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:75) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:96) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:84) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:48) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:71) at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744)