[ https://issues.apache.org/jira/browse/HADOOP-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706266#comment-15706266 ]
Luke Miner edited comment on HADOOP-13811 at 11/29/16 7:22 PM: --------------------------------------------------------------- Turns out I still had {{hadoop-aws:2.7.3}} in my spark conf file. I ended up including the {{hadoop-aws-2.9.0-SNAPSHOT.jar}} that I built using the instructions you gave above. I also bumped up amazon's {{aws-java-sdk}} to {{1.11.57}}. I'm still seeing the same error, only on a different line number now. Oddly, it also seems to be telling me that I should be using {{S3AFileSystem}} instead of {{S3FileSystem}}. {code} S3FileSystem is deprecated and will be removed in future releases. Use NativeS3FileSystem or S3AFileSystem instead. 16/11/29 19:00:19 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 177.7 KB, free 366.1 MB) 16/11/29 19:00:19 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 21.0 KB, free 366.1 MB) 16/11/29 19:00:19 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.0.229.45:52703 (size: 21.0 KB, free: 366.3 MB) 16/11/29 19:00:19 INFO SparkContext: Created broadcast 0 from textFile at json2pq.scala:130 Exception in thread "main" java.lang.NumberFormatException: For input string: "100M" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1320) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:234) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2904) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:101) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2941) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2923) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356) at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:265) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:236) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:322) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1957) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:928) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:358) at org.apache.spark.rdd.RDD.collect(RDD.scala:927) at Json2Pq$.main(json2pq.scala:130) at Json2Pq.main(json2pq.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) {code} I did also try to run it using {{hadoop-aws-3.0.0-alpha1.jar}} that is currently on central, but got this error instead. Perhaps because I'm running off a hadoop 2.9 snapshot. {code} Exception in thread "main" java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.internal.SessionState': at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:965) at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110) at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:862) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:862) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:862) at Json2Pq$.main(json2pq.scala:126) at Json2Pq.main(json2pq.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:962) ... 21 more Caused by: java.lang.UnsupportedClassVersionError: org/apache/hadoop/fs/s3native/NativeS3FileSystem : Unsupported major.minor version 52.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:803) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:442) at java.net.URLClassLoader.access$100(URLClassLoader.java:64) at java.net.URLClassLoader$1.run(URLClassLoader.java:354) at java.net.URLClassLoader$1.run(URLClassLoader.java:348) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:347) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:278) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:363) at java.util.ServiceLoader$1.next(ServiceLoader.java:445) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2854) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2881) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2902) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:101) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2941) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2923) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.liftedTree1$1(InMemoryCatalog.scala:109) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.createDatabase(InMemoryCatalog.scala:107) at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:96) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101) at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100) at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157) ... 26 more {code} When I change {fs.s3a.multipart.size}} to {{104857600}} I get the following, similar error: {code} Exception in thread "main" java.lang.NumberFormatException: For input string: "32M" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1320) at org.apache.hadoop.fs.s3a.S3AFileSystem.longOption(S3AFileSystem.java:1974) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:247) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2904) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:101) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2941) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2923) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356) at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:265) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:236) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:322) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1957) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:928) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:358) at org.apache.spark.rdd.RDD.collect(RDD.scala:927) at Json2Pq$.main(json2pq.scala:130) at Json2Pq.main(json2pq.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) {code} was (Author: lminer): Turns out I still had {{hadoop-aws:2.7.3}} in my spark conf file. I ended up including the {{hadoop-aws-2.9.0-SNAPSHOT.jar}} that I built using the instructions you gave above. I also bumped up amazon's {{aws-java-sdk}} to {{1.11.57}}. I'm still seeing the same error, only on a different line number now. Oddly, it also seems to be telling me that I should be using {{S3AFileSystem}} instead of {{S3FileSystem}}. {code} S3FileSystem is deprecated and will be removed in future releases. Use NativeS3FileSystem or S3AFileSystem instead. 16/11/29 19:00:19 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 177.7 KB, free 366.1 MB) 16/11/29 19:00:19 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 21.0 KB, free 366.1 MB) 16/11/29 19:00:19 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.0.229.45:52703 (size: 21.0 KB, free: 366.3 MB) 16/11/29 19:00:19 INFO SparkContext: Created broadcast 0 from textFile at json2pq.scala:130 Exception in thread "main" java.lang.NumberFormatException: For input string: "100M" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1320) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:234) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2904) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:101) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2941) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2923) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356) at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:265) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:236) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:322) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1957) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:928) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:358) at org.apache.spark.rdd.RDD.collect(RDD.scala:927) at Json2Pq$.main(json2pq.scala:130) at Json2Pq.main(json2pq.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) {code} I did also try to run it using {{hadoop-aws-3.0.0-alpha1.jar}} that is currently on central, but got this error instead. Perhaps because I'm running off a hadoop 2.9 snapshot. {code} Exception in thread "main" java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.internal.SessionState': at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:965) at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110) at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:862) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:862) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:862) at Json2Pq$.main(json2pq.scala:126) at Json2Pq.main(json2pq.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:962) ... 21 more Caused by: java.lang.UnsupportedClassVersionError: org/apache/hadoop/fs/s3native/NativeS3FileSystem : Unsupported major.minor version 52.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:803) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:442) at java.net.URLClassLoader.access$100(URLClassLoader.java:64) at java.net.URLClassLoader$1.run(URLClassLoader.java:354) at java.net.URLClassLoader$1.run(URLClassLoader.java:348) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:347) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:278) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:363) at java.util.ServiceLoader$1.next(ServiceLoader.java:445) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2854) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2881) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2902) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:101) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2941) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2923) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.liftedTree1$1(InMemoryCatalog.scala:109) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.createDatabase(InMemoryCatalog.scala:107) at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:96) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101) at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100) at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157) ... 26 more {code} > s3a: getFileStatus fails with com.amazonaws.AmazonClientException: Failed to > sanitize XML document destined for handler class > ----------------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-13811 > URL: https://issues.apache.org/jira/browse/HADOOP-13811 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 2.8.0, 2.7.3 > Reporter: Steve Loughran > Assignee: Steve Loughran > > Sometimes, occasionally, getFileStatus() fails with a stack trace starting > with {{com.amazonaws.AmazonClientException: Failed to sanitize XML document > destined for handler class}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org