[jira] [Commented] (SPARK-19123) KeyProviderException when reading Azure Blobs from Apache Spark
[ https://issues.apache.org/jira/browse/SPARK-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139950#comment-16139950 ] Davis commented on SPARK-19123: --- Please add the following config entry to the SparkSessionBuilder to skip the key decryption "fs.azure.account.keyprovider..blob.core.windows.net" with value "org.apache.hadoop.fs.azure.SimpleKeyProvider" > KeyProviderException when reading Azure Blobs from Apache Spark > --- > > Key: SPARK-19123 > URL: https://issues.apache.org/jira/browse/SPARK-19123 > Project: Spark > Issue Type: Question > Components: Input/Output, Java API >Affects Versions: 2.0.0 > Environment: Apache Spark 2.0.0 running on Azure HDInsight cluster > version 3.5 with Hadoop version 2.7.3 >Reporter: Saulo Ricci >Priority: Minor > > I created a Spark job and it's intended to read a set of json files from a > Azure Blob container. I set the key and reference to my storage and I'm > reading the files as showed in the snippet bellow: > {code:java} > SparkSession > sparkSession = > SparkSession.builder().appName("Pipeline") > .master("yarn") > .config("fs.azure", > "org.apache.hadoop.fs.azure.NativeAzureFileSystem") > > .config("fs.azure.account.key..blob.core.windows.net","") > .getOrCreate(); > Dataset txs = sparkSession.read().json("wasb://path_to_files"); > {code} > The point is that I'm unfortunately getting a > `org.apache.hadoop.fs.azure.KeyProviderException` when reading the blobs from > the azure storage. According to the trace showed bellow it seems the header > too long but still trying to figure out what exactly that means: > {code:java} > 17/01/07 19:28:39 ERROR ApplicationMaster: User class threw exception: > org.apache.hadoop.fs.azure.AzureException: > org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException > exitCode=2: Error reading S/MIME message > 140473279682200:error:0D07207B:asn1 encoding > routines:ASN1_get_object:header too long:asn1_lib.c:157: > 140473279682200:error:0D0D106E:asn1 encoding > routines:B64_READ_ASN1:decode error:asn_mime.c:192: > 140473279682200:error:0D0D40CB:asn1 encoding > routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517: > org.apache.hadoop.fs.azure.AzureException: > org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException > exitCode=2: Error reading S/MIME message > 140473279682200:error:0D07207B:asn1 encoding > routines:ASN1_get_object:header too long:asn1_lib.c:157: > 140473279682200:error:0D0D106E:asn1 encoding > routines:B64_READ_ASN1:decode error:asn_mime.c:192: > 140473279682200:error:0D0D40CB:asn1 encoding > routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517: > at > org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:953) > at > org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:450) > at > org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1209) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:366) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:364) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:364) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) > at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:294) > at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:249) > at > taka.pipelines.AnomalyTrainingPipeline.main(AnomalyTrainingPipeline.java:35) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(Native
[jira] [Commented] (SPARK-19123) KeyProviderException when reading Azure Blobs from Apache Spark
[ https://issues.apache.org/jira/browse/SPARK-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15809096#comment-15809096 ] Sean Owen commented on SPARK-19123: --- This doesn't look like a Spark issue per se. > KeyProviderException when reading Azure Blobs from Apache Spark > --- > > Key: SPARK-19123 > URL: https://issues.apache.org/jira/browse/SPARK-19123 > Project: Spark > Issue Type: Question > Components: Input/Output, Java API >Affects Versions: 2.0.0 > Environment: Apache Spark 2.0.0 running on Azure HDInsight cluster > version 3.5 with Hadoop version 2.7.3 >Reporter: Saulo Ricci >Priority: Minor > Labels: newbie > > I created a Spark job and it's intended to read a set of json files from a > Azure Blob container. I set the key and reference to my storage and I'm > reading the files as showed in the snippet bellow: > {code:java} > SparkSession > sparkSession = > SparkSession.builder().appName("Pipeline") > .master("yarn") > .config("fs.azure", > "org.apache.hadoop.fs.azure.NativeAzureFileSystem") > > .config("fs.azure.account.key..blob.core.windows.net","") > .getOrCreate(); > Dataset txs = sparkSession.read().json("wasb://path_to_files"); > {code} > The point is that I'm unfortunately getting a > `org.apache.hadoop.fs.azure.KeyProviderException` when reading the blobs from > the azure storage. According to the trace showed bellow it seems the header > too long but still trying to figure out what exactly that means: > {code:java} > 17/01/07 19:28:39 ERROR ApplicationMaster: User class threw exception: > org.apache.hadoop.fs.azure.AzureException: > org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException > exitCode=2: Error reading S/MIME message > 140473279682200:error:0D07207B:asn1 encoding > routines:ASN1_get_object:header too long:asn1_lib.c:157: > 140473279682200:error:0D0D106E:asn1 encoding > routines:B64_READ_ASN1:decode error:asn_mime.c:192: > 140473279682200:error:0D0D40CB:asn1 encoding > routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517: > org.apache.hadoop.fs.azure.AzureException: > org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException > exitCode=2: Error reading S/MIME message > 140473279682200:error:0D07207B:asn1 encoding > routines:ASN1_get_object:header too long:asn1_lib.c:157: > 140473279682200:error:0D0D106E:asn1 encoding > routines:B64_READ_ASN1:decode error:asn_mime.c:192: > 140473279682200:error:0D0D40CB:asn1 encoding > routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517: > at > org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:953) > at > org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:450) > at > org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1209) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:366) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:364) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:364) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) > at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:294) > at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:249) > at > taka.pipelines.AnomalyTrainingPipeline.main(AnomalyTrainingPipeline.java:35) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[jira] [Commented] (SPARK-19123) KeyProviderException when reading Azure Blobs from Apache Spark
[ https://issues.apache.org/jira/browse/SPARK-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15808663#comment-15808663 ] Shuai Lin commented on SPARK-19123: --- IIUC {{KeyProviderException}} means the storage account key is not configured properly. Are you sure the way you specify the key is correct? Have you checked the azure developer docs for it? BTW I don't think this is an "critical issue", so I changed it to "minor". > KeyProviderException when reading Azure Blobs from Apache Spark > --- > > Key: SPARK-19123 > URL: https://issues.apache.org/jira/browse/SPARK-19123 > Project: Spark > Issue Type: Question > Components: Input/Output, Java API >Affects Versions: 2.0.0 > Environment: Apache Spark 2.0.0 running on Azure HDInsight cluster > version 3.5 with Hadoop version 2.7.3 >Reporter: Saulo Ricci >Priority: Minor > Labels: newbie > > I created a Spark job and it's intended to read a set of json files from a > Azure Blob container. I set the key and reference to my storage and I'm > reading the files as showed in the snippet bellow: > {code:java} > SparkSession > sparkSession = > SparkSession.builder().appName("Pipeline") > .master("yarn") > .config("fs.azure", > "org.apache.hadoop.fs.azure.NativeAzureFileSystem") > > .config("fs.azure.account.key..blob.core.windows.net","") > .getOrCreate(); > Dataset txs = sparkSession.read().json("wasb://path_to_files"); > {code} > The point is that I'm unfortunately getting a > `org.apache.hadoop.fs.azure.KeyProviderException` when reading the blobs from > the azure storage. According to the trace showed bellow it seems the header > too long but still trying to figure out what exactly that means: > {code:java} > 17/01/07 19:28:39 ERROR ApplicationMaster: User class threw exception: > org.apache.hadoop.fs.azure.AzureException: > org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException > exitCode=2: Error reading S/MIME message > 140473279682200:error:0D07207B:asn1 encoding > routines:ASN1_get_object:header too long:asn1_lib.c:157: > 140473279682200:error:0D0D106E:asn1 encoding > routines:B64_READ_ASN1:decode error:asn_mime.c:192: > 140473279682200:error:0D0D40CB:asn1 encoding > routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517: > org.apache.hadoop.fs.azure.AzureException: > org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException > exitCode=2: Error reading S/MIME message > 140473279682200:error:0D07207B:asn1 encoding > routines:ASN1_get_object:header too long:asn1_lib.c:157: > 140473279682200:error:0D0D106E:asn1 encoding > routines:B64_READ_ASN1:decode error:asn_mime.c:192: > 140473279682200:error:0D0D40CB:asn1 encoding > routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517: > at > org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:953) > at > org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:450) > at > org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1209) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:366) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:364) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:364) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) > at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:294) > at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:249) > at > taka.pipelines.AnomalyTrainingPipeline.main(AnomalyTrainingPipeline.java:35) > at sun.reflect.NativeMethodAcce