[jira] [Commented] (SPARK-19123) KeyProviderException when reading Azure Blobs from Apache Spark

2017-08-24 Thread Davis (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139950#comment-16139950
 ] 

Davis commented on SPARK-19123:
---

Please add the following config entry to the SparkSessionBuilder to skip the 
key decryption

"fs.azure.account.keyprovider..blob.core.windows.net"

 with value

 "org.apache.hadoop.fs.azure.SimpleKeyProvider"


> KeyProviderException when reading Azure Blobs from Apache Spark
> ---
>
> Key: SPARK-19123
> URL: https://issues.apache.org/jira/browse/SPARK-19123
> Project: Spark
>  Issue Type: Question
>  Components: Input/Output, Java API
>Affects Versions: 2.0.0
> Environment: Apache Spark 2.0.0 running on Azure HDInsight cluster 
> version 3.5 with Hadoop version 2.7.3
>Reporter: Saulo Ricci
>Priority: Minor
>
> I created a Spark job and it's intended to read a set of json files from a 
> Azure Blob container. I set the key and reference to my storage and I'm 
> reading the files as showed in the snippet bellow:
> {code:java}
> SparkSession
> sparkSession =
> SparkSession.builder().appName("Pipeline")
> .master("yarn")
> .config("fs.azure", 
> "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
> 
> .config("fs.azure.account.key..blob.core.windows.net","")
> .getOrCreate();
> Dataset txs = sparkSession.read().json("wasb://path_to_files");
> {code}
> The point is that I'm unfortunately getting a 
> `org.apache.hadoop.fs.azure.KeyProviderException` when reading the blobs from 
> the azure storage. According to the trace showed bellow it seems the header 
> too long but still trying to figure out what exactly that means:
> {code:java}
> 17/01/07 19:28:39 ERROR ApplicationMaster: User class threw exception: 
> org.apache.hadoop.fs.azure.AzureException: 
> org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException 
> exitCode=2: Error reading S/MIME message
> 140473279682200:error:0D07207B:asn1 encoding 
> routines:ASN1_get_object:header too long:asn1_lib.c:157:
> 140473279682200:error:0D0D106E:asn1 encoding 
> routines:B64_READ_ASN1:decode error:asn_mime.c:192:
> 140473279682200:error:0D0D40CB:asn1 encoding 
> routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
> org.apache.hadoop.fs.azure.AzureException: 
> org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException 
> exitCode=2: Error reading S/MIME message
> 140473279682200:error:0D07207B:asn1 encoding 
> routines:ASN1_get_object:header too long:asn1_lib.c:157:
> 140473279682200:error:0D0D106E:asn1 encoding 
> routines:B64_READ_ASN1:decode error:asn_mime.c:192:
> 140473279682200:error:0D0D40CB:asn1 encoding 
> routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
>   at 
> org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:953)
>   at 
> org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:450)
>   at 
> org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1209)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:366)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:364)
>   at 
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
>   at 
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 
> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
>   at scala.collection.immutable.List.flatMap(List.scala:344)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:364)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
>   at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:294)
>   at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:249)
>   at 
> taka.pipelines.AnomalyTrainingPipeline.main(AnomalyTrainingPipeline.java:35)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(Native

[jira] [Commented] (SPARK-19123) KeyProviderException when reading Azure Blobs from Apache Spark

2017-01-08 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15809096#comment-15809096
 ] 

Sean Owen commented on SPARK-19123:
---

This doesn't look like a Spark issue per se.

> KeyProviderException when reading Azure Blobs from Apache Spark
> ---
>
> Key: SPARK-19123
> URL: https://issues.apache.org/jira/browse/SPARK-19123
> Project: Spark
>  Issue Type: Question
>  Components: Input/Output, Java API
>Affects Versions: 2.0.0
> Environment: Apache Spark 2.0.0 running on Azure HDInsight cluster 
> version 3.5 with Hadoop version 2.7.3
>Reporter: Saulo Ricci
>Priority: Minor
>  Labels: newbie
>
> I created a Spark job and it's intended to read a set of json files from a 
> Azure Blob container. I set the key and reference to my storage and I'm 
> reading the files as showed in the snippet bellow:
> {code:java}
> SparkSession
> sparkSession =
> SparkSession.builder().appName("Pipeline")
> .master("yarn")
> .config("fs.azure", 
> "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
> 
> .config("fs.azure.account.key..blob.core.windows.net","")
> .getOrCreate();
> Dataset txs = sparkSession.read().json("wasb://path_to_files");
> {code}
> The point is that I'm unfortunately getting a 
> `org.apache.hadoop.fs.azure.KeyProviderException` when reading the blobs from 
> the azure storage. According to the trace showed bellow it seems the header 
> too long but still trying to figure out what exactly that means:
> {code:java}
> 17/01/07 19:28:39 ERROR ApplicationMaster: User class threw exception: 
> org.apache.hadoop.fs.azure.AzureException: 
> org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException 
> exitCode=2: Error reading S/MIME message
> 140473279682200:error:0D07207B:asn1 encoding 
> routines:ASN1_get_object:header too long:asn1_lib.c:157:
> 140473279682200:error:0D0D106E:asn1 encoding 
> routines:B64_READ_ASN1:decode error:asn_mime.c:192:
> 140473279682200:error:0D0D40CB:asn1 encoding 
> routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
> org.apache.hadoop.fs.azure.AzureException: 
> org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException 
> exitCode=2: Error reading S/MIME message
> 140473279682200:error:0D07207B:asn1 encoding 
> routines:ASN1_get_object:header too long:asn1_lib.c:157:
> 140473279682200:error:0D0D106E:asn1 encoding 
> routines:B64_READ_ASN1:decode error:asn_mime.c:192:
> 140473279682200:error:0D0D40CB:asn1 encoding 
> routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
>   at 
> org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:953)
>   at 
> org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:450)
>   at 
> org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1209)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:366)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:364)
>   at 
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
>   at 
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 
> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
>   at scala.collection.immutable.List.flatMap(List.scala:344)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:364)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
>   at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:294)
>   at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:249)
>   at 
> taka.pipelines.AnomalyTrainingPipeline.main(AnomalyTrainingPipeline.java:35)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

[jira] [Commented] (SPARK-19123) KeyProviderException when reading Azure Blobs from Apache Spark

2017-01-07 Thread Shuai Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15808663#comment-15808663
 ] 

Shuai Lin commented on SPARK-19123:
---

IIUC {{KeyProviderException}} means the storage account key is not configured 
properly. Are you sure the way you specify the key is correct? Have you checked 
the azure developer docs for it?

BTW I don't think this is an "critical issue", so I changed it to "minor".

> KeyProviderException when reading Azure Blobs from Apache Spark
> ---
>
> Key: SPARK-19123
> URL: https://issues.apache.org/jira/browse/SPARK-19123
> Project: Spark
>  Issue Type: Question
>  Components: Input/Output, Java API
>Affects Versions: 2.0.0
> Environment: Apache Spark 2.0.0 running on Azure HDInsight cluster 
> version 3.5 with Hadoop version 2.7.3
>Reporter: Saulo Ricci
>Priority: Minor
>  Labels: newbie
>
> I created a Spark job and it's intended to read a set of json files from a 
> Azure Blob container. I set the key and reference to my storage and I'm 
> reading the files as showed in the snippet bellow:
> {code:java}
> SparkSession
> sparkSession =
> SparkSession.builder().appName("Pipeline")
> .master("yarn")
> .config("fs.azure", 
> "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
> 
> .config("fs.azure.account.key..blob.core.windows.net","")
> .getOrCreate();
> Dataset txs = sparkSession.read().json("wasb://path_to_files");
> {code}
> The point is that I'm unfortunately getting a 
> `org.apache.hadoop.fs.azure.KeyProviderException` when reading the blobs from 
> the azure storage. According to the trace showed bellow it seems the header 
> too long but still trying to figure out what exactly that means:
> {code:java}
> 17/01/07 19:28:39 ERROR ApplicationMaster: User class threw exception: 
> org.apache.hadoop.fs.azure.AzureException: 
> org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException 
> exitCode=2: Error reading S/MIME message
> 140473279682200:error:0D07207B:asn1 encoding 
> routines:ASN1_get_object:header too long:asn1_lib.c:157:
> 140473279682200:error:0D0D106E:asn1 encoding 
> routines:B64_READ_ASN1:decode error:asn_mime.c:192:
> 140473279682200:error:0D0D40CB:asn1 encoding 
> routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
> org.apache.hadoop.fs.azure.AzureException: 
> org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException 
> exitCode=2: Error reading S/MIME message
> 140473279682200:error:0D07207B:asn1 encoding 
> routines:ASN1_get_object:header too long:asn1_lib.c:157:
> 140473279682200:error:0D0D106E:asn1 encoding 
> routines:B64_READ_ASN1:decode error:asn_mime.c:192:
> 140473279682200:error:0D0D40CB:asn1 encoding 
> routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
>   at 
> org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:953)
>   at 
> org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:450)
>   at 
> org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1209)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:366)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:364)
>   at 
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
>   at 
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 
> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
>   at scala.collection.immutable.List.flatMap(List.scala:344)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:364)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
>   at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:294)
>   at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:249)
>   at 
> taka.pipelines.AnomalyTrainingPipeline.main(AnomalyTrainingPipeline.java:35)
>   at sun.reflect.NativeMethodAcce