Hi Daniel,

As Peter pointed out you need the hadoop-azure JAR as well as the Azure storage 
SDK for Java (com.microsoft.azure:azure-storage). Even though the WASB driver 
is built for 2.7, I was still able to use the hadoop-azure JAR with Spark built 
for older Hadoop versions, back to 2.4 I think.

Also, be sure to set your Storage Account key in your Spark Hadoop config, 
typically in core-site.xml:

<property>
  <name>fs.azure.account.key.{accountname}.blob.core.windows.net</name>
  <value>{storage key here}</value>
</property>

As a heads up I have a couple projects for Spark on Azure. One is to push data 
to the Power BI service (both batch and streaming) and I’m finishing up on 
another project for using Event Hubs as well. The Power BI library is up at 
http://spark-packages.org/package/granturing/spark-power-bi the Event Hubs 
library should be up soon.

Thanks,
Silvio

From: Daniel Haviv
Date: Thursday, June 25, 2015 at 1:37 PM
To: "user@spark.apache.org<mailto:user@spark.apache.org>"
Subject: Using Spark on Azure Blob Storage

Hi,
I'm trying to use spark over Azure's HDInsight but the spark-shell fails when 
starting:
java.io.IOException: No FileSystem for scheme: wasb
        at 
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
        at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)

Is Azure's blob storage supported ?

Thanks,
Daniel

Reply via email to