Below is the link for step by step guide in how to setup and use Spark in HDInsight.
https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-spark-install/ Jacob From: Daniel Haviv [mailto:daniel.ha...@veracity-group.com] Sent: Thursday, June 25, 2015 3:19 PM To: Silvio Fiorito Cc: user@spark.apache.org Subject: Re: Using Spark on Azure Blob Storage Thank you guys for the helpful answers. Daniel On 25 ביוני 2015, at 21:23, Silvio Fiorito <silvio.fior...@granturing.com<mailto:silvio.fior...@granturing.com>> wrote: Hi Daniel, As Peter pointed out you need the hadoop-azure JAR as well as the Azure storage SDK for Java (com.microsoft.azure:azure-storage). Even though the WASB driver is built for 2.7, I was still able to use the hadoop-azure JAR with Spark built for older Hadoop versions, back to 2.4 I think. Also, be sure to set your Storage Account key in your Spark Hadoop config, typically in core-site.xml: <property> <name>fs.azure.account.key.{accountname}.blob.core.windows.net<http://blob.core.windows.net></name> <value>{storage key here}</value> </property> As a heads up I have a couple projects for Spark on Azure. One is to push data to the Power BI service (both batch and streaming) and I’m finishing up on another project for using Event Hubs as well. The Power BI library is up at http://spark-packages.org/package/granturing/spark-power-bi the Event Hubs library should be up soon. Thanks, Silvio From: Daniel Haviv Date: Thursday, June 25, 2015 at 1:37 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" Subject: Using Spark on Azure Blob Storage Hi, I'm trying to use spark over Azure's HDInsight but the spark-shell fails when starting: java.io.IOException: No FileSystem for scheme: wasb at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) Is Azure's blob storage supported ? Thanks, Daniel