See this thread for some info: http://apache-spark-user-list.1001560.n3.nabble.com/DynamoDB-input-source-td8814.html
I don't think the situation has changed that much - if you're using Spark on EMR, then I think the InputFormat is available in a JAR (though I haven't tested that). Otherwise you'll need to try to get the JAR and see if you can get it to work outside of EMR. I'm afraid this thread ( https://forums.aws.amazon.com/thread.jspa?threadID=168506) does not appear encouraging, even for using Spark on EMR to read from DynamoDB using the InputFormat! It's a pity AWS doesn't open source the InputFormat. On Mon, Nov 16, 2015 at 5:00 AM, Charles Cobb <charc...@seas.upenn.edu> wrote: > Hi, > > What is the best practice for reading from DynamoDB from Spark? I know I > can use the Java API, but this doesn't seem to take data locality into > consideration at all. > > I was looking for something along the lines of the cassandra connector: > https://github.com/datastax/spark-cassandra-connector > > Thanks, > CJ > >