I tried myself a few weeks ago and saw that it "just works" too for the very simple test I ran. I did see some error messages when running from sbt after the job successfully completed and the SparkContext was closing. I assume this has to do with resources within the AccumuloInputFormat? This was with Accumulo 1.5.0. Haven't had time to look into it but I was going contribute an Accumulo example to https://github.com/apache/incubator-spark/tree/master/examples/src/main/scala/org/apache/spark/examplesif I could get these messages cleared up.
... 14/01/15 10:29:13 INFO spark.SparkContext: Successfully stopped SparkContext java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.accumulo.core.client.impl.ThriftTransportPool$Closer.run(ThriftTransportPool.java:129) at java.lang.Thread.run(Thread.java:680) java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.accumulo.core.client.impl.ThriftTransportPool$Closer.run(ThriftTransportPool.java:129) at java.lang.Thread.run(Thread.java:680) 14/01/15 10:29:13 ERROR zookeeper.ClientCnxn: Event thread exiting due to interruption java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1996) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491) 14/01/15 10:29:13 INFO zookeeper.ClientCnxn: EventThread shut down [success] Total time: 3 s, completed Jan 15, 2014 10:29:13 AM > 14/01/15 10:29:23 WARN zookeeper.ClientCnxn: Session 0x14396efc3be0022 for server vm/192.168.221.2:2181, unexpected error, closing socket connection and attempting reconnect java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:343) at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) On Mon, Jan 13, 2014 at 10:37 AM, Matthew Molek <mmo...@clearedgeit.com>wrote: > I just tried using AccumuloInputFormat as a data source for Spark running > in standalone mode on a single node 'cluster'. Everything seems to work > fine out of the box, as advertised. (Spark is supposed to work with any > hadoop InputFormat) > > Just properly configure the AccumuloInputFormat, and pass it off to > JavaSparkContext.newAPIHadoopRDD(...) to load the data into an RDD. > > The versions I tested with were Accumulo 1.5, Hadoop 1.2.1, and Spark > 0.8.1. > > Is anyone else using Spark with Accumulo? > >