Hi Josh, thanks for the help! > To your original question: you'd want to look at the method, > > `AccumuloInputFormat.setConnectorInfo(Job, String, AuthenticationToken)`
I found the functions to call pretty quickly, it was just how to actually call them which was puzzling me since my existing code uses Configurations. I've settled on creating a Job from my Configuration, invoking the new API calls, then overwriting my prior Configuration with the new values from my Job since the Job class doesn't modify its incoming Configuration in-place (my wrong assumption). This makes me feel slightly icky... > (the implementation is actually on AbstractInputFormat if you're curious..) Yup, and I call it directly since I'm using Scala > You would construct a KerberosToken via normal methods (Instance + > ClientConfiguration) and pass that to this method. When you do this, the > implementation automatically fetches delegation tokens for you (tl;dr on > delegation tokens: short-lived password sufficient to identify you that > prevents us from having to distribute your Kerberos credentials across the > cluster). Yup, that part seems to work fine: scala> val rdd = spatialRDDProvider.rdd(new Configuration, sc, params, q) 17/05/19 21:30:49 INFO UserGroupInformation: Login successful for user [email protected] using keytab file /tmp/accumulo.headless.keytab 17/05/19 21:30:49 INFO UserGroupInformation: Login successful for user [email protected] using keytab file /tmp/accumulo.headless.keytab 17/05/19 21:30:50 INFO ENGINE: dataFileCache open start 17/05/19 21:30:51 INFO AccumuloInputFormat: Received KerberosToken, attempting to fetch DelegationToken 17/05/19 21:30:52 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 384.8 KB, free 365.9 MB) 17/05/19 21:30:52 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.1 KB, free 365.9 MB) 17/05/19 21:30:52 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.85.100:39803 (size: 28.1 KB, free: 366.3 MB) 17/05/19 21:30:52 INFO SparkContext: Created broadcast 0 from newAPIHadoopRDD at AccumuloSpatialRDDProvider.scala:130 17/05/19 21:30:52 INFO GeoMesaSparkKryoRegistratorEndpoint$: kryo-schema rpc endpoint registered on driver 192.168.85.100:35861 rdd: org.locationtech.geomesa.spark.SpatialRDD = SpatialRDD[2] at RDD at GeoMesaSpark.scala:58 However, I seem to get this when trying to use the DelegationToken: scala> rdd.count() 17/05/19 21:30:55 INFO UserGroupInformation: Login successful for user [email protected] using keytab file /tmp/accumulo.headless.keytab java.lang.NullPointerException at org.apache.accumulo.core.client.mapreduce.lib.impl.ConfiguratorBase.unwrapAuthenticationToken(ConfiguratorBase.java:493) at org.apache.accumulo.core.client.mapreduce.AbstractInputFormat.validateOptions(AbstractInputFormat.java:390) at org.apache.accumulo.core.client.mapreduce.AbstractInputFormat.getSplits(AbstractInputFormat.java:668) at org.locationtech.geomesa.jobs.mapreduce.GeoMesaAccumuloInputFormat.getSplits(GeoMesaAccumuloInputFormat.scala:174) at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:121) Looking over the code, I can't see an obvious reason it would be null on those lines. Any help is much appreciated! James
