Hi, My storm topology which reads from kafka and writes to hadoop hdfs is failing exactly after 24hrs!!
I suspect the problem is, topology was not able to renew the tokens/not finding the keytabs to renew. Please share your thoughts and help me fix the issue. Please find the code used to configure hdfs bolt.. Config object: =========== //building a 'map' with hdfs related configuration for key tab Map<String, Object> hdfsSecConfigMap = new HashMap<String, Object>(); hdfsSecConfigMap.put("hdfs.keytab.file", ktPath); hdfsSecConfigMap.put("hdfs.kerberos.principal", ktPrincipal); //building a 'map' with hbase related configuration Map<String, Object> hbaseConfigMap = new HashMap<String, Object>(); hbaseConfigMap.put("hbase.rootdir", hbaseRootDir); hbaseConfigMap.put("storm.keytab.file", ktPath); hbaseConfigMap.put("storm.kerberos.principal", ktPrincipal); Config configured = new Config(); configured.setDebug(true); configured.put(hdfsConfKey, hdfsSecConfigMap); configured.put(hbaseConfKey, hbaseConfigMap); configured.setNumWorkers(2); configured.setMaxSpoutPending(300); configured.setNumAckers(30); configured.setMessageTimeoutSecs(1200); configured.put(HdfsSecurityUtil.STORM_KEYTAB_FILE_KEY, ktPath); configured.put(HdfsSecurityUtil.STORM_USER_NAME_KEY, ktPrincipal); configured.put(HBaseSecurityUtil.STORM_KEYTAB_FILE_KEY, ktPath); configured.put(HBaseSecurityUtil.STORM_USER_NAME_KEY, ktPrincipal); =======Retrieving hdfs bolt HdfsBolt hdfsbolt = new HdfsBolt() .withFsUrl(hdfsuri) .withRecordFormat(recFormat) .withFileNameFormat(fileNameWithPath) .withRotationPolicy(fileRotationSize) .withSyncPolicy(syncPolicy) .withConfigKey(secBypassConfigKey); TopologyBuilder setup below: builder.setBolt("hdfsBolt", avroHDFSBolt, 1) .setNumTasks(1) .shuffleGrouping("kafka-spout"); Exception facing is below: ava.io.IOException: IOException flush:java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "**********"; destination host is: "***************":8020; at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2082) ~[stormjar.jar:?] at org.apache.hadoop.hdfs.DFSOutputStream.hsync(DFSOutputStream.java:1969) ~[stormjar.jar:?] at org.apache.hadoop.hdfs.client.HdfsDataOutputStream.hsync(HdfsDataOutputStream.java:95) ~[stormjar.jar:?] at org.apache.storm.hdfs.bolt.HdfsBolt.execute(HdfsBolt.java:100) [stormjar.jar:?] at backtype.storm.daemon.executor$fn__3697$tuple_action_fn__3699.invoke(executor.clj:670) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485] at backtype.storm.daemon.executor$mk_task_receiver$fn__3620.invoke(executor.clj:426) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485] at backtype.storm.disruptor$clojure_handler$reify__3196.onEvent(disruptor.clj:58) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485] at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485] at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485] at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485] at backtype.storm.daemon.executor$fn__3697$fn__3710$fn__3761.invoke(executor.clj:808) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485] at backtype.storm.util$async_loop$fn__544.invoke(util.clj:475) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_73] Thanks a lot in advance for your valuable thoughts. Regards, Raja Aravapalli.