I am running Spark on DSE Cassandra with multiple analytics data centers. It is my understanding that with this setup you should have a CFS file system for each data center. I was able to create an additional CFS file system as described here: http://docs.datastax.com/en/latest-dse/datastax_enterprise/ana/anaCFS.html I verified that the additional CFS file system is created properly.
I am now following the instructions here to configure Spark on the second data center to use its own CFS: http://docs.datastax.com/en/latest-dse/datastax_enterprise/spark/sparkConfHistoryServer.html However, running: dse hadoop fs -mkdir <additional_cfs_name>:/spark/events fails with: WARN You are going to access CFS keyspace: cfs in data center: <second_analytics_datacenter>. It will not work because the replication factor for this keyspace in this data center is 0. .... Bad connection to FS. command aborted. exception: UnavailableException() That is, it appears that the <additional_cfs_name> in the hadoop command is being ignored and it is trying to connect to cfs: rather than additional_cfs. Anybody else ran into this? Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini