Hi Spark Experts,
We are trying to streamline the development lifecycle of our data
scientists taking algorithms from the lab into production. Currently the
tool of choice for our data scientists is R. Historically our engineers
have had to manually convert the R based algorithms to Java or
'this 2-node replication is mainly for failover in case the receiver dies
while data is in flight. there's still chance for data loss as there's no
write ahead log on the hot path, but this is being addressed.'
Can you comment a little on how this will be addressed, will there be a
durable WAL?
Hi Yan,
That is a good suggestion. I believe non-Zookeeper offset management will
be a feature in the upcoming Kafka 0.8.2 release tentatively scheduled for
September.
https://cwiki.apache.org/confluence/display/KAFKA/Inbuilt+Consumer+Offset+Management
That should make this fairly easy to