Re: Change Data Capture (CDC) with Kudu

2017-09-29 Thread Todd Lipcon
e from git and ran this command: find . -name \*.proto > | grep -v 'rtest.proto' | xargs cat | grep '^ rpc ' | sort) and this is an > initial list of the RPC methods that I think would have to captured: > - AlterSchema > - AlterTable > - CreateTable > - CreateTablet > - DeleteTable >

Re: Change Data Capture (CDC) with Kudu

2017-09-23 Thread Franco Venturi
g Sent: Friday, September 22, 2017 5:32:41 PM Subject: Re: Change Data Capture (CDC) with Kudu Franco, I just realized that I suggested something you mentioned in your initial email. My mistake for not reading through to the end. It is probably the least-worst approach right now and it's

Re: Change Data Capture (CDC) with Kudu

2017-09-22 Thread Mike Percy
Franco, I just realized that I suggested something you mentioned in your initial email. My mistake for not reading through to the end. It is probably the least-worst approach right now and it's probably what I would do if I were you. Mike On Fri, Sep 22, 2017 at 2:29 PM, Mike Percy

Re: Change Data Capture (CDC) with Kudu

2017-09-22 Thread Mike Percy
CDC is something that I would like to see in Kudu but we aren't there yet with the underlying support in the Raft Consensus implementation. Once we have higher availability re-replication support (KUDU-1097) we will be a bit closer for a solution involving traditional WAL streaming to an external

Re: Change Data Capture (CDC) with Kudu

2017-09-22 Thread Adar Lieber-Dembo
Franco, Thanks for the detailed description of your problem. I'm afraid there's no such mechanism in Kudu today. Mining the WALs seems like a path fraught with land mines. Kudu GCs WAL segments aggressively so I'd be worried about a listening mechanism missing out on some row operations. Plus

Change Data Capture (CDC) with Kudu

2017-09-21 Thread Franco Venturi
We are planning for a 50-100TB Kudu installation (about 200 tables or so). One of the requirements that we are working on is to have a secondary copy of our data in a Disaster Recovery data center in a different location. Since we are going to have inserts, updates, and deletes (for