+1 on Robert's proposal. On Sat, Nov 18, 2017 at 6:21 AM, Robert Metzger <[email protected]> wrote:
> I'm really sorry that I didn't respond yet. > > Regarding the question "should the code be in the bahir-flink repo, or the > kudu repo" > I have the feeling that the kudu repo might actually the better spot, > because flume, mapreduce, hive, spark, and others are already there. It > seems that the Kudu project is accepting such contributions and is also > willing to maintain them. > > If the Kudu project rejects the contribution for now, I would suggest to > provide a script that builds the flink-kudu connector, starts kudu (maybe > from a docker image?) and then runs a few test jobs. > > > On Fri, Nov 3, 2017 at 11:45 AM, Nacho Garcia Fernandez < > [email protected]> wrote: > > > Can somebody out there please reply my last question? Thanks in advance > :D > > > > On 27 October 2017 at 14:23, Nacho Garcia Fernandez < > > [email protected] > > > wrote: > > > > > Hello all. > > > > > > I'm a little bit stuck with one issue that I hope you can help me with. > > > > > > I'm developing a flink-connector-kudu extension that allows to read > from > > > Kudu and write to Flink and Kudu. This connector addresses the issue > > > [BAHIR-99] and is a full re-implemnetation of > https://github.com/apache/ > > > bahir-flink/pull/17. > > > > > > I'm struggle with testing: How is it supposed to be handled when the > data > > > storage (kudu) do not provide an embedded driver? > > > > > > In the case of Kudu, it does not provide any embedded java-based driver > > > yet and I need a built Kudu to perform testing against it, otherwise I > > > cannot test (e2e) this connector with a "real" data storage. > > > > > > Because of that I see three main possibilities for this scenario: > > > > > > * Create a Mock for Kudu classes (KuduSession, KuduTable, KuduClient, > > etc). > > > > > > * Use MiniKuduCluster utility of Kudu to instantiate a local cluster: > it > > > is not possible due to the fact that this needs a real build of Kudu in > > the > > > local machine. > > > > > > * Update travis.yml to install a Kudu server: it would fix the problem > > for > > > CI, but tests would fail locally. Moreover, bulding Kudu takes so long > > > (more than 20 minutes), which is not feasible for CI. > > > > > > * Ignore testing: not an option :) > > > > > > > > > In the case of Kudu, I saw that other connectors for other distributed > > > analytics platforms (i.e spark) are directly implemented in the Kudu > > repo ( > > > https://github.com/apache/kudu/tree/master/java/kudu-spark) instead of > > > using bahir-spark. I think this is good because when you execute the > > tests > > > you have a real build of Kudu to perform testing against it. > > > > > > What is the best place (kudu vs bahir) for this connector if we take > into > > > consideration the abovementioned issues? > > > > > > If the answer is bahir-flink, how should I proceed with my tests? :) > > > > > > Thanks in advance. > > > > > > > > >
