Re: JDBCInputFormat preparation with Flink 1.1-SNAPSHOT and Scala 2.11

Chesnay Schepler Wed, 09 Mar 2016 04:36:13 -0800

now that i look back to my mail i may have given you the wrong ideaabout the prototype; to make sure we are on the same page:the only thing it enables is using the JDBCInputFormat without providinga separate TypeInformation. Still works with tuples, not POJO's.

you can find the prototype here:https://github.com/zentol/flink/tree/3445_jdbc

The JDBCInputFormat there implements ResultTypeQueryable. WIthingetProducedType it executes a dummy query, reads the ResultSetMetaDataand generates a TypeInfo from it.


On 09.03.2016 12:46, Prez Cannady wrote:

I suspected as much (the tuple size limitation). Creating my ownInputFormat seems to be the best solution, but before i go down thatrabbit hole I wanted to see at least a semi-trivial working example ofJDBCInputFormat with Scala 2.11.
I’d appreciate a look at that prototype if its publicly available(even if it is Java). I might glean a hint from it.
Prez Cannady
p: 617 500 3378
e: revp...@opencorrelate.org <mailto:revp...@opencorrelate.org>
GH: https://github.com/opencorrelate
LI: https://www.linkedin.com/in/revprez
On Mar 9, 2016, at 3:25 AM, Chesnay Schepler <ches...@apache.org<mailto:ches...@apache.org>> wrote:
you can always create your own InputFormat, but there is noAbstractJDBCInputFormat if that's what you were looking for.
When you say arbitrary tuple size, do you mean a) a size greater than25, or b) tuples of different sizes?If a) unless you are fine with using nested tuples you won't getaround the tuple size limitation. Since the user has to be aware ofthe nesting (since the fields can be accessed directly via tuple.f0etc), this can't really be done in a general-purpose fashion.
If b) this will straight-up not work with tuples.

You could use POJO's though. then you could also group by column names.
I'm not sure about Scala, but in the Java Stream API you can pass theInputFormat and the TypeInformation into createInput.
I've recently did a prototype where the input type is determinedautomatically by querying the database. If this is a problem for youfeel free to ping me.
On 09.03.2016 03:17, Prez Cannady wrote:
I’m attempting to create a stream using JDBCInputFormat. Objectiveis to convert each record into a tuple and then serialize for inputinto a Kafka topic. Here’s what I have so far.
```
val env = StreamExecutionEnvironment.getExecutionEnvironment

val inputFormat = JDBCInputFormat.buildJDBCInputFormat()
.setDrivername("org.postgresql.Driver")
      .setDBUrl("jdbc:postgresql:test")
      .setQuery("select name from persons")
      .finish()

val stream : DataStream[Tuple1[String]] = env.createInput(...)
```
I think this is essentially what I want to do. It would be nice ifI could return tuples of arbitrary length, but reading the codesuggests I have to commit to a defined arity. So I have some questions.
1. Is there a better way to read from a database (i.e., defining myown `InputFormat` using Slick)?2. To get the above example working, what should I supply to`createInput`?
Prez Cannady
p: 617 500 3378
e: revp...@opencorrelate.org
GH: https://github.com/opencorrelate
LI: https://www.linkedin.com/in/revprez

Re: JDBCInputFormat preparation with Flink 1.1-SNAPSHOT and Scala 2.11

Reply via email to