We're also trying to work around the current limitations of Table API and we're reading DataSets with on-purpose input formats that returns a POJO Row containing the list of values (but we're reading all values as String...). Actually we would also need a way to abstract the composition of Flink operators and UDFs to compose a transformation from a Graphical UI or from a script..during the Stratosphere project there was Meteor and Supremo allowing that [1] but then it was dismissed in favour of Pig integration that I don't wheter it was ever completed..some days ago I discovered Piglet project[2] that allows to use PIG with Spark and Flink but I don't know how well it works (Flink integration is also very recent and not documented anywhere).
Best, Flavio [1] http://stratosphere.eu/assets/papers/Sopremo_Meteor%20BigData.pdf [2] https://github.com/ksattler/piglet On Thu, Apr 21, 2016 at 2:41 PM, Fabian Hueske <[email protected]> wrote: > Hi Simone, > > in Flink 1.0.x, the Table API does not support reading external data, > i.e., it is not possible to read a CSV file directly from the Table API. > Tables can only be created from DataSet or DataStream which means that the > data is already converted into "Flink types". > > However, the Table API is currently under heavy development as part of the > the efforts to add SQL support. > This work is taking place on the master branch and I am currently working > on interfaces to scan external data sets or ingest external data streams. > The interface will be quite generic such that it should be possible to > define a table source that reads the first lines of a file to infer > attribute names and types. > You can have a look at the current state of the API design here [1]. > > Feedback is welcome and can be very easily included in this phase of the > development ;-) > > Cheers, Fabian > > [1] > https://docs.google.com/document/d/1sITIShmJMGegzAjGqFuwiN_iw1urwykKsLiacokxSw0 > <https://docs.google.com/document/d/1sITIShmJMGegzAjGqFuwiN_iw1urwykKsLiacokxSw0/edit#> > > 2016-04-21 14:26 GMT+02:00 Simone Robutti <[email protected]>: > >> Hello, >> >> I would like to know if it's possible to create a Flink Table from an >> arbitrary CSV (or any other form of tabular data) without doing type safe >> parsing with expliciteky type classes/POJOs. >> >> To my knowledge this is not possible but I would like to know if I'm >> missing something. My requirement is to be able to read a CSV file and >> manipulate it reading the field names from the file and inferring data >> types. >> >> Thanks, >> >> Simone >> > >
