Re: Explanation on limitations of the Flink Table API

2016-04-21 Thread Flavio Pompermaier
We're also trying to work around the current limitations of Table API and
we're reading DataSets with on-purpose input formats that returns a POJO
Row containing the list of values (but we're reading all values as
String...).
Actually we would also need a way to abstract the composition of Flink
operators and UDFs to compose a transformation from a Graphical UI or from
a script..during the Stratosphere project there was Meteor and Supremo
allowing that [1] but then it was dismissed in favour of Pig integration
that I don't wheter it was ever completed..some days ago I discovered
Piglet project[2] that allows to use PIG with Spark and Flink but I don't
know how well it works (Flink integration is also very recent and not
documented anywhere).

Best,
Flavio

[1] http://stratosphere.eu/assets/papers/Sopremo_Meteor%20BigData.pdf
[2] https://github.com/ksattler/piglet

On Thu, Apr 21, 2016 at 2:41 PM, Fabian Hueske  wrote:

> Hi Simone,
>
> in Flink 1.0.x, the Table API does not support reading external data,
> i.e., it is not possible to read a CSV file directly from the Table API.
> Tables can only be created from DataSet or DataStream which means that the
> data is already converted into "Flink types".
>
> However, the Table API is currently under heavy development as part of the
> the efforts to add SQL support.
> This work is taking place on the master branch and I am currently working
> on interfaces to scan external data sets or ingest external data streams.
> The interface will be quite generic such that it should be possible to
> define a table source that reads the first lines of a file to infer
> attribute names and types.
> You can have a look at the current state of the API design here [1].
>
> Feedback is welcome and can be very easily included in this phase of the
> development ;-)
>
> Cheers, Fabian
>
> [1]
> https://docs.google.com/document/d/1sITIShmJMGegzAjGqFuwiN_iw1urwykKsLiacokxSw0
> 
>
> 2016-04-21 14:26 GMT+02:00 Simone Robutti :
>
>> Hello,
>>
>> I would like to know if it's possible to create a Flink Table from an
>> arbitrary CSV (or any other form of tabular data) without doing type safe
>> parsing with expliciteky type classes/POJOs.
>>
>> To my knowledge this is not possible but I would like to know if I'm
>> missing something. My requirement is to be able to read a CSV file and
>> manipulate it reading the field names from the file and inferring data
>> types.
>>
>> Thanks,
>>
>> Simone
>>
>
>


Re: Explanation on limitations of the Flink Table API

2016-04-21 Thread Fabian Hueske
Hi Simone,

in Flink 1.0.x, the Table API does not support reading external data, i.e.,
it is not possible to read a CSV file directly from the Table API.
Tables can only be created from DataSet or DataStream which means that the
data is already converted into "Flink types".

However, the Table API is currently under heavy development as part of the
the efforts to add SQL support.
This work is taking place on the master branch and I am currently working
on interfaces to scan external data sets or ingest external data streams.
The interface will be quite generic such that it should be possible to
define a table source that reads the first lines of a file to infer
attribute names and types.
You can have a look at the current state of the API design here [1].

Feedback is welcome and can be very easily included in this phase of the
development ;-)

Cheers, Fabian

[1]
https://docs.google.com/document/d/1sITIShmJMGegzAjGqFuwiN_iw1urwykKsLiacokxSw0


2016-04-21 14:26 GMT+02:00 Simone Robutti :

> Hello,
>
> I would like to know if it's possible to create a Flink Table from an
> arbitrary CSV (or any other form of tabular data) without doing type safe
> parsing with expliciteky type classes/POJOs.
>
> To my knowledge this is not possible but I would like to know if I'm
> missing something. My requirement is to be able to read a CSV file and
> manipulate it reading the field names from the file and inferring data
> types.
>
> Thanks,
>
> Simone
>


Explanation on limitations of the Flink Table API

2016-04-21 Thread Simone Robutti
Hello,

I would like to know if it's possible to create a Flink Table from an
arbitrary CSV (or any other form of tabular data) without doing type safe
parsing with expliciteky type classes/POJOs.

To my knowledge this is not possible but I would like to know if I'm
missing something. My requirement is to be able to read a CSV file and
manipulate it reading the field names from the file and inferring data
types.

Thanks,

Simone