Re: Reading csv-files in parallel

2018-05-09 Thread Fabian Hueske
L together in Scala code ? > > > > Best, Esa > > > > *From:* Fabian Hueske <fhue...@gmail.com> > *Sent:* Tuesday, May 8, 2018 10:26 PM > > *To:* Esa Heikkinen <esa.heikki...@student.tut.fi> > *Cc:* user@flink.apache.org > *Subject:* Re: Reading csv-files in pa

RE: Reading csv-files in parallel

2018-05-09 Thread Esa Heikkinen
.org Subject: Re: Reading csv-files in parallel Hi, the Table API / SQL and the DataSet API can be used together in the same program. So you could read the data with a custom input format or a TextInputFormat and a custom MapFunction parser and hand it to SQL afterwards. The program would be a r

Re: Reading csv-files in parallel

2018-05-08 Thread Fabian Hueske
> > > > Esa > > > > *From:* Fabian Hueske <fhue...@gmail.com> > *Sent:* Tuesday, May 8, 2018 2:00 PM > > *To:* Esa Heikkinen <esa.heikki...@student.tut.fi> > *Cc:* user@flink.apache.org > *Subject:* Re: Reading csv-files in parallel > > >

RE: Reading csv-files in parallel

2018-05-08 Thread Esa Heikkinen
(state-machine-based) logic for reading csv-files by certain order. Esa From: Fabian Hueske <fhue...@gmail.com> Sent: Tuesday, May 8, 2018 2:00 PM To: Esa Heikkinen <esa.heikki...@student.tut.fi> Cc: user@flink.apache.org Subject: Re: Reading csv-files in parallel Hi, the easiest approac

Re: Reading csv-files in parallel

2018-05-08 Thread Fabian Hueske
at read csv-files > parallel ? > > > > Best, Esa > > > > *From:* Fabian Hueske <fhue...@gmail.com> > *Sent:* Monday, May 7, 2018 3:48 PM > *To:* Esa Heikkinen <esa.heikki...@student.tut.fi> > *Cc:* user@flink.apache.org > *Subject:* Re: Reading csv-

RE: Reading csv-files in parallel

2018-05-08 Thread Esa Heikkinen
Monday, May 7, 2018 3:48 PM To: Esa Heikkinen <esa.heikki...@student.tut.fi> Cc: user@flink.apache.org Subject: Re: Reading csv-files in parallel Hi Esa, you can certainly read CSV files in parallel. This works very well in a batch query. For streaming queries, that expect data to be ingested

Re: Reading csv-files in parallel

2018-05-07 Thread Fabian Hueske
Hi Esa, you can certainly read CSV files in parallel. This works very well in a batch query. For streaming queries, that expect data to be ingested in timestamp order this is much more challenging, because you need 1) read the files in the right order and 2) cannot split files (unless you