Yes I've read it!Will it support also hbase tableInputFormat (HTable and Scan are no more serializable) ao basically the hbase addon becomes useless? On Nov 12, 2014 9:10 PM, "Fabian Hueske" <[email protected]> wrote:
> Hi, > > just want to let you know, that we opened a JIRA (FLINK-1236) to support > local split assignment for the HadoopInputFormat. > At least this performance issue should be easy to solve :-) > > 2014-11-11 12:44 GMT+01:00 Fabian Hueske <[email protected]>: > >> First of all, split locality can make a huge difference. >> It will also enable a tighter integration, API-wise as well for the >> execution by pushing for example filters or projections directly into the >> data source and therefore reduce the data to be read from the file system. >> >> 2014-11-11 12:30 GMT+01:00 Flavio Pompermaier <[email protected]>: >> >>> Maybe this is a dumb question but could you explain me what are the >>> benefits of a dedicated Flink IF vs the one available by default in Hadoop >>> IF wrapper? >>> Is it just because of data locality of task slots? >>> >>> On Tue, Nov 11, 2014 at 12:16 PM, Fabian Hueske <[email protected]> >>> wrote: >>> >>>> Hi Flavio, >>>> >>>> I am not aware of a Flink InputFormat for Parquet. However, it should >>>> be hopefully covered by the Hadoop IF wrapper. >>>> A dedicated Flink IF would be great though, IMO. >>>> >>>> Best, Fabian >>>> >>>> 2014-11-11 12:10 GMT+01:00 Flavio Pompermaier <[email protected]>: >>>> >>>>> Hi to all, >>>>> >>>>> I'd like to know whether Flink is able exploit Parquet format to read >>>>> data efficiently from HDFS. >>>>> Is there any example available? >>>>> >>>>> Bets, >>>>> Flavio >>>>> >>>> >>>> >>> >> >
