Re: How & Where does flink stores data for aggregations.

2017-11-24 Thread Piotr Nowojski
Hi, Flink will have to maintain state of the defined aggregations per each window and key (the more names you have, the bigger the state). Flink’s state backend will be used for that (for example memory or rocksdb). However in most cases state will be small and not dependent on the length of t

Re: external checkpoints

2017-11-24 Thread Fabian Hueske
Hi Aviad, sorry for the late reply. You can configure the checkpoint directory (which is also used for externalized checkpoints) when you create the state backend: env.setStateBackend(new RocksDBStateBackend("hdfs:///checkpoints-data/"); This configures the checkpoint directory to be hdfs:///che

Dataset read csv file problem

2017-11-24 Thread ebru
Hello all, We are trying to read csv files which contains fields containing \n character, also \n character is line delimiter. We used parseQuotedStrings('\"') Method but, it ignores only field delimiters so we couldn’t parse the fields that contains \n character. How can we solve this problem

Re: Accessing Cassandra for reading and writing

2017-11-24 Thread Fabian Hueske
Hi Andre, Do you have a batch or streaming use case? Flink provides Cassandra Input and OutputFormats for DataSet (batch) jobs and a Cassandra Sink for DataStream applications. The is no Cassandra source for DataStream applications. Regarding your error, this looks more like a Zepplin configurati

Re: ElasticSearch 6

2017-11-24 Thread Fabian Hueske
Hi Fritz, the ElasticSearch connector has not been updated for ES6 yet. There is a JIRA issue [1] to add support for ES6 and somebody working on it as it seems. Best, Fabian [1] https://issues.apache.org/jira/browse/FLINK-8101 2017-11-18 2:24 GMT+01:00 Fritz Budiyanto : > Hi, > > I've tried Fl

Re: Dataset read csv file problem

2017-11-24 Thread Fabian Hueske
Hi Ebru, this case is not supported by Flink's CsvInputFormat. The problem is that such a file could not be read in parallel because it is not possible to identify record boundaries if you start reading in the middle of the file. We have a new CsvInputFormat under development that follows the RFC

Re: Impersonate user for hdfs

2017-11-24 Thread Vishal Santoshi
Thank you Timo, I should have thought through. We have done this for oozie, so definitely an avenue to explore. Will let you know. On Wed, Nov 22, 2017 at 9:05 AM, Timo Walther wrote: > Hi Vishal, > > shouldn't it be possible to configure a proxy user via core-site.xml? > Flink is also using t