Re: Parameters to Control Intra-node Parallelism

2016-07-06 Thread Saliya Ekanayake
I tried to run more than one task manager per node by duplicating the slave IPs. At startup it says for example, [INFO] 1 instance(s) of taskmanager are already running on j-011. Starting taskmanager daemon on host j-011. but I only see 1 task manager process running. Is there anything else I

Re: Late arriving events

2016-07-06 Thread Chen Qin
Jamie, Sorry for late reply, some of my thoughts inline. -Chen > > Another way to do this is to kick off a parallel job to do the backfill > from the previous savepoint without stopping the current "realtime" job. > This way you would not have to have a "blackout". This assumes your final >

Re: Data point goes missing within iteration

2016-07-06 Thread Ufuk Celebi
I couldn't tell anything from the code. I would suggest to reduce it to a minimal example with Integers where you do the same thing flow structure wise (with simple types) and let's check that again. On Wed, Jul 6, 2016 at 9:35 AM, Biplob Biswas wrote: > Thanks a lot,

Re: Record has Long.MIN_VALUE timestamp (= no timestamp marker). Is the time characteristic set to 'ProcessingTime', or did you forget to call 'DataStream.assignTimestampsAndWatermarks(...)'?

2016-07-06 Thread Kostas Kloudas
Hi David, You are using Tumbling event time windows, but you set the timeCharacteristic to processing time. If you want processing time, then you should use TumblingProcessingTimeWindows and remove the timestampAssigner. If you want event time, then you need to set the timeCharacteristic to

Re: Flink and Calcite

2016-07-06 Thread Márton Balassi
Hey Radu, It is in master, you find the related module under flink-libraries/flink-table in the directory structure. [1] [1] https://github.com/apache/flink/blob/master/flink-libraries/flink-table/pom.xml#L77-L81 Best, Marton On Wed, Jul 6, 2016 at 3:49 PM, Radu Tudoran

Flink and Calcite

2016-07-06 Thread Radu Tudoran
Hi, Can someone point me to the repository where the integration of Calcite with Flink is available? Does this come with the master branch (as indicated by the link in the blog post)? https://github.com/apache/flink/tree/master Thanks Dr. Radu Tudoran Research Engineer - Big Data Expert IT R

Record has Long.MIN_VALUE timestamp (= no timestamp marker). Is the time characteristic set to 'ProcessingTime', or did you forget to call 'DataStream.assignTimestampsAndWatermarks(...)'?

2016-07-06 Thread David Olsen
I have two streams. One will produce a single record, and the other have a list of records. And I want to do left join. So for example, Stream A: record1 record2 ... Stream B: single-record After joined, record1, single-record record2, single-record ... However with the following streaming

Streaming Exception error message Explanation

2016-07-06 Thread subash basnet
Hello all, I have been trying to read the stock data as a stream and perform outlier detection upon it. My problem is mainly due to the absence of 'withBroadcastSet()' in DataStream API I used global variable and DataStreamUtils to read the variable *loop*. But I get cast exceptions and others.

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-06 Thread Aljoscha Krettek
Hi, I cleaned up the document a bit and added sections to address comments on the doc: https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit?usp=sharing (I also marked proposed features that are already implemented as [done].) The main thing that remains to be

Re: Tumbling time window cannot group events properly

2016-07-06 Thread Yukun Guo
You're right, I forgot to check that the "events in this window" line actually showed the number of events inside each window was what I expected, despite being printed a bit out of order. Thank you for the help! On 5 July 2016 at 17:37, Aljoscha Krettek wrote: > The order

Re: Checkpointing very large state in RocksDB?

2016-07-06 Thread Aljoscha Krettek
Hi, I think there is no disadvantage other than the fact that in the JobManager dashboard the checkpoint will be shown as "taking longer". Some people might be confused by this if they don't know that during the whole time the job keeps processing data. I think async snapshotting might be

Re: Data point goes missing within iteration

2016-07-06 Thread Biplob Biswas
Thanks a lot, would really appreciate it. Also. please let me know if you don't understand it well, the documentation is not really great at the moment in the code. -- View this message in context:

Re: Graph with stream of updates

2016-07-06 Thread agentmilindu
Hi Vasia and Ankur, I have the same need as Ankur where I want to create a graph from Twitter stream and query it. I got the streaming graph built from Twitter stream thanks to your Gelly Stream. Now I want to run queries( different graph algorithms ) on this streaming graph time to time, while