Re: Data Source Generator emits 4 instances of the same tuple
Yes Thanks a lot, also the fact that I was using ParallelSourceFunction was problematic. So as suggested by Fabian and Robert, I used Source Function and then in the flink job, i set the output of map with a parallelism of 4 to get the desired result. Thanks again. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-Source-Generator-emits-4-instances-of-the-same-tuple-tp7392p7513.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Re: Data Source Generator emits 4 instances of the same tuple
We solved this problem yesterday at the Flink Hackathon. The issue was that the source function was started with parallelism 4 and each function read the whole file. Cheers, Fabian 2016-06-06 16:53 GMT+02:00 Biplob Biswas : > Hi, > > I tried streaming the source data 2 ways > > 1. Is a simple straight forward way of sending data without using the > serving speed concept > http://pastebin.com/cTv0Pk5U > > > 2. The one where I use the TaxiRide source which is exactly similar except > loading the data in the proper data structures. > http://pastebin.com/NenvXShH > > > I hope to get a solution out of it. > > Thanks and Regards > Biplob Biswas > > > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-Source-Generator-emits-4-instances-of-the-same-tuple-tp7392p7405.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. >
Re: Data Source Generator emits 4 instances of the same tuple
Hi, I tried streaming the source data 2 ways 1. Is a simple straight forward way of sending data without using the serving speed concept http://pastebin.com/cTv0Pk5U 2. The one where I use the TaxiRide source which is exactly similar except loading the data in the proper data structures. http://pastebin.com/NenvXShH I hope to get a solution out of it. Thanks and Regards Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-Source-Generator-emits-4-instances-of-the-same-tuple-tp7392p7405.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Re: Data Source Generator emits 4 instances of the same tuple
Can you please post your code as well? The duplication might happen in your part of the code and not the TaxiRideSource. – Ufuk On Mon, Jun 6, 2016 at 10:30 AM, Biplob Biswas wrote: > Hi, > > I am using a Data Source Generator, which very closely resembles the one > here on the dataartisans github page > > https://github.com/dataArtisans/flink-training-exercises/blob/master/src/main/java/com/dataartisans/flinktraining/exercises/datastream_java/sources/TaxiRideSource.java > > This essentially reads from a file and then emits the data which is then > consumed by my flink program. > > But when I run this, I get 4 instances of the same tuple, so if I have 10 > samples, I am getting 40 samples. > > I am not really sure what's the problem here, if needed I can post my code > as well. > > Thanks and Regards > Biplob Biswas > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-Source-Generator-emits-4-instances-of-the-same-tuple-tp7392.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at > Nabble.com.
Data Source Generator emits 4 instances of the same tuple
Hi, I am using a Data Source Generator, which very closely resembles the one here on the dataartisans github page https://github.com/dataArtisans/flink-training-exercises/blob/master/src/main/java/com/dataartisans/flinktraining/exercises/datastream_java/sources/TaxiRideSource.java This essentially reads from a file and then emits the data which is then consumed by my flink program. But when I run this, I get 4 instances of the same tuple, so if I have 10 samples, I am getting 40 samples. I am not really sure what's the problem here, if needed I can post my code as well. Thanks and Regards Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-Source-Generator-emits-4-instances-of-the-same-tuple-tp7392.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.