Hey, I hope y'all are doing well, i started a project (4 months internship in Data Engineering) it's a totally new field for me and i have to present 2 months from now, i took with Java (i have no prior experience using it) and i started with Flink documentation, we have to follow the operations playground lab and our dev environment will be the same, hence, we'll use that lab and adapt it to our needs which is the programming of a real time event processing flink app that is gonna calculate the Net Promoter Score of an agency from a group of client satisfaction surveys (reviews, the format's gonna be messages) instead of Clicks as the operation playgrounds lab states, here's a link that describes that lab : https://nightlies.apache.org/flink/flink-docs-master/docs/try-flink/flink-operations-playground/#starting-the-playground
Instead of Generating Clicks, we should generate survey scores and then calculate the Net Promoter Score, my question is, I've thought about declaring some parameters for those scores such as : The Job consumes Scores from input topic, each with a score from 0-10, id of clients replies, number of replies, number of participants, number of detractors (if score is in the 0-6 range), number of passives (if score is in the 7-8 range), number of promoters (if the score is in the 9-10 range), percentage of promoters, percentage detractors, fullname clients, date of birth. Do i need to add more parameters ? it's supposed to be a simulation, i still don't quite understand what "dummy data" should i generate to then calculate the NPS score that i'm gonna use one survey just to test if my coding is correct and then dive to the possibility of treating different responses to 'n' number of surveys and translating those survey scores (that could be using different NPS scale/ the idea that the agency wanna implement is using a dynamic type of scale (changing the idea of a 0-10 scale when it comes to NPS, and using different scale, for example 0-5, -5 to 4, 0-25, 0-100 to give the clients the ability to choose which scale they feel the most comfortable with to respond to, and here comes the logic that corresponds the most to that which is : would it be better to ask participants (agency clients) for a response in one survey then take off the scales from the other surveys or should they respond to different surveys using different scales ?? Problem is we don't have a lot of time left and we're stuck because we have no prior experience neither in Flink nor in Data Engineering and we were kinda forced to learn the minimum so that we can propose a solution and pass our internship with success, and nobody's helping us or governing our attempts/reviewing our work, we only have two codes that we should work on to apply our logic and they're : ClickEventGenerator and ClickEventCount and they're within the operations playground lab into coding new ones that we're gonna name : ScoreGenerator and ScoreCount : one of the thing that we understood is that we want a real time solutions so we're not gonna use windowing or watermark operators compared to the lab code and we feel like we didn't even understand how to start coding our solution besides starting an execution environment and other elements that are always present no matter what the logic is, we didn't find any ressources for us to look at and we're running out of time, we're ready to work overtime, the most important point is to succeed and learn in our internship, i just wanna ask y'all for help if someone is familiar with that because we still have the part of visualizing the NPS in a real time Dashboard with Grafana and try to use Promotheus later on, deploy the solution to Kubernetes, please help us with that, we really want to learn Flink and Data Stream Processing as a whole as it is a great skill to have, thank you for reading our request, wish you a all a great day, Best Regards, Zakaria