Hey,

I hope y'all are doing well, i started a project (4 months internship in Data 
Engineering)
it's a totally new field for me and i have to present 2 months from now, i took 
with Java (i have no prior experience using it) and i started with Flink 
documentation, we have to follow the operations playground lab and our dev 
environment will be the same, hence, we'll use that lab and adapt it to our 
needs which is the programming of a real time event processing flink app that 
is gonna calculate the Net Promoter Score of an agency from a group of client 
satisfaction surveys (reviews, the format's gonna be messages) instead of 
Clicks as the operation playgrounds lab states, here's a link that describes 
that lab :
https://nightlies.apache.org/flink/flink-docs-master/docs/try-flink/flink-operations-playground/#starting-the-playground

Instead of Generating Clicks, we should generate survey scores and then 
calculate the Net Promoter Score, my question is, I've thought about declaring 
some parameters for those scores such as :
The Job consumes Scores from input topic, each with a score from 0-10, id of 
clients replies, number of replies, number of participants, number of 
detractors (if score is in the 0-6 range), number of passives (if score is in 
the 7-8 range), number of promoters (if the score is in the 9-10 range), 
percentage of promoters, percentage detractors, fullname clients, date of birth.

Do i need to add more parameters ? it's supposed to be a simulation, i still 
don't quite understand what "dummy data" should i generate to then calculate 
the NPS score that i'm gonna use one survey just to test if my coding is 
correct and then dive to the possibility of treating different responses to 'n' 
number of surveys and translating those survey scores (that could be using 
different NPS scale/ the idea that the agency wanna implement is using a 
dynamic type of scale (changing the idea of a 0-10 scale when it comes to NPS, 
and using different scale, for example 0-5, -5 to 4, 0-25, 0-100 to give the 
clients the ability to choose which scale they feel the most comfortable with 
to respond to, and here comes the logic that corresponds the most to that which 
is : would it be better to ask participants (agency clients) for a response in 
one survey then take off the scales from the other surveys or should they 
respond to different surveys using different scales ??

Problem is we don't have a lot of time left and we're stuck because we have no 
prior experience neither in Flink nor in Data Engineering and we were kinda 
forced to learn the minimum so that we can propose a solution and pass our 
internship with success, and nobody's helping us or governing our 
attempts/reviewing our work, we only have two codes that we should work on to 
apply our logic and they're : ClickEventGenerator and ClickEventCount and 
they're within the operations playground lab into coding new ones that we're 
gonna name : ScoreGenerator and ScoreCount : one of the thing that we 
understood is that we want a real time solutions so we're not gonna use 
windowing or watermark operators compared to the lab code and we feel like we 
didn't even understand how to start coding our solution besides starting an 
execution environment and other elements that are always present no matter what 
the logic is, we didn't find any ressources for us to look at and we're running 
out of time, we're ready to work overtime, the most important point is to 
succeed and learn in our internship, i just wanna ask y'all for help if someone 
is familiar with that because we still have the part of visualizing the NPS in 
a real time Dashboard with Grafana and try to use Promotheus later on, deploy 
the solution to Kubernetes, please help us with that, we really want to learn 
Flink and Data Stream Processing as a whole as it is a great skill to have, 
thank you for reading our request, wish you a all a great day,

Best Regards,

Zakaria





​

Reply via email to