Hey Wayang team,

after my warm up exercise using the CsvRowConverter I want to start
planning a work package which gives the project some new functionality and
me a deeper understanding of the codebase.

Just to wrap up the idea: I started with working on the CsvRowConverter so
that it can handle multiple separators, depending on the data, or based on
a decision of the developer.

I arrived at the JavaCSVTableSource and due to the lack of architectural
knowledge pressed my STOP button.

So far I understand that the following could be a goal for implementation:

(1) Migrate the JavaCSVTableSource to a place where it has a better home.
(2) Configure the default separator of the JavaCSVTableSource via
config-file (I have to learn how the config is handled during the life
cycle of a job).
(3) Create a JavaCSVTableSource with a well known separator
programmatically.
(4) Create a JavaCSVTableSource and allow autodetection of the separator.

Question: Is the JavaCSVTableSource the right class to start, or is the
functionality I refer to defined on a higher level in the framework.

Currently I ask for hints only, so that I can go for a solution while I
learn to navigate the code base.

I envision a JavaCSVTableSource component with a set of tests which shows
that CSV / TSV files can be loaded from local files or even using the
"remote file source" where data can be loaded from an HTTP server.

*The use case I have in mind is this:*
We have a data asset in our local processing context which should be joined
with a dataset which is provided in a remote data pod. I can read the small
"lookup table" into memory and handle the larger local data set using the
scalable platform. I can avid all data management steps for the "small
lookup" table which is hosted outside my envirnonemt.

Would that make sense? I think if we have a clear target scenario and if
that is aligned with existing ideas, it could become a great learning path
for a newcomer. And if the use case i have in mind is totally against the
core idea, no problem, there is a lot to learn from.

Cheers,
Mirko



Dr. rer. nat. Mirko Kämpf
Müchelner Str. 23
06259 Frankleben

Reply via email to