Hi Daniel

+1

I am extremely happy to see this thread about rebooting Kibble :-)

I think anything that makes it easier to contribute is a good thing. We have been talking about making the visualisations better (and more modern!) so adding the ability to improve it and make it more flexible is great, They say a picture says a 1000 words so while discussion is good - a perhaps an architecture diagram similar to what our friends at DevLake <https://devlake.apache.org/docs/Overview/Architecture>have done would help.

The data part did need some re-organisation and I remember the discussion we had previously <https://github.com/apache/kibble/pull/8> about trying to separate the source and the types so am happy to see this is still part of the roadmap to tidy up.

I  like the idea of not abandoning the mailing list, subversion and other data sources in favour of only Git and Github as there are many projects out there that still use them.

The mono repo idea is good - we already tried keeping two repos so let's see if one works better.

The only other thing I would bring up now is the possibility of making a release of Kibble-1 to give people coming along something to look at, work with and download. So will start a new thread for that. If necessary I'd be willing to have a go at working on organising that (with any other help I can get!) :-)

Thanks
Sharan


On 2022-09-11 18:36, Daniel Gruno wrote:
Hi folks,
a while back we attempted a complete redesign of the Kibble platform, which unfortunately fizzled out. I'd like to restart this process, but perhaps simplify and condense our goals a bit, so as to lower the bar for participation and implementation.

I'd like to propose we divide Kibble into three components:

- A management service that purely exists to manage sources, data access, and delegate jobs - A scanner service that uses the sources/jobs from the above and gathers data points - An optional visualization service that can latch onto the database and visualize the data gathered by the scanners. This would be optional in that the base server and scanners would work independently from this, and any other visualization platform (jupyter, devlake?? kibana??) could be utilized instead of the default option.

For the data part of it all, I'd also like to propose we do a split between source types and data types. That is to say, a source type could be a git repo, a github repo, a subversion repo, a pony mail list, a jira instance etc, whereas a data type could be a commit, an email, a ticket, etc. data types could have one or more associated sources, and these sources would themselves have individual ways of obtaining the required data. Thus, an issue scanner could essentially be source-type-agnostic, as the data type plugin itself would supply the API for grabbing a pre-determined base set of common data for that data type. As an example, a ticket scanner could work off both GitHub Issues, JIRA, and BugZilla. The scanner module would not need to know how to handle these individual calls, as the data type module would abstract that part and provide a standardized interface. This would also mean we could expand into subversion/mercurial/etc territory by abstracting the repository calls and providing a unified way of interacting with commits of any repository type.

Provided this is agreeable, I am willing to spend both time and resources on this (both that of my own, and that of my company).

WDYT?

With regards,
Daniel.

PS: I'd propose we start off with a mono-repo strategy, to ease the deployment and release workflow. If we later feel that a split into server/client is better, then we can do that.

Reply via email to