Hi there, first of all apologies for cross-posting. My name is Thiago and I
have been using spotlight for a couple of years now. Although I think it is
a great product and has a great community of people involved, I'd like to
discuss some issues related to the maintainability of its codebase.
I think that there s a lot of amazing developers out there that spend their
evenings contributing with PRs to spotlight, but once they submit their
work and maybe a discussion is going on, the PRs are not merged and it is
not clear why that branch hasn't been merged. Is it because it lacks tests
? Or is it because it's a bad idea ? Or is it because people somehow get a
bit scared of merging because they want to preserve the spotlight branch as
clear/stable as possible so experimental work is unwelcome.
To be honest, I don't know what's the best way to approach this is. But to
illustrate this point further, here are a couple of PRs that just fix some
functionality/make the code cleaner/more maintainable
https://github.com/dbpedia-spotlight/dbpedia-spotlight/pull/289
https://github.com/dbpedia-spotlight/dbpedia-spotlight/pull/312
Another PR that is really cool is Dirk's Scala 2.10 + Factorie NER
https://github.com/dbpedia-spotlight/dbpedia-spotlight/pull/306
but in is this case, one could argue that 2 PRs should be submitted, one
for the scala migration, another for the Factorie introduction.
In any case, I think the following things should be in place, if the
contributions from the community are welcome:
1. Some form of release metodology (development branch which PRs should be
submitted against, release branch that goes through a proper load testing)
2. Automatic evaluation against a variety of datasets (these seem really
useful https://github.com/diegoceccarelli/dexter-eval
https://github.com/marcocor/bat-framework and
http://acube.di.unipi.it/bat-framework/)
3. More clarity on why a PR is not merged yet/.
If things don't improve, my worry is that people will either stop
contributing or else focus on their own forks (which creates a problem of
syncing master branches and what not).
Comments thoughts and suggestions are welcome and appreciated.
All the best,
Thiago Galery
------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users