On 30/05/2013 17:03, Tiago Tresoldi wrote:
> 2013/5/30 Giulio Paci <[email protected] <mailto:[email protected]>>
>
> >From an acopost perspective, I think the structure that is now in place
> is fine. With respect to what I usually do when I can decide what to do,
> the only difference is that scripts are now in /bin instead of
> /src/scripts.
>
>
> Moving them to /src/scripts would make sense, IMO. I was not confortable
> with /bin holding just scripts, which is why I had thought of moving the
> compiled programs into it. But removing /bin entirely seems more
> appropriate given that, as per below, the compiled program will stay in
> /src.
The compiled programs by default are generated where the Makefile.am
creating them is located.
> > /data -- default files (tbt and et rules)
> > /data/{language} -- pre-trained models, such as Marco Baroni's Italian
> > and Ulrik's pre-1948 Danish; as a complete set of trained models
> is not
> > complex, everything should ideally be kept in a single directory
>
> I like the name /examples, because it makes it clear they are just
> examples. Unless we are planning to release data as well. In this case I
> think we should setup a sub-module for data and separate clearly data
> from software.
>
> I think that what I was suggesting to put in /data (such as model tbt
> and et rules) should be distributed in the main package, installing it
> in /usr/share/acopost. Even reading the source, it is far from obvious
> how such "configuration" files are expected to be, and as our end user
> will probably be training his/her own models, it is more practical to
> distribute them along with the binaries than pointing in the
> documentation an additional package to download.
>
> I had suggested putting everything in /data just to reduce the number of
> directories and to be consistent (/bin with binary files, /src with
> source files, /data with user-experience files, etc.)
>
> But sub-modules for the actual language models, such the available
> Italian and Danish, is a great idea; they could even include a wrapper
> to the voting tagger which would automatically load the correct model
> files (so that a user could do, say, "sudo apt-get install
> acopost-english && echo "This is just an example." | acopost-english").
If your goal is this, then /data is fine. I would keep them in a
separated submodule anyway: mixing data and software is likely to create
issues in the long term.
> > /maintain -- development&debug scripts, C testing, etc.
> > /maintain/data -- data for testing, fake language corpus and
> eventually
> > its pre-trained models, etc.
>
> In my opinion we should put in /maintain what is interesting only for
> package maintainers, the main purpose of this directory should be to
> avoid source tree pollution with scripts and stuff that are useful only
> to a few of us.
>
>
> Maybe a /maintain for package maintainers and a /devel for those who
> want to compile, study and extended acopost, but who don't care about
> the packaging? (but see answer below)
I think that /maintain and /devel serve the same purpose, so I would go
for /devel. However, as that directory would be empty (see below), I
suggest to not create any of them.
> /tests serve a completely different purpose: it contains tests that are
> useful to check acopost integrity on the end-user machine: the data
> should be distributed and should be used as a standard step of the
> installation procedures.
>
>
> Which pretty much sums up my idea of the /devel directory. So, are we
> decided for a /maintain and a /tests?
I think /tests is enough. However I want to make clear that testing
integrity and experimenting with software are two completely different
tasks and should stay in separated directories:
in /tests there should be unit testing stuff and nothing more.
In /devel it would also be possible to have experimental code or some
developer-oriented scripts.
> > /src -- should only keep the source of the distributed programs;
> > acopost_test should be moved to /maintain
>
> Probably true. I have not yet understood what is the purpose of
> acopost_test, but I think it is probable that it would be useful to make
> it part of the test suite as well, so /test/src or even /src would be a
> good place for it. We only need to make sure that "make install" does
> not install it.
>
>
> First of all, I had forgot lextest.c, which is similar to acopost_test.c
> and would follow it.
>
> The idea of acopost_test is to test/stress (and in a ugly way document)
> the "library" of acopost (mem.c, array.c, hash.c...), before I actually
> test the taggers. I know from experience, for example, that `met` was
> always coredumping when the number of tags was >= 40 or 50, which looked
> more like a matter of memory leaking than memory shortage. I would also
> like to test, for exemple, if there are better hashing functions that
> Knuth's, given our characteristics (strings are usually very short, tags
> are usually very alike, etc.), if we can work better with the collisions
> in hash.c, etc. In short, it is supposed to be both a workbench and a
> unit testing (in the long run). It is what I usually do, but, once more,
> I am certainly not your regular C-guru... ;)
>
> But I now agree with Ulrik about keeping it where it is: moving
> acopost_test.c would also mean lots of ugly "../src" in the #includes.
> Let's keep it the way it is, making sure that it is not installed.
I agree: keep it where it is.
> > BRW, I am not completely sure about the name `maintain`, perhaps we
> > could go for `devel`.
>
> /devel is a good name for it. But unless we ALL agree to move
> maintainance scripts there, I do not think it is a priority to have this
> directory as we only have 3 scripts that are candidates to move there
> and, as Ulrik reported, autogen.sh usually resides in the top directory
> of the source tree.
>
>
> I agree, autogen.sh should reside in the top directory. Extending what I
> said above, it would be matter of having one directory named /maintain
> and a second one named either /devel or /tests, if you agree in
> separating the "packaging" from the "hacking around". For the time
> being, however, I would keep it the way it is: acopost_test and company
> in /src, the scripts in the top directory.
I agree to keep the scripts in the top directory.
Summarizing I think that we basically agreed on these points:
1) moving /bin to /src/scripts
2) moving /example to /data
3) leave everything else as it is
4) eventually create /devel or /maintain or /experimental when we will
need it
Is it right?
------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
acopost-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/acopost-devel