Nick Howell <nlhow...@gmail.com> čálii: >> I'm not sure if corpora-xxx in the github is the right way to go though. >> >> I think it would be better to store them on a web server and either: >> >> 1) Have apertium-xxx/text that has a script that will download the corpus >> from the server and a gitignore to not have it in the repo. >> 2) Use something like git-annex (this is bit more involved) > > git-annex is essentially designed for exactly our use-case. Github and > Gitlab natively speak a protocol called "git LFS" which git-annex supports. > So I would be highly supportive of moving in that direction. > > I would be happy to help put together a proposal for what that would look > like, but probably not before the end of the month. Potential problems I > can see with such a plan are: > - git-annex has a heavy build dependency set (Haskell)
As with git itself, I hope we don't expect people to build it? :) > - git-annex depends on stable hashes of the corpus data > - git-annex packages can be out-of-date outside of debian Ubuntu 19.10: 7.20190912-1 Fedora 30: 7.20191114 CentOS 7: 5.20140221 (so probably newer than most CentOS software) OS X: dmg's "autobuilt" from every git version if I read https://git-annex.branchable.com/install/OSX/ correctly Windows: "beta" says https://git-annex.branchable.com/install/Windows/ but there are pre-built packages. That doesn't seem too bad? -Kevin
signature.asc
Description: PGP signature
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff