Hi dev@, JanI from Infra has nearly provisioned us with a brand spanking new Ubuntu VM which we can use for the Tika service !!!! YAY
Some things he requires first though... * - what is the external name used by users. tika-vm.a.o is solely for ssh, not for public * This is entirely up to you guys. Over in Any23 we were lucky enough to have someone on the project team own any23.org... what about service.tika.apache.org? Or something else maybe? *- do you intent to use https ? if yes the traffic will go a proxy server * I don't think that this is required but you guys may think differently. In the set up of the Any23 VM, I provisioned mod_proxy_ajp module for proxy between Tomcat where the Any23 Web Application was running and incoming traffic via Apache HTTPD. TIka Server is a standalone server though right... it's not packaged as a war. *- does your software use login Then you must use https: * AFAIK the Tika service does not have a security layer. Can someone confirm. Which user should have access ? It's up to you whether you want to put your name(s) here (PMC only) and I will transfer on to INFRA-7751 or else you can add them there yourself. *Which (if any) of the above users need sudo ? remark we are very restrictive with sudo. * The final one really comes down to anyone(s) who are willing to log in and rarely maintain the service e.g. if Apache HTTPD needs rebooted or something. I've fully documented the Any23 service... documentation can be found at the link below. These docs can be more or less copied to meet configuration of the Tika server and service... they are essential for complete server rebuild should anything go catastrophically wrong and we were to loose the server and everything running on it. https://svn.apache.org/repos/infra/infrastructure/trunk/docs/services/any23-vm.txt I'll be keeping a close eye on the ticket and will try to drive it on. Part of this involves getting information to JanI in a timely fashion. The info does not need to be 100% but we at least need some people to volunteer to maintain the service. BTW I also have a script which we run over on Any23 as a cron job which uses jsawk [0] to consume nightly stable SNAOSHOT's of the Any23 server... these are then loaded in to Tomcat and replace the previous stable SNAPSHOT. Users and Dev's alike can use the same development SNAPSHOT code for testing. This should also allow Tika to better test new features as it permits more users to try out new functionality, esp for parsers. Thanks Lewis [0] https://github.com/micha/jsawk#readme -- *Lewis*