Ec2 CI nodes rebooted for security upgrades
Hi folks! All Ec2 instances upgraded and rebooted, including the Jenkins head node hosting ci.bitop.apache.org. Lemme know if you see any issue! Luca
Jenkins upgraded
Hi folks! I have just upgraded Jenkins to 2.452, everything seems to work fine but please let me know if you see anything out of the ordinary. The upgrade was due to some security issues, and I also took the chance to upgrade some plugins for the same reason. Luca
Re: [Discuss] Bigtop release 3.3
Hi Masatake, Not sure if you saw https://issues.apache.org/jira/browse/BIGTOP-4058 but we could take some time to check also the experimental smoke test pipelines for branch-3.3, lemme know what you think :) Luca On Tue, Mar 19, 2024 at 6:50 AM Masatake Iwasaki wrote: > > I'm updating CI jobs for master branch now. > If the result of smoke-tests looks good, > I will create branch-3.3 and set up CI jobs creating release. > > Masatake Iwasaki > > On 2024/03/01 10:40, chenqiang2080 wrote: > > Hi,all > > The tasks of next Bigtop release 3.3 version(BIGTOP-3909) are > > basically completed. > > https://issues.apache.org/jira/browse/BIGTOP-3909 > > > > > >May I ask when Bigtop v3.3 release will start? And who is the > > release manager? > > > > > >As a contributor who finished the work support openEuler OS, i can > > do some work for Bigtop pre-release testing. > > > > > > Best Regards!
Re: Installed plugin for pipelines
Hi folks, getting back to the subject if anyone is interested in giving some feedback. In https://issues.apache.org/jira/browse/BIGTOP-4058 there are some examples of Jenkins pipelines that I have created to see if we can get rid of the Matrix reloaded plugin, but before proceeding I'd need to know if it is something that people would use/like or not :) The main use case for Matrix reloaded seems to be (IIUC) having a way to selectively trigger builds and smoke tests while doing a release (to bypass transient failures to specific components that would require to trigger all the builds again). In theory we could do the same with pipelines, but they are a little less flexible for the moment (at least from a UI perspective). Getting rid of the Matrix plugin would avoid some old/unsecure code and it would align us with more recent Jenkins standards. Lemme know! Thanks in advance, Luca On Mon, Jan 15, 2024 at 12:06 PM Luca Toscano wrote: > > As promised: https://issues.apache.org/jira/browse/BIGTOP-4058 > > I started adding some jobs to show the idea, feedback welcome :) > > Luca > > On Thu, Jan 11, 2024 at 3:56 PM Luca Toscano wrote: > > > > Nevermind I fixed the apt issue, but the current problem is that the > > COMPONENTS of the smoke test cannot be grouped together. IIUC we want > > to be able to kick off a run with COMPONENTS set to all or a subset of > > daemons (yarn/hdfs/etc..), that will bootstrap the hadoop cluster via > > Docker. I'll open a task and work on it, hopefully there is an easy > > solution :) > > > > On Thu, Jan 11, 2024 at 12:11 PM Luca Toscano > > wrote: > > > > > > I have also created: https://ci.bigtop.apache.org/job/pipeline-smoke-test/ > > > > > > It currently doesn't work since there seems to be an error when apt > > > runs inside the Docker container, but it is a generic script to give > > > you the idea. Release version and branch have a separate parameter, at > > > the moment we have 3.2.0 and 3.3.0, so a single script could in theory > > > be used multiple times with minimal changes (IIUC at the moment for > > > every release we create new smoke tests etc..).\ > > > > > > To test it: > > > > > > 1) Run build > > > 2) Open the build run's "Console output" > > > 3) Hit "Input Requested" and you'll see a series of dropdown options. > > > Once you select the right ones, the build should start. > > > > > > If you want to see an example: > > > https://ci.bigtop.apache.org/job/pipeline-smoke-test/5/console > > > > > > This is just an idea to gather feedback, it can surely be improved, > > > but it is a starting point :) > > > > > > Thanks! > > > > > > Luca > > > > > > On Wed, Jan 10, 2024 at 9:44 PM Luca Toscano > > > wrote: > > > > > > > > Hi folks, > > > > > > > > I created https://ci.bigtop.apache.org/job/test-elukey as an example > > > > of how Jenkins pipelines can be used to replace the Matrix Reloaded > > > > plugin. The prototype is limited for the moment to: > > > > - It uses scripted pipelines, not declarative pipelines, since the > > > > former is more flexible in my opinion. > > > > - It mimics the Smoke Tests for Debian OSes, used in our release > > > > process IIUC (like > > > > https://ci.bigtop.apache.org/view/3.2.0-smoke-tests/job/Bigtop-3.2.0-debian-10-smoke-tests/1/) > > > > - It only echoes values, no build is done. > > > > - It allows dynamic parameters (namely to restrict OS/ARCH/COMPONENT > > > > at runtime), but the inputs need to be added in the build's console > > > > output rather than having specific "Build with params" in Jenkins UI. > > > > This is probably something that we can circumvent, but the current > > > > solution avoids to copy/paste values multiple times etc.. > > > > - Compared to Matrix reloaded, the flexibility of selecting dynamic > > > > parameters is less, but it should be sufficient for the release use > > > > case. > > > > > > > > We'd need to deprecate Matrix Reloaded > > > > (https://plugins.jenkins.io/matrix-reloaded/) since the plugin was > > > > abandoned since a long time ago, and no more fixes are provided from > > > > upstream. > > > > > > > > Do you think that this process is viable? If so we could use pipelines > > > > as experimental step during the next release, to see if they can
Re: Installed plugin for pipelines
As promised: https://issues.apache.org/jira/browse/BIGTOP-4058 I started adding some jobs to show the idea, feedback welcome :) Luca On Thu, Jan 11, 2024 at 3:56 PM Luca Toscano wrote: > > Nevermind I fixed the apt issue, but the current problem is that the > COMPONENTS of the smoke test cannot be grouped together. IIUC we want > to be able to kick off a run with COMPONENTS set to all or a subset of > daemons (yarn/hdfs/etc..), that will bootstrap the hadoop cluster via > Docker. I'll open a task and work on it, hopefully there is an easy > solution :) > > On Thu, Jan 11, 2024 at 12:11 PM Luca Toscano wrote: > > > > I have also created: https://ci.bigtop.apache.org/job/pipeline-smoke-test/ > > > > It currently doesn't work since there seems to be an error when apt > > runs inside the Docker container, but it is a generic script to give > > you the idea. Release version and branch have a separate parameter, at > > the moment we have 3.2.0 and 3.3.0, so a single script could in theory > > be used multiple times with minimal changes (IIUC at the moment for > > every release we create new smoke tests etc..).\ > > > > To test it: > > > > 1) Run build > > 2) Open the build run's "Console output" > > 3) Hit "Input Requested" and you'll see a series of dropdown options. > > Once you select the right ones, the build should start. > > > > If you want to see an example: > > https://ci.bigtop.apache.org/job/pipeline-smoke-test/5/console > > > > This is just an idea to gather feedback, it can surely be improved, > > but it is a starting point :) > > > > Thanks! > > > > Luca > > > > On Wed, Jan 10, 2024 at 9:44 PM Luca Toscano wrote: > > > > > > Hi folks, > > > > > > I created https://ci.bigtop.apache.org/job/test-elukey as an example > > > of how Jenkins pipelines can be used to replace the Matrix Reloaded > > > plugin. The prototype is limited for the moment to: > > > - It uses scripted pipelines, not declarative pipelines, since the > > > former is more flexible in my opinion. > > > - It mimics the Smoke Tests for Debian OSes, used in our release > > > process IIUC (like > > > https://ci.bigtop.apache.org/view/3.2.0-smoke-tests/job/Bigtop-3.2.0-debian-10-smoke-tests/1/) > > > - It only echoes values, no build is done. > > > - It allows dynamic parameters (namely to restrict OS/ARCH/COMPONENT > > > at runtime), but the inputs need to be added in the build's console > > > output rather than having specific "Build with params" in Jenkins UI. > > > This is probably something that we can circumvent, but the current > > > solution avoids to copy/paste values multiple times etc.. > > > - Compared to Matrix reloaded, the flexibility of selecting dynamic > > > parameters is less, but it should be sufficient for the release use > > > case. > > > > > > We'd need to deprecate Matrix Reloaded > > > (https://plugins.jenkins.io/matrix-reloaded/) since the plugin was > > > abandoned since a long time ago, and no more fixes are provided from > > > upstream. > > > > > > Do you think that this process is viable? If so we could use pipelines > > > as experimental step during the next release, to see if they can be > > > adopted. Let me know your thoughts and if I am missing something big > > > (probably happening, apologies in advance). > > > > > > Thanks! > > > > > > Luca > > > > > > On Sun, Jan 7, 2024 at 3:38 PM Luca Toscano > > > wrote: > > > > > > > > Hi folks, > > > > > > > > just letting you know that I have installed the Pipelines plugin in > > > > the Jenkins' master, to test if we can define our jobs in there and > > > > avoid the Matrix reloaded plugin. It would be also very nice to have > > > > all pipelines checked out in the BigTop's repo eventually. > > > > > > > > I'll open a jira to track all the work after some quick tests :) > > > > > > > > Let me know if you see anything weird, I've just restarted Jenkins to > > > > pick up the new plugin. > > > > > > > > Luca
[jira] [Created] (BIGTOP-4058) Replace Jenkins' Matrix reloaded plugin
Luca Toscano created BIGTOP-4058: Summary: Replace Jenkins' Matrix reloaded plugin Key: BIGTOP-4058 URL: https://issues.apache.org/jira/browse/BIGTOP-4058 Project: Bigtop Issue Type: Improvement Reporter: Luca Toscano We are currently using Matrix reloaded (https://plugins.jenkins.io/matrix-reloaded/) as part of the build/release process to re-run selectively some builds. For example, this is a use case outlined by Masatake: > We are using "Matrix Reloaded" to rerun the part of failed configurations. > Since packaging and smoke-tests intermittently fails due to network > issue and flaky test cases, it is crucial based on the experience of > release process of Bigtop 3.2.1. I started https://ci.bigtop.apache.org/job/pipeline-smoke-test/ as attempt to migrate the Smoke Test job to Jenkins Pipelines, that seems the best tool for the job. There are two options: - Declarative pipelines - Scripted pipelines The former is more readable and better for simple jobs, but the latter gives us the possibility to write Groovy scripts. They offer a "matrix" statement, see these articles for more info: https://www.jenkins.io/blog/2019/11/22/welcome-to-the-matrix/ https://www.jenkins.io/blog/2019/12/02/matrix-building-with-scripted-pipeline/ I took inspiration by the latter to create a prototype for the Smoke tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: Installed plugin for pipelines
Nevermind I fixed the apt issue, but the current problem is that the COMPONENTS of the smoke test cannot be grouped together. IIUC we want to be able to kick off a run with COMPONENTS set to all or a subset of daemons (yarn/hdfs/etc..), that will bootstrap the hadoop cluster via Docker. I'll open a task and work on it, hopefully there is an easy solution :) On Thu, Jan 11, 2024 at 12:11 PM Luca Toscano wrote: > > I have also created: https://ci.bigtop.apache.org/job/pipeline-smoke-test/ > > It currently doesn't work since there seems to be an error when apt > runs inside the Docker container, but it is a generic script to give > you the idea. Release version and branch have a separate parameter, at > the moment we have 3.2.0 and 3.3.0, so a single script could in theory > be used multiple times with minimal changes (IIUC at the moment for > every release we create new smoke tests etc..).\ > > To test it: > > 1) Run build > 2) Open the build run's "Console output" > 3) Hit "Input Requested" and you'll see a series of dropdown options. > Once you select the right ones, the build should start. > > If you want to see an example: > https://ci.bigtop.apache.org/job/pipeline-smoke-test/5/console > > This is just an idea to gather feedback, it can surely be improved, > but it is a starting point :) > > Thanks! > > Luca > > On Wed, Jan 10, 2024 at 9:44 PM Luca Toscano wrote: > > > > Hi folks, > > > > I created https://ci.bigtop.apache.org/job/test-elukey as an example > > of how Jenkins pipelines can be used to replace the Matrix Reloaded > > plugin. The prototype is limited for the moment to: > > - It uses scripted pipelines, not declarative pipelines, since the > > former is more flexible in my opinion. > > - It mimics the Smoke Tests for Debian OSes, used in our release > > process IIUC (like > > https://ci.bigtop.apache.org/view/3.2.0-smoke-tests/job/Bigtop-3.2.0-debian-10-smoke-tests/1/) > > - It only echoes values, no build is done. > > - It allows dynamic parameters (namely to restrict OS/ARCH/COMPONENT > > at runtime), but the inputs need to be added in the build's console > > output rather than having specific "Build with params" in Jenkins UI. > > This is probably something that we can circumvent, but the current > > solution avoids to copy/paste values multiple times etc.. > > - Compared to Matrix reloaded, the flexibility of selecting dynamic > > parameters is less, but it should be sufficient for the release use > > case. > > > > We'd need to deprecate Matrix Reloaded > > (https://plugins.jenkins.io/matrix-reloaded/) since the plugin was > > abandoned since a long time ago, and no more fixes are provided from > > upstream. > > > > Do you think that this process is viable? If so we could use pipelines > > as experimental step during the next release, to see if they can be > > adopted. Let me know your thoughts and if I am missing something big > > (probably happening, apologies in advance). > > > > Thanks! > > > > Luca > > > > On Sun, Jan 7, 2024 at 3:38 PM Luca Toscano wrote: > > > > > > Hi folks, > > > > > > just letting you know that I have installed the Pipelines plugin in > > > the Jenkins' master, to test if we can define our jobs in there and > > > avoid the Matrix reloaded plugin. It would be also very nice to have > > > all pipelines checked out in the BigTop's repo eventually. > > > > > > I'll open a jira to track all the work after some quick tests :) > > > > > > Let me know if you see anything weird, I've just restarted Jenkins to > > > pick up the new plugin. > > > > > > Luca
Re: Installed plugin for pipelines
I have also created: https://ci.bigtop.apache.org/job/pipeline-smoke-test/ It currently doesn't work since there seems to be an error when apt runs inside the Docker container, but it is a generic script to give you the idea. Release version and branch have a separate parameter, at the moment we have 3.2.0 and 3.3.0, so a single script could in theory be used multiple times with minimal changes (IIUC at the moment for every release we create new smoke tests etc..).\ To test it: 1) Run build 2) Open the build run's "Console output" 3) Hit "Input Requested" and you'll see a series of dropdown options. Once you select the right ones, the build should start. If you want to see an example: https://ci.bigtop.apache.org/job/pipeline-smoke-test/5/console This is just an idea to gather feedback, it can surely be improved, but it is a starting point :) Thanks! Luca On Wed, Jan 10, 2024 at 9:44 PM Luca Toscano wrote: > > Hi folks, > > I created https://ci.bigtop.apache.org/job/test-elukey as an example > of how Jenkins pipelines can be used to replace the Matrix Reloaded > plugin. The prototype is limited for the moment to: > - It uses scripted pipelines, not declarative pipelines, since the > former is more flexible in my opinion. > - It mimics the Smoke Tests for Debian OSes, used in our release > process IIUC (like > https://ci.bigtop.apache.org/view/3.2.0-smoke-tests/job/Bigtop-3.2.0-debian-10-smoke-tests/1/) > - It only echoes values, no build is done. > - It allows dynamic parameters (namely to restrict OS/ARCH/COMPONENT > at runtime), but the inputs need to be added in the build's console > output rather than having specific "Build with params" in Jenkins UI. > This is probably something that we can circumvent, but the current > solution avoids to copy/paste values multiple times etc.. > - Compared to Matrix reloaded, the flexibility of selecting dynamic > parameters is less, but it should be sufficient for the release use > case. > > We'd need to deprecate Matrix Reloaded > (https://plugins.jenkins.io/matrix-reloaded/) since the plugin was > abandoned since a long time ago, and no more fixes are provided from > upstream. > > Do you think that this process is viable? If so we could use pipelines > as experimental step during the next release, to see if they can be > adopted. Let me know your thoughts and if I am missing something big > (probably happening, apologies in advance). > > Thanks! > > Luca > > On Sun, Jan 7, 2024 at 3:38 PM Luca Toscano wrote: > > > > Hi folks, > > > > just letting you know that I have installed the Pipelines plugin in > > the Jenkins' master, to test if we can define our jobs in there and > > avoid the Matrix reloaded plugin. It would be also very nice to have > > all pipelines checked out in the BigTop's repo eventually. > > > > I'll open a jira to track all the work after some quick tests :) > > > > Let me know if you see anything weird, I've just restarted Jenkins to > > pick up the new plugin. > > > > Luca
Re: Installed plugin for pipelines
Hi folks, I created https://ci.bigtop.apache.org/job/test-elukey as an example of how Jenkins pipelines can be used to replace the Matrix Reloaded plugin. The prototype is limited for the moment to: - It uses scripted pipelines, not declarative pipelines, since the former is more flexible in my opinion. - It mimics the Smoke Tests for Debian OSes, used in our release process IIUC (like https://ci.bigtop.apache.org/view/3.2.0-smoke-tests/job/Bigtop-3.2.0-debian-10-smoke-tests/1/) - It only echoes values, no build is done. - It allows dynamic parameters (namely to restrict OS/ARCH/COMPONENT at runtime), but the inputs need to be added in the build's console output rather than having specific "Build with params" in Jenkins UI. This is probably something that we can circumvent, but the current solution avoids to copy/paste values multiple times etc.. - Compared to Matrix reloaded, the flexibility of selecting dynamic parameters is less, but it should be sufficient for the release use case. We'd need to deprecate Matrix Reloaded (https://plugins.jenkins.io/matrix-reloaded/) since the plugin was abandoned since a long time ago, and no more fixes are provided from upstream. Do you think that this process is viable? If so we could use pipelines as experimental step during the next release, to see if they can be adopted. Let me know your thoughts and if I am missing something big (probably happening, apologies in advance). Thanks! Luca On Sun, Jan 7, 2024 at 3:38 PM Luca Toscano wrote: > > Hi folks, > > just letting you know that I have installed the Pipelines plugin in > the Jenkins' master, to test if we can define our jobs in there and > avoid the Matrix reloaded plugin. It would be also very nice to have > all pipelines checked out in the BigTop's repo eventually. > > I'll open a jira to track all the work after some quick tests :) > > Let me know if you see anything weird, I've just restarted Jenkins to > pick up the new plugin. > > Luca
Installed plugin for pipelines
Hi folks, just letting you know that I have installed the Pipelines plugin in the Jenkins' master, to test if we can define our jobs in there and avoid the Matrix reloaded plugin. It would be also very nice to have all pipelines checked out in the BigTop's repo eventually. I'll open a jira to track all the work after some quick tests :) Let me know if you see anything weird, I've just restarted Jenkins to pick up the new plugin. Luca
[jira] [Created] (BIGTOP-4041) Move httpd on ci.bigtop.apache.org to mpm worker/event
Luca Toscano created BIGTOP-4041: Summary: Move httpd on ci.bigtop.apache.org to mpm worker/event Key: BIGTOP-4041 URL: https://issues.apache.org/jira/browse/BIGTOP-4041 Project: Bigtop Issue Type: Task Reporter: Luca Toscano We are currently using prefork for ci.bigtop.apache.org's httpd, that is not compatible with modules like mod_h2 (for http2 support) that we have enabled. We should probably move to something like mpm_event (or worker). -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: Ongoing maintenance for ci.bigtop.apache.org
Maintenance done! The Let's Encrypt cert is not fully managed by mod_md (so httpd itself), and we reload the config once every week in a systemd timer (when a new cert is issued, httpd will automatically pick it up in max a week time). Let me know if anything looks wrong or not working! Luca On Thu, Nov 9, 2023 at 5:20 PM Luca Toscano wrote: > > Hi folks, > > working on https://issues.apache.org/jira/browse/BIGTOP-4038, so > ci.bigtop.apache.org may not be working for a couple of hours. > > Let me know if this is a problem or not. > > Thanks! > > Luca
Ongoing maintenance for ci.bigtop.apache.org
Hi folks, working on https://issues.apache.org/jira/browse/BIGTOP-4038, so ci.bigtop.apache.org may not be working for a couple of hours. Let me know if this is a problem or not. Thanks! Luca
Replace certbot with mod_md for ci.bigtop.apache.org
Hi folks! I added an idea in https://issues.apache.org/jira/browse/BIGTOP-4038 about how to automate the let's-encrypt cert renewal for ci.bigtop.apache.org. Let me know what you think and if you like the idea! Luca
Re: Issues for ci.bigtop.apache.org
Issue fixed! Apologies for the trouble :) The Jenkins master node has been upgraded to the latest kernel and packages, I'll do the same during the next few days with the worker nodes. Luca On Sun, Nov 5, 2023 at 9:24 AM Luca Toscano wrote: > > Created https://issues.apache.org/jira/browse/BIGTOP-4039 to track the > issue, still unable to make Jenkins start :( > > Luca > > On Sat, Nov 4, 2023 at 7:39 PM Luca Toscano wrote: > > > > Hi folks, > > > > I am working on the host behind ci.bigtop.apache.org, after a reboot + > > os package upgrade Jenkins refuses to start. I'll keep this list > > posted, sorry for the inconvenience! > > > > Luca
Re: Issues for ci.bigtop.apache.org
Created https://issues.apache.org/jira/browse/BIGTOP-4039 to track the issue, still unable to make Jenkins start :( Luca On Sat, Nov 4, 2023 at 7:39 PM Luca Toscano wrote: > > Hi folks, > > I am working on the host behind ci.bigtop.apache.org, after a reboot + > os package upgrade Jenkins refuses to start. I'll keep this list > posted, sorry for the inconvenience! > > Luca
[jira] [Created] (BIGTOP-4039) Jenkins not starting for ci.bigtop.apache.org after reboot
Luca Toscano created BIGTOP-4039: Summary: Jenkins not starting for ci.bigtop.apache.org after reboot Key: BIGTOP-4039 URL: https://issues.apache.org/jira/browse/BIGTOP-4039 Project: Bigtop Issue Type: Task Reporter: Luca Toscano After a yum upgrade + reboot of the Jenkins master node (no upgrade of its version), Jenkins fails to start due to: 2023-11-05 08:13:03.977+ [id=25]SEVERE hudson.util.BootFailure#publish: Failed to initialize Jenkins com.thoughtworks.xstream.mapper.CannotResolveClassException: hudson.security.GlobalMatrixAuthorizationStrategy You can see the full stacktrace via docker logs. I tried the following: * start different versions of Jenkins (so different containers, killing the old ones etc..) * copy matrix-auth plugin manually in the plugin dir (but it doesn't seem to work since the plugin is under /var/jenkins_home/war/WEB-INF/detached-plugins/matrix-auth.hpi) * Checked all permissions. * Checked packages upgraded (check /var/log/yum.log as ec2-user) but didn't find anything that could justify this. One interesting log entry: 2023-11-05 08:09:55.541+ [id=33]INFO jenkins.InitReactorRunner$1#onAttained: Started initialization 2023-11-05 08:09:55.557+ [id=32]INFO hudson.PluginManager#loadDetachedPlugins: Upgraded Jenkins from version 2.264 to version 2.428. Loaded detached plugins (and dependencies): [] I am wondering if for some reason Jenkins can't find the detached plugins under /var/jenkins_home/war/WEB-INF/detached-plugins ? -- This message was sent by Atlassian Jira (v8.20.10#820010)
Issues for ci.bigtop.apache.org
Hi folks, I am working on the host behind ci.bigtop.apache.org, after a reboot + os package upgrade Jenkins refuses to start. I'll keep this list posted, sorry for the inconvenience! Luca
[jira] [Created] (BIGTOP-4038) Replace certbot with apache httpd's mod_md
Luca Toscano created BIGTOP-4038: Summary: Replace certbot with apache httpd's mod_md Key: BIGTOP-4038 URL: https://issues.apache.org/jira/browse/BIGTOP-4038 Project: Bigtop Issue Type: Improvement Reporter: Luca Toscano We are currently using certbot to manage the TLS certificate for ci.bigtop.apache.org, with some downsides: 1) The process is manual. 2) It needs to be done every couple of months, stopping httpd first. 3) Sometimes we forget and users get TLS cert validation errors (until we renew). With https://github.com/icing/mod_md we should be able to automate the process with a battle tested httpd module. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Jenkins upgraded for security fixes
Hi everybody, Jenkins running behind ci.bigtop.apache.org has been upgraded to apply security fixes. If you see anything strange in jobs during the next few days please let me know. Cheers, Luca
Re: [ANNOUNCE] New Bigtop PMC member: Luca Toscano
Thanks a lot for all the messages, I really appreciated the warm welcome :) Luca On Tue, Sep 26, 2023, 19:48 Olaf Flebbe wrote: > On behalf of the Apache Bigtop PMC, I am pleased to announce that > Luca Toscano (elukey) has accepted the invitation to join the > Bigtop Project Management Committee. > > Please join me in congratulating Luca! >
Re: Arm workers need to be updated
Nice! Thank you! Luca On Wed, Jun 28, 2023 at 4:30 AM Yuqi Gu wrote: > > Hi Luca, > > >> https://ci.bigtop.apache.org/computer/docker-slave-arm-4/configure > > > I added "JavaPath" and 'JAVA_HOME' to the relevant field in the > configuration. > It seems that the two Arm nodes are working properly: > > https://ci.bigtop.apache.org/computer/docker-slave-arm-4/log > https://ci.bigtop.apache.org/computer/docker-slave-arm-5/log > > Thanks. > > BRs, > Yuqi > > > > Luca Toscano 于2023年6月27日周二 18:05写道: > > > Hi Yuqi, > > > > I don't have access to the ARM nodes afaics, I can only configure > > stuff like > > https://ci.bigtop.apache.org/computer/docker-slave-arm-4/configure > > and launch the agent remotely from the Jenkins UI. I think that it > > expects "java" to point to Java 17, so probably setting alternatives > > should work. Lemme know :) > > > > Luca > > > > On Tue, Jun 27, 2023 at 10:51 AM Yuqi Gu wrote: > > > > > > Hi Luca, > > > > > > >> Couldn't figure out the Java version of /home/jenkins/jdk/bin/java > > > >> bash: /home/jenkins/jdk/bin/java: No such file or directory > > > > > > It seems the default JAVA_HOME was mistakenly set to /home/jenkins/jdk/. > > > Could you please try to "source /home/jenkins/.bashrc" before launching > > the > > > agent? > > > > > > BRs, > > > Yuqi > > > > > > Luca Toscano 于2023年6月27日周二 15:06写道: > > > > > > > Hi Yuqi! > > > > > > > > I tried to re-launch the agent on the ARM nodes but I get: > > > > > > > > Checking Java version in the PATH > > > > openjdk version "1.8.0_332" > > > > OpenJDK Runtime Environment (build 1.8.0_332-8u332-ga-1~deb9u1-b09) > > > > OpenJDK 64-Bit Server VM (build 25.332-b09, mixed mode) > > > > [06/27/23 07:04:33] [SSH] Checking java version of > > > > /home/jenkins/jdk/bin/java > > > > Couldn't figure out the Java version of /home/jenkins/jdk/bin/java > > > > bash: /home/jenkins/jdk/bin/java: No such file or directory > > > > > > > > [06/27/23 07:04:34] [SSH] Checking java version of java > > > > [06/27/23 07:04:34] [SSH] java -version returned 1.8.0_332. > > > > > > > > So something is still not configured correctly :( > > > > > > > > Luca > > > > > > > > On Tue, Jun 27, 2023 at 5:30 AM Yuqi Gu wrote: > > > > > > > > > > Hi Luca, > > > > > > > > > > The default version of the JVM has been set to OpenJDK 17 on Arm > > nodes. > > > > > Please inform me if there are any other issues on Arm workers. > > > > > Thanks. > > > > > > > > > > BRs, > > > > > Yuqi > > > > > > > > > > On Sun, 25 Jun 2023 at 19:00, Yuqi Gu wrote: > > > > > > > > > > > Hi Luca, > > > > > > > > > > > > Sorry for the late reply. > > > > > > Let me try to set the default JVM version to 11 as you mentioned. > > > > > > I will ping you once JDK 11 has been properly configured on the Arm > > > > nodes. > > > > > > Thanks. > > > > > > > > > > > > BRs, > > > > > > Yuqi > > > > > > > > > > > > > > > > > > Luca Toscano 于2023年6月25日周日 15:17写道: > > > > > > > > > > > > > Hi folks! > > > > > > > > > > > > > > Sorry to ping you again, but the ARM workers are still down. > > Anybody > > > > > > > with access credentials that can fix them? > > > > > > > > > > > > > > Thanks :) > > > > > > > > > > > > > > Luca > > > > > > > > > > > > > > On Fri, Jun 16, 2023 at 5:50 PM Luca Toscano < > > toscano.l...@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > > > > > sending an email in here for visibility. After the Jenkins > > upgrade > > > > I > > > > > > > > see the following errors on the Arm workers: > > > > > > > > > > > > > > > > Exception in thread "main" > > java.lang.UnsupportedClassVersionError: > > > > > > > > hudson/remoting/Launcher has been compiled by a more recent > > > > version of > > > > > > > > the Java Runtime (class file version 55.0), this version of the > > > > Java > > > > > > > > Runtime only recognizes class file versions up to 52.0 > > > > > > > > > > > > > > > > I resolved the issue in the other nodes, setting the default > > JVM > > > > > > > > version to 11+, since the above error msg suggests that we > > need it > > > > > > > > from now on to run the Jenkins agents. I don't have access, > > > > afaics, on > > > > > > > > the Arm nodes, can somebody check/fix please? > > > > > > > > > > > > > > > > Thanks in advance :) > > > > > > > > > > > > > > > > Luca > > > > > > > > > > > > > > > > > > >
Re: Arm workers need to be updated
Hi Yuqi, I don't have access to the ARM nodes afaics, I can only configure stuff like https://ci.bigtop.apache.org/computer/docker-slave-arm-4/configure and launch the agent remotely from the Jenkins UI. I think that it expects "java" to point to Java 17, so probably setting alternatives should work. Lemme know :) Luca On Tue, Jun 27, 2023 at 10:51 AM Yuqi Gu wrote: > > Hi Luca, > > >> Couldn't figure out the Java version of /home/jenkins/jdk/bin/java > >> bash: /home/jenkins/jdk/bin/java: No such file or directory > > It seems the default JAVA_HOME was mistakenly set to /home/jenkins/jdk/. > Could you please try to "source /home/jenkins/.bashrc" before launching the > agent? > > BRs, > Yuqi > > Luca Toscano 于2023年6月27日周二 15:06写道: > > > Hi Yuqi! > > > > I tried to re-launch the agent on the ARM nodes but I get: > > > > Checking Java version in the PATH > > openjdk version "1.8.0_332" > > OpenJDK Runtime Environment (build 1.8.0_332-8u332-ga-1~deb9u1-b09) > > OpenJDK 64-Bit Server VM (build 25.332-b09, mixed mode) > > [06/27/23 07:04:33] [SSH] Checking java version of > > /home/jenkins/jdk/bin/java > > Couldn't figure out the Java version of /home/jenkins/jdk/bin/java > > bash: /home/jenkins/jdk/bin/java: No such file or directory > > > > [06/27/23 07:04:34] [SSH] Checking java version of java > > [06/27/23 07:04:34] [SSH] java -version returned 1.8.0_332. > > > > So something is still not configured correctly :( > > > > Luca > > > > On Tue, Jun 27, 2023 at 5:30 AM Yuqi Gu wrote: > > > > > > Hi Luca, > > > > > > The default version of the JVM has been set to OpenJDK 17 on Arm nodes. > > > Please inform me if there are any other issues on Arm workers. > > > Thanks. > > > > > > BRs, > > > Yuqi > > > > > > On Sun, 25 Jun 2023 at 19:00, Yuqi Gu wrote: > > > > > > > Hi Luca, > > > > > > > > Sorry for the late reply. > > > > Let me try to set the default JVM version to 11 as you mentioned. > > > > I will ping you once JDK 11 has been properly configured on the Arm > > nodes. > > > > Thanks. > > > > > > > > BRs, > > > > Yuqi > > > > > > > > > > > > Luca Toscano 于2023年6月25日周日 15:17写道: > > > > > > > > > Hi folks! > > > > > > > > > > Sorry to ping you again, but the ARM workers are still down. Anybody > > > > > with access credentials that can fix them? > > > > > > > > > > Thanks :) > > > > > > > > > > Luca > > > > > > > > > > On Fri, Jun 16, 2023 at 5:50 PM Luca Toscano > > > > > > > wrote: > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > sending an email in here for visibility. After the Jenkins upgrade > > I > > > > > > see the following errors on the Arm workers: > > > > > > > > > > > > Exception in thread "main" java.lang.UnsupportedClassVersionError: > > > > > > hudson/remoting/Launcher has been compiled by a more recent > > version of > > > > > > the Java Runtime (class file version 55.0), this version of the > > Java > > > > > > Runtime only recognizes class file versions up to 52.0 > > > > > > > > > > > > I resolved the issue in the other nodes, setting the default JVM > > > > > > version to 11+, since the above error msg suggests that we need it > > > > > > from now on to run the Jenkins agents. I don't have access, > > afaics, on > > > > > > the Arm nodes, can somebody check/fix please? > > > > > > > > > > > > Thanks in advance :) > > > > > > > > > > > > Luca > > > > > > > > > > >
Renewed TLS cert for ci.bigtop.apache.org
Hi folks, I followed https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide#BigtopCISetupGuide-Renewingthecert and renewed the certificate since it expired. Luca
Re: Arm workers need to be updated
Hi Yuqi! I tried to re-launch the agent on the ARM nodes but I get: Checking Java version in the PATH openjdk version "1.8.0_332" OpenJDK Runtime Environment (build 1.8.0_332-8u332-ga-1~deb9u1-b09) OpenJDK 64-Bit Server VM (build 25.332-b09, mixed mode) [06/27/23 07:04:33] [SSH] Checking java version of /home/jenkins/jdk/bin/java Couldn't figure out the Java version of /home/jenkins/jdk/bin/java bash: /home/jenkins/jdk/bin/java: No such file or directory [06/27/23 07:04:34] [SSH] Checking java version of java [06/27/23 07:04:34] [SSH] java -version returned 1.8.0_332. So something is still not configured correctly :( Luca On Tue, Jun 27, 2023 at 5:30 AM Yuqi Gu wrote: > > Hi Luca, > > The default version of the JVM has been set to OpenJDK 17 on Arm nodes. > Please inform me if there are any other issues on Arm workers. > Thanks. > > BRs, > Yuqi > > On Sun, 25 Jun 2023 at 19:00, Yuqi Gu wrote: > > > Hi Luca, > > > > Sorry for the late reply. > > Let me try to set the default JVM version to 11 as you mentioned. > > I will ping you once JDK 11 has been properly configured on the Arm nodes. > > Thanks. > > > > BRs, > > Yuqi > > > > > > Luca Toscano 于2023年6月25日周日 15:17写道: > > > > > Hi folks! > > > > > > Sorry to ping you again, but the ARM workers are still down. Anybody > > > with access credentials that can fix them? > > > > > > Thanks :) > > > > > > Luca > > > > > > On Fri, Jun 16, 2023 at 5:50 PM Luca Toscano > > > wrote: > > > > > > > > Hi folks, > > > > > > > > sending an email in here for visibility. After the Jenkins upgrade I > > > > see the following errors on the Arm workers: > > > > > > > > Exception in thread "main" java.lang.UnsupportedClassVersionError: > > > > hudson/remoting/Launcher has been compiled by a more recent version of > > > > the Java Runtime (class file version 55.0), this version of the Java > > > > Runtime only recognizes class file versions up to 52.0 > > > > > > > > I resolved the issue in the other nodes, setting the default JVM > > > > version to 11+, since the above error msg suggests that we need it > > > > from now on to run the Jenkins agents. I don't have access, afaics, on > > > > the Arm nodes, can somebody check/fix please? > > > > > > > > Thanks in advance :) > > > > > > > > Luca > > > > >
Re: Arm workers need to be updated
Hi folks! Sorry to ping you again, but the ARM workers are still down. Anybody with access credentials that can fix them? Thanks :) Luca On Fri, Jun 16, 2023 at 5:50 PM Luca Toscano wrote: > > Hi folks, > > sending an email in here for visibility. After the Jenkins upgrade I > see the following errors on the Arm workers: > > Exception in thread "main" java.lang.UnsupportedClassVersionError: > hudson/remoting/Launcher has been compiled by a more recent version of > the Java Runtime (class file version 55.0), this version of the Java > Runtime only recognizes class file versions up to 52.0 > > I resolved the issue in the other nodes, setting the default JVM > version to 11+, since the above error msg suggests that we need it > from now on to run the Jenkins agents. I don't have access, afaics, on > the Arm nodes, can somebody check/fix please? > > Thanks in advance :) > > Luca
Arm workers need to be updated
Hi folks, sending an email in here for visibility. After the Jenkins upgrade I see the following errors on the Arm workers: Exception in thread "main" java.lang.UnsupportedClassVersionError: hudson/remoting/Launcher has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0 I resolved the issue in the other nodes, setting the default JVM version to 11+, since the above error msg suggests that we need it from now on to run the Jenkins agents. I don't have access, afaics, on the Arm nodes, can somebody check/fix please? Thanks in advance :) Luca
Re: [DISCUSSION] Propose to remove Sqoop
+1! Maybe in the future Gobblin could be a valid alternative in Bigtop (IIRC it recently graduated as a full Apache project). The Wikimedia foundation replaced sqoop with Gobblin and so far we are very happy about the move. Luca On Mon, Aug 8, 2022 at 11:39 AM Yuqi Gu wrote: > > Hi folks, > > Apache Sqoop was retired last year. ( > https://lists.apache.org/thread/pkylvs0qrpmpxdlgfdmdj32rjk9q24x4) > I propose to remove it from Bigtop stack. > I'd like to proceed with a PR If there are no objections. > > BRs, > Yuqi > retired
Re: Preparation status of the 3.0.1 release
Hi Kengo! What is the plan for Nexus? IIUC we have a container running on every ec2 VM at the moment, is the plan to create a more central VM that all instances use? (Curious this is why I am asking). Let me know if you need any help for the release! Luca On Wed, Mar 9, 2022 at 1:03 AM Kengo Seki wrote: > > Thanks a lot, Luca! I'll start the release builds after testing and merging > BIGTOP-3649. > (I may need a few days to set up an EC2 instance for Nexus and configure CI > jobs to leverage it after merging) > > Kengo Seki apache.org> > > > On Mon, Mar 7, 2022 at 11:42 PM Luca Toscano wrote: > > > Hi everybody, > > > > Infra answered and unblocked the IPs. More details in > > https://issues.apache.org/jira/browse/INFRA-22953 > > > > Luca > > > > > > On Thu, Mar 3, 2022 at 11:36 AM Luca Toscano > > wrote: > > > > > > Hi everybody, > > > > > > Still no answers from Infra, I sent another email this morning, I'll > > > update the list as soon as I receive something :) > > > > > > Luca > > > > > > On Tue, Mar 1, 2022 at 2:25 PM Kengo Seki wrote: > > > > > > > > Hi everyone, > > > > > > > > As discussed in [1], I'd like to publish the 3.0.1 release based on > > > > branch-3.0 soon, > > > > which mainly focuses on addressing the Log4Shell vulnerabilities. > > > > > > > > I've created all CI jobs which are required to that release, > > > > but some of the CI nodes are currently blacklisted from the ASF sites. > > > > Luca has already asked the ASF Infra to remove that restriction [2], > > > > but they still seem to be banned as of now. > > > > > > > > Once they become available again, I'm going to start the release > > process. > > > > (but it will take a few weeks due to long build/test/transfer time, as > > > > usual) > > > > So let me know if there are any fixes that you'd like to include into > > it. > > > > > > > > [1]: https://lists.apache.org/thread/vor809oqqcjxso98nxrml5s3lthrlws5 > > > > [2]: > > https://issues.apache.org/jira/browse/BIGTOP-3644#comment-17498543 > > > > > > > > Kengo Seki apache.org> > >
Re: Preparation status of the 3.0.1 release
Hi everybody, Infra answered and unblocked the IPs. More details in https://issues.apache.org/jira/browse/INFRA-22953 Luca On Thu, Mar 3, 2022 at 11:36 AM Luca Toscano wrote: > > Hi everybody, > > Still no answers from Infra, I sent another email this morning, I'll > update the list as soon as I receive something :) > > Luca > > On Tue, Mar 1, 2022 at 2:25 PM Kengo Seki wrote: > > > > Hi everyone, > > > > As discussed in [1], I'd like to publish the 3.0.1 release based on > > branch-3.0 soon, > > which mainly focuses on addressing the Log4Shell vulnerabilities. > > > > I've created all CI jobs which are required to that release, > > but some of the CI nodes are currently blacklisted from the ASF sites. > > Luca has already asked the ASF Infra to remove that restriction [2], > > but they still seem to be banned as of now. > > > > Once they become available again, I'm going to start the release process. > > (but it will take a few weeks due to long build/test/transfer time, as > > usual) > > So let me know if there are any fixes that you'd like to include into it. > > > > [1]: https://lists.apache.org/thread/vor809oqqcjxso98nxrml5s3lthrlws5 > > [2]: https://issues.apache.org/jira/browse/BIGTOP-3644#comment-17498543 > > > > Kengo Seki apache.org>
[jira] [Created] (BIGTOP-3650) Modernize deb packages
Luca Toscano created BIGTOP-3650: Summary: Modernize deb packages Key: BIGTOP-3650 URL: https://issues.apache.org/jira/browse/BIGTOP-3650 Project: Bigtop Issue Type: Improvement Affects Versions: 3.0.0, 3.1.0 Reporter: Luca Toscano We should improve the configuration of the Debian packages now that we are supporting only Debian 10 and 11. The first step should be to raise compat to something like 12 and fix anything that breaks the builds. Moreover, a look at lintian's error/warnings would be good to improve the overall packaging following upstream's suggestions. Last but not the least, it would be great to start adding systemd units to Debian packages instead of init.d configs. This step is complex and involves some code (for example, we could create the equivalent of init.d.tmpl for systemd) but the Debian packages could be the first configured for it. -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Preparation status of the 3.0.1 release
Hi everybody, Still no answers from Infra, I sent another email this morning, I'll update the list as soon as I receive something :) Luca On Tue, Mar 1, 2022 at 2:25 PM Kengo Seki wrote: > > Hi everyone, > > As discussed in [1], I'd like to publish the 3.0.1 release based on > branch-3.0 soon, > which mainly focuses on addressing the Log4Shell vulnerabilities. > > I've created all CI jobs which are required to that release, > but some of the CI nodes are currently blacklisted from the ASF sites. > Luca has already asked the ASF Infra to remove that restriction [2], > but they still seem to be banned as of now. > > Once they become available again, I'm going to start the release process. > (but it will take a few weeks due to long build/test/transfer time, as > usual) > So let me know if there are any fixes that you'd like to include into it. > > [1]: https://lists.apache.org/thread/vor809oqqcjxso98nxrml5s3lthrlws5 > [2]: https://issues.apache.org/jira/browse/BIGTOP-3644#comment-17498543 > > Kengo Seki apache.org>
Re: Release during the next weeks?
Hi Kengo and Masatake, Definitely 3.0.1 is ok! Having new packages out with the log4j fix is enough in my opinion, and then 3.1 can follow later on with more changes. Thanks! On Tue, Feb 22, 2022 at 3:33 AM Kengo Seki wrote: > > I'd also release 3.0.1 based on branch-3.0 first, since 3.1.0 drops Debian > 9 (BIGTOP-3629) but its users may need a patch release for addressing > Log4Shell vulnabilites. > Is it OK for you, Luca? And if it's OK, do you have any fixes to be > included in 3.0.1 (Debian 11 support, for example)? > > Kengo Seki apache.org> > > > On Mon, Feb 21, 2022 at 6:02 PM Masatake Iwasaki < > iwasak...@oss.nttdata.co.jp> wrote: > > > Hi Luca, > > > > How about releasing 3.0.1 first? > > I think issues under BIGTOP-3613 are landed to both master and branch-3.0. > > > > I want to BIGTOP-3606 be in 3.1, it is not yet resolved. > > After Bigtop 3.0.1, we can release Bigtop 3.1.0 even without BIGTOP-3606. > > If it takes more time in Hadoop side, we can address the Hadoop 3.2.3 in > > Bigtop 3.1.1. > > > > Thanks, > > Masatake Iwasaki > > > > On 2022/02/21 17:08, Luca Toscano wrote: > > > Hi everybody, > > > > > > I am wondering if it would be good to release 3.1 during the coming > > > weeks to address the log4j vulnerabilities. Thoughts? No idea how > > > tough the process is, but I'll help if needed! > > > > > > Luca > >
Release during the next weeks?
Hi everybody, I am wondering if it would be good to release 3.1 during the coming weeks to address the log4j vulnerabilities. Thoughts? No idea how tough the process is, but I'll help if needed! Luca
Re: Followup on yesterday
Thanks for the explanation! I was able to use nsenter on docker-worker-7 to check Nexus logs, and I see artifacts being retrieved (mostly Maven related). I also see some logs like the following (always for apache.snapshots), repeated every hour: 2022-02-20 05:57:06,677+ WARN [ar-7-thread-4] *TASK org.sonatype.nexus.proxy.maven.routing.internal.RemoteContentDiscovererImpl - Remote strategy prefix-file error on M2Repository(id=apache.snapshots): org.apache.http.conn.ConnectTimeoutException: Connect to repository.apache.org:443 [repository.apache.org/136.243.146.148] failed: connect timed out I also filed https://issues.apache.org/jira/browse/BIGTOP-3644 (and https://github.com/apache/bigtop/pull/865) as attempt to improve the Gradle download (that IIUC doesn't use any proxy but that could use some retry logic). WIll keep investigating, if anybody has suggestions please let me know :) Luca On Sat, Feb 19, 2022 at 7:21 PM Olaf Flebbe wrote: > > Hi, > > glad you are asking how to make sure that a caching server is used: > > https://github.com/apache/bigtop/blob/8c323c4f12534508b6ffb45603db7cbf6e0a145f/build.gradle#L477 > > <https://github.com/apache/bigtop/blob/8c323c4f12534508b6ffb45603db7cbf6e0a145f/build.gradle#L477> > > The gradle task "configure-nexus“ > is configuring $HOME/.m2/settings.xml file for maven to download for instance > maven central via nexus instead of downloading from maven central directly. > > However this will only work for maven, not handling gradle or ivy builds > AFAIK. > > Best > Olaf > > > > Am 19.02.2022 um 14:23 schrieb Luca Toscano : > > > > Hi everybody, > > > > I have one doubt related to Nexus, namely when it is used. I tried to > > check [1] as random example, and I see that we trigger at some point > > the configure-nexus gradle code, but when the debuild script kicks in, > > I see stuff like: > > > > + mvn clean install -DskipTests -Dhadoop.version=3.2.2 > > -Dmaven.buildNumber.revisionOnScmFailure=v2.4.1 -Phadoop-3 -Pyarn > > -Dmaven.repo.local=/var/lib/jenkins/.m2/repository > > > > Is the mvn command launched inside debuild using the nexus cache? > > > > Luca > > > > > > [1] > > https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/COMPONENTS=alluxio,OS=debian-11/lastBuild/consoleFull > > > > On Thu, Feb 17, 2022 at 9:48 PM Olaf Flebbe wrote: > >> > >> Hi everyone, > >> > >> Yesterday Luca Toscano and me had a call to look into improving the > >> situation of artifact downloads by caching . > >> > >> I was a bit surprised that the „nexus“ code is still in place and still > >> seem to work somehow. > >> > >> Since having a repository server (a specialized proxy for maven repos) is > >> technically a much cleaner solution than messing with the .m2/repository > >> maven cache -- since it can be shared across architectures and os and even > >> support more built tools -- I would like to step back from my proposal to > >> use docker volumes to share the raw m2 cache between instances. > >> > >> What need to be done is to either update to a current version of nexus or > >> switch to a different maven proxy which can be setup , updated and > >> configured more easily. > >> > >> I asked a search machine for alternatives and tripped over this project > >> https://github.com/jenkins-x/bucketrepo > >> which promises to be a low-footprint minimal replacement for nexus, which > >> could even use S3 as a backing store. > >> > >> There was a configuration example for nginx as well : > >> https://github.com/lkiesow/weblog.lkiesow.de/blob/master/20170413-nginx-as-fast-maven-repository-proxy.md > >> > >> Best > >> Olaf >
[jira] [Created] (BIGTOP-3644) Improve dowload artifact resilency in Gradle/Maven/etc..
Luca Toscano created BIGTOP-3644: Summary: Improve dowload artifact resilency in Gradle/Maven/etc.. Key: BIGTOP-3644 URL: https://issues.apache.org/jira/browse/BIGTOP-3644 Project: Bigtop Issue Type: Improvement Reporter: Luca Toscano There seem to be a lot of failures in building trunk packages in CI related to downloading artifacts. One of the issues seems to be using the Gradle download plugin (IIUC outside the realm of our NEXUS cache proxy) usually ending up in error messages like: * What went wrong: Execution failed for task ':hbase-download'. > Could not download file My understanding is that we are using the Gradle plugin de.undercouch.download, versio 3.2.0. From 4.0+, a nice retry functionality was added: https://github.com/michel-kraemer/gradle-download-task/commit/c6d616a1184b935d16dd3be9f694d56ced1c01df And afaics from https://github.com/michel-kraemer/gradle-download-task#migrating-from-version-3x-to-4x we should be compatible with a 4.x version. -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Followup on yesterday
Hi everybody, I have one doubt related to Nexus, namely when it is used. I tried to check [1] as random example, and I see that we trigger at some point the configure-nexus gradle code, but when the debuild script kicks in, I see stuff like: + mvn clean install -DskipTests -Dhadoop.version=3.2.2 -Dmaven.buildNumber.revisionOnScmFailure=v2.4.1 -Phadoop-3 -Pyarn -Dmaven.repo.local=/var/lib/jenkins/.m2/repository Is the mvn command launched inside debuild using the nexus cache? Luca [1] https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/COMPONENTS=alluxio,OS=debian-11/lastBuild/consoleFull On Thu, Feb 17, 2022 at 9:48 PM Olaf Flebbe wrote: > > Hi everyone, > > Yesterday Luca Toscano and me had a call to look into improving the situation > of artifact downloads by caching . > > I was a bit surprised that the „nexus“ code is still in place and still seem > to work somehow. > > Since having a repository server (a specialized proxy for maven repos) is > technically a much cleaner solution than messing with the .m2/repository > maven cache -- since it can be shared across architectures and os and even > support more built tools -- I would like to step back from my proposal to use > docker volumes to share the raw m2 cache between instances. > > What need to be done is to either update to a current version of nexus or > switch to a different maven proxy which can be setup , updated and configured > more easily. > > I asked a search machine for alternatives and tripped over this project > https://github.com/jenkins-x/bucketrepo > which promises to be a low-footprint minimal replacement for nexus, which > could even use S3 as a backing store. > > There was a configuration example for nginx as well : > https://github.com/lkiesow/weblog.lkiesow.de/blob/master/20170413-nginx-as-fast-maven-repository-proxy.md > > Best > Olaf
Re: Jenkins upgrade scheduled for Feb 18th 9:00 CET
Upgraded Jenkins to 2.334 and upgraded the Pipeline Groovy plugin (both for security issues). Everything looks good, lemme know if you notice anything strange! Luca On Thu, Feb 17, 2022 at 9:48 AM Luca Toscano wrote: > > Hi everybody, > > I am planning to upgrade Jenkins tomorrow (Feb 18th) at around 9:00 > CET if nobody opposes it. I will stop the current package builds > (mostly trunk + old backlog queue when the ppc node was down), since > they will be restarted on Saturday (part of the weekly builds). > > Let me know if others already planned to do it or if it is better to wait. > > Thanks! > > Luca
Jenkins upgrade scheduled for Feb 18th 9:00 CET
Hi everybody, I am planning to upgrade Jenkins tomorrow (Feb 18th) at around 9:00 CET if nobody opposes it. I will stop the current package builds (mostly trunk + old backlog queue when the ppc node was down), since they will be restarted on Saturday (part of the weekly builds). Let me know if others already planned to do it or if it is better to wait. Thanks! Luca
Re: Network failures while maven runs
Hi Konstantin, I am seeing two type of failures: 1) Maven download errors that don't show (afaics) any hint about what failed. Example: https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/COMPONENTS=kafka,OS=centos-7-ppc64le/lastBuild/console 2) Connect timeouts while downloading poms, example: https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/COMPONENTS=hive,OS=debian-10-ppc64le/lastBuild/console Usually 2) seems to be related to apache.org-related repo servers. I am not very expert with Maven but I was wondering how Bigtop releases were cut with such a degree of unpredictability in builds (if some secret option was used etc..). I naively thought that adding a basic retry to Maven would have improved the end result, but over time a lot of people (like Olaf mentioned) tried to resolve this problem so I preferred to ask first, before making silly suggestions :) Thanks, Luca On Mon, Jan 31, 2022 at 2:21 PM Konstantin Boudnik wrote: > > Hey Luca. > > Did you notice if this is happening with some specific repo server? > > Tahnks! > Cos > > On Sat, Jan 29, 2022 at 08:31AM, Luca Toscano wrote: > > Hi everybody, > > > > I have been seeing build failures for trunk packages due to maven > > network failures (mostly connection timeouts). Is there a workaround > > for this? For example retrying x times etc.. I am asking mostly > > because of ignorance, I am wondering what we do when cutting a release > > for example (to avoid re-running the main package build job due to > > some package build failures). > > > > Thanks! > > > > Luca
Re: Network failures while maven runs
Hi Olaf, On Sat, Jan 29, 2022 at 9:18 AM Olaf Flebbe wrote: > > Hi Luca, > > I think you are referring to random failures of maven downloads? Exactly yes, maven download tasks and also random connect timeouts when downloading poms. > > My next proposal would be to use docker volumes (not bind mounts) for that, > so we wont have any permission problems any more and we are left with the > server issues going out if business, eventually. To mitigate that it would be > easy to simply recreate docker volumes rather to go to all filesystems as > root removing directories we needed for the bind mount workaround. To better understand your proposal - the Docker volume would be a shared maven cache that we'll mount to containers when building? Or something else? > We could eventually pair to try to implement that, if you are not too > familiar with it. > https://docs.docker.com/storage/volumes/ Definitely, I'd be happy to! Never used volumes up to now, only bind mounts. Luca
Re: docker worker ppc-2 down
Hi Amir, Thanks a lot, everything works now! Luca On Fri, Jan 28, 2022 at 7:16 PM MrAsanjar wrote: > > Hi team, > it is running now. Please next time email me directly to asan...@apache.com > or amir.san...@ibm.com > > On Fri, Jan 28, 2022 at 4:32 AM Luca Toscano wrote: > > > Hi everybody, > > > > In a previous email I mentioned the fact that the docker ppc-2 worker > > has been down for a while. It was mentioned that Amir is the point of > > contact for the node, so I am sending a more visible email seeking for > > help :) > > > > Luca > >
Network failures while maven runs
Hi everybody, I have been seeing build failures for trunk packages due to maven network failures (mostly connection timeouts). Is there a workaround for this? For example retrying x times etc.. I am asking mostly because of ignorance, I am wondering what we do when cutting a release for example (to avoid re-running the main package build job due to some package build failures). Thanks! Luca
[jira] [Created] (BIGTOP-3636) Remove slave terminology from the project
Luca Toscano created BIGTOP-3636: Summary: Remove slave terminology from the project Key: BIGTOP-3636 URL: https://issues.apache.org/jira/browse/BIGTOP-3636 Project: Bigtop Issue Type: Improvement Reporter: Luca Toscano Hi folks, I am aware that the Bigtop community considers the use of the "slave" terminology in our docker images etc.. as a technical term, and not something offensive to others, but we should try to replace it with a more inclusive word like "worker" or "replica" in my opinion. This will cause some impact to users, since all our Docker images will need to be renamed, but it seems worth doing. Let me know your thoughts :) -- This message was sent by Atlassian Jira (v8.20.1#820001)
docker worker ppc-2 down
Hi everybody, In a previous email I mentioned the fact that the docker ppc-2 worker has been down for a while. It was mentioned that Amir is the point of contact for the node, so I am sending a more visible email seeking for help :) Luca
Re: Docker arm-5 node not reachable by Jenkins
Hi Yuqi! Down again, can you check when you have a moment? :( Luca On Mon, Jan 17, 2022 at 9:59 AM Yuqi Gu wrote: > > Luca,slave-arm-5 comes back. Sorry for the inconvenience. > > BRs, > Yuqi > > On Mon, 17 Jan 2022 at 07:51, Kengo Seki wrote: > > > Hi Luca, > > > > Amir is taking care of the PowerPC server. > > It still seems to be unavailable. Would you take a look, Amir? > > > > Kengo Seki > > > > On Sat, Jan 15, 2022 at 7:39 PM Luca Toscano > > wrote: > > > > > > https://ci.bigtop.apache.org/computer/docker%2Dslave%2Dppc%2D2/ seems > > > down as well, not sure who to contact :) > > > > > > Marking both nodes as temporarily down in Jenkins.. > > > > > > Luca > > > > > > On Sat, Jan 15, 2022 at 11:37 AM Luca Toscano > > wrote: > > > > > > > > Hi folks, > > > > > > > > I see again the same problem in > > > > https://ci.bigtop.apache.org/computer/docker-slave-arm-5/ :( > > > > > > > > Luca > > > > > > > > On Wed, Jan 5, 2022 at 10:14 AM Luca Toscano > > wrote: > > > > > > > > > > Hi to both, > > > > > > > > > > Thanks a lot for the fix, I see the host up and running now :) > > > > > > > > > > Luca > > > > > > > > > > On Wed, Jan 5, 2022 at 2:44 AM Yuqi Gu wrote: > > > > > > > > > > > > Hi Luca, > > > > > > > > > > > > *docker-slave-arm-5* comes back. > > > > > > Please feel free to reach out to me if there is still any problem. > > > > > > > > > > > > Thanks. > > > > > > > > > > > > BRs, > > > > > > Yuqi > > > > > > > > > > > > Jun HE 于2022年1月4日周二 09:43写道: > > > > > > > > > > > > > Hi Luca, > > > > > > > > > > > > > > Sorry for the late response. Yuqi and I are responsible for > > these Arm CI > > > > > > > nodes. I'll check with this and update you later. > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > Luca Toscano 于2021年12月28日周二 16:48写道: > > > > > > > > > > > > > > > Hi everybody, > > > > > > > > > > > > > > > > I've set > > https://ci.bigtop.apache.org/computer/docker-slave-arm-5/ > > > > > > > > temporarily offline in the Jenkins UI, it seems not reachable > > via ssh > > > > > > > > from the Jenkin's perspective. > > > > > > > > > > > > > > > > What is the procedure to follow for these nodes? > > > > > > > > > > > > > > > > Luca > > > > > > > > > > > > > > > > >
[jira] [Created] (BIGTOP-3633) Update build.gradle and our docs
Luca Toscano created BIGTOP-3633: Summary: Update build.gradle and our docs Key: BIGTOP-3633 URL: https://issues.apache.org/jira/browse/BIGTOP-3633 Project: Bigtop Issue Type: Improvement Reporter: Luca Toscano On the user@ mailing list it came up that build.gradle mentions, in the docs/examples, very old os supported. We should update our docs and tools to reflect what is supported now, to avoid confusion to users. -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: ci.bigtop.apache.org currently under upgrade
Jenkins up and running, there are a couple of things left to do after the upgrade but nothing blocking (more details in the task). Luca On Mon, Jan 17, 2022 at 5:56 PM Luca Toscano wrote: > > Hi everybody, > > I reached out to Olaf and Kengo to upgrade Jenkins and some plugins, > and then executed the plan > (https://issues.apache.org/jira/browse/BIGTOP-3611) some minutes ago. > There is a problem with a plugging (sshslave afaics) that is not able > to launch the agent on the various workers. > > It should be a combination of usual luck while upgrading plus some new > setting to add, I hope to get everything back in running mode soon. > > Really sorry for the trouble! > > Luca
ci.bigtop.apache.org currently under upgrade
Hi everybody, I reached out to Olaf and Kengo to upgrade Jenkins and some plugins, and then executed the plan (https://issues.apache.org/jira/browse/BIGTOP-3611) some minutes ago. There is a problem with a plugging (sshslave afaics) that is not able to launch the agent on the various workers. It should be a combination of usual luck while upgrading plus some new setting to add, I hope to get everything back in running mode soon. Really sorry for the trouble! Luca
Re: Docker arm-5 node not reachable by Jenkins
https://ci.bigtop.apache.org/computer/docker%2Dslave%2Dppc%2D2/ seems down as well, not sure who to contact :) Marking both nodes as temporarily down in Jenkins.. Luca On Sat, Jan 15, 2022 at 11:37 AM Luca Toscano wrote: > > Hi folks, > > I see again the same problem in > https://ci.bigtop.apache.org/computer/docker-slave-arm-5/ :( > > Luca > > On Wed, Jan 5, 2022 at 10:14 AM Luca Toscano wrote: > > > > Hi to both, > > > > Thanks a lot for the fix, I see the host up and running now :) > > > > Luca > > > > On Wed, Jan 5, 2022 at 2:44 AM Yuqi Gu wrote: > > > > > > Hi Luca, > > > > > > *docker-slave-arm-5* comes back. > > > Please feel free to reach out to me if there is still any problem. > > > > > > Thanks. > > > > > > BRs, > > > Yuqi > > > > > > Jun HE 于2022年1月4日周二 09:43写道: > > > > > > > Hi Luca, > > > > > > > > Sorry for the late response. Yuqi and I are responsible for these Arm CI > > > > nodes. I'll check with this and update you later. > > > > > > > > Regards, > > > > > > > > Jun > > > > > > > > Luca Toscano 于2021年12月28日周二 16:48写道: > > > > > > > > > Hi everybody, > > > > > > > > > > I've set https://ci.bigtop.apache.org/computer/docker-slave-arm-5/ > > > > > temporarily offline in the Jenkins UI, it seems not reachable via ssh > > > > > from the Jenkin's perspective. > > > > > > > > > > What is the procedure to follow for these nodes? > > > > > > > > > > Luca > > > > > > > > >
Re: Docker arm-5 node not reachable by Jenkins
Hi folks, I see again the same problem in https://ci.bigtop.apache.org/computer/docker-slave-arm-5/ :( Luca On Wed, Jan 5, 2022 at 10:14 AM Luca Toscano wrote: > > Hi to both, > > Thanks a lot for the fix, I see the host up and running now :) > > Luca > > On Wed, Jan 5, 2022 at 2:44 AM Yuqi Gu wrote: > > > > Hi Luca, > > > > *docker-slave-arm-5* comes back. > > Please feel free to reach out to me if there is still any problem. > > > > Thanks. > > > > BRs, > > Yuqi > > > > Jun HE 于2022年1月4日周二 09:43写道: > > > > > Hi Luca, > > > > > > Sorry for the late response. Yuqi and I are responsible for these Arm CI > > > nodes. I'll check with this and update you later. > > > > > > Regards, > > > > > > Jun > > > > > > Luca Toscano 于2021年12月28日周二 16:48写道: > > > > > > > Hi everybody, > > > > > > > > I've set https://ci.bigtop.apache.org/computer/docker-slave-arm-5/ > > > > temporarily offline in the Jenkins UI, it seems not reachable via ssh > > > > from the Jenkin's perspective. > > > > > > > > What is the procedure to follow for these nodes? > > > > > > > > Luca > > > > > > >
Re: Broken package builds for Trunk
Hi everybody, I opened a jira for the issue: https://issues.apache.org/jira/browse/BIGTOP-3631 Ad described in the task, a quick option is to follow https://docs.aws.amazon.com/corretto/latest/corretto-8-ug/amazon-linux-install.html and install Corretto 8 (jdk8) on the new worker nodes where Corretto 17 is used (that is IIUC the root cause of the problem). I am going to try it during the next few hours if nobody opposes it. If you think that there is a different root cause please let me know :) Luca On Tue, Jan 11, 2022 at 9:07 AM Luca Toscano wrote: > > Hi everybody, > > I checked the trunk packages build matrix and I noticed that a lot of > packages are failing to build. An example is: > > https://ci.bigtop.apache.org/job/Bigtop-trunk-packages/762/COMPONENTS=kibana,OS=debian-10/console > > [...] > Using 4 worker leases. > Starting Build > java.lang.NoClassDefFoundError: Could not initialize class > org.codehaus.groovy.vmplugin.v7.Java7 > at > org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43) > at > org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35) > at org.codehaus.groovy.reflection.ClassInfo.(ClassInfo.java:109) > [..] > > From what I can see the ones failing run > /usr/lib/jvm/java-17-amazon-corretto.x86_64/bin/java, maybe it is > related to the move to the new instances? Do we need to bump the > gradle's version to something like 7.3 [1]? > > Luca > > [1]: https://docs.gradle.org/current/userguide/compatibility.html
[jira] [Created] (BIGTOP-3631) Gradlew failing while starting on new docker workers
Luca Toscano created BIGTOP-3631: Summary: Gradlew failing while starting on new docker workers Key: BIGTOP-3631 URL: https://issues.apache.org/jira/browse/BIGTOP-3631 Project: Bigtop Issue Type: Bug Reporter: Luca Toscano Hi everybody, I am seeing failures in trunk package builds, all of them when starting gradlew: {code} + ./gradlew realclean -Pnexus=true -POS=fedora-33 -Pprefix=trunk zookeeper-pkg-ind --info Initialized native services in: /home/jenkins/.gradle/native Removing 0 daemon stop events from registry Previous Daemon (5844) stopped at Sat Jan 08 05:37:29 UTC 2022 by user or operating system Starting a Gradle Daemon, 1 stopped Daemon could not be reused, use --status for details Starting process 'Gradle build daemon'. Working directory: /home/jenkins/.gradle/daemon/5.6.4 Command: /usr/lib/jvm/java-17-amazon-corretto.x86_64/bin/java --add-opens java.base/java.util=ALL-UNNAMED --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.lang.invoke=ALL-UNNAMED --add-opens java.prefs/java.util.prefs=ALL-UNNAMED -XX:MaxMetaspaceSize=256m -XX:+HeapDumpOnOutOfMemoryError -Xms256m -Xmx512m -Dfile.encoding=UTF-8 -Duser.country=US -Duser.language=en -Duser.variant -cp /home/jenkins/.gradle/wrapper/dists/gradle-5.6.4-bin/c9880aa85176bf8c458862eb99f7e0a9/gradle-5.6.4/lib/gradle-launcher-5.6.4.jar org.gradle.launcher.daemon.bootstrap.GradleDaemon 5.6.4 Successfully started process 'Gradle build daemon' An attempt to start the daemon took 1.686 secs. The client will now receive all logging from the daemon (pid: 9326). The daemon log file: /home/jenkins/.gradle/daemon/5.6.4/daemon-9326.out.log Starting build in new daemon [memory: 536.9 MB] Closing daemon's stdin at end of input. The daemon will no longer process any standard input. Using 4 worker leases. Starting Build java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7 at org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43) at org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35) at org.codehaus.groovy.reflection.ClassInfo.(ClassInfo.java:109) at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95) {code} https://ci.bigtop.apache.org/job/Bigtop-trunk-packages/COMPONENTS=zookeeper,OS=fedora-33/lastBuild/console In the new docker nodes there seems to me a more up-to-date jdk version, java-17-amazon-corretto, that doesn't seem to be compatible with the gradle version set in the bigtop repository (see https://docs.gradle.org/current/userguide/compatibility.html). We could follow this guide to install java8: https://docs.aws.amazon.com/corretto/latest/corretto-8-ug/amazon-linux-install.html The alternative is to upgrade gradle but it seems more invasive. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BIGTOP-3630) Some repositories yield to HTTP 403
Luca Toscano created BIGTOP-3630: Summary: Some repositories yield to HTTP 403 Key: BIGTOP-3630 URL: https://issues.apache.org/jira/browse/BIGTOP-3630 Project: Bigtop Issue Type: Bug Reporter: Luca Toscano Hi everybody, at Wikimedia we use reprepro to manage our internal Debian APT repository, in which we sync/copy package from Bigtop's. Today I noticed a weird issue, namely: {code} aptmethod error receiving 'http://repos.bigtop.apache.org/releases/1.5.0/debian/10/amd64/dists/bigtop/InRelease': '403 Forbidden [IP: 52.216.83.32 80]' {code} If I swap the config from repos.bigtop.apache.org to repo.bigtop.apache.org.s3.amazonaws.com, everything works fine. The value repo.bigtop.apache.org.s3.amazonaws.com comes from the S3 config, and it is the first CNAME of repos.bigtop.apache.org: {code} $ dig repos.bigtop.apache.org +short repo.bigtop.apache.org.s3.amazonaws.com. s3-1-w.amazonaws.com. s3-w.us-east-1.amazonaws.com. 52.216.132.251 {code} Could it be that our new S3 config needs to be called repos.bigtop.apache.org.s3.amazonaws.com (see the extra s - repoS) to make everything working? -- This message was sent by Atlassian Jira (v8.20.1#820001)
Broken package builds for Trunk
Hi everybody, I checked the trunk packages build matrix and I noticed that a lot of packages are failing to build. An example is: https://ci.bigtop.apache.org/job/Bigtop-trunk-packages/762/COMPONENTS=kibana,OS=debian-10/console [...] Using 4 worker leases. Starting Build java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7 at org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43) at org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35) at org.codehaus.groovy.reflection.ClassInfo.(ClassInfo.java:109) [..] >From what I can see the ones failing run /usr/lib/jvm/java-17-amazon-corretto.x86_64/bin/java, maybe it is related to the move to the new instances? Do we need to bump the gradle's version to something like 7.3 [1]? Luca [1]: https://docs.gradle.org/current/userguide/compatibility.html
[jira] [Created] (BIGTOP-3629) Drop Debian 9 support - second attempt
Luca Toscano created BIGTOP-3629: Summary: Drop Debian 9 support - second attempt Key: BIGTOP-3629 URL: https://issues.apache.org/jira/browse/BIGTOP-3629 Project: Bigtop Issue Type: Improvement Reporter: Luca Toscano Fix For: 3.1.0 In BIGTOP-3530 it was decided to post-pone the deprecation of Debian 9 until the 3.1 release, when Debian 11's support would have been added. We are now indeed supporting Debian 11, it is time to deprecate 9 :) -- This message was sent by Atlassian Jira (v8.20.1#820001)
Changes to the Docker provisioner
Hi everybody, I filed a code review for the Docker provisioner in https://github.com/apache/bigtop/pull/851, Masatake has already reviewed it but since a lot of people use it I wanted to bring it to your attention. The main issue that I found is that on recent OSes like Debian 11, mounting /sys/fs/cgroup to the containers causes systemd and dbus to fail when starting. The only explanation that I can give is that cgroupsv2 are enabled by default, and recent enough versions of Docker compose (10.20+) support them natively. If you have a better and more precise explanation please let me know, otherwise I'd like to merge the pull request during the next few days. The idea is to have, as an experimental feature, a separate docker-compose.yml config file that doesn't contain the /sys/fs/cgroup mountpoint, and experiment with it. Let me know what you think :) Luca
Re: Docker arm-5 node not reachable by Jenkins
Hi to both, Thanks a lot for the fix, I see the host up and running now :) Luca On Wed, Jan 5, 2022 at 2:44 AM Yuqi Gu wrote: > > Hi Luca, > > *docker-slave-arm-5* comes back. > Please feel free to reach out to me if there is still any problem. > > Thanks. > > BRs, > Yuqi > > Jun HE 于2022年1月4日周二 09:43写道: > > > Hi Luca, > > > > Sorry for the late response. Yuqi and I are responsible for these Arm CI > > nodes. I'll check with this and update you later. > > > > Regards, > > > > Jun > > > > Luca Toscano 于2021年12月28日周二 16:48写道: > > > > > Hi everybody, > > > > > > I've set https://ci.bigtop.apache.org/computer/docker-slave-arm-5/ > > > temporarily offline in the Jenkins UI, it seems not reachable via ssh > > > from the Jenkin's perspective. > > > > > > What is the procedure to follow for these nodes? > > > > > > Luca > > > > >
[jira] [Created] (BIGTOP-3626) Patch log4j version of ycsb
Luca Toscano created BIGTOP-3626: Summary: Patch log4j version of ycsb Key: BIGTOP-3626 URL: https://issues.apache.org/jira/browse/BIGTOP-3626 Project: Bigtop Issue Type: Improvement Affects Versions: 1.5.0, 3.1.0 Reporter: Luca Toscano There seems to be no commits in the upstream repo over the past months, but a pull request seems taking care of the issue: https://github.com/brianfrankcooper/YCSB/pull/1583 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BIGTOP-3625) Fix Livy's build failure
Luca Toscano created BIGTOP-3625: Summary: Fix Livy's build failure Key: BIGTOP-3625 URL: https://issues.apache.org/jira/browse/BIGTOP-3625 Project: Bigtop Issue Type: Bug Affects Versions: 3.1.0 Reporter: Luca Toscano There seems to be an issue when building Livy: {code} [[1;31mERROR[m] The build could not read 1 project -> [1m[Help 1][m [[1;31mERROR[m] [[1;31mERROR[m] The project org.apache.livy:livy-scala-api_2.12:0.7.1-incubating (/bigtop/build/livy/rpm/BUILD/livy-0.7.1/scala-api/scala-2.12/pom.xml) has 1 error [[1;31mERROR[m] 'dependencies.dependency.version' for io.netty:netty-all:jar must be a valid version but is '${netty.spark-2.12.version}'. @ org.apache.livy:livy-main:0.7.1-incubating, /bigtop/build/livy/rpm/BUILD/livy-0.7.1/pom.xml, line 329, column 18 {code} More info: https://ci.bigtop.apache.org/job/Bigtop-trunk-packages/759/COMPONENTS=livy,OS=centos-8-aarch64/console In BIGTOP-3489 a patch was added based on https://github.com/apache/incubator-livy/pull/289, that IIUC was not merged. Eventually https://github.com/apache/incubator-livy/commit/97cf2f75929ef6c152afc468adbead269bd0758f was merged, should we swap patches? -- This message was sent by Atlassian Jira (v8.20.1#820001)
Docker arm-5 node not reachable by Jenkins
Hi everybody, I've set https://ci.bigtop.apache.org/computer/docker-slave-arm-5/ temporarily offline in the Jenkins UI, it seems not reachable via ssh from the Jenkin's perspective. What is the procedure to follow for these nodes? Luca
[jira] [Created] (BIGTOP-3624) Bump Alluxio's log4j dependencies to 2.17.0
Luca Toscano created BIGTOP-3624: Summary: Bump Alluxio's log4j dependencies to 2.17.0 Key: BIGTOP-3624 URL: https://issues.apache.org/jira/browse/BIGTOP-3624 Project: Bigtop Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Luca Toscano Fix For: 3.1.0 https://github.com/Alluxio/alluxio/pull/14665 -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Some ideas about Jenkins
Hi everybody, cleanups done, the master is now using ~240G of disk space (24% of the root partition). Luca On Sun, Dec 19, 2021 at 9:16 AM Luca Toscano wrote: > > Hi Kengo, > > Thanks for the detailed explanation! I'll try to do some cleanup > today/tomorrow, I am 100% in favor of having more EBS space if needed > (if backed up regularly), more free space to dedicate to other > projects if needed :) > > Luca > > On Sun, Dec 19, 2021 at 1:46 AM Kengo Seki wrote: > > > > Thank you for cleaning up the Jenkins master! As you said, it often > > runs short of disk capacity. > > Your suggestion (keeping only recent build results of actively > > developing/maintaining branches and discarding others) > > sounds reasonable to me, because all of our release artifacts are on > > ASF's distribution server and S3, > > and we can also rerun the Jenkins job, if needed. > > > > On the other hand, after migrating to the new account, the situation > > will get better. > > We've used m3.xlarge (4vCPUs, 15GiB mem) for the Jenkins master, which > > costs $0.308/h. > > If we replace it with m6a.xlarge (4vCPUs, 16GiB mem) in the new > > account, it costs only $0.1728/h, so we can save $97.344/mo. > > The price of gp3 EBS volume is $0.08/mo for 1GB, so I'm estimating we > > can add 1.2TB extra capacity to the master node. > > > > Kengo Seki > > > > On Sat, Dec 18, 2021 at 6:45 PM Luca Toscano wrote: > > > > > > Hi Kengo! > > > > > > Thanks a lot for all the info, I don't have time this weekend too, so > > > I'll let you do the work without messing up the current environment. I > > > am more than happy to help if you need anything next week, feel free > > > to drop me an email in case! > > > > > > One thing that I'd like to sort out with you and others, before > > > proceeding, is the retention of the Jenkins build logs/files/etc.. We > > > have a huge partition on the master instance at the moment, that I > > > believe is/was filled up by old files that we don't really use. For > > > example, can we clean up old build logs after we cut a release? > > > Ideally, in my opinion, if those are not needed afterwards we could: > > > - Keep 1 or 2 recent builds for Trunk (for all the various jobs). The > > > job that builds packages in trunk, for example, uses a ton of GBs for > > > every round of builds (so every week). > > > - Clean up all the rest (even manually, build logs for 1.5.0, 3.0.0, > > > 1.3.0, etc..) > > > > > > I already cleaned up a bit the other week since the master instance's > > > partition was filled up, I'd be happy to finish the work this weekend > > > if you agree :) > > > > > > Lemme know! > > > > > > Thanks, > > > > > > Luca > > > > > > On Fri, Dec 17, 2021 at 2:13 PM Kengo Seki wrote: > > > > > > > > Hi Luca, thank you for working on them! (and thank you for helping him, > > > > Olaf!) > > > > Let me share our current situation about the CI environment. > > > > > > > > Our CI infrastructure is provided by courtesy of AWS, and for > > > > addressing their security request, > > > > we're going to integrate our environment (mainly EC2 instances and > > > > files on S3) > > > > to another AWS account in this month (I'm sorry for being late to > > > > share this information). > > > > So BIGTOP-3612 is very helpful for us, because we can take over the > > > > contents under /home/jenkins > > > > by sharing the EBS snapshot between the old and new accounts. > > > > > > > > On the other hand, BIGTOP-3611 is not necessarily required, because > > > > new EC2 instances > > > > for Jenkins and workers will be launched in the new account within one > > > > or two weeks. > > > > But if you could upgrade Jenkins, it's still be helpful, because we > > > > can check if all of the Jenkins plugins > > > > we're currently using are compatible with the latest version in advance. > > > > > > > > I'm a bit busy until this weekend, so I'm planning to start the > > > > migration work > > > > (copying files between S3 buckets, launching Jenkins and worker nodes, > > > > etc.) next week. > > > > So, if you are going to work on the issues above in this weekend, > > > > would y
[jira] [Created] (BIGTOP-3622) Remove usage of $(PWD) in Debian rules files
Luca Toscano created BIGTOP-3622: Summary: Remove usage of $(PWD) in Debian rules files Key: BIGTOP-3622 URL: https://issues.apache.org/jira/browse/BIGTOP-3622 Project: Bigtop Issue Type: Bug Affects Versions: 3.0.0, 1.5.0 Reporter: Luca Toscano Fix For: 1.5.1, 3.1.0 In Debian rules the usage of $(PWD) is discouraged, $(CURDIR) is preferred to avoid build failures. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BIGTOP-3621) Review Oozie 4.x and 5.x configs for CVE-2021-44228
Luca Toscano created BIGTOP-3621: Summary: Review Oozie 4.x and 5.x configs for CVE-2021-44228 Key: BIGTOP-3621 URL: https://issues.apache.org/jira/browse/BIGTOP-3621 Project: Bigtop Issue Type: Bug Affects Versions: 3.0.0, 1.5.0 Reporter: Luca Toscano Fix For: 1.5.1, 3.0.1 In Bigtop 1.5 Oozie seems to include log4j 2.6.x jars: {code} $ dpkg -L oozie | egrep *log4j.*2.6.* /usr/lib/oozie/lib/log4j-api-2.6.2.jar /usr/lib/oozie/lib/log4j-core-2.6.2.jar /usr/lib/oozie/lib/log4j-slf4j-impl-2.6.2.jar /usr/lib/oozie/lib/log4j-web-2.6.2.jar {code} On vanilla Oozie branch-4.3's dependency:tree I can find a reference to the lib, but from the Bigtop's build log it seems pulled in by the hcatalog pom.xml. I quickly tried to exclude the log4j dependency and it worked (no extra log4j jars in the .deb), but it is probably not the right fix since the hive dependencies may need a more up-to-date log4j version. We should also review Oozie's 5.x version for Bigtop 3.x -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Some ideas about Jenkins
Hi Kengo, Thanks for the detailed explanation! I'll try to do some cleanup today/tomorrow, I am 100% in favor of having more EBS space if needed (if backed up regularly), more free space to dedicate to other projects if needed :) Luca On Sun, Dec 19, 2021 at 1:46 AM Kengo Seki wrote: > > Thank you for cleaning up the Jenkins master! As you said, it often > runs short of disk capacity. > Your suggestion (keeping only recent build results of actively > developing/maintaining branches and discarding others) > sounds reasonable to me, because all of our release artifacts are on > ASF's distribution server and S3, > and we can also rerun the Jenkins job, if needed. > > On the other hand, after migrating to the new account, the situation > will get better. > We've used m3.xlarge (4vCPUs, 15GiB mem) for the Jenkins master, which > costs $0.308/h. > If we replace it with m6a.xlarge (4vCPUs, 16GiB mem) in the new > account, it costs only $0.1728/h, so we can save $97.344/mo. > The price of gp3 EBS volume is $0.08/mo for 1GB, so I'm estimating we > can add 1.2TB extra capacity to the master node. > > Kengo Seki > > On Sat, Dec 18, 2021 at 6:45 PM Luca Toscano wrote: > > > > Hi Kengo! > > > > Thanks a lot for all the info, I don't have time this weekend too, so > > I'll let you do the work without messing up the current environment. I > > am more than happy to help if you need anything next week, feel free > > to drop me an email in case! > > > > One thing that I'd like to sort out with you and others, before > > proceeding, is the retention of the Jenkins build logs/files/etc.. We > > have a huge partition on the master instance at the moment, that I > > believe is/was filled up by old files that we don't really use. For > > example, can we clean up old build logs after we cut a release? > > Ideally, in my opinion, if those are not needed afterwards we could: > > - Keep 1 or 2 recent builds for Trunk (for all the various jobs). The > > job that builds packages in trunk, for example, uses a ton of GBs for > > every round of builds (so every week). > > - Clean up all the rest (even manually, build logs for 1.5.0, 3.0.0, > > 1.3.0, etc..) > > > > I already cleaned up a bit the other week since the master instance's > > partition was filled up, I'd be happy to finish the work this weekend > > if you agree :) > > > > Lemme know! > > > > Thanks, > > > > Luca > > > > On Fri, Dec 17, 2021 at 2:13 PM Kengo Seki wrote: > > > > > > Hi Luca, thank you for working on them! (and thank you for helping him, > > > Olaf!) > > > Let me share our current situation about the CI environment. > > > > > > Our CI infrastructure is provided by courtesy of AWS, and for > > > addressing their security request, > > > we're going to integrate our environment (mainly EC2 instances and files > > > on S3) > > > to another AWS account in this month (I'm sorry for being late to > > > share this information). > > > So BIGTOP-3612 is very helpful for us, because we can take over the > > > contents under /home/jenkins > > > by sharing the EBS snapshot between the old and new accounts. > > > > > > On the other hand, BIGTOP-3611 is not necessarily required, because > > > new EC2 instances > > > for Jenkins and workers will be launched in the new account within one > > > or two weeks. > > > But if you could upgrade Jenkins, it's still be helpful, because we > > > can check if all of the Jenkins plugins > > > we're currently using are compatible with the latest version in advance. > > > > > > I'm a bit busy until this weekend, so I'm planning to start the migration > > > work > > > (copying files between S3 buckets, launching Jenkins and worker nodes, > > > etc.) next week. > > > So, if you are going to work on the issues above in this weekend, > > > would you share the result after that? > > > > > > Kengo Seki > > > > > > On Thu, Dec 16, 2021 at 1:30 AM Olaf Flebbe wrote: > > > > > > > > hi luca > > > > > > > > we can do the update together in the next days (evening local time in > > > > eu). > > > > > > > > best > > > > olaf > > > > > > > > > Am 15.12.2021 um 09:37 schrieb Luca Toscano : > > > > > > > > > > Hi everybody, > > > > > > > > > > A
Re: ci.bigtop.apache.org's TLS cert expired
Thanks Kengo! Certs renewed :) One nit - the name of the docker container to stop is "jenkins-master-8080", not "jenkins-master" (maybe it doesn't even need to be stopped? Is stopping httpd sufficient?) Luca On Sat, Dec 18, 2021 at 12:54 PM Kengo Seki wrote: > > That document is up-to-date and should work ;) > (Only docker container name may be different, IIRC) > > Kengo Seki > > On Sat, Dec 18, 2021 at 7:12 PM Luca Toscano wrote: > > > > Hi everybody, > > > > the TLS cert for ci.bigtop.apache.org is expired, I found > > https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide#BigtopCISetupGuide-Renewingthecert > > but before executing it I'd like to get some confirmation that the > > guide is up-to-date :) > > > > Luca
ci.bigtop.apache.org's TLS cert expired
Hi everybody, the TLS cert for ci.bigtop.apache.org is expired, I found https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide#BigtopCISetupGuide-Renewingthecert but before executing it I'd like to get some confirmation that the guide is up-to-date :) Luca
Re: Some ideas about Jenkins
Hi Kengo! Thanks a lot for all the info, I don't have time this weekend too, so I'll let you do the work without messing up the current environment. I am more than happy to help if you need anything next week, feel free to drop me an email in case! One thing that I'd like to sort out with you and others, before proceeding, is the retention of the Jenkins build logs/files/etc.. We have a huge partition on the master instance at the moment, that I believe is/was filled up by old files that we don't really use. For example, can we clean up old build logs after we cut a release? Ideally, in my opinion, if those are not needed afterwards we could: - Keep 1 or 2 recent builds for Trunk (for all the various jobs). The job that builds packages in trunk, for example, uses a ton of GBs for every round of builds (so every week). - Clean up all the rest (even manually, build logs for 1.5.0, 3.0.0, 1.3.0, etc..) I already cleaned up a bit the other week since the master instance's partition was filled up, I'd be happy to finish the work this weekend if you agree :) Lemme know! Thanks, Luca On Fri, Dec 17, 2021 at 2:13 PM Kengo Seki wrote: > > Hi Luca, thank you for working on them! (and thank you for helping him, Olaf!) > Let me share our current situation about the CI environment. > > Our CI infrastructure is provided by courtesy of AWS, and for > addressing their security request, > we're going to integrate our environment (mainly EC2 instances and files on > S3) > to another AWS account in this month (I'm sorry for being late to > share this information). > So BIGTOP-3612 is very helpful for us, because we can take over the > contents under /home/jenkins > by sharing the EBS snapshot between the old and new accounts. > > On the other hand, BIGTOP-3611 is not necessarily required, because > new EC2 instances > for Jenkins and workers will be launched in the new account within one > or two weeks. > But if you could upgrade Jenkins, it's still be helpful, because we > can check if all of the Jenkins plugins > we're currently using are compatible with the latest version in advance. > > I'm a bit busy until this weekend, so I'm planning to start the migration work > (copying files between S3 buckets, launching Jenkins and worker nodes, > etc.) next week. > So, if you are going to work on the issues above in this weekend, > would you share the result after that? > > Kengo Seki > > On Thu, Dec 16, 2021 at 1:30 AM Olaf Flebbe wrote: > > > > hi luca > > > > we can do the update together in the next days (evening local time in eu). > > > > best > > olaf > > > > > Am 15.12.2021 um 09:37 schrieb Luca Toscano : > > > > > > Hi everybody, > > > > > > Any feedback? I'd like to upgrade, if everybody agrees, Jenkins during > > > the next days. If anybody can review the procedure and give me a +1/-1 > > > I'd be grateful :) > > > Moreover, if anybody wants to be online with me when I do the upgrade > > > it would be really great, so if anything goes wrong there will be more > > > people watching. The upgrade itself shouldn't last long (10/15 mins if > > > everything goes fine). > > > > > > Thanks in advance! > > > > > > Luca > > > > > >> On Sat, Dec 11, 2021 at 10:01 AM Luca Toscano > > >> wrote: > > >> > > >> Hi everybody, > > >> > > >> I opened a couple of Jiras for Jenkins: > > >> - https://issues.apache.org/jira/browse/BIGTOP-3611 - Upgrade Jenkins > > >> to the latest upstream > > >> - https://issues.apache.org/jira/browse/BIGTOP-3612 - Add a backup for > > >> Jenkins' /home/jenkins dir > > >> > > >> I didn't find anything open for these topics, apologies in advance in > > >> case there is known work in progress. > > >> > > >> Let me know your thoughts :) > > >> > > >> Luca > >
Re: Some ideas about Jenkins
Hi everybody, Any feedback? I'd like to upgrade, if everybody agrees, Jenkins during the next days. If anybody can review the procedure and give me a +1/-1 I'd be grateful :) Moreover, if anybody wants to be online with me when I do the upgrade it would be really great, so if anything goes wrong there will be more people watching. The upgrade itself shouldn't last long (10/15 mins if everything goes fine). Thanks in advance! Luca On Sat, Dec 11, 2021 at 10:01 AM Luca Toscano wrote: > > Hi everybody, > > I opened a couple of Jiras for Jenkins: > - https://issues.apache.org/jira/browse/BIGTOP-3611 - Upgrade Jenkins > to the latest upstream > - https://issues.apache.org/jira/browse/BIGTOP-3612 - Add a backup for > Jenkins' /home/jenkins dir > > I didn't find anything open for these topics, apologies in advance in > case there is known work in progress. > > Let me know your thoughts :) > > Luca
[jira] [Created] (BIGTOP-3614) Docker provisioner fails to start a cluster on Debian 10/11 images
Luca Toscano created BIGTOP-3614: Summary: Docker provisioner fails to start a cluster on Debian 10/11 images Key: BIGTOP-3614 URL: https://issues.apache.org/jira/browse/BIGTOP-3614 Project: Bigtop Issue Type: Bug Reporter: Luca Toscano Fix For: 3.1.0 I am trying to run smoke tests for the trunk packages, but I keep seeing a failure: {code} ./docker-hadoop.sh --create 1 --image bigtop/puppet:trunk-debian-11 --memory 8g --repo file:///bigtop-home/output/apt --disable-gpg-check --stack hdfs,yarn,mapreduce,hbase --smoke-tests hbase {code} {code} Failed to connect to bus: No such file or directory {code} I used nsenter -m to connect to the container, created the /run/dbus directory, and ran ./docker-hadoop.h --provision, and the issue disappeared (I am seeing others right now but I will probably open other tasks). IIUC Docker compose should take care of the /run/dbus directory (since it is used to store the dbus system socket), but I can't find a way to do it properly (in theory it should be mounted as tmpfs). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BIGTOP-3613) Review log4j configurations for CVE-2021-44228
Luca Toscano created BIGTOP-3613: Summary: Review log4j configurations for CVE-2021-44228 Key: BIGTOP-3613 URL: https://issues.apache.org/jira/browse/BIGTOP-3613 Project: Bigtop Issue Type: Sub-task Affects Versions: 3.1.0 Reporter: Luca Toscano Due to CVE-2021-44228, it would be great to avoid shipping 3.1 with the affected log4j versions, or alternatively to apply the workarounds to patch the issue (like -Dlog4j2.formatMsgNoLookups=true etc..) More info: https://github.com/advisories/GHSA-jfh8-c2jp-5v3q -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Power (ppc64le) CI/CD upgrade
Hi Olaf, makes sense! I am unable to ssh to the ppc node though (I have some trouble decrypting the jenkins ssh key), if you have time later on could you please try to ssh and docker login? Thanks! Luca On Sat, Dec 11, 2021 at 9:43 AM Olaf Flebbe wrote: > > hi > > after setup a new node you need to do a „docker login“ on the jenkins slave > account once. IIUC this node has been setup only a few days before > > olaf > > > Am 11.12.2021 um 09:17 schrieb Luca Toscano : > > > > Hi! > > > > I am very new to the project so I don't have a lot of context on the > > ppc64 node, but today I tried to rebuild the docker images and all the > > jobs for ppc64 failed: > > https://ci.bigtop.apache.org/job/Docker-Puppet-Trunk/34/ > > > > It is weird since, from the logs, it seems that the node works but it > > fails at the end when trying to push to dockerhub: > > > > https://ci.bigtop.apache.org/job/Docker-Puppet-Trunk/DISTRO=debian-10,PLATFORM=ppc64le-slave/lastBuild/console > > > > Is it possible that some credentials are not available anymore? Does > > anybody else have context on this failure? > > > > Thanks! > > > > Luca > > > >> On Thu, Dec 2, 2021 at 9:15 PM MrAsanjar wrote: > >> > >> done, please verify > >> > >>> On Thu, Dec 2, 2021 at 12:33 PM MrAsanjar wrote: > >>> > >>> Hi > >>> I have good news and bad news :) The good news is IBM has upgraded the > >>> Power CI/CD server to a faster processor with 16 cores and NVMe drives. > >>> The bad news is during the process, somehow, cloning had failed :( I'll > >>> try to build a new Jenkins slave with the same IP. > >>> > >>> > >>>> On Fri, Nov 19, 2021 at 7:13 AM MrAsanjar wrote: > >>> > >>>> Hi team, > >>>> IBM is planning to upgrade and relocate the apache bigtop server next > >>>> week, again. > >>>> There will be a short downtime. Hopefully, the IP addresses will not > >>>> change. I'll keep you posted > >>>> > >>>> On Sun, Oct 24, 2021 at 11:29 AM Evans Ye wrote: > >>>> > >>>>> Thanks for the notice, Amir. > >>>>> > >>>>> MrAsanjar 於 2021年10月22日 週五 下午11:57寫道: > >>>>> > >>>>>> Hi team, > >>>>>> IBM plans to upgrade the current Jenkins slave to high-end Power9 with > >>>>> SSD > >>>>>> or nvme drives, perhaps next week. There will be a short interruption > >>>>> in > >>>>>> the availability. > >>>>>> > >>>>> > >>>> >
Some ideas about Jenkins
Hi everybody, I opened a couple of Jiras for Jenkins: - https://issues.apache.org/jira/browse/BIGTOP-3611 - Upgrade Jenkins to the latest upstream - https://issues.apache.org/jira/browse/BIGTOP-3612 - Add a backup for Jenkins' /home/jenkins dir I didn't find anything open for these topics, apologies in advance in case there is known work in progress. Let me know your thoughts :) Luca
[jira] [Created] (BIGTOP-3612) Move Jenkins master's /home/jenkins to a dedicated EBS volume
Luca Toscano created BIGTOP-3612: Summary: Move Jenkins master's /home/jenkins to a dedicated EBS volume Key: BIGTOP-3612 URL: https://issues.apache.org/jira/browse/BIGTOP-3612 Project: Bigtop Issue Type: Improvement Reporter: Luca Toscano Hi everybody, would it be feasible, in you opinion, to move /home/jenkins to a dedicated EBS volume on the Jenkins master? It would require some downtime and some extra resources dedicated to it, but we'd have a simple and easy way to keep a backup of the volume to restore in case of failures (not sure if we have backups now). I didn't find anything already opened for it, in case apologies for the extra Jira :) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BIGTOP-3611) Upgrade Jenkins to latest upstream
Luca Toscano created BIGTOP-3611: Summary: Upgrade Jenkins to latest upstream Key: BIGTOP-3611 URL: https://issues.apache.org/jira/browse/BIGTOP-3611 Project: Bigtop Issue Type: Improvement Reporter: Luca Toscano Hi everybody, I noticed that our Jenkins' UI suggests to upgrade, should we do it? >From >https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide#BigtopCISetupGuide-SetupaJenkinsmaster > it seems relatively easy: docker inspect #jenkins-container > backup_container.log docker image inspect jenkins/jenkins > backup_image.log docker stop #jenkins-container docker pull jenkins/jenkins:latest docker run etc.. Should we do it? -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Set up CI for Debian 11 Bullseye
Update: I have deleted old build files from the Jenkins UI, mostly related to old Bigtop releases (1.4/1.5), keeping only the last 3. The Jenkins master has now ~470GB of free disk space, that is more than enough to keep building the current projects without incurring in disk space issues. Question: should we delete all old builds, keeping only data for the projects that are actually being used (like Bigtop 3.x, trunk, etc..) ? Is there any value in keeping logs/etc.. about what has been done in the past? What I mean is if we have use cases that need us to go back in time to inspect what was done :) Thanks! Luca On Mon, Dec 6, 2021 at 9:57 PM Luca Toscano wrote: > > Update: I have deleted (via the UI) the failed builds that I tried in > these days for Trunk packages, and I freed ~100G. Is there any > specific reason to keep more than 2/3 builds for each project by > default? > > The idea would be to do something like: > https://plugins.jenkins.io/build-discarder/#plugin-content-getting-started > > My understanding is that old builds are kept for no real reason, all > packages are stored in the Apache archive and we don't really need to > keep build logs/config around. If nobody opposes it, I'd add a basic > global config to keep 3 builds for each project and discard the rest. > > Let me know what you think! > > Luca > > > Luca > > On Sun, Dec 5, 2021 at 10:02 AM Luca Toscano wrote: > > > > Thanks a lot Olaf, totally missed the credentials for AWS, I was able > > to get the new hostnames and ssh in. The master's root partition seems > > to be full afaics, and under /home/jenkins/jobs there is a ton of old > > build data (even from Bigtop 1.1.0 etc..). I am not very expert in > > jenkins but I am wondering if a Global build discarder (found in > > https://ci.bigtop.apache.org/configure) could be something worth > > testing. Was it done in the past? Is there any downside in using it? > > For example, we could keep the last 6 months (or any time window) > > worth of build data. Let me know what you think! > > > > Luca > > > > On Sat, Dec 4, 2021 at 9:08 PM Olaf Flebbe wrote: > > > > > > Hi Luca, > > > > > > the name changed. There is now some space available. > > > > > > Check with the EC2 console to see the current public IP adresses . > > > https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:instanceState=running > > > > > > Best > > > Olaf > > > > > > > Am 04.12.2021 um 18:23 schrieb Luca Toscano : > > > > > > > > Hi :) > > > > > > > > https://ci.bigtop.apache.org/log/all seems to indicate that we are > > > > again in trouble, docker-slave-02 seems full. I got the credentials to > > > > access the node but I can reach it (my ssh session hangs > > > > indefinitely), is there anybody that can take a look? > > > > > > > > Thanks, > > > > > > > > Luca > > > > > > > > On Sat, Nov 27, 2021 at 8:49 PM Olaf Flebbe wrote: > > > >> > > > >> Hi, > > > >> > > > >> I had a quick look around. Job History was excessive: availlable back > > > >> to 2017, I deleted all until 2019 and set the maximum life time to one > > > >> year. > > > >> I see that not any job from bigtopstore have been removed so far. Now > > > >> having 52995 jobs on disk. > > > >> I removed all bug the last ~3000 job dirs. > > > >> > > > >> Can someone please look at the config of „bigpetstore“, please? > > > >> > > > >> At least we have some air to breath now again on jenkins. > > > >> > > > >> Olaf > > > >> > > > >> > > > >>> Am 27.11.2021 um 20:04 schrieb Olaf Flebbe : > > > >>> > > > >>> Regarding setup of jenkins, you can find a lot of info here: > > > >>> > > > >>> https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide > > > >>> > > > >>> <https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide> > > > >>> > > > >>> Olaf > > > >>> > > > >>> > > > >>>> Am 27.11.2021 um 19:46 schrieb Olaf Flebbe > > >>>> <mailto:o...@oflebbe.de>>: > > > >>>> > > > >>>> I am working with luca to get him access .. > > > >
Re: Power (ppc64le) CI/CD upgrade
Hi! I am very new to the project so I don't have a lot of context on the ppc64 node, but today I tried to rebuild the docker images and all the jobs for ppc64 failed: https://ci.bigtop.apache.org/job/Docker-Puppet-Trunk/34/ It is weird since, from the logs, it seems that the node works but it fails at the end when trying to push to dockerhub: https://ci.bigtop.apache.org/job/Docker-Puppet-Trunk/DISTRO=debian-10,PLATFORM=ppc64le-slave/lastBuild/console Is it possible that some credentials are not available anymore? Does anybody else have context on this failure? Thanks! Luca On Thu, Dec 2, 2021 at 9:15 PM MrAsanjar wrote: > > done, please verify > > On Thu, Dec 2, 2021 at 12:33 PM MrAsanjar wrote: > > > Hi > > I have good news and bad news :) The good news is IBM has upgraded the > > Power CI/CD server to a faster processor with 16 cores and NVMe drives. > > The bad news is during the process, somehow, cloning had failed :( I'll > > try to build a new Jenkins slave with the same IP. > > > > > > On Fri, Nov 19, 2021 at 7:13 AM MrAsanjar wrote: > > > >> Hi team, > >> IBM is planning to upgrade and relocate the apache bigtop server next > >> week, again. > >> There will be a short downtime. Hopefully, the IP addresses will not > >> change. I'll keep you posted > >> > >> On Sun, Oct 24, 2021 at 11:29 AM Evans Ye wrote: > >> > >>> Thanks for the notice, Amir. > >>> > >>> MrAsanjar 於 2021年10月22日 週五 下午11:57寫道: > >>> > >>> > Hi team, > >>> > IBM plans to upgrade the current Jenkins slave to high-end Power9 with > >>> SSD > >>> > or nvme drives, perhaps next week. There will be a short interruption > >>> in > >>> > the availability. > >>> > > >>> > >>
[jira] [Created] (BIGTOP-3610) Hadoop's debian packaging fails due to Zookeeper jars
Luca Toscano created BIGTOP-3610: Summary: Hadoop's debian packaging fails due to Zookeeper jars Key: BIGTOP-3610 URL: https://issues.apache.org/jira/browse/BIGTOP-3610 Project: Bigtop Issue Type: Sub-task Reporter: Luca Toscano The following build failure was found while testing the build of the hadoop package on Debian 10 x86: {code} BUILD FAILED in 46m 30s + ln -fs /usr/lib/hadoop/lib/snappy-java-1.0.5.jar debian/tmp//usr/lib/hadoop/client/snappy-java-1.0.5.jar + ln -fs /usr/lib/hadoop/lib/snappy-java-1.0.5.jar debian/tmp//usr/lib/hadoop/client/snappy-java.jar + continue 2 + for file in `cat ${BUILD_DIR}/hadoop-client.list` + for dir in ${HADOOP_DIR}/{lib,} ${HDFS_DIR}/{lib,} ${YARN_DIR}/{lib,} ${MAPREDUCE_DIR}/{lib,} + '[' -e debian/tmp//usr/lib/hadoop/lib/stax2-api-3.1.4.jar ']' + ln -fs /usr/lib/hadoop/lib/stax2-api-3.1.4.jar debian/tmp//usr/lib/hadoop/client/stax2-api-3.1.4.jar + ln -fs /usr/lib/hadoop/lib/stax2-api-3.1.4.jar debian/tmp//usr/lib/hadoop/client/stax2-api.jar + continue 2 + for file in `cat ${BUILD_DIR}/hadoop-client.list` + for dir in ${HADOOP_DIR}/{lib,} ${HDFS_DIR}/{lib,} ${YARN_DIR}/{lib,} ${MAPREDUCE_DIR}/{lib,} + '[' -e debian/tmp//usr/lib/hadoop/lib/token-provider-1.0.1.jar ']' + ln -fs /usr/lib/hadoop/lib/token-provider-1.0.1.jar debian/tmp//usr/lib/hadoop/client/token-provider-1.0.1.jar + ln -fs /usr/lib/hadoop/lib/token-provider-1.0.1.jar debian/tmp//usr/lib/hadoop/client/token-provider.jar + continue 2 + for file in `cat ${BUILD_DIR}/hadoop-client.list` + for dir in ${HADOOP_DIR}/{lib,} ${HDFS_DIR}/{lib,} ${YARN_DIR}/{lib,} ${MAPREDUCE_DIR}/{lib,} + '[' -e debian/tmp//usr/lib/hadoop/lib/woodstox-core-5.0.3.jar ']' + ln -fs /usr/lib/hadoop/lib/woodstox-core-5.0.3.jar debian/tmp//usr/lib/hadoop/client/woodstox-core-5.0.3.jar + ln -fs /usr/lib/hadoop/lib/woodstox-core-5.0.3.jar debian/tmp//usr/lib/hadoop/client/woodstox-core.jar + continue 2 # Forcing Zookeeper dependency to be on the packaged jar ln -sf /usr/lib/zookeeper/zookeeper.jar debian/tmp/usr/lib/hadoop/lib/zookeeper*.jar ln: target 'debian/tmp/usr/lib/hadoop/lib/zookeeper-jute-3.5.9.jar' is not a directory make[1]: *** [debian/rules:57: override_dh_auto_install] Error 1 make[1]: Leaving directory '/ws/output/hadoop/hadoop-3.2.2' make: *** [debian/rules:27: binary] Error 2 dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2 {code} The main reason seems to be that there are two jars in the output directory related to zookeper: {code} $ ls output/hadoop/hadoop-3.2.2/debian/tmp/usr/lib/hadoop/lib/zookeeper-* output/hadoop/hadoop-3.2.2/debian/tmp/usr/lib/hadoop/lib/zookeeper-3.5.9.jar output/hadoop/hadoop-3.2.2/debian/tmp/usr/lib/hadoop/lib/zookeeper-jute-3.5.9.jar {code} The failing command "ln" assumes, in my opinion, that there is only one file matching the zookeeper-* pattern in the output dir, but now we have two. We can make the command smarter but I am wondering if the new jar is needed at runtime (so if needs to be copied/linked as well). Something like the following seems to work: {code} diff --git a/bigtop-packages/src/deb/hadoop/rules b/bigtop-packages/src/deb/hadoop/rules index fe0f3017..e57194b6 100755 --- a/bigtop-packages/src/deb/hadoop/rules +++ b/bigtop-packages/src/deb/hadoop/rules @@ -66,7 +66,7 @@ override_dh_auto_install: --native-build-string=${native_dir} \ --installed-lib-dir=/usr/lib/hadoop # Forcing Zookeeper dependency to be on the packaged jar - ln -sf /usr/lib/zookeeper/zookeeper.jar debian/tmp/usr/lib/hadoop/lib/zookeeper*.jar + ln -sf /usr/lib/zookeeper/zookeeper.jar debian/tmp/usr/lib/hadoop/lib/zookeeper-[0-9.]+.jar # Workaround for BIGTOP-583 rm -f debian/tmp/usr/lib/hadoop-*/lib/slf4j-log4j12-*.jar # FIXME: BIGTOP-463 {code} But of course it doesn't take into account the jute jar. The interesting thing is that I can see the following for hadoop.spec as well: {code} # Forcing Zookeeper dependency to be on the packaged jar %__ln_s -f /usr/lib/zookeeper/zookeeper.jar $RPM_BUILD_ROOT/%{lib_hadoop}/lib/zookeeper*.jar {code} Does it happen also for rpm packages? If not, why :) ? -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Set up CI for Debian 11 Bullseye
Update: I have deleted (via the UI) the failed builds that I tried in these days for Trunk packages, and I freed ~100G. Is there any specific reason to keep more than 2/3 builds for each project by default? The idea would be to do something like: https://plugins.jenkins.io/build-discarder/#plugin-content-getting-started My understanding is that old builds are kept for no real reason, all packages are stored in the Apache archive and we don't really need to keep build logs/config around. If nobody opposes it, I'd add a basic global config to keep 3 builds for each project and discard the rest. Let me know what you think! Luca Luca On Sun, Dec 5, 2021 at 10:02 AM Luca Toscano wrote: > > Thanks a lot Olaf, totally missed the credentials for AWS, I was able > to get the new hostnames and ssh in. The master's root partition seems > to be full afaics, and under /home/jenkins/jobs there is a ton of old > build data (even from Bigtop 1.1.0 etc..). I am not very expert in > jenkins but I am wondering if a Global build discarder (found in > https://ci.bigtop.apache.org/configure) could be something worth > testing. Was it done in the past? Is there any downside in using it? > For example, we could keep the last 6 months (or any time window) > worth of build data. Let me know what you think! > > Luca > > On Sat, Dec 4, 2021 at 9:08 PM Olaf Flebbe wrote: > > > > Hi Luca, > > > > the name changed. There is now some space available. > > > > Check with the EC2 console to see the current public IP adresses . > > https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:instanceState=running > > > > Best > > Olaf > > > > > Am 04.12.2021 um 18:23 schrieb Luca Toscano : > > > > > > Hi :) > > > > > > https://ci.bigtop.apache.org/log/all seems to indicate that we are > > > again in trouble, docker-slave-02 seems full. I got the credentials to > > > access the node but I can reach it (my ssh session hangs > > > indefinitely), is there anybody that can take a look? > > > > > > Thanks, > > > > > > Luca > > > > > > On Sat, Nov 27, 2021 at 8:49 PM Olaf Flebbe wrote: > > >> > > >> Hi, > > >> > > >> I had a quick look around. Job History was excessive: availlable back to > > >> 2017, I deleted all until 2019 and set the maximum life time to one year. > > >> I see that not any job from bigtopstore have been removed so far. Now > > >> having 52995 jobs on disk. > > >> I removed all bug the last ~3000 job dirs. > > >> > > >> Can someone please look at the config of „bigpetstore“, please? > > >> > > >> At least we have some air to breath now again on jenkins. > > >> > > >> Olaf > > >> > > >> > > >>> Am 27.11.2021 um 20:04 schrieb Olaf Flebbe : > > >>> > > >>> Regarding setup of jenkins, you can find a lot of info here: > > >>> > > >>> https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide > > >>> > > >>> <https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide> > > >>> > > >>> Olaf > > >>> > > >>> > > >>>> Am 27.11.2021 um 19:46 schrieb Olaf Flebbe > >>>> <mailto:o...@oflebbe.de>>: > > >>>> > > >>>> I am working with luca to get him access .. > > >>>> > > >>>> Olaf > > >>>> > > >>>>> Am 27.11.2021 um 18:29 schrieb Luca Toscano > >>>>> <mailto:toscano.l...@gmail.com>>: > > >>>>> > > >>>>> Keep reporting issues, sorry :) > > >>>>> > > >>>>> I see from https://ci.bigtop.apache.org/log/all > > >>>>> <https://ci.bigtop.apache.org/log/all> that we are having > > >>>>> disk space issues on the Jenkins node, builds are stopped for the > > >>>>> moment. No idea if there is a guide/tutorial for the cleanup, but I > > >>>>> guess it means accessing to the VM/hosts that are running Jenkins and > > >>>>> I have never done it (not even sure where to put my ssh key etc, what > > >>>>> is the hostname, etc..). > > >>>>> > > >>>>> Thanks! > > >>>>> > > >>>>> Luca > > >>>>> > > >>>
Re: Set up CI for Debian 11 Bullseye
Thanks a lot Olaf, totally missed the credentials for AWS, I was able to get the new hostnames and ssh in. The master's root partition seems to be full afaics, and under /home/jenkins/jobs there is a ton of old build data (even from Bigtop 1.1.0 etc..). I am not very expert in jenkins but I am wondering if a Global build discarder (found in https://ci.bigtop.apache.org/configure) could be something worth testing. Was it done in the past? Is there any downside in using it? For example, we could keep the last 6 months (or any time window) worth of build data. Let me know what you think! Luca On Sat, Dec 4, 2021 at 9:08 PM Olaf Flebbe wrote: > > Hi Luca, > > the name changed. There is now some space available. > > Check with the EC2 console to see the current public IP adresses . > https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:instanceState=running > > Best > Olaf > > > Am 04.12.2021 um 18:23 schrieb Luca Toscano : > > > > Hi :) > > > > https://ci.bigtop.apache.org/log/all seems to indicate that we are > > again in trouble, docker-slave-02 seems full. I got the credentials to > > access the node but I can reach it (my ssh session hangs > > indefinitely), is there anybody that can take a look? > > > > Thanks, > > > > Luca > > > > On Sat, Nov 27, 2021 at 8:49 PM Olaf Flebbe wrote: > >> > >> Hi, > >> > >> I had a quick look around. Job History was excessive: availlable back to > >> 2017, I deleted all until 2019 and set the maximum life time to one year. > >> I see that not any job from bigtopstore have been removed so far. Now > >> having 52995 jobs on disk. > >> I removed all bug the last ~3000 job dirs. > >> > >> Can someone please look at the config of „bigpetstore“, please? > >> > >> At least we have some air to breath now again on jenkins. > >> > >> Olaf > >> > >> > >>> Am 27.11.2021 um 20:04 schrieb Olaf Flebbe : > >>> > >>> Regarding setup of jenkins, you can find a lot of info here: > >>> > >>> https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide > >>> <https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide> > >>> > >>> Olaf > >>> > >>> > >>>> Am 27.11.2021 um 19:46 schrieb Olaf Flebbe >>>> <mailto:o...@oflebbe.de>>: > >>>> > >>>> I am working with luca to get him access .. > >>>> > >>>> Olaf > >>>> > >>>>> Am 27.11.2021 um 18:29 schrieb Luca Toscano >>>>> <mailto:toscano.l...@gmail.com>>: > >>>>> > >>>>> Keep reporting issues, sorry :) > >>>>> > >>>>> I see from https://ci.bigtop.apache.org/log/all > >>>>> <https://ci.bigtop.apache.org/log/all> that we are having > >>>>> disk space issues on the Jenkins node, builds are stopped for the > >>>>> moment. No idea if there is a guide/tutorial for the cleanup, but I > >>>>> guess it means accessing to the VM/hosts that are running Jenkins and > >>>>> I have never done it (not even sure where to put my ssh key etc, what > >>>>> is the hostname, etc..). > >>>>> > >>>>> Thanks! > >>>>> > >>>>> Luca > >>>>> > >>>>> On Fri, Nov 26, 2021 at 7:10 PM Olaf Flebbe >>>>> <mailto:o...@oflebbe.de>> wrote: > >>>>>> > >>>>>> Hi Luca, > >>>>>> > >>>>>> you are right, 201 should be valid. However there seems to be some > >>>>>> compatibility problem with at least this repo. So comment it out for > >>>>>> now if it resolves the isse. Or remove the whole proxy stuff > >>>>>> altoghether. We had stability issues with the internet connectivity at > >>>>>> that time. The proxy resolved that mostly. > >>>>>> > >>>>>> Olaf > >>>>>> > >>>>>> > >>>>>>> Am 26.11.2021 um 19:06 schrieb Luca Toscano >>>>>>> <mailto:toscano.l...@gmail.com>>: > >>>>>>> > >>>>>>> Hi Olaf, > >>>>>>> > >>>>>>> I see that https://repository.jboss.org/nexus/content/groups/public/ > >&
Re: Set up CI for Debian 11 Bullseye
Hi :) https://ci.bigtop.apache.org/log/all seems to indicate that we are again in trouble, docker-slave-02 seems full. I got the credentials to access the node but I can reach it (my ssh session hangs indefinitely), is there anybody that can take a look? Thanks, Luca On Sat, Nov 27, 2021 at 8:49 PM Olaf Flebbe wrote: > > Hi, > > I had a quick look around. Job History was excessive: availlable back to > 2017, I deleted all until 2019 and set the maximum life time to one year. > I see that not any job from bigtopstore have been removed so far. Now having > 52995 jobs on disk. > I removed all bug the last ~3000 job dirs. > > Can someone please look at the config of „bigpetstore“, please? > > At least we have some air to breath now again on jenkins. > > Olaf > > > > Am 27.11.2021 um 20:04 schrieb Olaf Flebbe : > > > > Regarding setup of jenkins, you can find a lot of info here: > > > > https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide > > <https://cwiki.apache.org/confluence/display/BIGTOP/Bigtop+CI+Setup+Guide> > > > > Olaf > > > > > >> Am 27.11.2021 um 19:46 schrieb Olaf Flebbe >> <mailto:o...@oflebbe.de>>: > >> > >> I am working with luca to get him access .. > >> > >> Olaf > >> > >>> Am 27.11.2021 um 18:29 schrieb Luca Toscano >>> <mailto:toscano.l...@gmail.com>>: > >>> > >>> Keep reporting issues, sorry :) > >>> > >>> I see from https://ci.bigtop.apache.org/log/all > >>> <https://ci.bigtop.apache.org/log/all> that we are having > >>> disk space issues on the Jenkins node, builds are stopped for the > >>> moment. No idea if there is a guide/tutorial for the cleanup, but I > >>> guess it means accessing to the VM/hosts that are running Jenkins and > >>> I have never done it (not even sure where to put my ssh key etc, what > >>> is the hostname, etc..). > >>> > >>> Thanks! > >>> > >>> Luca > >>> > >>> On Fri, Nov 26, 2021 at 7:10 PM Olaf Flebbe >>> <mailto:o...@oflebbe.de>> wrote: > >>>> > >>>> Hi Luca, > >>>> > >>>> you are right, 201 should be valid. However there seems to be some > >>>> compatibility problem with at least this repo. So comment it out for now > >>>> if it resolves the isse. Or remove the whole proxy stuff altoghether. We > >>>> had stability issues with the internet connectivity at that time. The > >>>> proxy resolved that mostly. > >>>> > >>>> Olaf > >>>> > >>>> > >>>>> Am 26.11.2021 um 19:06 schrieb Luca Toscano >>>>> <mailto:toscano.l...@gmail.com>>: > >>>>> > >>>>> Hi Olaf, > >>>>> > >>>>> I see that https://repository.jboss.org/nexus/content/groups/public/ > >>>>> <https://repository.jboss.org/nexus/content/groups/public/> > >>>>> is valid though (http redirects to https), but I don't get why we > >>>>> should delete the line (also the code returns a HTTP 201 afaics so it > >>>>> seems not failing to find it). > >>>>> > >>>>> Lemme know, I am probably missing something, too many new things :) > >>>>> > >>>>> Luca > >>>>> > >>>>> On Fri, Nov 26, 2021 at 6:24 PM Olaf Flebbe >>>>> <mailto:o...@oflebbe.de>> wrote: > >>>>>> > >>>>>> Hi Luca, > >>>>>> > >>>>>> Seems like jboss.org <http://jboss.org/> <http://jboss.org/ > >>>>>> <http://jboss.org/>> repository does not exist any more. > >>>>>> > >>>>>> Try to delete this line: > >>>>>> https://github.com/apache/bigtop/blob/5f863aae467198070c1230558446a19f308a66ae/build.gradle#L462 > >>>>>> > >>>>>> <https://github.com/apache/bigtop/blob/5f863aae467198070c1230558446a19f308a66ae/build.gradle#L462> > >>>>>> > >>>>>> <https://github.com/apache/bigtop/blob/5f863aae467198070c1230558446a19f308a66ae/build.gradle#L462 > >>>>>> > >>>>>> <https://github.com/apache/bigtop/blob/5f863aae467198070c1230558446a19f308a66ae/build.gradle#L462>> > >>>
[jira] [Created] (BIGTOP-3609) Hive package build failure for CentOS 7
Luca Toscano created BIGTOP-3609: Summary: Hive package build failure for CentOS 7 Key: BIGTOP-3609 URL: https://issues.apache.org/jira/browse/BIGTOP-3609 Project: Bigtop Issue Type: Bug Affects Versions: 3.1.0 Reporter: Luca Toscano https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/754/COMPONENTS=hive,OS=centos-7/console RPM build errors: > Task :hive-rpm FAILED + export 'MAVEN_OPTS= -Xmx1500m -Xms1500m' :hive-rpm (Thread[Execution worker for ':' Thread 2,5,main]) completed. Took 2.111 secs. + MAVEN_OPTS=' -Xmx1500m -Xms1500m' 11 actionable tasks: 11 executed + mvn -Dhbase.version=2.2.6 -Dzookeeper.version=3.5.9 -Dhadoop.version=3.2.2 -DskipTests -Dtez.version=0.10.0 -Dspark.version=3.0.1 -Dscala.binary.version=2.12 -Dscala.version=2.12.13 -Dguava.version=27.0-jre clean install -Pdist /bigtop/build/hive/rpm/SOURCES/do-component-build: line 45: mvn: command not found error: Bad exit status from /var/tmp/rpm-tmp.ky8iAk (%build) Bad exit status from /var/tmp/rpm-tmp.ky8iAk (%build) This seems to be the issue: /bigtop/build/hive/rpm/SOURCES/do-component-build: line 45: mvn: command not found -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BIGTOP-3608) HBase zookeeper build failure
Luca Toscano created BIGTOP-3608: Summary: HBase zookeeper build failure Key: BIGTOP-3608 URL: https://issues.apache.org/jira/browse/BIGTOP-3608 Project: Bigtop Issue Type: Bug Affects Versions: 3.1.0 Reporter: Luca Toscano There seems to be a problem in building HBase: [https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/754/COMPONENTS=hbase,OS=debian-10/console] [INFO] --- maven-compiler-plugin:3.8.1:compile (default-compile) @ hbase-zookeeper --- [INFO] Compiling 23 source files to /bigtop/output/hbase/hbase-2.2.6/hbase-zookeeper/target/classes [INFO] /bigtop/output/hbase/hbase-2.2.6/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java: Some input files use or override a deprecated API. [INFO] /bigtop/output/hbase/hbase-2.2.6/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java: Recompile with -Xlint:deprecation for details. [INFO] - [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /bigtop/output/hbase/hbase-2.2.6/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/HQuorumPeer.java:[89,23] unreported exception org.apache.zookeeper.server.admin.AdminServer.AdminServerException; must be caught or declared to be thrown [ERROR] /bigtop/output/hbase/hbase-2.2.6/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/HQuorumPeer.java:[94,23] unreported exception org.apache.zookeeper.server.admin.AdminServer.AdminServerException; must be caught or declared to be thrown -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Set up CI for Debian 11 Bullseye
Keep reporting issues, sorry :) I see from https://ci.bigtop.apache.org/log/all that we are having disk space issues on the Jenkins node, builds are stopped for the moment. No idea if there is a guide/tutorial for the cleanup, but I guess it means accessing to the VM/hosts that are running Jenkins and I have never done it (not even sure where to put my ssh key etc, what is the hostname, etc..). Thanks! Luca On Fri, Nov 26, 2021 at 7:10 PM Olaf Flebbe wrote: > > Hi Luca, > > you are right, 201 should be valid. However there seems to be some > compatibility problem with at least this repo. So comment it out for now if > it resolves the isse. Or remove the whole proxy stuff altoghether. We had > stability issues with the internet connectivity at that time. The proxy > resolved that mostly. > > Olaf > > > > Am 26.11.2021 um 19:06 schrieb Luca Toscano : > > > > Hi Olaf, > > > > I see that https://repository.jboss.org/nexus/content/groups/public/ > > is valid though (http redirects to https), but I don't get why we > > should delete the line (also the code returns a HTTP 201 afaics so it > > seems not failing to find it). > > > > Lemme know, I am probably missing something, too many new things :) > > > > Luca > > > > On Fri, Nov 26, 2021 at 6:24 PM Olaf Flebbe wrote: > >> > >> Hi Luca, > >> > >> Seems like jboss.org <http://jboss.org/> repository does not exist any > >> more. > >> > >> Try to delete this line: > >> https://github.com/apache/bigtop/blob/5f863aae467198070c1230558446a19f308a66ae/build.gradle#L462 > >> > >> <https://github.com/apache/bigtop/blob/5f863aae467198070c1230558446a19f308a66ae/build.gradle#L462> > >> > >> Olaf > >> > >> > >>> Am 26.11.2021 um 14:23 schrieb Luca Toscano : > >>> > >>> Hi Olaf, > >>> > >>> I restarted the jobs and some packages started to build, thanks! I > >>> found another weird error though: > >>> > >>> """ > >>> * What went wrong: > >>> Execution failed for task ':configure-nexus-repository.jboss.org'. > >>>> Failed to configure Nexus proxy repository.jboss.org with http code 201 > >>>> returned. Run with --info option to see the executed command > >>> """ > >>> More info: > >>> https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/745/COMPONENTS=flink,OS=debian-9/console > >>> > >>> I am wondering if https://github.com/apache/bigtop/pull/834 could help :) > >>> > >>> Luca > >>> > >>> On Fri, Nov 26, 2021 at 11:44 AM Olaf Flebbe wrote: > >>>> > >>>> Hi Luca, > >>>> > >>>> I once added a nexus container as an download cache for maven > >>>> dependencies, attached to a seperate docker network. It has to be > >>>> started manually if it failed. > >>>> https://ci.bigtop.apache.org/job/nexus-restart/ > >>>> <https://ci.bigtop.apache.org/job/nexus-restart/> > >>>> > >>>> If this network is not present you might get this kind of error > >>>> So try to restart nexus. > >>>> > >>>> Best > >>>> Olaf > >>>> > >>>>> Am 26.11.2021 um 09:04 schrieb Luca Toscano : > >>>>> > >>>>> Update: > >>>>> > >>>>> - Found the config diff changes in Jenkins, so I know how to re-enable > >>>>> periodic execution. > >>>>> > >>>>> - Tried to kick off a build, but I stopped it since all the package > >>>>> builds were failing for the same error: > >>>>> ++ docker run -d --net=container:nexus bigtop/slaves:trunk-debian-10 > >>>>> /sbin/init > >>>>> docker: Error response from daemon: cannot join network of a non > >>>>> running container > >>>>> Example:https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/COMPONENTS=solr,OS=debian-10/lastBuild/console > >>>>> > >>>>> I have probably missed something, I don't exactly understand the error > >>>>> so any help would be appreciated :) > >>>>> > >>>>> Luca > >>>>> > >>>>> On Thu, Nov 25, 2021 at 12:27 PM Luca Toscano > >>>>> wrote: &g
Re: Set up CI for Debian 11 Bullseye
Hi Olaf, I see that https://repository.jboss.org/nexus/content/groups/public/ is valid though (http redirects to https), but I don't get why we should delete the line (also the code returns a HTTP 201 afaics so it seems not failing to find it). Lemme know, I am probably missing something, too many new things :) Luca On Fri, Nov 26, 2021 at 6:24 PM Olaf Flebbe wrote: > > Hi Luca, > > Seems like jboss.org <http://jboss.org/> repository does not exist any more. > > Try to delete this line: > https://github.com/apache/bigtop/blob/5f863aae467198070c1230558446a19f308a66ae/build.gradle#L462 > > <https://github.com/apache/bigtop/blob/5f863aae467198070c1230558446a19f308a66ae/build.gradle#L462> > > Olaf > > > > Am 26.11.2021 um 14:23 schrieb Luca Toscano : > > > > Hi Olaf, > > > > I restarted the jobs and some packages started to build, thanks! I > > found another weird error though: > > > > """ > > * What went wrong: > > Execution failed for task ':configure-nexus-repository.jboss.org'. > >> Failed to configure Nexus proxy repository.jboss.org with http code 201 > >> returned. Run with --info option to see the executed command > > """ > > More info: > > https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/745/COMPONENTS=flink,OS=debian-9/console > > > > I am wondering if https://github.com/apache/bigtop/pull/834 could help :) > > > > Luca > > > > On Fri, Nov 26, 2021 at 11:44 AM Olaf Flebbe wrote: > >> > >> Hi Luca, > >> > >> I once added a nexus container as an download cache for maven > >> dependencies, attached to a seperate docker network. It has to be started > >> manually if it failed. > >> https://ci.bigtop.apache.org/job/nexus-restart/ > >> <https://ci.bigtop.apache.org/job/nexus-restart/> > >> > >> If this network is not present you might get this kind of error > >> So try to restart nexus. > >> > >> Best > >> Olaf > >> > >>> Am 26.11.2021 um 09:04 schrieb Luca Toscano : > >>> > >>> Update: > >>> > >>> - Found the config diff changes in Jenkins, so I know how to re-enable > >>> periodic execution. > >>> > >>> - Tried to kick off a build, but I stopped it since all the package > >>> builds were failing for the same error: > >>> ++ docker run -d --net=container:nexus bigtop/slaves:trunk-debian-10 > >>> /sbin/init > >>> docker: Error response from daemon: cannot join network of a non > >>> running container > >>> Example:https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/COMPONENTS=solr,OS=debian-10/lastBuild/console > >>> > >>> I have probably missed something, I don't exactly understand the error > >>> so any help would be appreciated :) > >>> > >>> Luca > >>> > >>> On Thu, Nov 25, 2021 at 12:27 PM Luca Toscano > >>> wrote: > >>>> > >>>> Update: I added debian-11 support, if nobody disagrees I'd kick off a > >>>> package build. I can also enable periodic execution but I am not sure > >>>> what I should put in the related time interval field (I didn't find it > >>>> in the docs). > >>>> > >>>> Luca > >>>> > >>>> On Sun, Nov 21, 2021 at 7:10 PM Luca Toscano > >>>> wrote: > >>>>> > >>>>> Hi Olaf! Thanks a lot :) > >>>>> > >>>>> I managed to change CI to allow Debian 11 (minus a little error for > >>>>> Ubuntu that seems unrelated, see > >>>>> https://ci.bigtop.apache.org/view/Docker/job/Docker-Toolchain-Trunk/) > >>>>> and I was able to build the hadoop packages locally via the new Debian > >>>>> 11 images published to Dockerhub. > >>>>> > >>>>> IIUC the last step is to modify > >>>>> https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/ > >>>>> and kick off a build, but I see "2021-10-02: temporarily disable > >>>>> periodic execution for preparing 3.0.0 release." so before doing > >>>>> anything I'd like some feedback (to avoid causing problems etc..). > >>>>> > >>>>> Luca > >>>>> > >>>>> On Fri, Nov 19, 2021 at 8:53 PM Olaf Flebbe wrote: > >>>>>> > >>>>>> Hi Luca, > >>>>>> > >>>>>> I gave you super powers. > >>>>>> > >>>>>> Best > >>>>>> Olaf > >>>>>> > >>>>>>> Am 19.11.2021 um 08:15 schrieb Luca Toscano : > >>>>>>> > >>>>>>> Hi everybody! > >>>>>>> > >>>>>>> I am working on https://issues.apache.org/jira/browse/BIGTOP-3600 and > >>>>>>> IIUC I'd need to add the CI jobs now, but of course I don't have > >>>>>>> permissions. If anybody has some time to help I'd be happy to learn > >>>>>>> the process and help out :) > >>>>>>> > >>>>>>> Thanks! > >>>>>>> > >>>>>>> Luca > >>>>>> > >> >
Re: Set up CI for Debian 11 Bullseye
Hi Olaf, I restarted the jobs and some packages started to build, thanks! I found another weird error though: """ * What went wrong: Execution failed for task ':configure-nexus-repository.jboss.org'. > Failed to configure Nexus proxy repository.jboss.org with http code 201 > returned. Run with --info option to see the executed command """ More info: https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/745/COMPONENTS=flink,OS=debian-9/console I am wondering if https://github.com/apache/bigtop/pull/834 could help :) Luca On Fri, Nov 26, 2021 at 11:44 AM Olaf Flebbe wrote: > > Hi Luca, > > I once added a nexus container as an download cache for maven dependencies, > attached to a seperate docker network. It has to be started manually if it > failed. > https://ci.bigtop.apache.org/job/nexus-restart/ > <https://ci.bigtop.apache.org/job/nexus-restart/> > > If this network is not present you might get this kind of error > So try to restart nexus. > > Best > Olaf > > > Am 26.11.2021 um 09:04 schrieb Luca Toscano : > > > > Update: > > > > - Found the config diff changes in Jenkins, so I know how to re-enable > > periodic execution. > > > > - Tried to kick off a build, but I stopped it since all the package > > builds were failing for the same error: > > ++ docker run -d --net=container:nexus bigtop/slaves:trunk-debian-10 > > /sbin/init > > docker: Error response from daemon: cannot join network of a non > > running container > > Example:https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/COMPONENTS=solr,OS=debian-10/lastBuild/console > > > > I have probably missed something, I don't exactly understand the error > > so any help would be appreciated :) > > > > Luca > > > > On Thu, Nov 25, 2021 at 12:27 PM Luca Toscano > > wrote: > >> > >> Update: I added debian-11 support, if nobody disagrees I'd kick off a > >> package build. I can also enable periodic execution but I am not sure > >> what I should put in the related time interval field (I didn't find it > >> in the docs). > >> > >> Luca > >> > >> On Sun, Nov 21, 2021 at 7:10 PM Luca Toscano > >> wrote: > >>> > >>> Hi Olaf! Thanks a lot :) > >>> > >>> I managed to change CI to allow Debian 11 (minus a little error for > >>> Ubuntu that seems unrelated, see > >>> https://ci.bigtop.apache.org/view/Docker/job/Docker-Toolchain-Trunk/) > >>> and I was able to build the hadoop packages locally via the new Debian > >>> 11 images published to Dockerhub. > >>> > >>> IIUC the last step is to modify > >>> https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/ > >>> and kick off a build, but I see "2021-10-02: temporarily disable > >>> periodic execution for preparing 3.0.0 release." so before doing > >>> anything I'd like some feedback (to avoid causing problems etc..). > >>> > >>> Luca > >>> > >>> On Fri, Nov 19, 2021 at 8:53 PM Olaf Flebbe wrote: > >>>> > >>>> Hi Luca, > >>>> > >>>> I gave you super powers. > >>>> > >>>> Best > >>>> Olaf > >>>> > >>>>> Am 19.11.2021 um 08:15 schrieb Luca Toscano : > >>>>> > >>>>> Hi everybody! > >>>>> > >>>>> I am working on https://issues.apache.org/jira/browse/BIGTOP-3600 and > >>>>> IIUC I'd need to add the CI jobs now, but of course I don't have > >>>>> permissions. If anybody has some time to help I'd be happy to learn > >>>>> the process and help out :) > >>>>> > >>>>> Thanks! > >>>>> > >>>>> Luca > >>>> >
Re: Set up CI for Debian 11 Bullseye
Update: - Found the config diff changes in Jenkins, so I know how to re-enable periodic execution. - Tried to kick off a build, but I stopped it since all the package builds were failing for the same error: ++ docker run -d --net=container:nexus bigtop/slaves:trunk-debian-10 /sbin/init docker: Error response from daemon: cannot join network of a non running container Example: https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/COMPONENTS=solr,OS=debian-10/lastBuild/console I have probably missed something, I don't exactly understand the error so any help would be appreciated :) Luca On Thu, Nov 25, 2021 at 12:27 PM Luca Toscano wrote: > > Update: I added debian-11 support, if nobody disagrees I'd kick off a > package build. I can also enable periodic execution but I am not sure > what I should put in the related time interval field (I didn't find it > in the docs). > > Luca > > On Sun, Nov 21, 2021 at 7:10 PM Luca Toscano wrote: > > > > Hi Olaf! Thanks a lot :) > > > > I managed to change CI to allow Debian 11 (minus a little error for > > Ubuntu that seems unrelated, see > > https://ci.bigtop.apache.org/view/Docker/job/Docker-Toolchain-Trunk/) > > and I was able to build the hadoop packages locally via the new Debian > > 11 images published to Dockerhub. > > > > IIUC the last step is to modify > > https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/ > > and kick off a build, but I see "2021-10-02: temporarily disable > > periodic execution for preparing 3.0.0 release." so before doing > > anything I'd like some feedback (to avoid causing problems etc..). > > > > Luca > > > > On Fri, Nov 19, 2021 at 8:53 PM Olaf Flebbe wrote: > > > > > > Hi Luca, > > > > > > I gave you super powers. > > > > > > Best > > > Olaf > > > > > > > Am 19.11.2021 um 08:15 schrieb Luca Toscano : > > > > > > > > Hi everybody! > > > > > > > > I am working on https://issues.apache.org/jira/browse/BIGTOP-3600 and > > > > IIUC I'd need to add the CI jobs now, but of course I don't have > > > > permissions. If anybody has some time to help I'd be happy to learn > > > > the process and help out :) > > > > > > > > Thanks! > > > > > > > > Luca > > >
Re: Set up CI for Debian 11 Bullseye
Update: I added debian-11 support, if nobody disagrees I'd kick off a package build. I can also enable periodic execution but I am not sure what I should put in the related time interval field (I didn't find it in the docs). Luca On Sun, Nov 21, 2021 at 7:10 PM Luca Toscano wrote: > > Hi Olaf! Thanks a lot :) > > I managed to change CI to allow Debian 11 (minus a little error for > Ubuntu that seems unrelated, see > https://ci.bigtop.apache.org/view/Docker/job/Docker-Toolchain-Trunk/) > and I was able to build the hadoop packages locally via the new Debian > 11 images published to Dockerhub. > > IIUC the last step is to modify > https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/ > and kick off a build, but I see "2021-10-02: temporarily disable > periodic execution for preparing 3.0.0 release." so before doing > anything I'd like some feedback (to avoid causing problems etc..). > > Luca > > On Fri, Nov 19, 2021 at 8:53 PM Olaf Flebbe wrote: > > > > Hi Luca, > > > > I gave you super powers. > > > > Best > > Olaf > > > > > Am 19.11.2021 um 08:15 schrieb Luca Toscano : > > > > > > Hi everybody! > > > > > > I am working on https://issues.apache.org/jira/browse/BIGTOP-3600 and > > > IIUC I'd need to add the CI jobs now, but of course I don't have > > > permissions. If anybody has some time to help I'd be happy to learn > > > the process and help out :) > > > > > > Thanks! > > > > > > Luca > >
Re: [DISCUSS] Release plan of Bigtop 3.1
Hi Masatake, one thing that I'd be worried about from an admin/operator point of view is a potential double upgrade for Bigtop 3.1, namely Zookeper first and then Hadoop (plus other services if needed). In my case, we would probably keep relying on Debian's zookeeper packages (3.4) since we don't use Hbase and we have our own version of Kafka packaged, but others relying on Bigtop's zookeeper packages may need to upgrade them as well. Nothing major, but I would highlight this thing in the release notes, so people can plan upgrades accordingly. In my experience planning a Bigtop upgrade, even a minor version, takes time and a lot of testing :) Hope it makes sense! Luca On Mon, Nov 22, 2021 at 5:25 AM Masatake Iwasaki wrote: > > Hi team, > > I filed BIGTOP-3605 as a starting point of discussion. > https://issues.apache.org/jira/browse/BIGTOP-3605 > > I propose to release Bigtop 3.1 shortly after Hadoop 3.2.3 is released. > We can bump Hadoop to 3.3 (or later) on Bigtop 3.2. > > * Bumping to Hadoop 3.3 is big change. Providing Hadoop 3.2.3 containing >almost 300 fixes from 3.2.2 could be useful for users of Bigtop 3.0. > > * HBase 2.2.6 and Kafka 2.4.1 are bit old releases. >We can upgrade them by bumping ZooKeeper to 3.5. > > I will appreciate your feedback on BIGTOP-3605 or this thread. > > Thanks, > Masatake Iwasaki
Re: Set up CI for Debian 11 Bullseye
Hi Olaf! Thanks a lot :) I managed to change CI to allow Debian 11 (minus a little error for Ubuntu that seems unrelated, see https://ci.bigtop.apache.org/view/Docker/job/Docker-Toolchain-Trunk/) and I was able to build the hadoop packages locally via the new Debian 11 images published to Dockerhub. IIUC the last step is to modify https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-packages/ and kick off a build, but I see "2021-10-02: temporarily disable periodic execution for preparing 3.0.0 release." so before doing anything I'd like some feedback (to avoid causing problems etc..). Luca On Fri, Nov 19, 2021 at 8:53 PM Olaf Flebbe wrote: > > Hi Luca, > > I gave you super powers. > > Best > Olaf > > > Am 19.11.2021 um 08:15 schrieb Luca Toscano : > > > > Hi everybody! > > > > I am working on https://issues.apache.org/jira/browse/BIGTOP-3600 and > > IIUC I'd need to add the CI jobs now, but of course I don't have > > permissions. If anybody has some time to help I'd be happy to learn > > the process and help out :) > > > > Thanks! > > > > Luca >
Set up CI for Debian 11 Bullseye
Hi everybody! I am working on https://issues.apache.org/jira/browse/BIGTOP-3600 and IIUC I'd need to add the CI jobs now, but of course I don't have permissions. If anybody has some time to help I'd be happy to learn the process and help out :) Thanks! Luca
[jira] [Created] (BIGTOP-3600) Add support for Debian 11 Bullseye
Luca Toscano created BIGTOP-3600: Summary: Add support for Debian 11 Bullseye Key: BIGTOP-3600 URL: https://issues.apache.org/jira/browse/BIGTOP-3600 Project: Bigtop Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Luca Toscano Hi everybody! I didn't find any task related to this, but as we discussed right before the 3.0 release, it would be great to add support for Debian 11. I am interested in knowing how to do it so if there is a guide/tutorial/etc.. I can definitely offer my help to add support for it! Thanks in advance! -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [DISCUSS] New Features post Bigtop 3.0
Hi Evans! On Tue, Nov 2, 2021 at 5:35 PM Evans Ye wrote: > > Hi folks, > > With Bigtop 3.0 been released, I think it's time to discuss what's new as > our next steps. Of course the open source ver. of unified compatible Hadoop > Distro. is still our core product going forward. But the surrounding value > added features might be something that can take us further beyond where we > were at. Now, let me post some ideas to start the brainstorming. > > 1. Deployment on K8S: Ambari or Bigtop Puppet as K8S operators. I am wondering how complex it is to write a Kubernetes Operator (that I assume would be a go-based application that talks with the Kubernetes API) vs writing Helm charts (or similar). We use the latter extensively at Wikimedia (but not for any Hadoop-related configs) and it works really well. Tools like Helmfile (https://github.com/roboll/helmfile) are also very nice to bootstrap and manage different environments/clusters/configurations. The couple Helm+Helmfile seems to be more close to what Bigtop currently does with puppet, so it may be an alternative (before writing an Operator) to figure out how to handle configs. For example, how is the Operator going to apply/create/etc.. configurations? I worked with Istio recently (https://istio.io/), and they offer tools that basically wrap Helm configurations (via binary client-side tool or K8s Operator) under the hood. I've never written a K8s operator so my understanding could be completely wrong! > 2. MLOps integrations: MLFlow, Submarine. At Wikimedia we are using KServe/Kubeflow, it may be a good addition to the list. We are using Openstack's Swift as object storage for models since it offers an S3 API, Apache Ozone could represent a very nice alternative (I saw some traction in the Jira, I'll try to help/review if needed!). > 3. Data Lake integrations: Hudi, Iceberg, Delta. +1, our plan is to experiment with Apache Iceberg very soon :) > And for some software engineering stuffs, I think we can do a clean up on > out-dated features such as: > 1. vagrant provisioner > 2. docker sandbox > 3. bigtop-ci > 4. bigtop-data-generators > 5. bigtop-bigpetstore Something else that would be nice: 1) Upgrade the Puppet version where needed (I know that Bigtop needs to keep compatibility with OS Distros that offer older versions of puppet etc..) 2) Migrate init.d scripts to systemd units where possible (for example, in Distros like Debian where it is fully supported). I understand that the above tasks are very complex and that require a lot of work :) They may not be super important given the above Kubernetes work to focus on, but I thought it was good to mention them! Thanks a lot for all the work! Luca
Re: [VOTE] Release Bigtop version 3.0.0
This is really great, so happy to see 3.0 in RC state! We will probably not be able to test a full upgrade of our testing environment for this vote, but we'll surely report back any issue that we find when we decide to upgrade. Since this is a hadoop 2 -> hadoop 3 upgrade for Bigtop, is there any documentation/guideline/etc.. that is available to help people to migrate their configs to the new major version? Luca On Tue, Oct 19, 2021 at 1:17 AM Kengo Seki wrote: > > This is the vote for release 3.0.0 of Apache Bigtop. > > It fixes the following issues: > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311420&version=12349352 > > The vote will be going for at least 72 hours and will be closed on Thursday, > October 21st, 2021 at 17:00 PDT. Please download, test and vote with > > [ ] +1, accept RC0 as the official 3.0.0 release of Apache Bigtop > [ ] +0, I don't care either way, > [ ] -1, do not accept RC0 as the official 3.0.0 release of Apache > Bigtop, because... > > Source and binary files: > https://dist.apache.org/repos/dist/dev/bigtop/bigtop-3.0.0-RC0/ > > Maven staging repo: > https://repository.apache.org/content/repositories/orgapachebigtop-1031 > > The git tag to be voted upon is release-3.0.0 > > Bigtop's KEYS file containing PGP keys we use to sign the release: > https://dist.apache.org/repos/dist/release/bigtop/KEYS > > You can see the results of packaging, deployment and smoke tests on > the CI environment in the following URLs. > Some builds had to be retried several times since some of our tests > are flaky, but all of them succeeded at last. > (I ran the smoke tests for all distros on x86_64. Regarding aarch64 and > ppc64le, > I ran them only for CentOS and Debian on the former and for Fedora and > Ubuntu on the latter, > since our computation resource is respectively limited.) > https://ci.bigtop.apache.org/view/Releases/job/Bigtop-3.0.0/ > https://ci.bigtop.apache.org/view/Test/job/Bigtop-3.0.0-smoke-tests/ > > Kengo Seki
Re: 3.0.0 release branch cutting notification
Hi Arnaud, I'd be interested in Bullseye support as well. IIRC it was already discussed, and the plan was to cut another release after 3.0.0 to avoid waiting (at the time it wasn't clear when Debian would have released). I think it is fine to postpone Bullseye support and release 3.0 asap, maybe we can open a Jira to start the CI work together?? Luca On Thu, Aug 26, 2021 at 8:05 AM Arnaud Launay wrote: > > Le Thu, Aug 26, 2021 at 11:02:07AM +0900, Kengo Seki a écrit: > > and let me know if you have any other fixes that you would like to > > merge into 3.0.0 ;) > > Debian 11 (bullseye) support ? :) It was released two weeks ago. Is it at > least > possible to have an automated test to see what works and what doesn't ? > > Arnaud.
[jira] [Created] (BIGTOP-3566) Add support for the Namenode Observer role in puppet
Luca Toscano created BIGTOP-3566: Summary: Add support for the Namenode Observer role in puppet Key: BIGTOP-3566 URL: https://issues.apache.org/jira/browse/BIGTOP-3566 Project: Bigtop Issue Type: New Feature Reporter: Luca Toscano Hi everybody, in dev@ it was discussed the fact that the HDFS Namenode can have three states: - active - standby - observer https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ObserverNameNode.html It may be good to add the support for it in puppet :) -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Interesting blog post from Linkedin
/me apologizes, I just learned that Masatake's answers ended up in my gmail spam folder :( Going to answer those thanks :) On Fri, Jul 2, 2021 at 7:59 AM Luca Toscano wrote: > > Ping on this, I'd be curious to know people's opinion and experiences > (it is also fine to not answer of course :) > > Lica > > On Sat, Jun 12, 2021 at 7:40 PM Roman Shaposhnik wrote: > > > > On Fri, Jun 11, 2021 at 9:53 AM Luca Toscano wrote: > > > > > Hi everybody, > > > > > > > > > https://engineering.linkedin.com/blog/2021/the-exabyte-club--linkedin-s-journey-of-scaling-the-hadoop-distr > > > is a nice blog post reading. There are some interesting follow ups in > > > my opinion: > > > > > > - Fair vs Non-Fair locking for the HDFS Namenode. IIUC this seems to > > > be a code change rather than a jvm setting tunable, but I am wondering > > > if others have experience with different locking mechanisms in > > > production for HDFS. > > > - Observer HDFS Namenode. IIUC this was introduced in Hadoop 2.10, it > > > would be nice if we could offer it via puppet for the docker > > > provisioner (if we don't already do it, I didn't find it). Having a > > > separate Namenode to handle read requests could be interesting for > > > busy clusters. Has anybody already deployed it? > > > > > > > Very interesting indeed! Thanks for sharing. > > > > Thanks, > > Roman.
Re: Interesting blog post from Linkedin
Ping on this, I'd be curious to know people's opinion and experiences (it is also fine to not answer of course :) Lica On Sat, Jun 12, 2021 at 7:40 PM Roman Shaposhnik wrote: > > On Fri, Jun 11, 2021 at 9:53 AM Luca Toscano wrote: > > > Hi everybody, > > > > > > https://engineering.linkedin.com/blog/2021/the-exabyte-club--linkedin-s-journey-of-scaling-the-hadoop-distr > > is a nice blog post reading. There are some interesting follow ups in > > my opinion: > > > > - Fair vs Non-Fair locking for the HDFS Namenode. IIUC this seems to > > be a code change rather than a jvm setting tunable, but I am wondering > > if others have experience with different locking mechanisms in > > production for HDFS. > > - Observer HDFS Namenode. IIUC this was introduced in Hadoop 2.10, it > > would be nice if we could offer it via puppet for the docker > > provisioner (if we don't already do it, I didn't find it). Having a > > separate Namenode to handle read requests could be interesting for > > busy clusters. Has anybody already deployed it? > > > > Very interesting indeed! Thanks for sharing. > > Thanks, > Roman.
Re: Powered By Bigtop
On Thu, Jul 1, 2021 at 1:12 AM Roman Shaposhnik wrote: > > On Mon, Jun 28, 2021 at 10:24 PM Evans Ye wrote: > > > > Sure. That's a good suggestion. But to me it's hard to gather the > > information by solely googling it. Even though there are evidences showing > > they're using Bigtop, we can't 100% sure so representing them might be > > error-phony. > > > > I'd really love to hear if there's any suggestion that can address this :) > > Maybe a blast to a user@ mailing list? I would also add a Markdown file in the git repo with the following (I didn't find it but in case something is already there apologies :): - Who is using Bigtop - Videos/presentations/blog-posts/etc.. Not sure if we have something similar elsewhere (like in the wiki), but people may be more inclined to add content if it is just a github pull request (rather than sending emails or opening jiras). Luca