For others potentially seeing this on mailing list search, yes, I needed that, which of course required creating an account charge which I wasn't using. So I ran
sacctmgr add account default_account sacctmgr add -i user $user Accounts=default_account with an appropriate looping around for $user and everything is working fine now. Thanks everybody! On Tue, Oct 3, 2023 at 7:44 AM Paul Edmon <ped...@cfa.harvard.edu> wrote: > You will probably need to. > > The way we handle it is that we add users when the first submit a job via > the job_submit.lua script. This way the database autopopulates with active > users. > > -Paul Edmon- > On 10/3/23 9:01 AM, Davide DelVento wrote: > > By increasing the slurmdbd verbosity level, I got additional information, > namely the following: > > slurmdbd: error: couldn't get information for this user (null)(xxxxxx) > slurmdbd: debug: accounting_storage/as_mysql: > as_mysql_jobacct_process_get_jobs: User xxxxxx has no associations, and > is not admin, so not returning any jobs. > > again where xxxxx is the posix ID of the user who's running the query in > the slurmdbd logs. > > I suspect this is due to the fact that our userbase is small enough (we > are a department HPC) that we don't need to use allocation and the like, so > I have not configured any association (and not even studied its > configuration, since when I was at another place which did use > associations, someone else took care of slurm administration). > > Anyway, I read the fantastic document by our own member at > https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_accounting/#associations > and in fact I have not even configured slurm users: > > # sacctmgr show user > User Def Acct Admin > ---------- ---------- --------- > root root Administ+ > # > > So is that the issue? Should I just add all users? Any suggestions on the > minimal (but robust) way to do that? > > Thanks! > > > On Mon, Oct 2, 2023 at 9:20 AM Davide DelVento <davide.quan...@gmail.com> > wrote: > >> Thanks Paul, this helps. >> >> I don't have any PrivateData line in either config file. According to the >> docs, "By default, all information is visible to all users" so this should >> not be an issue. I tried to add a line with "PrivateData=jobs" to the conf >> files, just in case, but that didn't change the behavior. >> >> On Mon, Oct 2, 2023 at 9:10 AM Paul Edmon <ped...@cfa.harvard.edu> wrote: >> >>> At least in our setup, users can see their own scripts by doing sacct -B >>> -j JOBID >>> >>> I would make sure that the scripts are being stored and how you have >>> PrivateData set. >>> >>> -Paul Edmon- >>> On 10/2/2023 10:57 AM, Davide DelVento wrote: >>> >>> I deployed the job_script archival and it is working, however it can be >>> queried only by root. >>> >>> A regular user can run sacct -lj towards any jobs (even those by other >>> users, and that's okay in our setup) with no problem. However if they run >>> sacct -j job_id --batch-script even against a job they own themselves, >>> nothing is returned and I get a >>> >>> slurmdbd: error: couldn't get information for this user (null)(xxxxxx) >>> >>> where xxxxx is the posix ID of the user who's running the query in the >>> slurmdbd logs. >>> >>> Both configure files slurmdbd.conf and slurm.conf do not have any >>> "permission" setting. FWIW, we use LDAP. >>> >>> Is that the expected behavior, in that by default only root can see the >>> job scripts? I was assuming the users themselves should be able to debug >>> their own jobs... Any hint on what could be changed to achieve this? >>> >>> Thanks! >>> >>> >>> >>> On Fri, Sep 29, 2023 at 5:48 AM Davide DelVento < >>> davide.quan...@gmail.com> wrote: >>> >>>> Fantastic, this is really helpful, thanks! >>>> >>>> On Thu, Sep 28, 2023 at 12:05 PM Paul Edmon <ped...@cfa.harvard.edu> >>>> wrote: >>>> >>>>> Yes it was later than that. If you are 23.02 you are good. We've been >>>>> running with storing job_scripts on for years at this point and that part >>>>> of the database only uses up 8.4G. Our entire database takes up 29G on >>>>> disk. So its about 1/3 of the database. We also have database compression >>>>> which helps with the on disk size. Raw uncompressed our database is about >>>>> 90G. We keep 6 months of data in our active database. >>>>> >>>>> -Paul Edmon- >>>>> On 9/28/2023 1:57 PM, Ryan Novosielski wrote: >>>>> >>>>> Sorry for the duplicate e-mail in a short time: do you know (or >>>>> anyone) when the hashing was added? Was planning to enable this on 21.08, >>>>> but we then had to delay our upgrade to it. I’m assuming later than that, >>>>> as I believe that’s when the feature was added. >>>>> >>>>> On Sep 28, 2023, at 13:55, Ryan Novosielski <novos...@rutgers.edu> >>>>> <novos...@rutgers.edu> wrote: >>>>> >>>>> Thank you; we’ll put in a feature request for improvements in that >>>>> area, and also thanks for the warning? I thought of that in passing, but >>>>> the real world experience is really useful. I could easily see wanting >>>>> that >>>>> stuff to be retained less often than the main records, which is what I’d >>>>> ask for. >>>>> >>>>> I assume that archiving, in general, would also remove this stuff, >>>>> since old jobs themselves will be removed? >>>>> >>>>> -- >>>>> #BlackLivesMatter >>>>> ____ >>>>> || \\UTGERS, >>>>> |---------------------------*O*--------------------------- >>>>> ||_// the State | Ryan Novosielski - novos...@rutgers.edu >>>>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ >>>>> RBHS Campus >>>>> || \\ of NJ | Office of Advanced Research Computing - MSB >>>>> A555B, Newark >>>>> `' >>>>> >>>>> On Sep 28, 2023, at 13:48, Paul Edmon <ped...@cfa.harvard.edu> >>>>> <ped...@cfa.harvard.edu> wrote: >>>>> >>>>> Slurm should take care of it when you add it. >>>>> >>>>> So far as horror stories, under previous versions our database size >>>>> ballooned to be so massive that it actually prevented us from upgrading >>>>> and >>>>> we had to drop the columns containing the job_script and job_env. This >>>>> was >>>>> back before slurm started hashing the scripts so that it would only store >>>>> one copy of duplicate scripts. After this point we found that the >>>>> job_script database stayed at a fairly reasonable size as most users use >>>>> functionally the same script each time. However the job_env continued to >>>>> grow like crazy as there are variables in our environment that change >>>>> fairly consistently depending on where the user is. Thus job_envs ended up >>>>> being too massive to keep around and so we had to drop them. Frankly we >>>>> never really used them for debugging. The job_scripts though are super >>>>> useful and not that much overhead. >>>>> >>>>> In summary my recommendation is to only store job_scripts. job_envs >>>>> add too much storage for little gain, unless your job_envs are basically >>>>> the same for each user in each location. >>>>> >>>>> Also it should be noted that there is no way to prune out job_scripts >>>>> or job_envs right now. So the only way to get rid of them if they get >>>>> large >>>>> is to 0 out the column in the table. You can ask SchedMD for the mysql >>>>> command to do this as we had to do it here to our job_envs. >>>>> >>>>> -Paul Edmon- >>>>> >>>>> On 9/28/2023 1:40 PM, Davide DelVento wrote: >>>>> >>>>> In my current slurm installation, (recently upgraded to slurm >>>>> v23.02.3), I only have >>>>> >>>>> AccountingStoreFlags=job_comment >>>>> >>>>> I now intend to add both >>>>> >>>>> AccountingStoreFlags=job_script >>>>> AccountingStoreFlags=job_env >>>>> >>>>> leaving the default 4MB value for max_script_size >>>>> >>>>> Do I need to do anything on the DB myself, or will slurm take care of >>>>> the additional tables if needed? >>>>> >>>>> Any comments/suggestions/gotcha/pitfalls/horror_stories to share? I >>>>> know about the additional diskspace and potentially load needed, and with >>>>> our resources and typical workload I should be okay with that. >>>>> >>>>> Thanks! >>>>> >>>>> >>>>> >>>>> >>>>>