Hi Dave, Thanks again for your guidance and confirmation regarding the project size
As you suggested, I have submitted a patch to [email protected]. This patch adds a new *kf_getenv* handler, which bifurcates on the return value into NULL and non-NULL paths, so that the analyzer can now warn on unchecked *getenv* dereferences, and also checks the argument for null-termination. The new getenv-1.c test passes (9/9), and I also ran the full analyzer.exp suite to check for regressions, and there were no new failures. This was a helpful exercise for getting more familiar with analyzer internals, the known-function modelling and the gcc test workflow, as you suggested in last email. I have started drafting the official gsoc proposal in parallel, and I will be sharing it for review once its ready. Also, I would appreciate your feedback on the patch Best, Ridham Khurana On Sat, Mar 14, 2026 at 5:24 AM David Malcolm <[email protected]> wrote: > On Fri, 2026-03-13 at 12:23 +0530, Ridham Khurana wrote: > > Hi Dave, > > > > Thanks for the confirmation about the expected type argument, I will > > add it > > in the shared layer. > > > > While going through the current analyzer implementation, I noticed > > that > > arguments to the function calls are retrieved through > > *call_details::get_arg_svalue()* and then handles as const svalue*, > > rather > > than *tree* nodes like in the frontend and GIMPLE passes. From what I > > can > > understand, the library calls behaviour is modelled through > > *known_function* handlers interacting with the *region_model*(for > > example > > through *impl_call_pre in kf_ handlers*), and then the existing > > checks for > > functions like printf are mostly driven by format attribute and the > > validation of format string arguments(for example using > > *check_for_null_terminated_string_arg()*), instead of interpreting > > the > > individual directives. > > That's correct. > > > > > But one thing that I am not sure about is where the shared string- > > parser > > show be integrated on the analyzer side. Maybe it should be triggered > > through the attribute-based path , or it is better to use it inside > > the > > individual kf_* handlers for the functions like printf-style. > > I'm not sure. I think we want a subroutine inside the analyzer that > can be called from either place, and then see how well each approach > works. > > On the subject of known_function handlers, some other GSoC candidates > have had success in making patches that add new known_function > subclasses for specific POSIX/C stdlib entrypoints. This is a > relatively easy and self-contained way to improve -fanalyzer, and it's > a good way to demonstrate technical prowess, and to shake out any > problems that a candidate might run into building/debugging gcc on > their hardware. It overlaps with the format-string support, so would > be a useful learning experience - but you'd have to choose a simpler > API entrypoint (obviously we don't have the format-string parsing in > convenient modular form yet). > > > > > Also, before starting to draft the official proposal, I wanted to > > confirm > > the expected size of this project. From my current understanding, it > > would > > be 350 hours, > > I think 350 hours is the better choice; this is a rather ambitious > project. > > > > dividing this project into 2 major phases, the first phase of > > the project to unify the parsing logic among all 3 subsystems > > (it would be the *2* subsystems at this time, since the analyzer > doesn't yet support format strings) > > > and the > > second phase to be the actual work on the analyzer part. Please let > > me know > > if it matches your expectations or would you prefer 175 hour scope? > > FWIW I'm always a bit sceptical of timetables that rigidly divide > projects into phases - it feels too much like the "waterfall" model of > development. But yes, splitting out the parsing logic from the other 2 > subsystems is a prerequisite before using it in -fanalyzer (I suppose > you could have a proof-of-concept that recognizes hardcoded strings and > provides the analyzer with the (hardcoded) action list, but that's > probably wasted effort compared to simply doing the refactoring work). > > A useful exercise would be to get familiar with running gcc's full test > suite, and verifying that a patch doesn't regress anything, since > that's very important during the refactoring of the existing code. > > Hope this is helpful and makes sense; let me know if you have any > questions > Dave > > > >
