Re: GSoC 2026: Extend the static analysis pass

Ridham Khurana via Gcc Wed, 18 Mar 2026 07:27:08 -0700

Hi Dave,

Thanks again for your guidance and confirmation regarding the project size


As you suggested, I have submitted a patch to [email protected]. This
patch adds a new *kf_getenv* handler, which bifurcates on the return value
into NULL and non-NULL paths, so that the analyzer can now warn on
unchecked *getenv* dereferences, and also checks the argument for
null-termination.

The new getenv-1.c test passes (9/9), and I also ran the full analyzer.exp
suite to check for regressions, and there were no new failures.

This was a helpful exercise for getting more familiar with analyzer
internals, the known-function modelling and the gcc test workflow, as you
suggested in last email.

I have started drafting the official gsoc proposal in parallel, and I will
be sharing it for review once its ready. Also, I would appreciate your
feedback on the patch

Best,
Ridham Khurana

On Sat, Mar 14, 2026 at 5:24 AM David Malcolm <[email protected]> wrote:

> On Fri, 2026-03-13 at 12:23 +0530, Ridham Khurana wrote:
> > Hi Dave,
> >
> > Thanks for the confirmation about the expected type argument, I will
> > add it
> > in the shared layer.
> >
> > While going through the current analyzer implementation, I noticed
> > that
> > arguments to the function calls are retrieved through
> > *call_details::get_arg_svalue()* and then handles as const svalue*,
> > rather
> > than *tree* nodes like in the frontend and GIMPLE passes. From what I
> > can
> > understand, the library calls behaviour is modelled through
> > *known_function* handlers interacting with the *region_model*(for
> > example
> > through *impl_call_pre in kf_ handlers*), and then the existing
> > checks for
> > functions like printf are mostly driven by format attribute and the
> > validation of format string arguments(for example using
> > *check_for_null_terminated_string_arg()*), instead of interpreting
> > the
> > individual directives.
>
> That's correct.
>
> >
> > But one thing that I am not sure about is where the shared string-
> > parser
> > show be integrated on the analyzer side. Maybe it should be triggered
> > through the attribute-based path , or it is better to use it inside
> > the
> > individual kf_* handlers for the functions like printf-style.
>
> I'm not sure.  I think we want a subroutine inside the analyzer that
> can be called from either place, and then see how well each approach
> works.
>
> On the subject of known_function handlers, some other GSoC candidates
> have had success in making patches that add new known_function
> subclasses for specific POSIX/C stdlib entrypoints.  This is a
> relatively easy and self-contained way to improve -fanalyzer, and it's
> a good way to demonstrate technical prowess, and to shake out any
> problems that a candidate might run into building/debugging gcc on
> their hardware.  It overlaps with the format-string support, so would
> be a useful learning experience - but you'd have to choose a simpler
> API entrypoint (obviously we don't have the format-string parsing in
> convenient modular form yet).
>
> >
> > Also, before starting to draft the official proposal, I wanted to
> > confirm
> > the expected size of this project. From my current understanding, it
> > would
> > be 350 hours,
>
> I think 350 hours is the better choice; this is a rather ambitious
> project.
>
>
> > dividing this project into 2 major phases, the first phase of
> > the project to unify the parsing logic among all 3 subsystems
>
> (it would be the *2* subsystems at this time, since the analyzer
> doesn't yet support format strings)
>
> > and the
> > second phase to be the actual work on the analyzer part. Please let
> > me know
> > if it matches your expectations or would you prefer 175 hour scope?
>
> FWIW I'm always a bit sceptical of timetables that rigidly divide
> projects into phases - it feels too much like the "waterfall" model of
> development.  But yes, splitting out the parsing logic from the other 2
> subsystems is a prerequisite before using it in -fanalyzer (I suppose
> you could have a proof-of-concept that recognizes hardcoded strings and
> provides the analyzer with the (hardcoded) action list, but that's
> probably wasted effort compared to simply doing the refactoring work).
>
> A useful exercise would be to get familiar with running gcc's full test
> suite, and verifying that a patch doesn't regress anything, since
> that's very important during the refactoring of the existing code.
>
> Hope this is helpful and makes sense; let me know if you have any
> questions
> Dave
>
>
>
>

Re: GSoC 2026: Extend the static analysis pass

Reply via email to