Hello again, Following up on my email from yesterday, I have finished drafting the second proposal I mentioned regarding support for printf-style formatted functions in -fanalyzer.
This project focuses on refactoring the existing format-string parsing logic (from c-format.cc and gimple-ssa-sprintf.cc) into a unified internal API. This would allow the analyzer to perform path-sensitive buffer overflow detection for sprintf and fprintf, addressing PR107017. Link to Printf-style Proposal: https://drive.google.com/file/d/1x2DvcwNoNs-we24EMOdHyas9sTAO1DhA/view?usp=sharing Now that I have shared both the FAM (Flexible Array Member) proposal and this printf/refactoring proposal, I would appreciate your guidance on which direction the community feels is more impactful for the analyzer's roadmap. Should I focus on refining one of these exclusively, or is it helpful to continue developing both at this stage? Thank you for your time and feedback. Best regards, Virginia Kodsy On Sun, Mar 22, 2026 at 1:39 PM Virginia Hany <[email protected]> wrote: > Hello GCC community, > > My name is Virginia Kodsy, and I am interested in applying for Google > Summer of Code 2026 to work on the GCC static analyzer (-fanalyzer). > > Over the past few weeks, I have been exploring the analyzer’s internals > and working on small contributions to gain familiarity with the codebase. > Specifically, I have been implementing models for functions like getenv and > strcmp within the known_function framework, which has helped me understand > region_model, svalue types, and constraint handling. > > I have prepared a draft proposal focused on improving the detection of > out-of-bounds accesses for Flexible Array Members (FAMs). > > Project Title: Improving Detection of Out-of-Bounds Accesses for FAMs in > GCC Static Analyzer > > Brief Summary: The project aims to enhance symbolic capacity tracking and > constraint propagation to better detect OOB accesses in FAMs, particularly > in complex cases involving symbolic allocation sizes and realloc patterns > where the current analyzer often loses track of region bounds. > > Draft Proposal: > https://drive.google.com/drive/folders/1hcfYmvJ7mSvdpp7c4V7ChFZAN7sv6j8h?usp=sharing > > I would greatly appreciate your feedback on: > 1. The technical feasibility of the proposed approach for tracking > symbolic FAM sizes. > 2. Whether the scope is appropriate for a GSoC timeline or if it should be > narrowed/expanded. > 3. Any specific edge cases in FAM handling that you believe should be > prioritized. > > I also have an interest in improving general string handling support. > While I am prepared to submit multiple proposals, would you recommend > focusing my efforts on refining this FAM proposal to a higher standard > instead? > > Thank you for your time and guidance. > > Best regards, > Virginia Kodsy >
