POI 3.12 In looking at the styles.xml file generated by POI, It appears that a lot of borders and fills are created that are not used. Maybe POI needs a method to trim off those unused fills and borders on save. This method would also be able to normalize the fills and borders to remove duplicates and system colors (i.e. fgColor = 64 or bgColor = 65). I notice that Excel does this when it saves a spreadsheet created by POI. Here is an example of the fills, first as created by POI, second as saved by Excel (same spreadsheet):
POI 3.12 <fills count="9"> <fill> <patternFill patternType="none" /> </fill> <fill> <patternFill patternType="darkGray" /> </fill> <fill> <patternFill patternType="none"> <fgColor indexed="64" /> </patternFill> </fill> <fill> <patternFill patternType="none"> <fgColor indexed="64" /> <bgColor indexed="65" /> </patternFill> </fill> <fill> <patternFill> <fgColor indexed="64" /> <bgColor indexed="65" /> </patternFill> </fill> <fill> <patternFill patternType="none"> <fgColor indexed="44" /> </patternFill> </fill> <fill> <patternFill patternType="none"> <fgColor indexed="44" /> <bgColor indexed="65" /> </patternFill> </fill> <fill> <patternFill patternType="solid"> <fgColor indexed="44" /> <bgColor indexed="65" /> </patternFill> </fill> <fill> <patternFill patternType="none"> <bgColor indexed="65" /> </patternFill> </fill> </fills> Excel 2010 Save As (same spreadsheet) <fills count="4"> <fill> <patternFill patternType="none" /> </fill> <fill> <patternFill patternType="gray125" /> </fill> <fill> <patternFill patternType="none" /> </fill> <fill> <patternFill patternType="solid"> <fgColor indexed="44" /> </patternFill> </fill> </fills> Note, not even Excel was perfect as it left a duplicate fill (0, and 2). Borders was even more dramatic POI generated 94, Excel saved 13. And now that I look at it, POI generated 42 styles while Excel kept 23. I understand that you are keeping all the fills borders and styles as they are specified, and re-using them as you get to a final setting, but normalizing and trimming seem to be what you are missing. That can also be applied to fonts and formats. Some additional thoughts, based on not yet looking at all the code. I can see three steps to collecting all the garbage: Normalization, Duplicate removal, Orphan removal. Normalization would involve setting default values, and removing system colors. On the fill above, the POI generated fill would look like this after normalization: <fills count="9"> <fill> <patternFill patternType="none" /> </fill> <fill> <patternFill patternType="darkGray" /> </fill> <fill> <patternFill patternType="none" /> </fill> <fill> <patternFill patternType="none" /> </fill> <fill> <patternFill patternType="none" /> </fill> <fill> <patternFill patternType="none"> <fgColor indexed="44" /> </patternFill> </fill> <fill> <patternFill patternType="none"> <fgColor indexed="44" /> </patternFill> </fill> <fill> <patternFill patternType="solid"> <fgColor indexed="44" /> </patternFill> </fill> <fill> <patternFill patternType="none" /> </fill> </fills> Duplicate removal would remove the many duplicates left by normalization. This would have to be carried back to the styles themselves and potentially to individual cells in the spreadsheet that contain fill id's. After duplicate removal you would have something like the following in the fills: <fills count="4"> <fill> <patternFill patternType="none" /> </fill> <fill> <patternFill patternType="darkGray" /> </fill> <fill> <patternFill patternType="none"> <fgColor indexed="44" /> </patternFill> </fill> <fill> <patternFill patternType="solid"> <fgColor indexed="44" /> </patternFill> </fill> </fills> And I just happen to know that my spreadsheet never uses fill "none" - 44, so that could d be stripped out during orphan removal to bring the fill to: <fills count="3"> <fill> <patternFill patternType="none" /> </fill> <fill> <patternFill patternType="darkGray" /> </fill> <fill> <patternFill patternType="solid"> <fgColor indexed="44" /> </patternFill> </fill> </fills> I suspect that the same process could be used for borders, fonts, formats and styles where the order of operation would be normalization and duplicate removal for fills, borders, fonts, formats, then styles. This followed by orphan removal for styles then fills, borders, fonts, and formats.