As a note, most database systems specify the memory size allowed on a per operator basis.
PostgreSQL calls it "work_mem" [1], MySQL calls it "tmp_table_size" [2], and Oracle used to call it "sort_area_size", but now has a new setting called "pga_aggregate_target" [3]. -Stephen [1] http://www.postgresql.org/docs/9.1/static/runtime-config-resource.html [2] http://dev.mysql.com/doc/refman/5.6/en/internal-temporary-tables.html [3] http://download.oracle.com/docs/cd/B28359_01/server.111/b28320/initparams232.htm On Tue, Nov 1, 2011 at 3:59 PM, Stephen Allen <[email protected]> wrote: > All, > > I am working on JENA-119, and wanted to get some feedback on an external > user-facing change. > > I'd like to consolidate the "spillOnDiskSortingThreshold", > "spillOnDiskUpdateThreshold", and any potential future > "spillOnDisk*Threshold" parameters into a single variable. Separate > symbols for each operator does not seem to scale well, we could potentially > have about 10 different operations that would require a setting. Also I > don't think that a user will really have a good notion of what to set it to. > > I propose the name "workCount" for the variable. I picked this because it > captures the idea of storing that many items (mostly bindings) in memory as > a count. In the future I think we would want something like "workMem" to > specify the amount of memory each operator can use rather than the count of > the items. I have a mild aversion to "spillToDiskThreshold", as I think it > might focus too much on the implementation details, and does not indicate > what it's units of measurement are (count vs. memory size). But I want to > know your opinions. Since this is a user-facing change, we want to make > sure to get it right the first time, as it will be hard to change later. > > So two questions: > 1) Should I consolidate the parameters? > 2) Is "workCount" a good name? > > -Stephen >
