Change temporary table threshold policy from count to memory size
-----------------------------------------------------------------

                 Key: JENA-126
                 URL: https://issues.apache.org/jira/browse/JENA-126
             Project: Jena
          Issue Type: Improvement
          Components: ARQ
            Reporter: Stephen Allen


The "workCount" setting for temporary table sizes is not a good configuration 
option.  Binding sizes could potentially vary from as little as 32 bytes (8 
byte ref to the binding + 8 byte ref to a variable + 8 byte nodeID + 8 byte 
object overhead), to some bindings with multi-megabyte strings.  Asking the 
user to know which one it is likely to be, and then how that count translates 
into memory usage (the real resource we are attempting to control) is already 
way too much IMO.

OK, so what the user wants is a way to specify the amount of memory that can be 
used by each query operator for temporary tables [1][2][3].  Hmm, wait, no what 
he maybe wants is a way to specify a the total memory used for temporary tables 
per query?  No, maybe he wants to specify it for the whole query engine.

But that last paragraph is not accurate.  What he *really* wants is a system 
that answers all of his queries for whatever data he has as fast as possible.  
He doesn't want to have to configure any parameters.  Unfortunately, this is a 
really hard dynamic optimization problem so we foist it off on the user, hoping 
he'll be able to come up with some value.

We need to decide on what we want to use as a config parameter.  I believe it 
should be a "workMem" or "tmpTableSize" setting that specifies the max memory 
usage of a temporary table before it is converted into an on-disk table.


[1] This is what most DB systems provide, specifically PostgreSQL and MySQL 
both have per operator temporary table sizes.  PostgreSQL calls the setting 
"work_mem" and MySQL calls it "tmp_table_size"
[2] http://www.postgresql.org/docs/8.3/static/runtime-config-resource.html
[3] http://dev.mysql.com/doc/refman/5.0/en/internal-temporary-tables.html



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to