[ https://issues.apache.org/jira/browse/TORQUE-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839298#comment-17839298 ]
Max Philipp Wriedt edited comment on TORQUE-364 at 4/20/24 10:23 PM: --------------------------------------------------------------------- To point 3. we finally got a slightly nicer working prototype to share; {code:java} ### recordMapperBase.vm ---+ // at the top import the new runtime MappingStrategy class import org.apache.torque.om.mapper.MappingStrategy; --- ---+ // Should this be cached per RecordMapper (Thread safety/Multi query safety?) private MappingStrategy<${dbObjectClassName}> strategy; // A sample prepareStrategy implementation, still missing the new HashSet optimization! public MappingStrategy<${dbObjectClassName}> prepareStrategy(Criteria criteria, int offset) { MappingStrategy<${dbObjectClassName}> strategy = new MappingStrategy<${dbObjectClassName}>(); if (criteria == null) { return strategy; } int nextOuterOffset = offset + 1; List<Column> selectColumns = criteria.getSelectColumns(); List<Column> columnsWithoutOffset = selectColumns.subList( offset, selectColumns.size()); for (Column column : columnsWithoutOffset) { // Create a final local copy for this iteration to be passed into the lambdas final int nextOffset = nextOuterOffset; String columnExpression = column.getSqlExpression(); #set ( $else = "" ) #foreach ($columnElement in $torqueGen.getChildren("column")) #set ( $setter = $columnElement.getAttribute("setter") ) #set ( $getter = $columnElement.getAttribute("getter") ) #set ( $peerColumnName = $columnElement.getAttribute("peerColumnName") ) #set ( $fieldType = $columnElement.getAttribute("fieldType") ) ${else}if (${peerColumnName}_EXPRESSION.equals(columnExpression)) { strategy.addColumn(nextOffset, (res, inst) -> inst.${setter}(this.${getter}(res, nextOffset))); } #set ( $else = "else ") #end nextOuterOffset += 1; } strategy.finish($torqueGen.getChildren("column").size()); return strategy; } --- ---+ // in processRow() // this could theoretically replace BOTh if cases (criteria == null AND != null) // However criteria == null might still be slightly faster than this! // Wherever the MappingStrategy is ultimately stored should probably handle invalidation // Storing it in Criteria for example would allow invalidating whenever a select columns is added/modified // However, for testing purposes we just placed it directly in processRow to show viability. if (this.strategy == null) { this.strategy = prepareStrategy(criteria, offset); } this.strategy.execute(resultSet, $field); --- ### MappingStrategy.java package org.apache.torque.om.mapper; import org.apache.commons.lang3.tuple.Pair; import org.apache.commons.lang3.function.FailableBiConsumer; import org.apache.torque.TorqueException; import java.sql.ResultSet; import java.util.ArrayList; import java.util.Comparator; import java.util.List; public class MappingStrategy<T> { private final List<Pair<Integer, FailableBiConsumer<ResultSet, T, TorqueException>>> strategy; private boolean allSet; public MappingStrategy() { this.strategy = new ArrayList<>(); this.allSet = false; } public void addColumn(int offset, FailableBiConsumer<ResultSet, T, TorqueException> setter) { this.strategy.add(Pair.of(offset, setter)); } public void finish(int num_fields) { // The list should already be in the correct order because Criteria loops through the columns // in the same order in which they are added to the SQL statement but just in case something weird // is being done this gets us closer to the desired contract of ResultSet of looping over monotonically // increasing indices of columns only. this.strategy.sort(Comparator.comparing(Pair::getLeft)); this.allSet = this.strategy.size() == num_fields; } public boolean isEmpty() { return this.strategy.isEmpty(); } public boolean isAllSet() { return this.allSet; } public void execute(ResultSet result, T instance) throws TorqueException { for (Pair<Integer, FailableBiConsumer<ResultSet, T, TorqueException>> strategy : this.strategy) { strategy.getRight().accept(result, instance); } } } {code} This code adds a single MappingStrategy<T> class which holds a list of lambdas which know how to best query a ResultSet to construct a T from it as well as the necessary code for a RecordMapper to utilize this MappingStrategy to speed up object construction significantly! However, the actual strategy object is obviously tied to a given Criterias/Queries selected columns and thus can't be kept by the RecordMapperXXX objects themselves. (This would lead to exceptions with multple threads doing queries and would require extensive cache invalidation checks for all processRow invokations to create a new strategy even in single threaded execution!) As such I would like to open the discussion wether or not this implementation seems worthwhile and in case it does, how best to store the MappingStrategy object so it can be passed to the relevant RecordMapper, is kept up to date with the given Criteria/ResultSet passed to its processRow and is invalidated correctly. [~refarb] [~mwriedt] (Sorry for still not having my actual account back up yet) was (Author: JIRAUSER299058): To point 3. we finally got a slightly nicer working prototype to share; {code:java} ### recordMapperBase.vm ---+ // at the top import the new runtime MappingStrategy class import org.apache.torque.om.mapper.MappingStrategy; --- ---+ // Should this be cached per RecordMapper (Thread safety/Multi query safety?) private MappingStrategy<${dbObjectClassName}> strategy; // A sample prepareStrategy implementation, still missing the new HashSet optimization! public MappingStrategy<${dbObjectClassName}> prepareStrategy(Criteria criteria, int offset) { MappingStrategy<${dbObjectClassName}> strategy = new MappingStrategy<${dbObjectClassName}>(); if (criteria == null) { return strategy; } int nextOuterOffset = offset + 1; List<Column> selectColumns = criteria.getSelectColumns(); List<Column> columnsWithoutOffset = selectColumns.subList( offset, selectColumns.size()); for (Column column : columnsWithoutOffset) { // Create a final local copy for this iteration to be passed into the lambdas final int nextOffset = nextOuterOffset; String columnExpression = column.getSqlExpression(); #set ( $else = "" ) #foreach ($columnElement in $torqueGen.getChildren("column")) #set ( $setter = $columnElement.getAttribute("setter") ) #set ( $getter = $columnElement.getAttribute("getter") ) #set ( $peerColumnName = $columnElement.getAttribute("peerColumnName") ) #set ( $fieldType = $columnElement.getAttribute("fieldType") ) ${else}if (${peerColumnName}_EXPRESSION.equals(columnExpression)) { strategy.addColumn(nextOffset, (res, inst) -> inst.${setter}(this.${getter}(res, nextOffset))); } #set ( $else = "else ") #end nextOuterOffset += 1; } strategy.finish($torqueGen.getChildren("column").size()); return strategy; } --- ---+ // in processRow() // this could theoretically replace BOTh if cases (criteria == null AND != null) // However criteria == null might still be slightly faster than this! // Wherever the MappingStrategy is ultimately stored should probably handle invalidation // Storing it in Criteria for example would allow invalidating whenever a select columns is added/modified // However, for testing purposes we just placed it directly in processRow to show viability. if (this.strategy == null) { this.strategy = prepareStrategy(criteria, offset); } this.strategy.execute(resultSet, $field); --- ### MappingStrategy.java package org.apache.torque.om.mapper; import org.apache.commons.lang3.tuple.Pair; import org.apache.commons.lang3.function.FailableBiConsumer; import org.apache.torque.TorqueException; import java.sql.ResultSet; import java.util.ArrayList; import java.util.Comparator; import java.util.List; public class MappingStrategy<T> { private final List<Pair<Integer, FailableBiConsumer<ResultSet, T, TorqueException>>> strategy; private boolean allSet; public MappingStrategy() { this.strategy = new ArrayList<>(); this.allSet = false; } public void addColumn(int offset, FailableBiConsumer<ResultSet, T, TorqueException> setter) { this.strategy.add(Pair.of(offset, setter)); } public void finish(int num_fields) { // The list should already be in the correct order because Criteria loops through the columns // in the same order in which they are added to the SQL statement but just in case something weird // is being done this gets us closer to the desired contract of ResultSet of looping over monotonically // increasing indices of columns only. this.strategy.sort(Comparator.comparing(Pair::getLeft)); this.allSet = this.strategy.size() == num_fields; } public boolean isEmpty() { return this.strategy.isEmpty(); } public boolean isAllSet() { return this.allSet; } public void execute(ResultSet result, T instance) throws TorqueException { for (Pair<Integer, FailableBiConsumer<ResultSet, T, TorqueException>> strategy : this.strategy) { strategy.getRight().accept(result, instance); } } } {code} This code adds a single MappingStrategy<T> class which holds a list of lambdas which know how to best query a ResultSet to construct a T from it as well as the necessary code for a RecordMapper to utilize this MappingStrategy to speed up object construction significantly! However, the actual strategy object is obviously tied to a given Criterias/Queries selected columns and thus can't be kept by the RecordMapperXXX objects themselves. (This would lead to exceptions with multple threads doing queries and would require extensive cache invalidation checks for all processRow invokations to create a new strategy even in single threaded execution!) As such I would like to open the discussion wether or not this implementation seems worthwhile and in case it does, how best to store the MappingStrategy object so it can be passed to the relevant RecordMapper, is kept up to date with the given Criteria/ResultSet passed to its processRow and is invalidated correctly. [~refarb] [~mwriedt] (Sorry for still not having my actual account back up yet) > RecordMapper very slow on many columns in table > ----------------------------------------------- > > Key: TORQUE-364 > URL: https://issues.apache.org/jira/browse/TORQUE-364 > Project: Torque > Issue Type: Improvement > Components: Runtime, Templates > Affects Versions: 5.1 > Reporter: Max Philipp Wriedt > Priority: Major > Labels: criteria_api, om, performance, recordmapper, templates > > When "doSelect()" a large quantity of columns in a table the default > RecordMappers generated by Om-Templates (processRow()) cause an > !https://wikimedia.org/api/rest_v1/media/math/render/svg/4441d9689c0e6b2c47994e2f587ac5378faeefba! > Problem. (technically O(rows * columns)) > Specifically, constantly generating the SQL expression of all possible > columns for every row in the result causes excessive use of StringBuilders > which slow the mapping process to a crawl. > I currently have two ideas on how best to tackle this problem: > # Either generate all SQL column expressions once when a (template > generated) RecordMapper is first created (using final static String fields) > thus reducing the cost for every row to generating all selected column SQL > expressions once(instead of every selected column times every available > column) > # Or (in case the first approach generates unacceptably excessive number of > fields for RecordMappers) adjust the RecordMapper API to allow a > "prepare(Criteria, int offset)" method to be called once before processing > any rows and implement it on generated RecordMappers to scan the Criteria and > build two lists: One containing references to the setXXX methods of the > mapper in the order they appear in the ResultSet (via the order in the > Criteria) and a second list containing the corresponding column offsets. This > would allow the processRow method to only iterated over both lists > simultaneously and call the referenced methods with the result set and offset > immediately. (Alternatively one list using lambdas could be used but I am > currently not sure about the stance or impact of these lambdas in the Torque > project.) > credits to [~refarb] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: torque-dev-unsubscr...@db.apache.org For additional commands, e-mail: torque-dev-h...@db.apache.org