GitHub user cramja opened a pull request:
https://github.com/apache/incubator-quickstep/pull/109
Refectored bulk insertion to the SplitRow store
The inner loop of the insert algorithm has been changed to reduce function
calls to only those that are absolutely necessary. Also, we merge copies which
come from other rowstore source, speeding up insertion time.
Also adds support for the idea of 'partial inserts'. Partial inserts are
when you are only inserting a subset of the columns at a time. Partial inserts
will be used in a later commit.
*Testing*
Unit tests have been updated. The old bulkInsert tests needed to be
modified because now we have situations where a block will not be filled up
completely- only to a threshold value. This reduces the runtime of the costly
inner loop at the cost of a few tuples.
*Performance*
I had a [similar PR-100
open](https://github.com/apache/incubator-quickstep/pull/100) last week. I ran
TPCH SF100 queries 1-17 with this branch and with the branch from PR-100. They
performed within a 1% margin of each other so it is safe to say that this
branch is as fast as the last branch (which was 2x the base).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cramja/incubator-quickstep
splitrow_insert_refactor
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-quickstep/pull/109.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #109
----
commit 4ce5acf046e0d5fce320efcae7aea648549e98e9
Author: cramja <[email protected]>
Date: 2016-10-05T21:40:30Z
Refectored bulk insertion to the SplitRow store
The inner loop of the insert algorithm has been changed to reduce
function calls to only those that are absolutely necessary. Also, we
merge copies which come from other rowstore source, speeding up
insertion time.
Also adds support for the idea of 'partial inserts'. Partial
inserts are when you are only inserting a subset of the columns at a
time. Partial inserts will be used in a later commit.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---