[
https://issues.apache.org/jira/browse/ORC-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963133#comment-15963133
]
ASF GitHub Bot commented on ORC-168:
------------------------------------
GitHub user Citrullin opened a pull request:
https://github.com/apache/orc/pull/104
Core documentation fixes
There's an issue with the size count and the MapColumnVector is named
ListColumnVector
Also wrote a jira ticket for it:
https://issues.apache.org/jira/browse/ORC-168
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Citrullin/orc master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/orc/pull/104.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #104
----
commit c1083247a917f18289761bcf13f902555b5cbda8
Author: Philipp Blum <[email protected]>
Date: 2017-04-10T16:26:06Z
fix wrong batch.size counter
commit b4c75761115d3fc48f53ebed51bb64a6e8a204c8
Author: Philipp Blum <[email protected]>
Date: 2017-04-10T16:29:06Z
Rename ListColumnVector to MapColumnVector
----
> Documentation Writer Example: batch.size++ has a wrong possition
> ----------------------------------------------------------------
>
> Key: ORC-168
> URL: https://issues.apache.org/jira/browse/ORC-168
> Project: ORC
> Issue Type: Improvement
> Components: documentation, Java
> Reporter: Philipp Blum
> Priority: Critical
>
> There's one little mistake in the Java Core Example. The for loops starts
> with a batch.size++:
> for(int r=0; r < 10000; ++r) {
> int row = batch.size++;
> x.vector[row] = r;
> y.vector[row] = r * 3;
> // If the batch is full, write it out and start over.
> if (batch.size == batch.getMaxSize()) {
> writer.addRowBatch(batch);
> batch.reset();
> }
> }
> If you start with a batch.size++ the first index will be 1, so the first
> entry in the orc file will be empty.
> Correct is:
> for(int r=0; r < 10000; ++r) {
> int row = batch.size;
> x.vector[row] = r;
> y.vector[row] = r * 3;
> // If the batch is full, write it out and start over.
> if (batch.size == batch.getMaxSize()) {
> writer.addRowBatch(batch);
> batch.reset();
> }
> batch.size++;
> }
> Already tested it in scala.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)