[GitHub] orc pull request #311: ORC-407 - Lowerbound and upperbound support in JsonFi...

2018-09-20 Thread moresandeep
GitHub user moresandeep opened a pull request: https://github.com/apache/orc/pull/311 ORC-407 - Lowerbound and upperbound support in JsonFileDump As part of this change JsonFileDump will now take into account lowerbound and upperbound values, specifically, if lowerbound or

[GitHub] orc issue #299: ORC-203 - Update StringStatistics to trim long strings to 10...

2018-09-05 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/299 @omalley I updated the PR with the suggested changes, updated the WriterVersion and put back README. ---

[GitHub] orc issue #299: ORC-203 - Update StringStatistics to trim long strings to 10...

2018-08-14 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/299 Hello @omalley Thanks for the review ! I made the suggested changes, the changes are on a new commit (there are two commits), I am doing this to save the review history, let me know if you

[GitHub] orc issue #292: ORC-203 - Update StringStatistics to trim long strings to 10...

2018-08-06 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/292 @omalley np, I opened a new PR https://github.com/apache/orc/pull/299 ---

[GitHub] orc pull request #299: ORC-203 - Update StringStatistics to trim long string...

2018-08-06 Thread moresandeep
GitHub user moresandeep opened a pull request: https://github.com/apache/orc/pull/299 ORC-203 - Update StringStatistics to trim long strings to 1024 characters & record they were trimmed Reopening the PR. You can merge this pull request into a Git repository by running: $

[GitHub] orc pull request #292: ORC-203 - Update StringStatistics to trim long string...

2018-07-31 Thread moresandeep
Github user moresandeep commented on a diff in the pull request: https://github.com/apache/orc/pull/292#discussion_r206636579 --- Diff: java/bench/README.md --- @@ -1,3 +1,20 @@ + --- End diff -- Thanks @omalley I updated the PR with suggested changes. ---

[GitHub] orc pull request #292: ORC-203 - Update StringStatistics to trim long string...

2018-07-27 Thread moresandeep
Github user moresandeep commented on a diff in the pull request: https://github.com/apache/orc/pull/292#discussion_r205867014 --- Diff: java/core/src/java/org/apache/orc/impl/ColumnStatisticsImpl.java --- @@ -584,16 +642,40 @@ public void merge(ColumnStatisticsImpl other

[GitHub] orc pull request #292: ORC-203 - Update StringStatistics to trim long string...

2018-07-18 Thread moresandeep
GitHub user moresandeep opened a pull request: https://github.com/apache/orc/pull/292 ORC-203 - Update StringStatistics to trim long strings to 1024 characters & record they were trimmed This PR adds the functionality described in ORC-203. You can merge this pull request in

[GitHub] orc pull request #255: ORC-305 - Add column statistics for the size on disk

2018-04-23 Thread moresandeep
GitHub user moresandeep opened a pull request: https://github.com/apache/orc/pull/255 ORC-305 - Add column statistics for the size on disk This PR adds column statistics for the size on disk. I have updated the Unit Tests to reflect this change, I have also manually gone

[GitHub] orc issue #213: ORC-278 - Create in memory KeyProvider class

2018-02-28 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/213 Hello @omalley The changes look good, thanks ! ---

[GitHub] orc issue #213: ORC-278 - Create in memory KeyProvider class

2018-02-21 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/213 Hello @omalley Thanks for the review, the new PR should incorporate the changes you suggested. ---

[GitHub] orc pull request #213: ORC-278 - Create in memory KeyProvider class

2018-01-23 Thread moresandeep
GitHub user moresandeep opened a pull request: https://github.com/apache/orc/pull/213 ORC-278 - Create in memory KeyProvider class This PR addresses ORC-278 by creating an in-memory implementation of HadoopShims.KeyProvider interface which can be used for testing. You can merge

[GitHub] orc issue #208: ORC-250 - Create sha256 mask

2018-01-18 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/208 Updated the PR with suggested changes. ---

[GitHub] orc issue #208: ORC-250 - Create sha256 mask

2018-01-17 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/208 Hello @omalley Thank you for the review ! I have incorporated most of the changes you proposed, except the one about copying output bytes. The thing is I will be needing another buffer

[GitHub] orc pull request #208: ORC-250 - Create sha256 mask

2018-01-17 Thread moresandeep
Github user moresandeep commented on a diff in the pull request: https://github.com/apache/orc/pull/208#discussion_r162178962 --- Diff: java/core/src/java/org/apache/orc/impl/mask/SHA256MaskFactory.java --- @@ -0,0 +1,290 @@ +/* + * Licensed to the Apache Software

[GitHub] orc pull request #208: ORC-250 - Create sha256 mask

2018-01-10 Thread moresandeep
GitHub user moresandeep opened a pull request: https://github.com/apache/orc/pull/208 ORC-250 - Create sha256 mask Masking strategy that masks String, Varchar, Char and Binary types as SHA 256 hash. **For String type:** All string type of any length will be

[GitHub] orc pull request #201: Orc 250 - Create sha256 mask

2018-01-10 Thread moresandeep
Github user moresandeep closed the pull request at: https://github.com/apache/orc/pull/201 ---

[GitHub] orc issue #201: Orc 250 - Create sha256 mask

2017-12-14 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/201 No idea why the cpp build is failing, there are no changes to cpp side in this PR. ---

[GitHub] orc pull request #201: Orc 250 - Create sha256 mask

2017-12-13 Thread moresandeep
GitHub user moresandeep opened a pull request: https://github.com/apache/orc/pull/201 Orc 250 - Create sha256 mask Masking strategy that masks String, Varchar, Char and Binary types as SHA 256 hash. For String type: All string type of any length will be converted

[GitHub] orc issue #184: Orc 256 unmask range option

2017-12-07 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/184 @xndai @omalley I updated the PR with the suggested changes, let me know if you have any questions. ---

[GitHub] orc pull request #184: Orc 256 unmask range option

2017-12-07 Thread moresandeep
Github user moresandeep commented on a diff in the pull request: https://github.com/apache/orc/pull/184#discussion_r155629181 --- Diff: java/core/src/test/org/apache/orc/impl/mask/TestUnmaskRange.java --- @@ -0,0 +1,165 @@ +package org.apache.orc.impl.mask

[GitHub] orc pull request #184: Orc 256 unmask range option

2017-12-07 Thread moresandeep
Github user moresandeep commented on a diff in the pull request: https://github.com/apache/orc/pull/184#discussion_r155629073 --- Diff: java/core/src/java/org/apache/orc/impl/mask/RedactMaskFactory.java --- @@ -245,8 +271,8 @@ public void maskData(ColumnVector original

[GitHub] orc pull request #184: Orc 256 unmask range option

2017-12-07 Thread moresandeep
Github user moresandeep commented on a diff in the pull request: https://github.com/apache/orc/pull/184#discussion_r155629140 --- Diff: java/core/src/java/org/apache/orc/impl/mask/RedactMaskFactory.java --- @@ -619,7 +646,7 @@ public double maskDouble(double value

[GitHub] orc pull request #184: Orc 256 unmask range option

2017-12-07 Thread moresandeep
Github user moresandeep commented on a diff in the pull request: https://github.com/apache/orc/pull/184#discussion_r155628665 --- Diff: java/core/src/java/org/apache/orc/impl/mask/RedactMaskFactory.java --- @@ -114,6 +120,10 @@ private final boolean maskDate; private

[GitHub] orc issue #184: Orc 256 unmask range option

2017-12-07 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/184 @omalley I updated the PR with your suggestions, thanks for the review ! ---

[GitHub] orc issue #184: Orc 256 unmask range option

2017-11-09 Thread moresandeep
Github user moresandeep commented on the issue: https://github.com/apache/orc/pull/184 Updated the PR he changes are as follows: 1. Fixed the find bugs issue. 2. Merged the feature into a single commit. ---

[GitHub] orc pull request #187: ORC-260 - Fix a bug in masking data for Decimal

2017-11-03 Thread moresandeep
GitHub user moresandeep opened a pull request: https://github.com/apache/orc/pull/187 ORC-260 - Fix a bug in masking data for Decimal You can merge this pull request into a Git repository by running: $ git pull https://github.com/moresandeep/orc ORC-260 Alternatively you can

[GitHub] orc pull request #184: Orc 256 unmask range option

2017-10-28 Thread moresandeep
GitHub user moresandeep opened a pull request: https://github.com/apache/orc/pull/184 Orc 256 unmask range option This PR contains changes that enables unmasking range option for redact mask (ORC-256). 1. The react mask would accept an additional option (option #3 in this