[jira] [Commented] (HIVE-7250) Adaptive compression buffer size for wide tables in ORC
[ https://issues.apache.org/jira/browse/HIVE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038568#comment-14038568 ] Lefty Leverenz commented on HIVE-7250: -- No user doc? > Adaptive compression buffer size for wide tables in ORC > --- > > Key: HIVE-7250 > URL: https://issues.apache.org/jira/browse/HIVE-7250 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Fix For: 0.14.0 > > Attachments: HIVE-7250.1.patch, HIVE-7250.2.patch, HIVE-7250.3.patch, > HIVE-7250.4.patch, HIVE-7250.5.patch > > > If the input table is wide (in the order of 1000s), ORC compression buffer > size overhead becomes significant causing OOM issues. To overcome this issue, > buffer size should be adaptively chosen based on the available memory and the > number of columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7250) Adaptive compression buffer size for wide tables in ORC
[ https://issues.apache.org/jira/browse/HIVE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038286#comment-14038286 ] Prasanth J commented on HIVE-7250: -- Patch committed to trunk. > Adaptive compression buffer size for wide tables in ORC > --- > > Key: HIVE-7250 > URL: https://issues.apache.org/jira/browse/HIVE-7250 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Fix For: 0.14.0 > > Attachments: HIVE-7250.1.patch, HIVE-7250.2.patch, HIVE-7250.3.patch, > HIVE-7250.4.patch, HIVE-7250.5.patch > > > If the input table is wide (in the order of 1000s), ORC compression buffer > size overhead becomes significant causing OOM issues. To overcome this issue, > buffer size should be adaptively chosen based on the available memory and the > number of columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7250) Adaptive compression buffer size for wide tables in ORC
[ https://issues.apache.org/jira/browse/HIVE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038284#comment-14038284 ] Prasanth J commented on HIVE-7250: -- The recent patch does not change the outcome of unit tests. > Adaptive compression buffer size for wide tables in ORC > --- > > Key: HIVE-7250 > URL: https://issues.apache.org/jira/browse/HIVE-7250 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Fix For: 0.14.0 > > Attachments: HIVE-7250.1.patch, HIVE-7250.2.patch, HIVE-7250.3.patch, > HIVE-7250.4.patch, HIVE-7250.5.patch > > > If the input table is wide (in the order of 1000s), ORC compression buffer > size overhead becomes significant causing OOM issues. To overcome this issue, > buffer size should be adaptively chosen based on the available memory and the > number of columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7250) Adaptive compression buffer size for wide tables in ORC
[ https://issues.apache.org/jira/browse/HIVE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037271#comment-14037271 ] Hive QA commented on HIVE-7250: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12651345/HIVE-7250.4.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5664 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/514/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/514/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-514/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12651345 > Adaptive compression buffer size for wide tables in ORC > --- > > Key: HIVE-7250 > URL: https://issues.apache.org/jira/browse/HIVE-7250 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Attachments: HIVE-7250.1.patch, HIVE-7250.2.patch, HIVE-7250.3.patch, > HIVE-7250.4.patch > > > If the input table is wide (in the order of 1000s), ORC compression buffer > size overhead becomes significant causing OOM issues. To overcome this issue, > buffer size should be adaptively chosen based on the available memory and the > number of columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7250) Adaptive compression buffer size for wide tables in ORC
[ https://issues.apache.org/jira/browse/HIVE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036773#comment-14036773 ] Prasanth J commented on HIVE-7250: -- The qfile test was added just to make sure the wide table creation runs with any OOM with the default heap settings. I will add a unit test in the next patch that will check for buffer sizes. > Adaptive compression buffer size for wide tables in ORC > --- > > Key: HIVE-7250 > URL: https://issues.apache.org/jira/browse/HIVE-7250 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Attachments: HIVE-7250.1.patch, HIVE-7250.2.patch, HIVE-7250.3.patch > > > If the input table is wide (in the order of 1000s), ORC compression buffer > size overhead becomes significant causing OOM issues. To overcome this issue, > buffer size should be adaptively chosen based on the available memory and the > number of columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7250) Adaptive compression buffer size for wide tables in ORC
[ https://issues.apache.org/jira/browse/HIVE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036763#comment-14036763 ] Gunther Hagleitner commented on HIVE-7250: -- +1 although i was asking for some more testing of the logic through unit tests on rb > Adaptive compression buffer size for wide tables in ORC > --- > > Key: HIVE-7250 > URL: https://issues.apache.org/jira/browse/HIVE-7250 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Attachments: HIVE-7250.1.patch, HIVE-7250.2.patch, HIVE-7250.3.patch > > > If the input table is wide (in the order of 1000s), ORC compression buffer > size overhead becomes significant causing OOM issues. To overcome this issue, > buffer size should be adaptively chosen based on the available memory and the > number of columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7250) Adaptive compression buffer size for wide tables in ORC
[ https://issues.apache.org/jira/browse/HIVE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036726#comment-14036726 ] Gopal V commented on HIVE-7250: --- LGTM +1 (NB) > Adaptive compression buffer size for wide tables in ORC > --- > > Key: HIVE-7250 > URL: https://issues.apache.org/jira/browse/HIVE-7250 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Attachments: HIVE-7250.1.patch, HIVE-7250.2.patch, HIVE-7250.3.patch > > > If the input table is wide (in the order of 1000s), ORC compression buffer > size overhead becomes significant causing OOM issues. To overcome this issue, > buffer size should be adaptively chosen based on the available memory and the > number of columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7250) Adaptive compression buffer size for wide tables in ORC
[ https://issues.apache.org/jira/browse/HIVE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034991#comment-14034991 ] Prasanth J commented on HIVE-7250: -- Was able to load 15K columns of the test dataset similar to the qtest dataset in patch with default 1GB heap. 20K columns causes OOM. > Adaptive compression buffer size for wide tables in ORC > --- > > Key: HIVE-7250 > URL: https://issues.apache.org/jira/browse/HIVE-7250 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Attachments: HIVE-7250.1.patch, HIVE-7250.2.patch > > > If the input table is wide (in the order of 1000s), ORC compression buffer > size overhead becomes significant causing OOM issues. To overcome this issue, > buffer size should be adaptively chosen based on the available memory and the > number of columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7250) Adaptive compression buffer size for wide tables in ORC
[ https://issues.apache.org/jira/browse/HIVE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034914#comment-14034914 ] Prasanth J commented on HIVE-7250: -- I tested the current patch with hive 0.11 and hive 0.12 versions for backward compatibility. > Adaptive compression buffer size for wide tables in ORC > --- > > Key: HIVE-7250 > URL: https://issues.apache.org/jira/browse/HIVE-7250 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Attachments: HIVE-7250.1.patch, HIVE-7250.2.patch > > > If the input table is wide (in the order of 1000s), ORC compression buffer > size overhead becomes significant causing OOM issues. To overcome this issue, > buffer size should be adaptively chosen based on the available memory and the > number of columns. -- This message was sent by Atlassian JIRA (v6.2#6252)