[jira] [Commented] (HIVE-15664) LLAP text cache: improve first query perf I
[ https://issues.apache.org/jira/browse/HIVE-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837354#comment-15837354 ] Lefty Leverenz commented on HIVE-15664: --- Doc note: This adds two configuration parameters to HiveConf.java (*hive.llap.io.encode.enabled*, *hive.llap.io.encode.vector.serde.enabled*), so they need to be documented in the wiki. * [Configuration Properties -- LLAP | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAP] Added a TODOC2.2 label. > LLAP text cache: improve first query perf I > --- > > Key: HIVE-15664 > URL: https://issues.apache.org/jira/browse/HIVE-15664 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15664.04.patch, HIVE-15664.patch > > > 1) Don't use ORC dictionary. > 2) Use VectorDeserialize. > 3) Don't parse the columns that are not included (cannot avoid reading them). > -4) Send VRB to the pipeline and write ORC in parallel (in background)-. > HIVE-15672 > Also add an option to disable the encoding pipeline server-side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15664) LLAP text cache: improve first query perf I
[ https://issues.apache.org/jira/browse/HIVE-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837011#comment-15837011 ] Hive QA commented on HIVE-15664: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12849183/HIVE-15664.04.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10996 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple] (batchId=153) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=93) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3159/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3159/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3159/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12849183 - PreCommit-HIVE-Build > LLAP text cache: improve first query perf I > --- > > Key: HIVE-15664 > URL: https://issues.apache.org/jira/browse/HIVE-15664 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15664.04.patch, HIVE-15664.patch > > > 1) Don't use ORC dictionary. > 2) Use VectorDeserialize. > 3) Don't parse the columns that are not included (cannot avoid reading them). > -4) Send VRB to the pipeline and write ORC in parallel (in background)-. > HIVE-15672 > Also add an option to disable the encoding pipeline server-side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15664) LLAP text cache: improve first query perf I
[ https://issues.apache.org/jira/browse/HIVE-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836839#comment-15836839 ] Prasanth Jayachandran commented on HIVE-15664: -- lgtm, +1. Pending tests > LLAP text cache: improve first query perf I > --- > > Key: HIVE-15664 > URL: https://issues.apache.org/jira/browse/HIVE-15664 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15664.04.patch, HIVE-15664.patch > > > 1) Don't use ORC dictionary. > 2) Use VectorDeserialize. > 3) Don't parse the columns that are not included (cannot avoid reading them). > -4) Send VRB to the pipeline and write ORC in parallel (in background)-. > HIVE-15672 > Also add an option to disable the encoding pipeline server-side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15664) LLAP text cache: improve first query perf I
[ https://issues.apache.org/jira/browse/HIVE-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832842#comment-15832842 ] Hive QA commented on HIVE-15664: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12848464/HIVE-15664.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10974 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop] (batchId=226) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path] (batchId=226) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=136) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=149) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3087/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3087/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3087/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12848464 - PreCommit-HIVE-Build > LLAP text cache: improve first query perf I > --- > > Key: HIVE-15664 > URL: https://issues.apache.org/jira/browse/HIVE-15664 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15664.patch > > > 1) Don't use ORC dictionary. > 2) Use VectorDeserialize. > 3) Don't parse the columns that are not included (cannot avoid reading them). > -4) Send VRB to the pipeline and write ORC in parallel (in background)-. > HIVE-15672 > Also add an option to disable the encoding pipeline server-side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15664) LLAP text cache: improve first query perf I
[ https://issues.apache.org/jira/browse/HIVE-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832381#comment-15832381 ] Sergey Shelukhin commented on HIVE-15664: - Yes, why? :) > LLAP text cache: improve first query perf I > --- > > Key: HIVE-15664 > URL: https://issues.apache.org/jira/browse/HIVE-15664 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15664.patch, HIVE-15664.WIP.patch > > > 1) Don't use ORC dictionary. > 2) Use VectorDeserialize. > 3) Don't parse the columns that are not included (cannot avoid reading them). > -4) Send VRB to the pipeline and write ORC in parallel (in background)-. > HIVE-15672 > Also add an option to disable the encoding pipeline server-side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15664) LLAP text cache: improve first query perf I
[ https://issues.apache.org/jira/browse/HIVE-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831527#comment-15831527 ] Matt McCline commented on HIVE-15664: - So, looks like you have a sparse column input VRB from the table but you need to cache the data ORC style with non sparse so you share columns with the destination (write) VRB. > LLAP text cache: improve first query perf I > --- > > Key: HIVE-15664 > URL: https://issues.apache.org/jira/browse/HIVE-15664 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15664.patch, HIVE-15664.WIP.patch > > > 1) Don't use ORC dictionary. > 2) Use VectorDeserialize. > 3) Don't parse the columns that are not included (cannot avoid reading them). > -4) Send VRB to the pipeline and write ORC in parallel (in background)-. > HIVE-15672 > Also add an option to disable the encoding pipeline server-side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15664) LLAP text cache: improve first query perf I
[ https://issues.apache.org/jira/browse/HIVE-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831194#comment-15831194 ] Sergey Shelukhin commented on HIVE-15664: - [~gopalv] [~mmccline] can you take a look? Thanks. It reuses some code from VectorMapOperator as part of #2 > LLAP text cache: improve first query perf I > --- > > Key: HIVE-15664 > URL: https://issues.apache.org/jira/browse/HIVE-15664 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15664.patch, HIVE-15664.WIP.patch > > > 1) Don't use ORC dictionary. > 2) Use VectorDeserialize. > 3) Don't parse the columns that are not included (cannot avoid reading them). > -4) Send VRB to the pipeline and write ORC in parallel (in background)-. > HIVE-15672 > Also add an option to disable the encoding pipeline server-side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)