[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099480#comment-16099480 ] Lefty Leverenz commented on HIVE-12878: --- Doc note: HIVE-16222 changes the default value of *hive.vectorized.use.row.serde.deserialize* to true in release 3.0.0. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.091.patch, HIVE-12878.092.patch, HIVE-12878.093.patch, > HIVE-12878.09.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392820#comment-15392820 ] Shannon Ladymon commented on HIVE-12878: Doc done. The new properties (*hive.vectorized.use.vectorized.input.format, hive.vectorized.use.vector.serde.deserialize, and hive.vectorized.use.row.serde.deserialize*) have been documented as follows: * [Configuration Properties - hive.vectorized.use.vectorized.input.format | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.vectorized.use.vectorized.input.format] * [Configuration Properties - hive.vectorized.use.vector.serde.deserialize | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.vectorized.use.vector.serde.deserialize] * [Configuration Properties - hive.vectorized.use.row.serde.deserialize | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.vectorized.use.row.serde.deserialize] The TODOC 2.1 label has also been removed. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch, > HIVE-12878.093.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392777#comment-15392777 ] Lefty Leverenz commented on HIVE-12878: --- The previous comment about maxColumnWidth is on the wrong jira -- it belongs on HIVE-14135. Thanks go to [~sladymon] for figuring it out and fixing the doc. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch, > HIVE-12878.093.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15391571#comment-15391571 ] Lefty Leverenz commented on HIVE-12878: --- [~vihangk1] documented the change of default value for Beeline's maxColumnWidth from 15 to 50 in the wiki (thanks, Vihang). * [HiveServer2 Clients -- Beeline Command Options | https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions] The new configs still need to be documented. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch, > HIVE-12878.093.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273794#comment-15273794 ] Lefty Leverenz commented on HIVE-12878: --- Doc note: This adds three configuration parameters (*hive.vectorized.use.vectorized.input.format*, *hive.vectorized.use.vector.serde.deserialize*, and *hive.vectorized.use.row.serde.deserialize*) to HiveConf.java, so they will need to be documented in the wiki for release 2.1.0. * [Configuration Properties -- Vectorization | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Vectorization] Added a TODOC2.1 label. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch, > HIVE-12878.093.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267772#comment-15267772 ] Matt McCline commented on HIVE-12878: - Committed to master. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch, > HIVE-12878.093.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267769#comment-15267769 ] Matt McCline commented on HIVE-12878: - New test failures (Age = 1) from ptest run are on HIVE-12878.093.patch are: {code} org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.testGetMetaConfDefault 10 sec 1 org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.testGetMetaConfDefaultEmptyString 10 sec 1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby10 15 sec 1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_cast_constant 17 sec 1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_14 8.3 sec 1 {code} I run the TestSparkCliDriver tests on my laptop and they succeeded. TestHiveMetaStoreGetMetaConf failures appear unrelated. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch, > HIVE-12878.093.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266173#comment-15266173 ] Matt McCline commented on HIVE-12878: - The "new" build failures are: org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropTable org.apache.hive.hcatalog.listener.TestDbNotificationListener.sqlInsertPartition org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testDelegationTokenSharedStore org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testMetastoreProxyUser org.apache.hive.jdbc.TestSSL.testSSLFetchHttp There are 2 combined runs that produced no tests results: TestMiniTezCliDriver-insert_values_non_partitioned.q-join1.q-schema_evol_orc_nonvec_mapwork_part.q-and-12-more - did not produce a TEST-*.xml file {code} insert_values_non_partitioned.q,join1.q,schema_evol_orc_nonvec_mapwork_part.q,union5.q,orc_merge4.q,cbo_subq_not_in.q,cbo_semijoin.q,vector_if_expr.q,hybridgrace_hashjoin_1.q,orc_analyze.q,cbo_union.q,auto_sortmerge_join_2.q,update_where_non_partitioned.q,script_env_var2.q,auto_sortmerge_join_11.q {code} TestMiniTezCliDriver-vectorized_parquet.q-tez_self_join.q-vector_left_outer_join2.q-and-12-more - did not produce a TEST-*.xml file {code} vectorized_parquet.q,tez_self_join.q,vector_left_outer_join2.q,orc_merge_incompat1.q,cbo_gby_empty.q,schema_evol_orc_vec_mapwork_part.q,vector_groupby_mapjoin.q,auto_sortmerge_join_6.q,dynpart_sort_optimization.q,cte_mat_5.q,vectorization_nested_udf.q,cross_product_check_1.q,cte_3.q,parallel.q,transform_ppr1.q {code} The others build failures are stretching back to prior builds. Given how many build failures there are, the current state of master seems unstable and it doesn't seem wise to commit more stuff. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch, > HIVE-12878.093.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266126#comment-15266126 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12801634/HIVE-12878.093.patch {color:green}SUCCESS:{color} +1 due to 24 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 9979 tests executed *Failed tests:* {noformat} TestHBaseAggrStatsCacheIntegration - did not produce a TEST-*.xml file TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-insert_values_non_partitioned.q-join1.q-schema_evol_orc_nonvec_mapwork_part.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vectorized_parquet.q-tez_self_join.q-vector_left_outer_join2.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.testGetMetaConfDefault org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testAddPartitions org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hadoop.hive.metastore.TestMetaStoreEndFunctionListener.testEndFunctionListener org.apache.hadoop.hive.metastore.TestMetaStoreEventListenerOnlyOnCommit.testEventStatus org.apache.hadoop.hive.metastore.TestMetaStoreInitListener.testMetaStoreInitListener org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.org.apache.hadoop.hive.metastore.TestMetaStoreMetrics org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithCommas org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithUnicode org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithValidCharacters org.apache.hadoop.hive.metastore.TestRetryingHMSHandler.testRetryingHMSHandler org.apache.hadoop.hive.ql.security.TestClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestFolderPermissions.org.apache.hadoop.hive.ql.security.TestFolderPermissions org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener.org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testDelegationTokenSharedStore org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testSaslWithHiveMetaStore org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropTable org.apache.hive.hcatalog.listener.TestDbNotificationListener.sqlInsertPartition org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/147/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/147/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-147/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 33 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12801634 - PreCommit-HIVE-MASTER-Build > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264926#comment-15264926 ] Matt McCline commented on HIVE-12878: - Around #146 or #147. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch, > HIVE-12878.093.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264470#comment-15264470 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12801019/HIVE-12878.092.patch {color:green}SUCCESS:{color} +1 due to 24 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 64 failed/errored test(s), 1 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nomore_ambiguous_table_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_regexp_extract org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_clustern3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_clustern4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_nonkey_groupby org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_selectDistinctStarNeg_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_subquery_shared_alias org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udtf_not_supported1 org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.testGetMetaConfDefault org.apache.hadoop.hive.metastore.TestMetaStoreEndFunctionListener.testEndFunctionListener org.apache.hadoop.hive.metastore.TestMetaStoreEventListenerOnlyOnCommit.testEventStatus org.apache.hadoop.hive.metastore.TestMetaStoreInitListener.testMetaStoreInitListener org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.org.apache.hadoop.hive.metastore.TestMetaStoreMetrics org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAddPartitionWithCommas org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAddPartitionWithUnicode org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAddPartitionWithValidPartVal org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithCommas org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithUnicode org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithValidCharacters org.apache.hadoop.hive.metastore.TestRemoteUGIHiveMetaStoreIpAddress.testIpAddress org.apache.hadoop.hive.metastore.TestRetryingHMSHandler.testRetryingHMSHandler org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorization org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithAcid org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithBuckets org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.insertOverwriteCreate org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testDummyTxnManagerOnAcidTable org.apache.hadoop.hive.ql.security.TestClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestExtendedAcls.org.apache.hadoo
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262629#comment-15262629 ] Sergey Shelukhin commented on HIVE-12878: - +1 pending tests. I'd rather that the public fields were changed to have getters/setters... > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260153#comment-15260153 ] Matt McCline commented on HIVE-12878: - [~sershe] Thank you for all your hard work on reviewing this. I think it is finally ready for a +1, pending another Hive QA run. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch, HIVE-12878.092.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259747#comment-15259747 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12800649/HIVE-12878.09.patch {color:green}SUCCESS:{color} +1 due to 24 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 10009 tests executed *Failed tests:* {noformat} TestCliDriver-tez_smb_empty.q-char_2.q-udf_date_sub.q-and-12-more - did not produce a TEST-*.xml file TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nomore_ambiguous_table_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_regexp_extract org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_partition_diff_num_cols org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_clustern3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_clustern4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_nonkey_groupby org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_selectDistinctStarNeg_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_subquery_shared_alias org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udtf_not_supported1 org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testSimpleTable org.apache.hadoop.hive.ql.exec.vector.TestVectorSerDeRow.testVectorSerDeRow org.apache.hadoop.hive.serde2.binarysortable.TestBinarySortableFast.testBinarySortableFast org.apache.hadoop.hive.serde2.lazy.TestLazySimpleFast.testLazySimpleFast org.apache.hadoop.hive.serde2.lazybinary.TestLazyBinaryFast.testLazyBinaryFast {noformat} Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/94/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/94/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-94/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12800649 - PreCommit-HIVE-MASTER-Build > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch, HIVE-12878.091.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259218#comment-15259218 ] Sergey Shelukhin commented on HIVE-12878: - Left some more comments on RB > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257461#comment-15257461 ] Sergey Shelukhin commented on HIVE-12878: - Went halfway thru the recent diffs and then my head started to hurt. I will finish tomorrow... > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253033#comment-15253033 ] Sergey Shelukhin commented on HIVE-12878: - left some comment on RB. I didn't review the out file results. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch, HIVE-12878.08.patch, > HIVE-12878.09.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15233731#comment-15233731 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12797481/HIVE-12878.07.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7526/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7526/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7526/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7526/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 9a00b2f HIVE-13437. httpserver getPort does not return the actual port when attempting to use a dynamic port. (Siddharth Seth, reviewed by Prasanth Jayachandran and Sergey Shelukhin) + git clean -f -d Removing ql/src/test/queries/clientpositive/msck_repair_batchsize.q Removing ql/src/test/results/clientpositive/msck_repair_batchsize.q.out + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 9a00b2f HIVE-13437. httpserver getPort does not return the actual port when attempting to use a dynamic port. (Siddharth Seth, reviewed by Prasanth Jayachandran and Sergey Shelukhin) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12797481 - PreCommit-HIVE-TRUNK-Build > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181856#comment-15181856 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12791258/HIVE-12878.06.patch {color:green}SUCCESS:{color} +1 due to 21 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 9766 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-stats13.q-groupby6_map.q-join_casesensitive.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_partition_diff_num_cols org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_streaming org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries_with_filters org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_partition_diff_num_cols org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_metadata_only_queries org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_metadata_only_queries_with_filters org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorization org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithAcid org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithBuckets org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7171/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7171/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7171/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12791258 - PreCommit-HIVE-TRUNK-Build > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158638#comment-15158638 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12789077/HIVE-12878.05.patch {color:green}SUCCESS:{color} +1 due to 21 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9803 tests executed *Failed tests:* {noformat} TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_partition_diff_num_cols org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries_with_filters org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_partition_diff_num_cols org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_metadata_only_queries org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_metadata_only_queries_with_filters org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorization org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithAcid org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithBuckets org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7065/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7065/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7065/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12789077 - PreCommit-HIVE-TRUNK-Build > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157411#comment-15157411 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12788991/HIVE-12878.04.patch {color:green}SUCCESS:{color} +1 due to 20 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1195 failed/errored test(s), 9479 tests executed *Failed tests:* {noformat} TestSparkCliDriver-auto_join11.q-vector_groupby_3.q-smb_mapjoin_8.q-and-3-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join9.q-bucketmapjoin10.q-skewjoinopt19.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join_reordering_values.q-auto_sortmerge_join_7.q-multigroupby_singlemr.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-avro_decimal_native.q-bucketmapjoin12.q-ppd_outer_join2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketsortoptimize_insert_7.q-enforce_order.q-join36.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-escape_distributeby1.q-union_remove_7.q-skewjoin_union_remove_2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby2_noskew_multi_distinct.q-skewjoin_noskew.q-vector_data_types.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby4.q-timestamp_null.q-auto_join23.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_complex_types.q-vectorization_10.q-join4.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_grouping_id2.q-bucketmapjoin4.q-groupby7.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_cond_pushdown_3.q-groupby7_noskew.q-auto_join13.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-load_dyn_part12.q-nullgroup4_multi_distinct.q-union14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-load_dyn_part5.q-skewjoinopt8.q-groupby1_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_gby_join.q-stats2.q-groupby_rollup1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoinopt15.q-bucketmapjoin3.q-auto_join10.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_15.q-auto_sortmerge_join_13.q-auto_join18_multi_distinct.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_4.q-auto_join19.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-stats13.q-groupby6_map.q-join_casesensitive.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-tez_joins_explain.q-input17.q-union29.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-vector_distinct_2.q-load_dyn_part2.q-udf_percentile.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ansi_sql_arithmetic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3 org.apache.
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148018#comment-15148018 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12787933/HIVE-12878.03.patch {color:green}SUCCESS:{color} +1 due to 23 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1210 failed/errored test(s), 9469 tests executed *Failed tests:* {noformat} TestDateIntervalDayTime - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join11.q-vector_groupby_3.q-smb_mapjoin_8.q-and-3-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join9.q-bucketmapjoin10.q-skewjoinopt19.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join_reordering_values.q-auto_sortmerge_join_7.q-multigroupby_singlemr.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-avro_decimal_native.q-bucketmapjoin12.q-ppd_outer_join2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketsortoptimize_insert_7.q-enforce_order.q-join36.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-escape_distributeby1.q-union_remove_7.q-skewjoin_union_remove_2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby2_noskew_multi_distinct.q-skewjoin_noskew.q-vector_data_types.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby4.q-timestamp_null.q-auto_join23.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_complex_types.q-vectorization_10.q-join4.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_grouping_id2.q-bucketmapjoin4.q-groupby7.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_cond_pushdown_3.q-groupby7_noskew.q-auto_join13.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-load_dyn_part12.q-nullgroup4_multi_distinct.q-union14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-load_dyn_part5.q-skewjoinopt8.q-groupby1_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_gby_join.q-stats2.q-groupby_rollup1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoinopt15.q-bucketmapjoin3.q-auto_join10.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_15.q-auto_sortmerge_join_13.q-auto_join18_multi_distinct.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_4.q-auto_join19.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-stats13.q-groupby6_map.q-join_casesensitive.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-tez_joins_explain.q-input17.q-union29.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-vector_distinct_2.q-load_dyn_part2.q-udf_percentile.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ansi_sql_arithmetic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27 org.apache.hadoop
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146493#comment-15146493 ] Matt McCline commented on HIVE-12878: - Only checked TestCliDriver. Fewer failures now. {code} • TestCliDriver o Wrong Results: • cbo_rp_union • cbo_rp_views • cbo_union • cbo_views • constprog_type • date_diff • interval_3 • mapjoin1 • metadata_only_queries • metadata_only_queries_with_filters • orc_file_dump • orc_pushdown_predicate • offset_limit_global_optimizer • parquet_ppd_decimal • parquet_predicate_pushdown • special_character_in_tabnames_1 • temp_table_windowing_expressions • windowing_distinct • windowing_expressions • windowing_mutltipartitioning • windowing_navfn • windowing_rank • udf_context_aware • vector_binary_join_groupby • vector_data_types • vector_elt • vector_orderby_5 o Failures: • auto_sortmerge_join_1 • auto_sortmerge_join_2 • auto_sortmerge_join_3 • auto_sortmerge_join_4 • auto_sortmerge_join_5 • auto_sortmerge_join_6 • auto_sortmerge_join_7 • auto_sortmerge_join_9 • auto_sortmerge_join_14 • • bucketsortoptimize_insert_4 • bucketsortoptimize_insert_5 • interval_arithmetic • Caused by: java.lang.IllegalStateException • at com.google.common.base.Preconditions.checkState(Preconditions.java:134) ~[guava-15.0.jar:?] • at org.apache.hadoop.hive.common.type.PisaTimestamp.updateFromTimestamp(PisaTimestamp.java:201) ~[hive-storage-api-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] • at org.apache.hadoop.hive.common.type.PisaTimestamp.updateFromTimestampMilliseconds(PisaTimestamp.java:216) ~[hive-storage-api-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] • at org.apache.hadoop.hive.ql.util.DateTimeMath.addMonthsToPisaTimestamp(DateTimeMath.java:104) ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] • at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.TimestampColSubtractIntervalYearMonthScalar.evaluate(TimestampColSubtractIntervalYearMonthScalar.java:112) ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] • join_filters • orc_min_max • partition_wise_fileformat16 • skew_join • vectorized_timestamp {code} > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146453#comment-15146453 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12787853/HIVE-12878.02.patch {color:green}SUCCESS:{color} +1 due to 20 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1190 failed/errored test(s), 9452 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-auto_join30.q-cte_4.q-vector_data_types.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join11.q-vector_groupby_3.q-smb_mapjoin_8.q-and-3-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join9.q-bucketmapjoin10.q-skewjoinopt19.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join_reordering_values.q-auto_sortmerge_join_7.q-multigroupby_singlemr.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-avro_decimal_native.q-bucketmapjoin12.q-ppd_outer_join2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketsortoptimize_insert_7.q-enforce_order.q-join36.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-escape_distributeby1.q-union_remove_7.q-skewjoin_union_remove_2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby2_noskew_multi_distinct.q-skewjoin_noskew.q-vector_data_types.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby4.q-timestamp_null.q-auto_join23.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_complex_types.q-vectorization_10.q-join4.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_grouping_id2.q-bucketmapjoin4.q-groupby7.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_cond_pushdown_3.q-groupby7_noskew.q-auto_join13.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-load_dyn_part12.q-nullgroup4_multi_distinct.q-union14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-load_dyn_part5.q-skewjoinopt8.q-groupby1_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_gby_join.q-stats2.q-groupby_rollup1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoinopt15.q-bucketmapjoin3.q-auto_join10.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_15.q-auto_sortmerge_join_13.q-auto_join18_multi_distinct.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_4.q-auto_join19.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-stats13.q-groupby6_map.q-join_casesensitive.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-tez_joins_explain.q-input17.q-union29.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-vector_distinct_2.q-load_dyn_part2.q-udf_percentile.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ansi_sql_arithmetic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestC
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146372#comment-15146372 ] Matt McCline commented on HIVE-12878: - Fix bug in new VectorMapJoin for ORC. Fix vectorization bug in "Row Limit Per Split". > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144535#comment-15144535 ] Matt McCline commented on HIVE-12878: - Adjust to recent timestamp patch. Fix text deserialization issue. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139766#comment-15139766 ] Matt McCline commented on HIVE-12878: - I just went through the 1253 test failures to filter out the expected "Execution mode: vectorized", statistics differences, etc. Here are the query wrong results and test failures. A rather stunning amount. {code} TestCliDriver o Wrong Results: • add_part_multiple • alter_partition_coltype • alter_varchar2 • analyze_tbl_part • auto_join18 • auto_join18_multi_distinct • avro_schema_evolution_native • avro_timestamp • bucket_groupby • cbo_const • cbo_rp_lineage2 • cbo_rp_union • cbo_rp_views • cbo_rp_windowing • cbo_union • cbo_views • cbo_windowing • complex_alias • constprog_type • correlationoptimizer14 • correlationoptimizer2 • correlationoptimizer8 • ctas_colname • custom_input_output_format • date_1 • date_3 • date_udf • decimal_1 • decimal_2 • empty_join • filter_join_breaktask2 • groupby_duplicate_key • groupby_grouping_window • groupby_sort_10 • insert_into1 • interval_arithmetic • join18 • join18_multi_distinct • lineage2 • mapjoin_test_outer • metadata_only_queries • metadata_only_queries_with_filters • non_ascii_literal • orc_dictionary_threshold • orc_diff_part_cols • orc_empty_strings • orc_file_dump • orc_int_type_promotion • orc_predicate_pushdown • offset_limit_global_optimizer • parquet_ppd_decimal • parquet_predicate_pushdown • partcols1 • partition_date • partition_date2 • partition_multilevels • partition_timestamp • partition_timestamp2 • partition_varchar1 • partition_wise_fileformat2 • ppr_pushdown2 • rcfile_null_value • selectDistinctStar • special_characters_in_tabnames_1 • stats1 • str_to_map • temp_table_windowing_expressions • test_boolean_whereclause • timestamp_3 • timestamp_lazy • timestamp_udf • truncate_column • truncate_column_merge • udf_context_aware • udf_get_json_object • udf_length • udf_printf • udf_round_2 • udtf_json_tuple • union6 • union34 • unionDistinct_1 • vector_binary_join_groupby • vector_data_types • vector_decimal_1 • vector_decimal_2 • vector_orderby_5 • windowing_distinct • windowing_expressions • windowing_multipartitioning • windowing_navfn • windowing_rank o Failures: • auto_join_reordering_values • auto_sortmerge_join_1 • auto_sortmerge_join_14 • auto_sortmerge_join_2 • auto_sortmerge_join_3 • auto_sortmerge_join_4 • auto_sortmerge_join_5 • auto_sortmerge_join_6 • auto_sortmerge_join_7 • auto_sortmerge_join_9 • bucketsortoptimize_insert_2 • bucketsortoptimize_insert_4 • bucketsortoptimize_insert_5 • join42 • join_filters • mapjoin1 • orc_min_max • partition_wise_fileformat16 • ppd_union_view • skewjoin • vector_elt TestContribNegativeCliDriver o Wrong Results: o Failures: • case_with_row_sequence TestHBaseCliDriver o Wrong Results: • hbase_single_sourced_multi_insert o Failures: TestMiniLlapCliDriver o Wrong Results: • hybridgrace_hashjoin_1 • hybridgrace_hashjoin_2 • tez_join_tests • tez_union_decimal o Failures: • bucket_map_join_tez1 • tez_bmj_schema_evolution • tez_smb_main • TestMiniSparkOnYarnCliDriver o Wrong Results: • schemaAuthority2 • vector_outer_join1 • vector_outer_join2 • vector_outer_join3 • vector_outer_join4 o Failures: • bucketmapjoin7 TestMiniTezCliDriver o Wrong Results: • cbo_simple_select • cbo_union • cbo_views • cbo_windowing • custom_input_output_format • empty_join • filter_join_breaktask2 • hybridgrace_hashjoin_1 • hybridgrace_hashjoin_2 • insert_into1 • mergejoin • metadata_queries_only • metadata_queries_only_with_filters • selectDistinctStar • select_dummy_source • tez_join_tests • tez_union_decimal • union6 • unionDistinct_1 • vector_binary_join_groupby • vector_data_types • vector_decimal_1 • vector_decimal_2 • vector_outer_join1 • vector_outer_join2 • vector_outer_join3 • vector_outer_join4 • vector_orderby_5 • vector_when_case_null • vectorized_date_funcs o
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15138119#comment-15138119 ] Hive QA commented on HIVE-12878: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12786780/HIVE-12878.01.patch {color:green}SUCCESS:{color} +1 due to 13 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1253 failed/errored test(s), 9718 tests executed *Failed tests:* {noformat} TestSparkCliDriver-auto_join11.q-vector_groupby_3.q-smb_mapjoin_8.q-and-3-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join9.q-bucketmapjoin10.q-skewjoinopt19.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-auto_join_reordering_values.q-auto_sortmerge_join_7.q-multigroupby_singlemr.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-avro_decimal_native.q-bucketmapjoin12.q-ppd_outer_join2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketsortoptimize_insert_7.q-enforce_order.q-join36.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-escape_distributeby1.q-union_remove_7.q-skewjoin_union_remove_2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby2_noskew_multi_distinct.q-skewjoin_noskew.q-vector_data_types.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby4.q-timestamp_null.q-auto_join23.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_complex_types.q-vectorization_10.q-join4.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_grouping_id2.q-bucketmapjoin4.q-groupby7.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_cond_pushdown_3.q-groupby7_noskew.q-auto_join13.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-load_dyn_part12.q-nullgroup4_multi_distinct.q-union14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-load_dyn_part5.q-skewjoinopt8.q-groupby1_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_gby_join.q-stats2.q-groupby_rollup1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoinopt15.q-bucketmapjoin3.q-auto_join10.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_15.q-auto_sortmerge_join_13.q-auto_join18_multi_distinct.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_4.q-auto_join19.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-stats13.q-groupby6_map.q-join_casesensitive.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-tez_joins_explain.q-input17.q-union29.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-vector_distinct_2.q-load_dyn_part2.q-udf_percentile.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_tbl_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ansi_sql_arithmetic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_joi
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136224#comment-15136224 ] Matt McCline commented on HIVE-12878: - Patch submitted is an experiment: Changed the default on these environment variables (temporarily) to force vectorization of many queries. {code} hive.fetch.task.conversion=none hive.vectorized.execution.enabled=true {code} New environment variables are set so that all vectorized queries either use the new vectorized versions of deserialize for LazySimple (i.e. TEXTFILE) and LazyBinarySerDe. Or, we deserialize row-by-row to fill up VectorizedRowBatch. {code} hive.vectorized.use.vectorized.input.format=false hive.vectorized.use.vector.serde.deserialize=true hive.vectorized.use.row.serde.deserialize=true {code} So, MapWork tasks not vectorizing due to input file format should not happen (except for ACID which only is permitted for vectorized input format...). > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101749#comment-15101749 ] Matt McCline commented on HIVE-12878: - Rehydrated an old patch from last year. Unclear how it interacts with recent ORC Schema Evolution. > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)