[jira] [Commented] (HIVE-13365) Allow multiple llap instances with the MiniLlap cluster
[ https://issues.apache.org/jira/browse/HIVE-13365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215448#comment-15215448 ] Hive QA commented on HIVE-13365: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795530/HIVE-13365.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 788 failed/errored test(s), 6999 tests executed *Failed tests:* {noformat} TestCliDriver-acid_vectorization.q-smb_mapjoin_2.q-exim_02_00_part_empty.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-alter_char2.q-ppd_join3.q-vectorization_14.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-alter_table_not_sorted.q-authorization_update.q-dynamic_partition_pruning.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-alter_view_rename.q-tez_bmj_schema_evolution.q-llap_uncompressed.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-authorization_1_sql_std.q-drop_index_removes_partition_dirs.q-udf_date_sub.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-authorization_cli_nonsql.q-cbo_rp_subq_in.q-rcfile_merge1.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-authorization_cli_stdconfigauth.q-vectorized_parquet.q-ba_table_union.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-authorization_create_table_owner_privs.q-create_func1.q-partition_wise_fileformat.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-authorization_role_grant2.q-bucketcontext_3.q-windowing_multipartitioning.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-authorization_view_1.q-vector_left_outer_join2.q-add_jar_pfile.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-authorization_view_disable_cbo_3.q-vector_groupby_3.q-decimal_udf.q-and-2-more - did not produce a TEST-*.xml file TestCliDriver-authorization_view_disable_cbo_4.q-vectorization_13.q-udf_explode.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-auto_join18_multi_distinct.q-interval_udf.q-list_bucket_query_multiskew_2.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-auto_join30.q-unionall_unbalancedppd.q-lock1.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-auto_join4.q-mapjoin_decimal.q-input_dynamicserde.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-auto_join9.q-insert_into_with_schema.q-schema_evol_text_nonvec_mapwork_table.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-auto_sortmerge_join_13.q-tez_self_join.q-exim_03_nonpart_over_compat.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-autogen_colalias.q-compute_stats_date.q-union29.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-avro_add_column.q-orc_wide_table.q-query_with_semi.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-avro_decimal_native.q-alter_file_format.q-groupby3_map_skew.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-avro_joins.q-disallow_incompatible_type_change_off.q-udf_max.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-bool_literal.q-udf_hash.q-groupby4_map.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-bucket_map_join_tez1.q-ppd_random.q-vector_auto_smb_mapjoin_14.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-bucketmapjoin3.q-udf_round_3.q-udf_between.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-cbo_rp_join1.q-enforce_order.q-bucketcontext_4.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-cbo_rp_limit.q-show_columns.q-input31.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-cbo_rp_stats.q-skewjoinopt16.q-rename_column.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-cbo_rp_udf_udaf.q-groupby4_noskew.q-list_bucket_dml_11.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-cbo_rp_views.q-cbo_rp_semijoin.q-offset_limit_ppd_optimizer.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-columnStatsUpdateForStatsOptimizer_2.q-alter_partition_clusterby_sortby.q-udf_repeat.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-columnstats_partlvl_dp.q-smb_mapjoin_21.q-udf_sha1.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-columnstats_tbllvl.q-index_compact.q-input14.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-compute_stats_string.q-load_dyn_part12.q-nullgroup4_multi_distinct.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-cp_mj_rc.q-masking_disablecbo_1.q-udf_stddev_pop.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-create_genericudf.q-ambiguitycheck.q-join13.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-create_or_replace_view.q-join_cond_pushdown_3.q-struct_in_view.q-and-12-more - did not produce a
[jira] [Commented] (HIVE-10365) First job fails with StackOverflowError [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215435#comment-15215435 ] Szehon Ho commented on HIVE-10365: -- I happened to see this in a spark-executor during a query as well. Just leaving a note in case someone else hits this. The solution is to set spark.executor.extraJavaOptions to a sufficiently high -Xss value. > First job fails with StackOverflowError [Spark Branch] > -- > > Key: HIVE-10365 > URL: https://issues.apache.org/jira/browse/HIVE-10365 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > > When running some queries on Yarn with standalone Hadoop, the first query > fails with StackOverflowError: > {noformat} > java.lang.StackOverflowError > at > java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:333) > at > java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:1145) > at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:464) > at java.lang.ClassLoader.loadClass(ClassLoader.java:405) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > at java.lang.ClassLoader.loadClass(ClassLoader.java:412) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column
[ https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215369#comment-15215369 ] Pengcheng Xiong commented on HIVE-13372: [~big60], could u apply the patch and try? ccing [~ashutoshc], could u please take a look? Thanks. > Hive Macro overwritten when multiple macros are used in one column > -- > > Key: HIVE-13372 > URL: https://issues.apache.org/jira/browse/HIVE-13372 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Lin Liu >Assignee: Pengcheng Xiong >Priority: Critical > Attachments: HIVE-13372.01.patch > > > When multiple macros are used in one column, results of the later ones are > over written by that of the first. > For example: > Suppose we have created a table called macro_test with single column x in > STRING type, and with data as: > "a" > "bb" > "ccc" > We also create three macros: > {code} > CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); > CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; > CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; > {code} > When we ran the following query, > {code} > SELECT > CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", > STRING_LEN_PLUS_TWO(x)) a > FROM macro_test > SORT BY a DESC; > {code} > We get result: > 3:3:3 > 2:2:2 > 1:1:1 > instead of expected: > 3:4:5 > 2:3:4 > 1:2:3 > Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and > HIVE-12277 patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column
[ https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13372: --- Status: Patch Available (was: Open) > Hive Macro overwritten when multiple macros are used in one column > -- > > Key: HIVE-13372 > URL: https://issues.apache.org/jira/browse/HIVE-13372 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Lin Liu >Assignee: Pengcheng Xiong >Priority: Critical > Attachments: HIVE-13372.01.patch > > > When multiple macros are used in one column, results of the later ones are > over written by that of the first. > For example: > Suppose we have created a table called macro_test with single column x in > STRING type, and with data as: > "a" > "bb" > "ccc" > We also create three macros: > {code} > CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); > CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; > CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; > {code} > When we ran the following query, > {code} > SELECT > CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", > STRING_LEN_PLUS_TWO(x)) a > FROM macro_test > SORT BY a DESC; > {code} > We get result: > 3:3:3 > 2:2:2 > 1:1:1 > instead of expected: > 3:4:5 > 2:3:4 > 1:2:3 > Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and > HIVE-12277 patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column
[ https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13372: --- Attachment: HIVE-13372.01.patch > Hive Macro overwritten when multiple macros are used in one column > -- > > Key: HIVE-13372 > URL: https://issues.apache.org/jira/browse/HIVE-13372 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Lin Liu >Assignee: Pengcheng Xiong >Priority: Critical > Attachments: HIVE-13372.01.patch > > > When multiple macros are used in one column, results of the later ones are > over written by that of the first. > For example: > Suppose we have created a table called macro_test with single column x in > STRING type, and with data as: > "a" > "bb" > "ccc" > We also create three macros: > {code} > CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); > CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; > CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; > {code} > When we ran the following query, > {code} > SELECT > CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", > STRING_LEN_PLUS_TWO(x)) a > FROM macro_test > SORT BY a DESC; > {code} > We get result: > 3:3:3 > 2:2:2 > 1:1:1 > instead of expected: > 3:4:5 > 2:3:4 > 1:2:3 > Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and > HIVE-12277 patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column
[ https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13372: --- Attachment: HIVE-13372.01.patch > Hive Macro overwritten when multiple macros are used in one column > -- > > Key: HIVE-13372 > URL: https://issues.apache.org/jira/browse/HIVE-13372 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Lin Liu >Assignee: Pengcheng Xiong >Priority: Critical > > When multiple macros are used in one column, results of the later ones are > over written by that of the first. > For example: > Suppose we have created a table called macro_test with single column x in > STRING type, and with data as: > "a" > "bb" > "ccc" > We also create three macros: > {code} > CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); > CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; > CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; > {code} > When we ran the following query, > {code} > SELECT > CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", > STRING_LEN_PLUS_TWO(x)) a > FROM macro_test > SORT BY a DESC; > {code} > We get result: > 3:3:3 > 2:2:2 > 1:1:1 > instead of expected: > 3:4:5 > 2:3:4 > 1:2:3 > Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and > HIVE-12277 patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13373) Use most specific type for numerical constants
[ https://issues.apache.org/jira/browse/HIVE-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13373: Status: Patch Available (was: Open) > Use most specific type for numerical constants > -- > > Key: HIVE-13373 > URL: https://issues.apache.org/jira/browse/HIVE-13373 > Project: Hive > Issue Type: Bug > Components: Types >Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13373.patch > > > tinyint & shortint are currently inferred as ints, if they are without > postfix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13373) Use most specific type for numerical constants
[ https://issues.apache.org/jira/browse/HIVE-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13373: Attachment: HIVE-13373.patch > Use most specific type for numerical constants > -- > > Key: HIVE-13373 > URL: https://issues.apache.org/jira/browse/HIVE-13373 > Project: Hive > Issue Type: Bug > Components: Types >Affects Versions: 1.0.0, 1.2.0, 1.1.0, 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13373.patch > > > tinyint & shortint are currently inferred as ints, if they are without > postfix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13268) Add a HA mini cluster type in MiniHS2
[ https://issues.apache.org/jira/browse/HIVE-13268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HIVE-13268: Attachment: HIVE-13268.3.patch [~sershe] Thanks for the review. TestPigHBaseStorageHandler was passed in my local computer. I uploaded a new patch and want to see new Jenkins results. It includes a new unit test class which is cloned from TestJdbcWithMiniMR. > Add a HA mini cluster type in MiniHS2 > - > > Key: HIVE-13268 > URL: https://issues.apache.org/jira/browse/HIVE-13268 > Project: Hive > Issue Type: Test > Components: Tests >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Minor > Attachments: HIVE-13268.1.patch, HIVE-13268.2.patch, > HIVE-13268.3.patch > > > We need a HA mini cluster for unit tests. This jira is for implimenting that > in MiniHS2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13130) HS2 changes : API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13130: - Attachment: HIVE-13130.4.patch > HS2 changes : API calls for retrieving primary keys and foreign keys > information > - > > Key: HIVE-13130 > URL: https://issues.apache.org/jira/browse/HIVE-13130 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13130.1.patch, HIVE-13130.2.patch, > HIVE-13130.3.patch, HIVE-13130.4.patch > > > ODBC exposes the SQLPrimaryKeys and SQLForeignKeys API calls and JDBC exposes > getPrimaryKeys and getCrossReference API calls. We need to provide these > interfaces as part of PK/FK implementation in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows
[ https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13371: --- Affects Version/s: 2.0.0 > Fix test failure of testHasNull in TestColumnStatistics running on Windows > -- > > Key: HIVE-13371 > URL: https://issues.apache.org/jira/browse/HIVE-13371 > Project: Hive > Issue Type: Test >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Minor > Fix For: 2.1.0 > > Attachments: HIVE-13371.01.patch > > > As per [~prasanth_j]'s analysis, > {code} > The ColumnStatistics test failures are already known to fail in Windows. This > is mostly a file size difference which is not a product issue and can be > ignored safely. The reason for the failure is ORC stripe footer stores the > timezone ID as string in stripe footer. Since Linux and Windows produces > different timezone id string, the size of the stripe footer will change > accordingly. Because of this timezone difference the file sizes will be > different on windows and linux. We can either update the > orc-file-has-null.out output file on Windows or ignore this test altogether. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows
[ https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13371: --- Fix Version/s: 2.1.0 > Fix test failure of testHasNull in TestColumnStatistics running on Windows > -- > > Key: HIVE-13371 > URL: https://issues.apache.org/jira/browse/HIVE-13371 > Project: Hive > Issue Type: Test >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Minor > Fix For: 2.1.0 > > Attachments: HIVE-13371.01.patch > > > As per [~prasanth_j]'s analysis, > {code} > The ColumnStatistics test failures are already known to fail in Windows. This > is mostly a file size difference which is not a product issue and can be > ignored safely. The reason for the failure is ORC stripe footer stores the > timezone ID as string in stripe footer. Since Linux and Windows produces > different timezone id string, the size of the stripe footer will change > accordingly. Because of this timezone difference the file sizes will be > different on windows and linux. We can either update the > orc-file-has-null.out output file on Windows or ignore this test altogether. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows
[ https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13371: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Fix test failure of testHasNull in TestColumnStatistics running on Windows > -- > > Key: HIVE-13371 > URL: https://issues.apache.org/jira/browse/HIVE-13371 > Project: Hive > Issue Type: Test >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Minor > Fix For: 2.1.0 > > Attachments: HIVE-13371.01.patch > > > As per [~prasanth_j]'s analysis, > {code} > The ColumnStatistics test failures are already known to fail in Windows. This > is mostly a file size difference which is not a product issue and can be > ignored safely. The reason for the failure is ORC stripe footer stores the > timezone ID as string in stripe footer. Since Linux and Windows produces > different timezone id string, the size of the stripe footer will change > accordingly. Because of this timezone difference the file sizes will be > different on windows and linux. We can either update the > orc-file-has-null.out output file on Windows or ignore this test altogether. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215084#comment-15215084 ] Hive QA commented on HIVE-13349: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795730/HIVE-13349.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 4 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/132/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/132/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-132/ Messages: {noformat} LXC derby found. LXC derby is not started. Starting container... Container started. Preparing derby container... Container prepared. Calling /hive/testutils/metastore/dbs/derby/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/derby/execute.sh ... Tests executed. LXC mysql found. LXC mysql is not started. Starting container... Container started. Preparing mysql container... Container prepared. Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/mysql/execute.sh ... Tests executed. LXC oracle found. LXC oracle is not started. Starting container... Container started. Preparing oracle container... Container prepared. Calling /hive/testutils/metastore/dbs/oracle/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/oracle/execute.sh ... Tests executed. LXC postgres found. LXC postgres is not started. Starting container... Container started. Preparing postgres container... Container prepared. Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/postgres/execute.sh ... Tests executed. {noformat} This message is automatically generated. ATTACHMENT ID: 12795730 - PreCommit-HIVE-METASTORE-Test > Metastore Changes : API calls for retrieving primary keys and foreign keys > information > -- > > Key: HIVE-13349 > URL: https://issues.apache.org/jira/browse/HIVE-13349 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13349.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13349: - Summary: Metastore Changes : API calls for retrieving primary keys and foreign keys information (was: Metastore Changes : HS2 changes : API calls for retrieving primary keys and foreign keys information) > Metastore Changes : API calls for retrieving primary keys and foreign keys > information > -- > > Key: HIVE-13349 > URL: https://issues.apache.org/jira/browse/HIVE-13349 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13349.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13349: - Attachment: (was: HIVE-13349.1.patch) > Metastore Changes : API calls for retrieving primary keys and foreign keys > information > -- > > Key: HIVE-13349 > URL: https://issues.apache.org/jira/browse/HIVE-13349 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13349.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13349) Metastore Changes : API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13349: - Attachment: HIVE-13349.1.patch > Metastore Changes : API calls for retrieving primary keys and foreign keys > information > -- > > Key: HIVE-13349 > URL: https://issues.apache.org/jira/browse/HIVE-13349 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13349.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows
[ https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215070#comment-15215070 ] Pengcheng Xiong commented on HIVE-13371: OK. pushed to master. > Fix test failure of testHasNull in TestColumnStatistics running on Windows > -- > > Key: HIVE-13371 > URL: https://issues.apache.org/jira/browse/HIVE-13371 > Project: Hive > Issue Type: Test >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Minor > Attachments: HIVE-13371.01.patch > > > As per [~prasanth_j]'s analysis, > {code} > The ColumnStatistics test failures are already known to fail in Windows. This > is mostly a file size difference which is not a product issue and can be > ignored safely. The reason for the failure is ORC stripe footer stores the > timezone ID as string in stripe footer. Since Linux and Windows produces > different timezone id string, the size of the stripe footer will change > accordingly. Because of this timezone difference the file sizes will be > different on windows and linux. We can either update the > orc-file-has-null.out output file on Windows or ignore this test altogether. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column
[ https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215062#comment-15215062 ] Lin Liu commented on HIVE-13372: Thanks, Pengcheng. Just FYI: After some investigations, we found that the error is not related to sort by, because even without SORTBY clause, if the data size of the table is sufficiently large to launch a MR job, the result is still wrong. > Hive Macro overwritten when multiple macros are used in one column > -- > > Key: HIVE-13372 > URL: https://issues.apache.org/jira/browse/HIVE-13372 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Lin Liu >Assignee: Pengcheng Xiong >Priority: Critical > > When multiple macros are used in one column, results of the later ones are > over written by that of the first. > For example: > Suppose we have created a table called macro_test with single column x in > STRING type, and with data as: > "a" > "bb" > "ccc" > We also create three macros: > {code} > CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); > CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; > CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; > {code} > When we ran the following query, > {code} > SELECT > CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", > STRING_LEN_PLUS_TWO(x)) a > FROM macro_test > SORT BY a DESC; > {code} > We get result: > 3:3:3 > 2:2:2 > 1:1:1 > instead of expected: > 3:4:5 > 2:3:4 > 1:2:3 > Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and > HIVE-12277 patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13349) Metastore Changes : HS2 changes : API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215060#comment-15215060 ] Hive QA commented on HIVE-13349: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795726/HIVE-13349.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 4 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/131/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/131/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-131/ Messages: {noformat} LXC derby found. LXC derby is not started. Starting container... Container started. Preparing derby container... Container prepared. Calling /hive/testutils/metastore/dbs/derby/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/derby/execute.sh ... Tests executed. LXC mysql found. LXC mysql is not started. Starting container... Container started. Preparing mysql container... Container prepared. Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/mysql/execute.sh ... Tests executed. LXC oracle found. LXC oracle is not started. Starting container... Container started. Preparing oracle container... Container prepared. Calling /hive/testutils/metastore/dbs/oracle/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/oracle/execute.sh ... Tests executed. LXC postgres found. LXC postgres is not started. Starting container... Container started. Preparing postgres container... Container prepared. Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/postgres/execute.sh ... Tests executed. {noformat} This message is automatically generated. ATTACHMENT ID: 12795726 - PreCommit-HIVE-METASTORE-Test > Metastore Changes : HS2 changes : API calls for retrieving primary keys and > foreign keys information > > > Key: HIVE-13349 > URL: https://issues.apache.org/jira/browse/HIVE-13349 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13349.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on Windows
[ https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13371: --- Summary: Fix test failure of testHasNull in TestColumnStatistics running on Windows (was: Fix test failure of testHasNull in TestColumnStatistics running on WIndows) > Fix test failure of testHasNull in TestColumnStatistics running on Windows > -- > > Key: HIVE-13371 > URL: https://issues.apache.org/jira/browse/HIVE-13371 > Project: Hive > Issue Type: Test >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Minor > Attachments: HIVE-13371.01.patch > > > As per [~prasanth_j]'s analysis, > {code} > The ColumnStatistics test failures are already known to fail in Windows. This > is mostly a file size difference which is not a product issue and can be > ignored safely. The reason for the failure is ORC stripe footer stores the > timezone ID as string in stripe footer. Since Linux and Windows produces > different timezone id string, the size of the stripe footer will change > accordingly. Because of this timezone difference the file sizes will be > different on windows and linux. We can either update the > orc-file-has-null.out output file on Windows or ignore this test altogether. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column
[ https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksei Statkevich updated HIVE-13372: -- Description: When multiple macros are used in one column, results of the later ones are over written by that of the first. For example: Suppose we have created a table called macro_test with single column x in STRING type, and with data as: "a" "bb" "ccc" We also create three macros: {code} CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; {code} When we ran the following query, {code} SELECT CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", STRING_LEN_PLUS_TWO(x)) a FROM macro_test SORT BY a DESC; {code} We get result: 3:3:3 2:2:2 1:1:1 instead of expected: 3:4:5 2:3:4 1:2:3 Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and HIVE-12277 patches. was: When multiple macros are used in one column, results of the later ones are over written by that of the first. For example: Suppose we have created a table called macro_test with single column x in STRING type, and with data as: "a" "bb" "ccc" We also create three macros: CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; When we ran the following query, SELECT CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", STRING_LEN_PLUS_TWO(x)) a FROM macro_test SORT BY a DESC; We get result: 3:3:3 2:2:2 1:1:1 instead of expected: 3:4:5 2:3:4 1:2:3 Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and HIVE-12277 patches. > Hive Macro overwritten when multiple macros are used in one column > -- > > Key: HIVE-13372 > URL: https://issues.apache.org/jira/browse/HIVE-13372 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Lin Liu >Assignee: Pengcheng Xiong >Priority: Critical > > When multiple macros are used in one column, results of the later ones are > over written by that of the first. > For example: > Suppose we have created a table called macro_test with single column x in > STRING type, and with data as: > "a" > "bb" > "ccc" > We also create three macros: > {code} > CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); > CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; > CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; > {code} > When we ran the following query, > {code} > SELECT > CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", > STRING_LEN_PLUS_TWO(x)) a > FROM macro_test > SORT BY a DESC; > {code} > We get result: > 3:3:3 > 2:2:2 > 1:1:1 > instead of expected: > 3:4:5 > 2:3:4 > 1:2:3 > Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and > HIVE-12277 patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on WIndows
[ https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215051#comment-15215051 ] Ashutosh Chauhan commented on HIVE-13371: - +1 test only changes need to no go through Hive QA cycle. > Fix test failure of testHasNull in TestColumnStatistics running on WIndows > -- > > Key: HIVE-13371 > URL: https://issues.apache.org/jira/browse/HIVE-13371 > Project: Hive > Issue Type: Test >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Minor > Attachments: HIVE-13371.01.patch > > > As per [~prasanth_j]'s analysis, > {code} > The ColumnStatistics test failures are already known to fail in Windows. This > is mostly a file size difference which is not a product issue and can be > ignored safely. The reason for the failure is ORC stripe footer stores the > timezone ID as string in stripe footer. Since Linux and Windows produces > different timezone id string, the size of the stripe footer will change > accordingly. Because of this timezone difference the file sizes will be > different on windows and linux. We can either update the > orc-file-has-null.out output file on Windows or ignore this test altogether. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on WIndows
[ https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13371: --- Status: Patch Available (was: Open) > Fix test failure of testHasNull in TestColumnStatistics running on WIndows > -- > > Key: HIVE-13371 > URL: https://issues.apache.org/jira/browse/HIVE-13371 > Project: Hive > Issue Type: Test >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Minor > Attachments: HIVE-13371.01.patch > > > As per [~prasanth_j]'s analysis, > {code} > The ColumnStatistics test failures are already known to fail in Windows. This > is mostly a file size difference which is not a product issue and can be > ignored safely. The reason for the failure is ORC stripe footer stores the > timezone ID as string in stripe footer. Since Linux and Windows produces > different timezone id string, the size of the stripe footer will change > accordingly. Because of this timezone difference the file sizes will be > different on windows and linux. We can either update the > orc-file-has-null.out output file on Windows or ignore this test altogether. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13349) Metastore Changes : HS2 changes : API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13349: - Attachment: HIVE-13349.1.patch > Metastore Changes : HS2 changes : API calls for retrieving primary keys and > foreign keys information > > > Key: HIVE-13349 > URL: https://issues.apache.org/jira/browse/HIVE-13349 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13349.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column
[ https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215045#comment-15215045 ] Pengcheng Xiong commented on HIVE-13372: Interesting. I will take a look. > Hive Macro overwritten when multiple macros are used in one column > -- > > Key: HIVE-13372 > URL: https://issues.apache.org/jira/browse/HIVE-13372 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Lin Liu >Priority: Critical > > When multiple macros are used in one column, results of the later ones are > over written by that of the first. > For example: > Suppose we have created a table called macro_test with single column x in > STRING type, and with data as: > "a" > "bb" > "ccc" > We also create three macros: > CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); > CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; > CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; > When we ran the following query, > SELECT > CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", > STRING_LEN_PLUS_TWO(x)) a > FROM macro_test > SORT BY a DESC; > We get result: > 3:3:3 > 2:2:2 > 1:1:1 > instead of expected: > 3:4:5 > 2:3:4 > 1:2:3 > Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and > HIVE-12277 patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13349) Metastore Changes : HS2 changes : API calls for retrieving primary keys and foreign keys information
[ https://issues.apache.org/jira/browse/HIVE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13349: - Status: Patch Available (was: Open) > Metastore Changes : HS2 changes : API calls for retrieving primary keys and > foreign keys information > > > Key: HIVE-13349 > URL: https://issues.apache.org/jira/browse/HIVE-13349 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13349.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column
[ https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HIVE-13372: --- Description: When multiple macros are used in one column, results of the later ones are over written by that of the first. For example: Suppose we have created a table called macro_test with single column x in STRING type, and with data as: "a" "bb" "ccc" We also create three macros: CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; When we ran the following query, SELECT CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", STRING_LEN_PLUS_TWO(x)) a FROM macro_test SORT BY a DESC; We get result: 3:3:3 2:2:2 1:1:1 instead of expected: 3:4:5 2:3:4 1:2:3 Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and HIVE-12277 patches. was: When multiple macros are used in one column, results of the later ones are over written by that of the first. For example: Suppose we have created a table called macro_test with single column x in STRING type, and with data as: "a" "bb" "ccc" We also create three macros: CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; When we ran the following query, SELECT CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", STRING_LEN_PLUS_TWO(x)) a FROM macro_test SORT BY a DESC; We get result: 3:3:3 2:2:2 1:1:1 instead of expected: 3:4:5 2:3:4 1:2:3 Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and Hive-12277 patches. > Hive Macro overwritten when multiple macros are used in one column > -- > > Key: HIVE-13372 > URL: https://issues.apache.org/jira/browse/HIVE-13372 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Lin Liu >Priority: Critical > > When multiple macros are used in one column, results of the later ones are > over written by that of the first. > For example: > Suppose we have created a table called macro_test with single column x in > STRING type, and with data as: > "a" > "bb" > "ccc" > We also create three macros: > CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); > CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; > CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; > When we ran the following query, > SELECT > CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", > STRING_LEN_PLUS_TWO(x)) a > FROM macro_test > SORT BY a DESC; > We get result: > 3:3:3 > 2:2:2 > 1:1:1 > instead of expected: > 3:4:5 > 2:3:4 > 1:2:3 > Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and > HIVE-12277 patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13372) Hive Macro overwritten when multiple macros are used in one column
[ https://issues.apache.org/jira/browse/HIVE-13372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HIVE-13372: --- Description: When multiple macros are used in one column, results of the later ones are over written by that of the first. For example: Suppose we have created a table called macro_test with single column x in STRING type, and with data as: "a" "bb" "ccc" We also create three macros: CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; When we ran the following query, SELECT CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", STRING_LEN_PLUS_TWO(x)) a FROM macro_test SORT BY a DESC; We get result: 3:3:3 2:2:2 1:1:1 instead of expected: 3:4:5 2:3:4 1:2:3 Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and Hive-12277 patches. was: When multiple macros are used in one column, results of the later ones are over written by that of the first. For example: Suppose we have created a table called macro_test with single column x in STRING type, and with data as: "a" "bb" "ccc" We also create three macros: CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; When we ran the following query, SELECT CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", STRING_LEN_PLUS_TWO(x)) a FROM macro_test SORT BY a DESC; We get result: 3:3:3 2:2:2 1:1:1 instead of expected: 3:4:5 2:3:4 1:2:3 > Hive Macro overwritten when multiple macros are used in one column > -- > > Key: HIVE-13372 > URL: https://issues.apache.org/jira/browse/HIVE-13372 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Lin Liu >Priority: Critical > > When multiple macros are used in one column, results of the later ones are > over written by that of the first. > For example: > Suppose we have created a table called macro_test with single column x in > STRING type, and with data as: > "a" > "bb" > "ccc" > We also create three macros: > CREATE TEMPORARY MACRO STRING_LEN(x string) length(x); > CREATE TEMPORARY MACRO STRING_LEN_PLUS_ONE(x string) length(x)+1; > CREATE TEMPORARY MACRO STRING_LEN_PLUS_TWO(x string) length(x)+2; > When we ran the following query, > SELECT > CONCAT(STRING_LEN(x), ":", STRING_LEN_PLUS_ONE(x), ":", > STRING_LEN_PLUS_TWO(x)) a > FROM macro_test > SORT BY a DESC; > We get result: > 3:3:3 > 2:2:2 > 1:1:1 > instead of expected: > 3:4:5 > 2:3:4 > 1:2:3 > Currently we are using Hive 1.2.1, and have applied both HIVE-11432 and > Hive-12277 patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12290) Native Vector ReduceSink
[ https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shannon Ladymon updated HIVE-12290: --- Labels: (was: TODOC2.0) > Native Vector ReduceSink > > > Key: HIVE-12290 > URL: https://issues.apache.org/jira/browse/HIVE-12290 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.0.0 > > Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, > HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, > HIVE-12290.06.patch > > > Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so > we incur object inspector costs. > Native vectorization will not use object inspectors and allocate memory up > front that will be reused for each batch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12290) Native Vector ReduceSink
[ https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215034#comment-15215034 ] Shannon Ladymon commented on HIVE-12290: Doc done: * [Configuration Properties - hive.vectorized.execution.reducesink.new.enabled | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.vectorized.execution.reducesink.new.enabled] > Native Vector ReduceSink > > > Key: HIVE-12290 > URL: https://issues.apache.org/jira/browse/HIVE-12290 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.0.0 > > Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, > HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, > HIVE-12290.06.patch > > > Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so > we incur object inspector costs. > Native vectorization will not use object inspectors and allocate memory up > front that will be reused for each batch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215008#comment-15215008 ] Shannon Ladymon commented on HIVE-9824: --- [~mmccline], just wanted to check in once more about whether the description for *hive.vectorized.execution.mapjoin.minmax.enabled* should read "max/max" or "min/max". > LLAP: Native Vectorization of Map Join > -- > > Key: HIVE-9824 > URL: https://issues.apache.org/jira/browse/HIVE-9824 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: TODOC1.2 > Fix For: 1.2.0 > > Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, > HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, > HIVE-9824.08.patch, HIVE-9824.09.patch > > > Today's VectorMapJoinOperator is a pass-through that converts each row from a > vectorized row batch in a Java Object[] row and passes it to the > MapJoinOperator superclass. > This enhancement creates specialized vectorized map join operator classes > that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-13370: Priority: Minor (was: Major) > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan >Priority: Minor > Attachments: HIVE-13370.patch > > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13371) Fix test failure of testHasNull in TestColumnStatistics running on WIndows
[ https://issues.apache.org/jira/browse/HIVE-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13371: --- Description: As per [~prasanth_j]'s analysis, {code} The ColumnStatistics test failures are already known to fail in Windows. This is mostly a file size difference which is not a product issue and can be ignored safely. The reason for the failure is ORC stripe footer stores the timezone ID as string in stripe footer. Since Linux and Windows produces different timezone id string, the size of the stripe footer will change accordingly. Because of this timezone difference the file sizes will be different on windows and linux. We can either update the orc-file-has-null.out output file on Windows or ignore this test altogether. {code} > Fix test failure of testHasNull in TestColumnStatistics running on WIndows > -- > > Key: HIVE-13371 > URL: https://issues.apache.org/jira/browse/HIVE-13371 > Project: Hive > Issue Type: Test >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Minor > > As per [~prasanth_j]'s analysis, > {code} > The ColumnStatistics test failures are already known to fail in Windows. This > is mostly a file size difference which is not a product issue and can be > ignored safely. The reason for the failure is ORC stripe footer stores the > timezone ID as string in stripe footer. Since Linux and Windows produces > different timezone id string, the size of the stripe footer will change > accordingly. Because of this timezone difference the file sizes will be > different on windows and linux. We can either update the > orc-file-has-null.out output file on Windows or ignore this test altogether. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214935#comment-15214935 ] Daniel Dai commented on HIVE-13370: --- +1 > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-13370.patch > > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13281) Update some default configs for LLAP
[ https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-13281: -- Attachment: HIVE-13281.2.patch Updated patch to fix the test failures. > Update some default configs for LLAP > > > Key: HIVE-13281 > URL: https://issues.apache.org/jira/browse/HIVE-13281 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-13281.1.patch, HIVE-13281.2.patch > > > Disable uber mode. > Enable llap.io by default -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10729) Query failed when select complex columns from joinned table (tez map join only)
[ https://issues.apache.org/jira/browse/HIVE-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214922#comment-15214922 ] Matt McCline commented on HIVE-10729: - Thank you for the review. > Query failed when select complex columns from joinned table (tez map join > only) > --- > > Key: HIVE-10729 > URL: https://issues.apache.org/jira/browse/HIVE-10729 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Selina Zhang >Assignee: Matt McCline > Attachments: HIVE-10729.03.patch, HIVE-10729.04.patch, > HIVE-10729.1.patch, HIVE-10729.2.patch > > > When map join happens, if projection columns include complex data types, > query will fail. > Steps to reproduce: > {code:sql} > hive> set hive.auto.convert.join; > hive.auto.convert.join=true > hive> desc foo; > a array > hive> select * from foo; > [1,2] > hive> desc src_int; > key int > value string > hive> select * from src_int where key=2; > 2val_2 > hive> select * from foo join src_int src on src.key = foo.a[1]; > {code} > Query will fail with stack trace > {noformat} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryArray cannot be cast to > [Ljava.lang.Object; > at > org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getList(StandardListObjectInspector.java:111) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:314) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:262) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:246) > at > org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:50) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:692) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:644) > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:676) > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386) > ... 23 more > {noformat} > Similar error when projection columns include a map: > {code:sql} > hive> CREATE TABLE test (a INT, b MAP) STORED AS ORC; > hive> INSERT OVERWRITE TABLE test SELECT 1, MAP(1, "val_1", 2, "val_2") FROM > src LIMIT 1; > hive> select * from src join test where src.key=test.a; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13111) Fix timestamp / interval_day_time wrong results with HIVE-9862
[ https://issues.apache.org/jira/browse/HIVE-13111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13111: Attachment: HIVE-13111.07.patch Reduce range of random timestamp produced due to HIVE-12531. > Fix timestamp / interval_day_time wrong results with HIVE-9862 > --- > > Key: HIVE-13111 > URL: https://issues.apache.org/jira/browse/HIVE-13111 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13111.01.patch, HIVE-13111.02.patch, > HIVE-13111.03.patch, HIVE-13111.04.patch, HIVE-13111.05.patch, > HIVE-13111.06.patch, HIVE-13111.07.patch > > > Fix timestamp / interval_day_time issues discovered when testing the > Vectorized Text patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11936) Support SQLAnywhere as a backing DB for the hive metastore
[ https://issues.apache.org/jira/browse/HIVE-11936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-11936: Assignee: (was: Sushanth Sowmyan) > Support SQLAnywhere as a backing DB for the hive metastore > -- > > Key: HIVE-11936 > URL: https://issues.apache.org/jira/browse/HIVE-11936 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Sushanth Sowmyan > > I've had pings from people interested in enabling the metastore to work on > top of SQLAnywhere (17+), and thus, opening this jira to track changes needed > in hive to make SQLAnywhere work as a backing db for the metastore. > I have it working and passing all tests currently in my setup, and will > upload patches as I'm able to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11936) Support SQLAnywhere as a backing DB for the hive metastore
[ https://issues.apache.org/jira/browse/HIVE-11936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214869#comment-15214869 ] Sushanth Sowmyan commented on HIVE-11936: - Opening this up as unassigned - the changes required changes to DN for a new SQLAnywhere adapter, and also more changes that I'd not been able to test enough, and haven't followed up on for a while. If someone else wants to take this on, please go ahead. > Support SQLAnywhere as a backing DB for the hive metastore > -- > > Key: HIVE-11936 > URL: https://issues.apache.org/jira/browse/HIVE-11936 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > > I've had pings from people interested in enabling the metastore to work on > top of SQLAnywhere (17+), and thus, opening this jira to track changes needed > in hive to make SQLAnywhere work as a backing db for the metastore. > I have it working and passing all tests currently in my setup, and will > upload patches as I'm able to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214840#comment-15214840 ] Mithun Radhakrishnan commented on HIVE-13370: - Thanks for adding the test, [~sushanth]. +1. Looks good. > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-13370.patch > > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-13370: Status: Patch Available (was: Open) > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-13370.patch > > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-13370: Attachment: HIVE-13370.patch Patch attached. > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-13370.patch > > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13369) AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing the "best" base file
[ https://issues.apache.org/jira/browse/HIVE-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-13369: -- Description: The JavaDoc on getAcidState() reads, in part: "Note that because major compactions don't preserve the history, we can't use a base directory that includes a transaction id that we must exclude." which is correct but there is nothing in the code that does this. was: The JavaDoc on getAcidState() reads, in part: "Note that because major compactions don't * preserve the history, we can't use a base directory that includes a * transaction id that we must exclude." which is correct but there is nothing in the code that does this. > AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing > the "best" base file > -- > > Key: HIVE-13369 > URL: https://issues.apache.org/jira/browse/HIVE-13369 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Blocker > > The JavaDoc on getAcidState() reads, in part: > "Note that because major compactions don't >preserve the history, we can't use a base directory that includes a >transaction id that we must exclude." > which is correct but there is nothing in the code that does this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214833#comment-15214833 ] Sushanth Sowmyan commented on HIVE-13370: - [~mithun]/[~daijy], could you please review? > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-13370.patch > > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-13370: Description: HIVE-11470 added capability to handle NULL dynamic partitioning keys properly. However, it did not add a test for the case, we should have one so we don't have future regressions of the same. > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13345) LLAP: metadata cache takes too much space, esp. with bloom filters, due to Java/protobuf overhead
[ https://issues.apache.org/jira/browse/HIVE-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214825#comment-15214825 ] Sergey Shelukhin commented on HIVE-13345: - I think the problem is/was that ORC readers were created with proto objects. Anyway, I'll take a look at how complex both approaches are at some point (this week?) > LLAP: metadata cache takes too much space, esp. with bloom filters, due to > Java/protobuf overhead > - > > Key: HIVE-13345 > URL: https://issues.apache.org/jira/browse/HIVE-13345 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > We cache java objects currently; these have high overhead, average stripe > metadata takes 200-500Kb on real files, and with bloom filters blowing up > more than x5 due to being stored as list of Long-s, up to 5Mb per stripe. > That is undesirable. > We should either create better objects for ORC (might be good in general) or > store serialized metadata and deserialize when needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13345) LLAP: metadata cache takes too much space, esp. with bloom filters, due to Java/protobuf overhead
[ https://issues.apache.org/jira/browse/HIVE-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214816#comment-15214816 ] Prasanth Jayachandran commented on HIVE-13345: -- IMO we should store the serialized representation of metadata. Deserialized representation of metadata (Proto objects) are supposed to be short-lived. We have POJOs for all protobuf equivalents. BloomFilter, ColumnStatistics, StripeInformation etc. which creates POJOs from Proto objects. If we are caching the deserialized representation then we should cache the equivalent POJOs and not the proto objects. > LLAP: metadata cache takes too much space, esp. with bloom filters, due to > Java/protobuf overhead > - > > Key: HIVE-13345 > URL: https://issues.apache.org/jira/browse/HIVE-13345 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > We cache java objects currently; these have high overhead, average stripe > metadata takes 200-500Kb on real files, and with bloom filters blowing up > more than x5 due to being stored as list of Long-s, up to 5Mb per stripe. > That is undesirable. > We should either create better objects for ORC (might be good in general) or > store serialized metadata and deserialize when needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC
[ https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9660: --- Attachment: (was: HIVE-9660.WIP2.patch) > store end offset of compressed data for RG in RowIndex in ORC > - > > Key: HIVE-9660 > URL: https://issues.apache.org/jira/browse/HIVE-9660 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-9660.01.patch, HIVE-9660.patch, HIVE-9660.patch > > > Right now the end offset is estimated, which in some cases results in tons of > extra data being read. > We can add a separate array to RowIndex (positions_v2?) that stores number of > compressed buffers for each RG, or end offset, or something, to remove this > estimation magic -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC
[ https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9660: --- Attachment: HIVE-9660.01.patch Rebased the patch and fixed some small issue. > store end offset of compressed data for RG in RowIndex in ORC > - > > Key: HIVE-9660 > URL: https://issues.apache.org/jira/browse/HIVE-9660 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-9660.01.patch, HIVE-9660.WIP2.patch, > HIVE-9660.patch, HIVE-9660.patch > > > Right now the end offset is estimated, which in some cases results in tons of > extra data being read. > We can add a separate array to RowIndex (positions_v2?) that stores number of > compressed buffers for each RG, or end offset, or something, to remove this > estimation magic -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214796#comment-15214796 ] Gopal V commented on HIVE-12049: MaxRows = 1000 !old-driver-profiles.png! The hot codepath with the new driver is {code} Stacks at 2016-03-28 01:10:19 PM (uptime 7m 58 sec) faeb41dd-3869-40cc-860b-748f505d5565 eab06890-8bb8-478f-877a-9282f5b4d64e HiveServer2-Handler-Pool: Thread-788 [RUNNABLE] *** java.util.concurrent.ConcurrentHashMap.putAll(Map) ConcurrentHashMap.java:1084 *** java.util.concurrent.ConcurrentHashMap.(Map) ConcurrentHashMap.java:852 *** org.apache.hadoop.conf.Configuration.(Configuration) Configuration.java:713 *** org.apache.hadoop.hive.conf.HiveConf.(HiveConf) HiveConf.java:3460 *** org.apache.hive.service.cli.operation.SQLOperation.getConfigForOperation() SQLOperation.java:529 *** org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(FetchOrientation, long) SQLOperation.java:360 *** org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationHandle, FetchOrientation, long) OperationManager.java:280 org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(OperationHandle, FetchOrientation, long, FetchType) HiveSessionImpl.java:786 org.apache.hive.service.cli.CLIService.fetchResults(OperationHandle, FetchOrientation, long, FetchType) CLIService.java:452 org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(TFetchResultsReq) ThriftCLIService.java:743 org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService$Iface, TCLIService$FetchResults_args) TCLIService.java:1557 org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(Object, TBase) TCLIService.java:1542 org.apache.thrift.ProcessFunction.process(int, TProtocol, TProtocol, Object) ProcessFunction.java:39 org.apache.thrift.TBaseProcessor.process(TProtocol, TProtocol) TBaseProcessor.java:39 org.apache.hive.service.auth.TSetIpAddressProcessor.process(TProtocol, TProtocol) TSetIpAddressProcessor.java:56 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() TThreadPoolServer.java:286 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1142 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:617 java.lang.Thread.run() Thread.java:745 {code} > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, > HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-12049: --- Attachment: old-driver-profiles.png > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, > HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-1718) Implement SerDe for processing fixed length data
[ https://issues.apache.org/jira/browse/HIVE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1718: - Assignee: (was: Shreepadma Venugopalan) > Implement SerDe for processing fixed length data > > > Key: HIVE-1718 > URL: https://issues.apache.org/jira/browse/HIVE-1718 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Carl Steinbach > > Fixed length fields are pretty common in legacy data formats. While it is > already > possible to process these files using the RegexSerDe, they could be more > efficiently > handled using a SerDe that is specifically crafted for reading/writing fixed > length > fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13350) Support Alter commands for Rely/NoRely novalidate for PK/FK constraints
[ https://issues.apache.org/jira/browse/HIVE-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214774#comment-15214774 ] Alan Gates commented on HIVE-13350: --- Does this mean you want an alter command that can add a PK or FK? Or do you want to be able to add/drop the rely/no rely options? The latter doesn't make any sense since we have no ability to validate a PK or FK. > Support Alter commands for Rely/NoRely novalidate for PK/FK constraints > > > Key: HIVE-13350 > URL: https://issues.apache.org/jira/browse/HIVE-13350 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive
[ https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214772#comment-15214772 ] Alan Gates commented on HIVE-13290: --- Could you post a version of the patch without the generated code for easier review? > Support primary keys/foreign keys constraint as part of create table command > in Hive > > > Key: HIVE-13290 > URL: https://issues.apache.org/jira/browse/HIVE-13290 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, > HIVE-13290.3.patch > > > SUPPORT for the following statements > {code} > CREATE TABLE product > ( > product_idINTEGER, > product_vendor_id INTEGER, > PRIMARY KEY (product_id), > CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES > vendor(vendor_id) > ); > CREATE TABLE vendor > ( > vendor_id INTEGER, > PRIMARY KEY (vendor_id) > ); > {code} > In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not > specified by the user, we will use system generated constraint name. For the > purpose of simplicity, we will allow CONSTRAINT option for foreign keys and > not primary key since there is only one primary key per table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13364) Allow llap to work with dynamic ports for rpc, shuffle, ui
[ https://issues.apache.org/jira/browse/HIVE-13364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214771#comment-15214771 ] Hive QA commented on HIVE-13364: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795524/HIVE-13364.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9882 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthString org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthTimestamp org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearString org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearTimestamp org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7398/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7398/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7398/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12795524 - PreCommit-HIVE-TRUNK-Build > Allow llap to work with dynamic ports for rpc, shuffle, ui > -- > > Key: HIVE-13364 > URL: https://issues.apache.org/jira/browse/HIVE-13364 > Project: Hive > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-13364.1.patch > > > At the moment - the ports specified in the configuration are the ones which > are used to register with the Zookeeper service. Setting the ports to 0 > effectively means that the services cannot be discovered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10249) ACID: show locks should show who the lock is waiting for
[ https://issues.apache.org/jira/browse/HIVE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-10249: -- Attachment: HIVE-10249.3.patch patch 3 fixes the test issus > ACID: show locks should show who the lock is waiting for > > > Key: HIVE-10249 > URL: https://issues.apache.org/jira/browse/HIVE-10249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-10249.2.patch, HIVE-10249.3.patch, HIVE-10249.patch > > > instead of just showing state WAITING, we should include what the lock is > waiting for. It will make diagnostics easier. > It would also be useful to add QueryPlan.getQueryId() so it's easy to see > which query the lock belongs to. > # need to store this in HIVE_LOCKS (additional field); this has a perf hit to > do another update on failed attempt and to clear filed on successful attempt. > (Actually on success, we update anyway). How exactly would this be > displayed? Each lock can block but we acquire all parts of external lock at > once. Since we stop at first one that blocked, we’d only update that one… > # This needs a matching Thrift change to pass to client: ShowLocksResponse > # Perhaps we can start updating this info after lock was in W state for some > time to reduce perf hit. > # This is mostly useful for “Why is my query stuck” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores
[ https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214754#comment-15214754 ] Alan Gates commented on HIVE-11388: --- bq. Does this need to be documented in the wiki for releases 1.3.0 and 2.1.0? I am not sure. [~ekoifman] is this all of the work needed to make it possible to run multiple initiators and cleaners, or just part of it? Have we tested running them in multiple metastores? If the answer to those is yes, then the answer to [~leftylev]'s question is: "definitely". > Allow ACID Compactor components to run in multiple metastores > - > > Key: HIVE-11388 > URL: https://issues.apache.org/jira/browse/HIVE-11388 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, > HIVE-11388.5.patch, HIVE-11388.6.patch, HIVE-11388.7.patch, > HIVE-11388.branch-1.patch, HIVE-11388.patch > > > (this description is no loner accurate; see further comments) > org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs > inside the metastore service to manage compactions of ACID tables. There > should be exactly 1 instance of this thread (even with multiple Thrift > services). > This is documented in > https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration > but not enforced. > Should add enforcement, since more than 1 Initiator could cause concurrent > attempts to compact the same table/partition - which will not work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13345) LLAP: metadata cache takes too much space, esp. with bloom filters, due to Java/protobuf overhead
[ https://issues.apache.org/jira/browse/HIVE-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214728#comment-15214728 ] Sergey Shelukhin commented on HIVE-13345: - [~gopalv] [~prasanth_j] [~owen.omalley] opinions on the best approach? I am leaning towards changing ORC to use POJOs instead of OrcProto stuff, but as an alternative we can change metadata cache in LLAP to store serialized metadata. The cost of deserializing every time in LLAP vs the cost of copying fields/converting some things (e.g. OrcProto stores bloom filters as List, which aside from being horrible on pure merits, offends my engineering sensibilities, so I might be biased here). > LLAP: metadata cache takes too much space, esp. with bloom filters, due to > Java/protobuf overhead > - > > Key: HIVE-13345 > URL: https://issues.apache.org/jira/browse/HIVE-13345 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > We cache java objects currently; these have high overhead, average stripe > metadata takes 200-500Kb on real files, and with bloom filters blowing up > more than x5 due to being stored as list of Long-s, up to 5Mb per stripe. > That is undesirable. > We should either create better objects for ORC (might be good in general) or > store serialized metadata and deserialize when needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13345) LLAP: metadata cache takes too much space, esp. with bloom filters, due to Java/protobuf overhead
[ https://issues.apache.org/jira/browse/HIVE-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214728#comment-15214728 ] Sergey Shelukhin edited comment on HIVE-13345 at 3/28/16 7:27 PM: -- [~gopalv] [~prasanth_j] [~owen.omalley] opinions on the best approach? I am leaning towards changing ORC to use POJOs instead of OrcProto stuff, but as an alternative we can change metadata cache in LLAP to store serialized metadata. The cost of deserializing every time in LLAP vs the cost of copying fields/converting some things (e.g. OrcProto stores bloom filters as List, which aside from being horrible on purely practical grounds, offends my engineering sensibilities, so I might be biased here). was (Author: sershe): [~gopalv] [~prasanth_j] [~owen.omalley] opinions on the best approach? I am leaning towards changing ORC to use POJOs instead of OrcProto stuff, but as an alternative we can change metadata cache in LLAP to store serialized metadata. The cost of deserializing every time in LLAP vs the cost of copying fields/converting some things (e.g. OrcProto stores bloom filters as List, which aside from being horrible on pure merits, offends my engineering sensibilities, so I might be biased here). > LLAP: metadata cache takes too much space, esp. with bloom filters, due to > Java/protobuf overhead > - > > Key: HIVE-13345 > URL: https://issues.apache.org/jira/browse/HIVE-13345 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > We cache java objects currently; these have high overhead, average stripe > metadata takes 200-500Kb on real files, and with bloom filters blowing up > more than x5 due to being stored as list of Long-s, up to 5Mb per stripe. > That is undesirable. > We should either create better objects for ORC (might be good in general) or > store serialized metadata and deserialize when needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13332) support dumping all row indexes in ORC FileDump
[ https://issues.apache.org/jira/browse/HIVE-13332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13332: Attachment: HIVE-13332.01.patch Updated the patch with the out file changes for MiniTez... > support dumping all row indexes in ORC FileDump > --- > > Key: HIVE-13332 > URL: https://issues.apache.org/jira/browse/HIVE-13332 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13332.01.patch, HIVE-13332.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13149: Attachment: HIVE-13149.6.patch > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, > HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, HIVE-13149.6.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some like StatsTask, don't need to access HMS. > Currently a new HMS connection will be established for each Task thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13149: Attachment: (was: HIVE-13149.6.patch) > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, > HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, HIVE-13149.6.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some like StatsTask, don't need to access HMS. > Currently a new HMS connection will be established for each Task thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12992) Hive on tez: Bucket map join plan is incorrect
[ https://issues.apache.org/jira/browse/HIVE-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-12992: -- Resolution: Fixed Fix Version/s: 2.0.1 2.1.0 1.2.2 Status: Resolved (was: Patch Available) > Hive on tez: Bucket map join plan is incorrect > -- > > Key: HIVE-12992 > URL: https://issues.apache.org/jira/browse/HIVE-12992 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Labels: tez > Fix For: 1.2.2, 2.1.0, 2.0.1 > > Attachments: HIVE-12992.1.patch, HIVE-12992.2.patch > > > TPCH Query 9 fails when bucket map join is enabled: > {code} > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer > 5, vertexId=vertex_1450634494433_0007_2_06, diagnostics=[Exception in > EdgeManager, vertex=vertex_1450634494433_0007_2_06 [Reducer 5], Fail to > sendTezEventToDestinationTasks, event:DataMovementEvent [sourceIndex=0, > targetIndex=-1, version=0], sourceInfo:{ producerConsumerType=OUTPUT, > taskVertexName=Map 1, edgeVertexName=Reducer 5, > taskAttemptId=attempt_1450634494433_0007_2_05_00_0 }, > destinationInfo:null, EdgeInfo: sourceVertexName=Map 1, > destinationVertexName=Reducer 5, java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.routeDataMovementEventToDestination(CustomPartitionEdge.java:88) > at > org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:458) > at > org.apache.tez.dag.app.dag.impl.Edge.handleCompositeDataMovementEvent(Edge.java:386) > at > org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:439) > at > org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4382) > at > org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:202) > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4172) > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4164) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13361) Orc concatenation should enforce the compression buffer size
[ https://issues.apache.org/jira/browse/HIVE-13361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13361: - Attachment: HIVE-13361.1.patch Not sure why precommit did not pick up the patch. Reuploading again. > Orc concatenation should enforce the compression buffer size > > > Key: HIVE-13361 > URL: https://issues.apache.org/jira/browse/HIVE-13361 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0, 2.1.0 >Reporter: Yi Zhang >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13361.1.patch, HIVE-13361.1.patch, alltypesorc3xcols > > > With HIVE-11807 buffer size estimation happens by default. This can have > undesired effect wrt file concatenation. Consider the following table with > files > {code} > testtable > -- 00_0 (created before HIVE-11807 which has buffer size 256KB) > -- 01_0 (created before HIVE-11807 which has buffer size 256KB) > -- 02_0 (created after HIVE-11807 with buffer size chosen as 128KB) > -- 03_0 (created after HIVE-11807 with buffer size chosen as 128KB) > {code} > If we perform ALTER TABLE .. CONCATENATE on the above table with HIVE-11807, > then depending on the split arrangement 00_0 and 01_0 will be > concatenated together to new merged file. But this new merged file will have > 128KB buffer size (estimated buffer size and not requested buffer size). > Since new ORC writer size does not honor the requested buffer size the new > merged files will have smaller buffers than the required 256KB making the > file unreadable. Following exception will be thrown when reading the table > after concatenation > {code} > 2016-03-24T16:26:33,974 ERROR [a9e27a9a-37cb-411d-9708-6c58a4ce34f2 main]: > CliDriver (SessionState.java:printError(1049)) - Failed with exception > java.io.IOException:java.lang.IllegalArgumentException: Buffer size too > small. size = 131072 needed = 153187 > java.io.IOException: java.lang.IllegalArgumentException: Buffer size too > small. size = 131072 needed = 153187 > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:513) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:420) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:145) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1848) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:256) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:782) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:721) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:648) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13326) HiveServer2: Make ZK config publishing configurable
[ https://issues.apache.org/jira/browse/HIVE-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214672#comment-15214672 ] Thejas M Nair commented on HIVE-13326: -- +1 Thanks for creating the tests, we can now build on this for future service discovery patches! > HiveServer2: Make ZK config publishing configurable > --- > > Key: HIVE-13326 > URL: https://issues.apache.org/jira/browse/HIVE-13326 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-13326.1.patch, HIVE-13326.2.patch > > > We should revert to older behaviour when config publishing is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12992) Hive on tez: Bucket map join plan is incorrect
[ https://issues.apache.org/jira/browse/HIVE-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214647#comment-15214647 ] Vikram Dixit K commented on HIVE-12992: --- The bucket_map_join test failure is related. It is a golden file update that I missed. Posting a new patch here with golden file update for it. > Hive on tez: Bucket map join plan is incorrect > -- > > Key: HIVE-12992 > URL: https://issues.apache.org/jira/browse/HIVE-12992 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Labels: tez > Attachments: HIVE-12992.1.patch, HIVE-12992.2.patch > > > TPCH Query 9 fails when bucket map join is enabled: > {code} > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer > 5, vertexId=vertex_1450634494433_0007_2_06, diagnostics=[Exception in > EdgeManager, vertex=vertex_1450634494433_0007_2_06 [Reducer 5], Fail to > sendTezEventToDestinationTasks, event:DataMovementEvent [sourceIndex=0, > targetIndex=-1, version=0], sourceInfo:{ producerConsumerType=OUTPUT, > taskVertexName=Map 1, edgeVertexName=Reducer 5, > taskAttemptId=attempt_1450634494433_0007_2_05_00_0 }, > destinationInfo:null, EdgeInfo: sourceVertexName=Map 1, > destinationVertexName=Reducer 5, java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.routeDataMovementEventToDestination(CustomPartitionEdge.java:88) > at > org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:458) > at > org.apache.tez.dag.app.dag.impl.Edge.handleCompositeDataMovementEvent(Edge.java:386) > at > org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:439) > at > org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4382) > at > org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:202) > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4172) > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4164) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12992) Hive on tez: Bucket map join plan is incorrect
[ https://issues.apache.org/jira/browse/HIVE-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-12992: -- Attachment: HIVE-12992.2.patch > Hive on tez: Bucket map join plan is incorrect > -- > > Key: HIVE-12992 > URL: https://issues.apache.org/jira/browse/HIVE-12992 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Labels: tez > Attachments: HIVE-12992.1.patch, HIVE-12992.2.patch > > > TPCH Query 9 fails when bucket map join is enabled: > {code} > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer > 5, vertexId=vertex_1450634494433_0007_2_06, diagnostics=[Exception in > EdgeManager, vertex=vertex_1450634494433_0007_2_06 [Reducer 5], Fail to > sendTezEventToDestinationTasks, event:DataMovementEvent [sourceIndex=0, > targetIndex=-1, version=0], sourceInfo:{ producerConsumerType=OUTPUT, > taskVertexName=Map 1, edgeVertexName=Reducer 5, > taskAttemptId=attempt_1450634494433_0007_2_05_00_0 }, > destinationInfo:null, EdgeInfo: sourceVertexName=Map 1, > destinationVertexName=Reducer 5, java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.routeDataMovementEventToDestination(CustomPartitionEdge.java:88) > at > org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:458) > at > org.apache.tez.dag.app.dag.impl.Edge.handleCompositeDataMovementEvent(Edge.java:386) > at > org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:439) > at > org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4382) > at > org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:202) > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4172) > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4164) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive
[ https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214642#comment-15214642 ] Hive QA commented on HIVE-13290: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795670/HIVE-13290.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 4 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/130/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/130/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-METASTORE-Test-130/ Messages: {noformat} LXC derby found. LXC derby is not started. Starting container... Container started. Preparing derby container... Container prepared. Calling /hive/testutils/metastore/dbs/derby/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/derby/execute.sh ... Tests executed. LXC mysql found. LXC mysql is not started. Starting container... Container started. Preparing mysql container... Container prepared. Calling /hive/testutils/metastore/dbs/mysql/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/mysql/execute.sh ... Tests executed. LXC oracle found. LXC oracle is not started. Starting container... Container started. Preparing oracle container... Container prepared. Calling /hive/testutils/metastore/dbs/oracle/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/oracle/execute.sh ... Tests executed. LXC postgres found. LXC postgres is not started. Starting container... Container started. Preparing postgres container... Container prepared. Calling /hive/testutils/metastore/dbs/postgres/prepare.sh ... Server prepared. Calling /hive/testutils/metastore/dbs/postgres/execute.sh ... Tests executed. {noformat} This message is automatically generated. ATTACHMENT ID: 12795670 - PreCommit-HIVE-METASTORE-Test > Support primary keys/foreign keys constraint as part of create table command > in Hive > > > Key: HIVE-13290 > URL: https://issues.apache.org/jira/browse/HIVE-13290 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, > HIVE-13290.3.patch > > > SUPPORT for the following statements > {code} > CREATE TABLE product > ( > product_idINTEGER, > product_vendor_id INTEGER, > PRIMARY KEY (product_id), > CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES > vendor(vendor_id) > ); > CREATE TABLE vendor > ( > vendor_id INTEGER, > PRIMARY KEY (vendor_id) > ); > {code} > In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not > specified by the user, we will use system generated constraint name. For the > purpose of simplicity, we will allow CONSTRAINT option for foreign keys and > not primary key since there is only one primary key per table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214643#comment-15214643 ] Thejas M Nair commented on HIVE-12049: -- [~gopalv] Thanks for profiling it! What is it like without the optimization ? What is the JDBC fetchRowSize being used ? > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, > HIVE-12049.9.patch, new-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12960) Migrate Column Stats Extrapolation and UniformDistribution to HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214636#comment-15214636 ] Pengcheng Xiong commented on HIVE-12960: [~sershe], thanks for your attention. The sql part is not removed and my analysis is posted at 09/Feb/16 22:53. Could u please scroll up your mouse and take a look? :) > Migrate Column Stats Extrapolation and UniformDistribution to HBaseStore > > > Key: HIVE-12960 > URL: https://issues.apache.org/jira/browse/HIVE-12960 > Project: Hive > Issue Type: Sub-task > Components: Statistics >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-12960.01.patch, HIVE-12960.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10729) Query failed when select complex columns from joinned table (tez map join only)
[ https://issues.apache.org/jira/browse/HIVE-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214635#comment-15214635 ] Sergey Shelukhin commented on HIVE-10729: - Didn't look at the test file, I assume it's the same as w/o vector :) posSingleVectorMapJoinSmallTable - assumes there are two elements in the array, right? Should there be an assert? Looks good pending tests otherwise. +1 > Query failed when select complex columns from joinned table (tez map join > only) > --- > > Key: HIVE-10729 > URL: https://issues.apache.org/jira/browse/HIVE-10729 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Selina Zhang >Assignee: Matt McCline > Attachments: HIVE-10729.03.patch, HIVE-10729.04.patch, > HIVE-10729.1.patch, HIVE-10729.2.patch > > > When map join happens, if projection columns include complex data types, > query will fail. > Steps to reproduce: > {code:sql} > hive> set hive.auto.convert.join; > hive.auto.convert.join=true > hive> desc foo; > a array > hive> select * from foo; > [1,2] > hive> desc src_int; > key int > value string > hive> select * from src_int where key=2; > 2val_2 > hive> select * from foo join src_int src on src.key = foo.a[1]; > {code} > Query will fail with stack trace > {noformat} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryArray cannot be cast to > [Ljava.lang.Object; > at > org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getList(StandardListObjectInspector.java:111) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:314) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:262) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:246) > at > org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:50) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:692) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:644) > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:676) > at > org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386) > ... 23 more > {noformat} > Similar error when projection columns include a map: > {code:sql} > hive> CREATE TABLE test (a INT, b MAP) STORED AS ORC; > hive> INSERT OVERWRITE TABLE test SELECT 1, MAP(1, "val_1", 2, "val_2") FROM > src LIMIT 1; > hive> select * from src join test where src.key=test.a; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12937) DbNotificationListener unable to clean up old notification events
[ https://issues.apache.org/jira/browse/HIVE-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214633#comment-15214633 ] Sushanth Sowmyan commented on HIVE-12937: - None of the test failures seem to be related to this patch. [~alangates], could you please review? > DbNotificationListener unable to clean up old notification events > - > > Key: HIVE-12937 > URL: https://issues.apache.org/jira/browse/HIVE-12937 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 1.2.1, 2.0.0, 2.1.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-12937.patch > > > There is a bug in ObjectStore, where we use pm.deletePersistent instead of > pm.deletePersistentAll, which causes the persistenceManager to try and drop a > org.datanucleus.store.rdbms.query.ForwardQueryResult instead of the > appropriate associated > org.apache.hadoop.hive.metastore.model.MNotificationLog. > This results in an error that looks like this: > {noformat} > Exception in thread "CleanerThread" > org.datanucleus.api.jdo.exceptions.ClassNotPersistenceCapableException: The > class "org.datanucleus.store.rdbms.query.ForwardQueryResult" is not > persistable. This means that it either hasnt been enhanced, or that the > enhanced version of the file is not in the CLASSPATH (or is hidden by an > unenhanced version), or the Meta-Data/annotations for the class are not found. > at > org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:380) > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoDeletePersistent(JDOPersistenceManager.java:807) > at > org.datanucleus.api.jdo.JDOPersistenceManager.deletePersistent(JDOPersistenceManager.java:820) > at > org.apache.hadoop.hive.metastore.ObjectStore.cleanNotificationEvents(ObjectStore.java:7149) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) > at com.sun.proxy.$Proxy0.cleanNotificationEvents(Unknown Source) > at > org.apache.hive.hcatalog.listener.DbNotificationListener$CleanerThread.run(DbNotificationListener.java:277) > NestedThrowablesStackTrace: > The class "org.datanucleus.store.rdbms.query.ForwardQueryResult" is not > persistable. This means that it either hasnt been enhanced, or that the > enhanced version of the file is not in the CLASSPATH (or is hidden by an > unenhanced version), or the Meta-Data/annotations for the class are not found. > org.datanucleus.exceptions.ClassNotPersistableException: The class > "org.datanucleus.store.rdbms.query.ForwardQueryResult" is not persistable. > This means that it either hasnt been enhanced, or that the enhanced version > of the file is not in the CLASSPATH (or is hidden by an unenhanced version), > or the Meta-Data/annotations for the class are not found. > at > org.datanucleus.ExecutionContextImpl.assertClassPersistable(ExecutionContextImpl.java:5698) > at > org.datanucleus.ExecutionContextImpl.deleteObjectInternal(ExecutionContextImpl.java:2495) > at > org.datanucleus.ExecutionContextImpl.deleteObjectWork(ExecutionContextImpl.java:2466) > at > org.datanucleus.ExecutionContextImpl.deleteObject(ExecutionContextImpl.java:2417) > at > org.datanucleus.ExecutionContextThreadedImpl.deleteObject(ExecutionContextThreadedImpl.java:245) > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoDeletePersistent(JDOPersistenceManager.java:802) > at > org.datanucleus.api.jdo.JDOPersistenceManager.deletePersistent(JDOPersistenceManager.java:820) > at > org.apache.hadoop.hive.metastore.ObjectStore.cleanNotificationEvents(ObjectStore.java:7149) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) > at com.sun.proxy.$Proxy0.cleanNotificationEvents(Unknown Source) > at > org.apache.hive.hcatalog.listener.DbNotificationListener$CleanerThread.run(DbNotificationListener.java:277) > {noformat} > The end result of this bug is that users of DbNotificationListener will have > an evergrowing number of notification events that are not cleaned up as they > age. This is an easy enough fix, but shows that we have a lack of test > coverage here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13365) Allow multiple llap instances with the MiniLlap cluster
[ https://issues.apache.org/jira/browse/HIVE-13365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214631#comment-15214631 ] Siddharth Seth commented on HIVE-13365: --- At the moment, there's no way to do that. It's also pointless since the tests are not big enough to run multiple instances. I guess we could add an option to the testrunner to control the number of instances. Will create a follow up jira for that. I intend to use the multiple instance feature in some failure handling tests in the main codebase. > Allow multiple llap instances with the MiniLlap cluster > --- > > Key: HIVE-13365 > URL: https://issues.apache.org/jira/browse/HIVE-13365 > Project: Hive > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-13365.01.patch, HIVE-13365.1.review.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12960) Migrate Column Stats Extrapolation and UniformDistribution to HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214626#comment-15214626 ] Sergey Shelukhin commented on HIVE-12960: - Hmm... I don't see sql part removed. Was it just copy/pasted? > Migrate Column Stats Extrapolation and UniformDistribution to HBaseStore > > > Key: HIVE-12960 > URL: https://issues.apache.org/jira/browse/HIVE-12960 > Project: Hive > Issue Type: Sub-task > Components: Statistics >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-12960.01.patch, HIVE-12960.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13346) LLAP doesn't update metadata priority when reusing from cache; some tweaks in LRFU policy
[ https://issues.apache.org/jira/browse/HIVE-13346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214628#comment-15214628 ] Sergey Shelukhin commented on HIVE-13346: - [~sseth] maybe you can review? > LLAP doesn't update metadata priority when reusing from cache; some tweaks in > LRFU policy > - > > Key: HIVE-13346 > URL: https://issues.apache.org/jira/browse/HIVE-13346 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13346.01.patch, HIVE-13346.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13365) Allow multiple llap instances with the MiniLlap cluster
[ https://issues.apache.org/jira/browse/HIVE-13365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214621#comment-15214621 ] Sergey Shelukhin commented on HIVE-13365: - +1 pending tests. How does one trigger multiple instances? > Allow multiple llap instances with the MiniLlap cluster > --- > > Key: HIVE-13365 > URL: https://issues.apache.org/jira/browse/HIVE-13365 > Project: Hive > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-13365.01.patch, HIVE-13365.1.review.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive
[ https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13290: - Attachment: (was: HIVE-13290.2.patch) > Support primary keys/foreign keys constraint as part of create table command > in Hive > > > Key: HIVE-13290 > URL: https://issues.apache.org/jira/browse/HIVE-13290 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, > HIVE-13290.3.patch > > > SUPPORT for the following statements > {code} > CREATE TABLE product > ( > product_idINTEGER, > product_vendor_id INTEGER, > PRIMARY KEY (product_id), > CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES > vendor(vendor_id) > ); > CREATE TABLE vendor > ( > vendor_id INTEGER, > PRIMARY KEY (vendor_id) > ); > {code} > In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not > specified by the user, we will use system generated constraint name. For the > purpose of simplicity, we will allow CONSTRAINT option for foreign keys and > not primary key since there is only one primary key per table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13290) Support primary keys/foreign keys constraint as part of create table command in Hive
[ https://issues.apache.org/jira/browse/HIVE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13290: - Attachment: HIVE-13290.3.patch > Support primary keys/foreign keys constraint as part of create table command > in Hive > > > Key: HIVE-13290 > URL: https://issues.apache.org/jira/browse/HIVE-13290 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13290.1.patch, HIVE-13290.2.patch, > HIVE-13290.3.patch > > > SUPPORT for the following statements > {code} > CREATE TABLE product > ( > product_idINTEGER, > product_vendor_id INTEGER, > PRIMARY KEY (product_id), > CONSTRAINT product_fk_1 FOREIGN KEY (product_vendor_id) REFERENCES > vendor(vendor_id) > ); > CREATE TABLE vendor > ( > vendor_id INTEGER, > PRIMARY KEY (vendor_id) > ); > {code} > In the above syntax, [CONSTRAINT constraint-Name] is optional. If this is not > specified by the user, we will use system generated constraint name. For the > purpose of simplicity, we will allow CONSTRAINT option for foreign keys and > not primary key since there is only one primary key per table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-12049: --- Attachment: (was: new-driver-profiles.png) > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214598#comment-15214598 ] Gopal V commented on HIVE-12049: Profiling the patch, most of the CPU in the fetchResults is now from the Session acquire and release. !new-driver-profiles.png! > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-12049: --- Attachment: new-driver-profiles.png > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, > HIVE-12049.9.patch, new-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214528#comment-15214528 ] Thejas M Nair commented on HIVE-12049: -- The "not in list of params that are allowed to be modified at runtime" is happening because SQL std auth or Ranger is enabled, and it allows modifying configs only in a whitelist. [~gopalv] A workaround is to add the parameter as value of hive.security.authorization.sqlstd.confwhitelist.append in HS2. [~rohitdholakia] We should add hive.server2.thrift.resulset.serialize.in.tasks parameter to the default whiltelist. It should be added to sqlStdAuthSafeVarNames array in HiveConf.java. > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214528#comment-15214528 ] Thejas M Nair edited comment on HIVE-12049 at 3/28/16 5:41 PM: --- The "not in list of params that are allowed to be modified at runtime" is happening because SQL std auth or Ranger is enabled, and it allows modifying configs only in a whitelist. [~gopalv] A workaround is to add the parameter as value of hive.security.authorization.sqlstd.confwhitelist.append in HS2, or disable authorization. [~rohitdholakia] We should add hive.server2.thrift.resulset.serialize.in.tasks parameter to the default whiltelist. It should be added to sqlStdAuthSafeVarNames array in HiveConf.java. was (Author: thejas): The "not in list of params that are allowed to be modified at runtime" is happening because SQL std auth or Ranger is enabled, and it allows modifying configs only in a whitelist. [~gopalv] A workaround is to add the parameter as value of hive.security.authorization.sqlstd.confwhitelist.append in HS2. [~rohitdholakia] We should add hive.server2.thrift.resulset.serialize.in.tasks parameter to the default whiltelist. It should be added to sqlStdAuthSafeVarNames array in HiveConf.java. > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13363) Add hive.metastore.token.signature property to HiveConf
[ https://issues.apache.org/jira/browse/HIVE-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214459#comment-15214459 ] Hive QA commented on HIVE-13363: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795494/HIVE-13363.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7397/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7397/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7397/ Messages: {noformat} This message was trimmed, see log for full details [WARNING] - org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim [WARNING] - 64 more... [WARNING] hive-exec-2.1.0-SNAPSHOT.jar, snappy-0.2.jar define 16 overlappping classes: [WARNING] - org.iq80.snappy.SnappyCompressor [WARNING] - org.iq80.snappy.SlowMemory [WARNING] - org.iq80.snappy.Crc32C [WARNING] - org.iq80.snappy.CorruptionException [WARNING] - org.iq80.snappy.UnsafeMemory [WARNING] - org.iq80.snappy.Memory [WARNING] - org.iq80.snappy.Snappy [WARNING] - org.iq80.snappy.Main [WARNING] - org.iq80.snappy.HadoopSnappyCodec [WARNING] - org.iq80.snappy.SnappyDecompressor [WARNING] - 6 more... [WARNING] hive-serde-2.1.0-SNAPSHOT.jar, hive-exec-2.1.0-SNAPSHOT.jar define 588 overlappping classes: [WARNING] - org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector [WARNING] - org.apache.hadoop.hive.serde2.lazy.LazySerDeParameters [WARNING] - org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeXception [WARNING] - org.apache.hadoop.hive.serde2.proto.test.Complexpb$Complex [WARNING] - org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeStructBase [WARNING] - org.apache.hadoop.hive.serde.test.ThriftTestObj$ThriftTestObjTupleSchemeFactory [WARNING] - org.apache.hadoop.hive.serde2.thrift.test.Complex$1 [WARNING] - org.apache.hadoop.hive.serde2.thrift.test.MiniStruct [WARNING] - org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector [WARNING] - org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$1 [WARNING] - 578 more... [WARNING] jackson-mapper-asl-1.9.2.jar, hive-exec-2.1.0-SNAPSHOT.jar define 494 overlappping classes: [WARNING] - org.codehaus.jackson.map.ser.impl.SerializerCache$TypeKey [WARNING] - org.codehaus.jackson.map.DeserializerProvider [WARNING] - org.codehaus.jackson.map.deser.std.StdKeyDeserializer$LongKD [WARNING] - org.codehaus.jackson.node.ValueNode [WARNING] - org.codehaus.jackson.map.ser.std.CollectionSerializer [WARNING] - org.codehaus.jackson.map.ser.impl.PropertySerializerMap$Double [WARNING] - org.codehaus.jackson.map.deser.FromStringDeserializer [WARNING] - org.codehaus.jackson.map.deser.std.StdKeyDeserializer$FloatKD [WARNING] - org.codehaus.jackson.map.Deserializers [WARNING] - org.codehaus.jackson.map.ser.StdSerializers$SerializableSerializer [WARNING] - 484 more... [WARNING] hadoop-yarn-common-2.6.0.jar, hadoop-yarn-api-2.6.0.jar define 3 overlappping classes: [WARNING] - org.apache.hadoop.yarn.factories.package-info [WARNING] - org.apache.hadoop.yarn.util.package-info [WARNING] - org.apache.hadoop.yarn.factory.providers.package-info [WARNING] commons-beanutils-core-1.8.0.jar, commons-beanutils-1.7.0.jar, commons-collections-3.2.2.jar define 10 overlappping classes: [WARNING] - org.apache.commons.collections.FastHashMap$EntrySet [WARNING] - org.apache.commons.collections.ArrayStack [WARNING] - org.apache.commons.collections.FastHashMap$1 [WARNING] - org.apache.commons.collections.FastHashMap$KeySet [WARNING] - org.apache.commons.collections.FastHashMap$CollectionView [WARNING] - org.apache.commons.collections.BufferUnderflowException [WARNING] - org.apache.commons.collections.Buffer [WARNING] - org.apache.commons.collections.FastHashMap$CollectionView$CollectionViewIterator [WARNING] - org.apache.commons.collections.FastHashMap$Values [WARNING] - org.apache.commons.collections.FastHashMap [WARNING] hive-shims-0.23-2.1.0-SNAPSHOT.jar, hive-exec-2.1.0-SNAPSHOT.jar define 29 overlappping classes: [WARNING] - org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsFileStatusWithIdImpl [WARNING] - org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim [WARNING] - org.apache.hadoop.hive.shims.Jetty23Shims$Server [WARNING] - org.apache.hadoop.mapred.WebHCatJTShim23 [WARNING] - org.apache.hadoop.hive.shims.Hadoop23Shims$MiniTezShim [WARNING] - org.apache.hadoop.hive.shims.Jetty23Shims$1 [WARNING] - org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge23 [WARNING] - org.apache.hadoop.hive.shims.Jetty23Shims [WARNING] -
[jira] [Commented] (HIVE-12612) beeline always exits with 0 status when reading query from standard input
[ https://issues.apache.org/jira/browse/HIVE-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214424#comment-15214424 ] Hive QA commented on HIVE-12612: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795428/HIVE-12612.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9882 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthString org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthTimestamp org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearString org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearTimestamp org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7396/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7396/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7396/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12795428 - PreCommit-HIVE-TRUNK-Build > beeline always exits with 0 status when reading query from standard input > - > > Key: HIVE-12612 > URL: https://issues.apache.org/jira/browse/HIVE-12612 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 1.1.0 > Environment: CDH5.5.0 >Reporter: Paulo Sequeira >Assignee: Reuben Kuhnert >Priority: Minor > Attachments: HIVE-12612.01.patch, HIVE-12612.02.patch > > > Similar to what was reported on HIVE-6978, but now it only happens when the > query is read from the standard input. For example, the following fails as > expected: > {code} > bash$ if beeline -u "jdbc:hive2://..." -e "boo;" ; then echo "Ok?!" ; else > echo "Failed!" ; fi > Connecting to jdbc:hive2://... > Connected to: Apache Hive (version 1.1.0-cdh5.5.0) > Driver: Hive JDBC (version 1.1.0-cdh5.5.0) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Error: Error while compiling statement: FAILED: ParseException line 1:0 > cannot recognize input near 'boo' '' '' (state=42000,code=4) > Closing: 0: jdbc:hive2://... > Failed! > {code} > But the following does not: > {code} > bash$ if echo "boo;"|beeline -u "jdbc:hive2://..." ; then echo "Ok?!" ; else > echo "Failed!" ; fi > Connecting to jdbc:hive2://... > Connected to: Apache Hive (version 1.1.0-cdh5.5.0) > Driver: Hive JDBC (version 1.1.0-cdh5.5.0) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 1.1.0-cdh5.5.0 by Apache Hive > 0: jdbc:hive2://...:8> Error: Error while compiling statement: FAILED: > ParseException line 1:0 cannot recognize input near 'boo' '' '' > (state=42000,code=4) > 0: jdbc:hive2://...:8> Closing: 0: jdbc:hive2://... > Ok?! > {code} > This was misleading our batch scripts to always believe that the execution of > the queries succeded, when sometimes that was not the case. > h2. Workaround > We found we can work around the issue by always using the -e or the -f > parameters, and even reading the standard input through the /dev/stdin device > (this was useful because a lot of the scripts fed the queries from here > documents), like this: > {code:title=some-script.sh} > #!/bin/sh > set -o nounset -o errexit -o pipefail > # As beeline is failing to report an error status if reading the query > # to be executed from STDIN, check whether no -f or -e option is used > # and, in that case, pretend it has to read the query from a regular > # file using -f to read from /dev/stdin > function beeline_workaround_exit_status () { > for arg in "$@" > do if [ "$arg" = "-f" -o "$arg" = "-e" ] >
[jira] [Updated] (HIVE-13367) Extending HPLSQL parser
[ https://issues.apache.org/jira/browse/HIVE-13367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-13367: -- Status: Patch Available (was: Open) > Extending HPLSQL parser > --- > > Key: HIVE-13367 > URL: https://issues.apache.org/jira/browse/HIVE-13367 > Project: Hive > Issue Type: Improvement > Components: hpl/sql >Affects Versions: 2.1.0 >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > Attachments: HIVE-13367.1.patch > > > Extending HPL/SQL parser to support more procedural constructs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13367) Extending HPLSQL parser
[ https://issues.apache.org/jira/browse/HIVE-13367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-13367: -- Attachment: HIVE-13367.1.patch Patch with tests attached. > Extending HPLSQL parser > --- > > Key: HIVE-13367 > URL: https://issues.apache.org/jira/browse/HIVE-13367 > Project: Hive > Issue Type: Improvement > Components: hpl/sql >Affects Versions: 2.1.0 >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > Attachments: HIVE-13367.1.patch > > > Extending HPL/SQL parser to support more procedural constructs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12968) genNotNullFilterForJoinSourcePlan: needs to merge predicates into the multi-AND
[ https://issues.apache.org/jira/browse/HIVE-12968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214188#comment-15214188 ] Hive QA commented on HIVE-12968: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795426/HIVE-12968.3.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7395/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7395/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7395/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7395/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 7747458 HIVE-13358: Stats state is not captured correctly: turn off stats optimizer for sampled table (Pengcheng Xiong, reviewed by Ashutosh Chauhan) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 7747458 HIVE-13358: Stats state is not captured correctly: turn off stats optimizer for sampled table (Pengcheng Xiong, reviewed by Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12795426 - PreCommit-HIVE-TRUNK-Build > genNotNullFilterForJoinSourcePlan: needs to merge predicates into the > multi-AND > --- > > Key: HIVE-12968 > URL: https://issues.apache.org/jira/browse/HIVE-12968 > Project: Hive > Issue Type: Improvement > Components: Logical Optimizer >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Minor > Attachments: HIVE-12968.1.patch, HIVE-12968.2.patch, > HIVE-12968.3.patch > > > {code} > predicate: ((cbigint is not null and cint is not null) and cint BETWEEN > 100 AND 300) (type: boolean) > {code} > does not fold the IS_NULL on cint, because of the structure of the AND clause. > For example, see {{tez_dynpart_hashjoin_1.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11601) confusing message in start/stop webhcat server
[ https://issues.apache.org/jira/browse/HIVE-11601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214185#comment-15214185 ] Hive QA commented on HIVE-11601: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795372/HIVE-11601.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7394/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7394/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7394/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7394/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 7747458 HIVE-13358: Stats state is not captured correctly: turn off stats optimizer for sampled table (Pengcheng Xiong, reviewed by Ashutosh Chauhan) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 7747458 HIVE-13358: Stats state is not captured correctly: turn off stats optimizer for sampled table (Pengcheng Xiong, reviewed by Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch patch: Only garbage was found in the patch input. patch: Only garbage was found in the patch input. patch: Only garbage was found in the patch input. The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12795372 - PreCommit-HIVE-TRUNK-Build > confusing message in start/stop webhcat server > -- > > Key: HIVE-11601 > URL: https://issues.apache.org/jira/browse/HIVE-11601 > Project: Hive > Issue Type: Improvement > Components: WebHCat >Affects Versions: 0.13.0 >Reporter: Takashi Ohnishi >Assignee: Andrew Sears >Priority: Trivial > Attachments: HIVE-11601.patch, HIVE-11601.patch > > > HIVE-5167 makes webhcat_config.sh can output the below message > {code} > Lenght of string is non zero > {code} > This maybe misspelling. > And I think it is not easy to understand what to say. > How about changin like > {code} > found HIVE_HOME is already set. > {code} > or remove this message? > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13326) HiveServer2: Make ZK config publishing configurable
[ https://issues.apache.org/jira/browse/HIVE-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-13326: Attachment: HIVE-13326.2.patch > HiveServer2: Make ZK config publishing configurable > --- > > Key: HIVE-13326 > URL: https://issues.apache.org/jira/browse/HIVE-13326 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-13326.1.patch, HIVE-13326.2.patch > > > We should revert to older behaviour when config publishing is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214055#comment-15214055 ] Hive QA commented on HIVE-13149: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12795360/HIVE-13149.6.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 9825 tests executed *Failed tests:* {noformat} TestCliDriver-cbo_rp_stats.q-skewjoinopt16.q-rename_column.q-and-12-more - did not produce a TEST-*.xml file TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file TestMiniTezCliDriver-dynpart_sort_optimization2.q-cte_mat_1.q-tez_bmj_schema_evolution.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthString org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFMonthTimestamp org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearString org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTimestampExpressions.testVectorUDFYearTimestamp org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7393/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7393/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7393/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12795360 - PreCommit-HIVE-TRUNK-Build > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, > HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, HIVE-13149.6.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some like StatsTask, don't need to access HMS. > Currently a new HMS connection will be established for each Task thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-3432) perform a map-only group by if grouping key matches the sorting properties of the table
[ https://issues.apache.org/jira/browse/HIVE-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-3432: - Labels: (was: TODOC10) > perform a map-only group by if grouping key matches the sorting properties of > the table > --- > > Key: HIVE-3432 > URL: https://issues.apache.org/jira/browse/HIVE-3432 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Namit Jain >Assignee: Namit Jain > Fix For: 0.10.0 > > Attachments: hive.3432.1.patch, hive.3432.2.patch, hive.3432.3.patch, > hive.3432.4.patch, hive.3432.5.patch, hive.3432.6.patch, hive.3432.7.patch, > hive.3432.8.patch > > > There should be an option to use bucketizedinputformat and use map-only group > by. There would be no need to perform a map-side aggregation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4281) add hive.map.groupby.sorted.testmode
[ https://issues.apache.org/jira/browse/HIVE-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213991#comment-15213991 ] Lefty Leverenz commented on HIVE-4281: -- Doc note: Removing the TODOC11 label because *hive.map.groupby.sorted.testmode* is now documented in the wiki (including its removal by HIVE-12325): * [Configuration Properties -- hive.map.groupby.sorted.testmode | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.map.groupby.sorted.testmode] > add hive.map.groupby.sorted.testmode > > > Key: HIVE-4281 > URL: https://issues.apache.org/jira/browse/HIVE-4281 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Namit Jain >Assignee: Namit Jain > Fix For: 0.11.0 > > Attachments: hive.4281.1.patch, hive.4281.2.patch, > hive.4281.2.patch-nohcat, hive.4281.3.patch > > > The idea behind this would be to test hive.map.groupby.sorted. > Since this is a new feature, it might be a good idea to run it in test mode, > where a query property would denote that this query plan would have changed. > If a customer wants, they can run those queries offline, compare the results > for correctness, and set hive.map.groupby.sorted only if all the results are > the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4281) add hive.map.groupby.sorted.testmode
[ https://issues.apache.org/jira/browse/HIVE-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-4281: - Labels: (was: TODOC11) > add hive.map.groupby.sorted.testmode > > > Key: HIVE-4281 > URL: https://issues.apache.org/jira/browse/HIVE-4281 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Namit Jain >Assignee: Namit Jain > Fix For: 0.11.0 > > Attachments: hive.4281.1.patch, hive.4281.2.patch, > hive.4281.2.patch-nohcat, hive.4281.3.patch > > > The idea behind this would be to test hive.map.groupby.sorted. > Since this is a new feature, it might be a good idea to run it in test mode, > where a query property would denote that this query plan would have changed. > If a customer wants, they can run those queries offline, compare the results > for correctness, and set hive.map.groupby.sorted only if all the results are > the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12325) Turn hive.map.groupby.sorted on by default
[ https://issues.apache.org/jira/browse/HIVE-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-12325: -- Labels: (was: TODOC2.0) > Turn hive.map.groupby.sorted on by default > -- > > Key: HIVE-12325 > URL: https://issues.apache.org/jira/browse/HIVE-12325 > Project: Hive > Issue Type: Improvement > Components: Logical Optimizer >Reporter: Ashutosh Chauhan >Assignee: Chetna Chaudhari > Fix For: 2.0.0 > > Attachments: HIVE-12325.1.patch > > > When applicable it can avoid shuffle phase altogether for group by, which > will be a performance win. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12325) Turn hive.map.groupby.sorted on by default
[ https://issues.apache.org/jira/browse/HIVE-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213988#comment-15213988 ] Lefty Leverenz commented on HIVE-12325: --- Removing the TODOC2.0 label because *hive.map.groupby.sorted* and the removal of *hive.map.groupby.sorted.testmode* are now documented in the wiki: * [Configuration Properties -- hive.map.groupby.sorted | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.map.groupby.sorted] * [Configuration Properties -- hive.map.groupby.sorted.testmode | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.map.groupby.sorted.testmode] > Turn hive.map.groupby.sorted on by default > -- > > Key: HIVE-12325 > URL: https://issues.apache.org/jira/browse/HIVE-12325 > Project: Hive > Issue Type: Improvement > Components: Logical Optimizer >Reporter: Ashutosh Chauhan >Assignee: Chetna Chaudhari > Fix For: 2.0.0 > > Attachments: HIVE-12325.1.patch > > > When applicable it can avoid shuffle phase altogether for group by, which > will be a performance win. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert
[ https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213949#comment-15213949 ] Lefty Leverenz commented on HIVE-4240: -- Removed the TODOC11 label because *hive.optimize.bucketingsorting* is now documented in the wiki: * [Configuration Properties -- hive.optimize.bucketingsorting | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.optimize.bucketingsorting] > optimize hive.enforce.bucketing and hive.enforce sorting insert > --- > > Key: HIVE-4240 > URL: https://issues.apache.org/jira/browse/HIVE-4240 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Namit Jain >Assignee: Namit Jain > Fix For: 0.11.0 > > Attachments: hive.4240.1.patch, hive.4240.2.patch, hive.4240.3.patch, > hive.4240.4.patch, hive.4240.5.patch, hive.4240.5.patch-nohcat > > > Consider the following scenario: > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > set hive.input.format = > org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.enforce.bucketing=true; > set hive.enforce.sorting=true; > set hive.exec.reducers.max = 1; > set hive.merge.mapfiles=false; > set hive.merge.mapredfiles=false; > -- Create two bucketed and sorted tables > CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) > CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; > CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) > CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; > FROM src > INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *; > -- Insert data into the bucketed table by selecting from another bucketed > table > -- This should be a map-only operation > INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1') > SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1'; > We should not need a reducer to perform the above operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12331) Remove hive.enforce.bucketing & hive.enforce.sorting configs
[ https://issues.apache.org/jira/browse/HIVE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213948#comment-15213948 ] Lefty Leverenz commented on HIVE-12331: --- Configuration Properties now includes *hive.optimize.bucketingsorting* but the TODOC2.0 label still remains because of other wikidocs that need to be updated. > Remove hive.enforce.bucketing & hive.enforce.sorting configs > > > Key: HIVE-12331 > URL: https://issues.apache.org/jira/browse/HIVE-12331 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12331.1.patch, HIVE-12331.patch > > > If table is created as bucketed and/or sorted and this config is set to > false, you will insert data in wrong buckets and/or sort order and then if > you use these tables subsequently in BMJ or SMBJ you will get wrong results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert
[ https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-4240: - Labels: (was: TODOC11) > optimize hive.enforce.bucketing and hive.enforce sorting insert > --- > > Key: HIVE-4240 > URL: https://issues.apache.org/jira/browse/HIVE-4240 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Namit Jain >Assignee: Namit Jain > Fix For: 0.11.0 > > Attachments: hive.4240.1.patch, hive.4240.2.patch, hive.4240.3.patch, > hive.4240.4.patch, hive.4240.5.patch, hive.4240.5.patch-nohcat > > > Consider the following scenario: > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > set hive.input.format = > org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.enforce.bucketing=true; > set hive.enforce.sorting=true; > set hive.exec.reducers.max = 1; > set hive.merge.mapfiles=false; > set hive.merge.mapredfiles=false; > -- Create two bucketed and sorted tables > CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) > CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; > CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) > CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; > FROM src > INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *; > -- Insert data into the bucketed table by selecting from another bucketed > table > -- This should be a map-only operation > INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1') > SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1'; > We should not need a reducer to perform the above operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)