[jira] [Updated] (HIVE-10994) Hive.moveFile should not fail on a no-op move
[ https://issues.apache.org/jira/browse/HIVE-10994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10994: Fix Version/s: 2.0.0 > Hive.moveFile should not fail on a no-op move > - > > Key: HIVE-10994 > URL: https://issues.apache.org/jira/browse/HIVE-10994 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 1.2.1, 2.0.0 > > Attachments: HIVE-10994.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11012) LLAP: fix some tests in the branch and revert incorrectly committed changed out files (from HIVE-11014)
[ https://issues.apache.org/jira/browse/HIVE-11012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-11012. - Resolution: Fixed Fix Version/s: llap committed to branch > LLAP: fix some tests in the branch and revert incorrectly committed changed > out files (from HIVE-11014) > --- > > Key: HIVE-11012 > URL: https://issues.apache.org/jira/browse/HIVE-11012 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: llap > > > I am assigning some new issues to people and fixing whatever random issues > from HIVE-10997. So far fixed all the TestLocationQueries/MtQueries, > list_bucket* Kryo exception, and some tez NPE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11018) Turn on cbo in more q files
[ https://issues.apache.org/jira/browse/HIVE-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-11018: Attachment: HIVE-11018.patch No code changes. Only test changes. > Turn on cbo in more q files > --- > > Key: HIVE-11018 > URL: https://issues.apache.org/jira/browse/HIVE-11018 > Project: Hive > Issue Type: Task > Components: Tests >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-11018.patch > > > There are few tests in which cbo was turned off for various reasons. Those > reasons don't exists anymore. For those tests, we should turn on cbo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11014) LLAP: MiniTez vector_binary_join_groupby, vector_outer_join1, vector_outer_join2 and cbo_windowing tests have result changes compared to master
[ https://issues.apache.org/jira/browse/HIVE-11014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587254#comment-14587254 ] Sergey Shelukhin commented on HIVE-11014: - Feel free to create separate jiras if changes are for different reasons > LLAP: MiniTez vector_binary_join_groupby, vector_outer_join1, > vector_outer_join2 and cbo_windowing tests have result changes compared to > master > --- > > Key: HIVE-11014 > URL: https://issues.apache.org/jira/browse/HIVE-11014 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Matt McCline > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11012) LLAP: fix some tests in the branch and revert incorrectly committed changed out files (from HIVE-11014)
[ https://issues.apache.org/jira/browse/HIVE-11012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11012: Summary: LLAP: fix some tests in the branch and revert incorrectly committed changed out files (from HIVE-11014) (was: LLAP: fix some tests in the branch) > LLAP: fix some tests in the branch and revert incorrectly committed changed > out files (from HIVE-11014) > --- > > Key: HIVE-11012 > URL: https://issues.apache.org/jira/browse/HIVE-11012 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > I am assigning some new issues to people and fixing whatever random issues > from HIVE-10997. So far fixed all the TestLocationQueries/MtQueries, > list_bucket* Kryo exception, and some tez NPE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11014) LLAP: MiniTez vector_binary_join_groupby, vector_outer_join1, vector_outer_join2 and cbo_windowing tests have result changes compared to master
[ https://issues.apache.org/jira/browse/HIVE-11014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11014: Summary: LLAP: MiniTez vector_binary_join_groupby, vector_outer_join1, vector_outer_join2 and cbo_windowing tests have result changes compared to master (was: LLAP: MiniTez vector_binary_join_groupby test has result changes compared to master) > LLAP: MiniTez vector_binary_join_groupby, vector_outer_join1, > vector_outer_join2 and cbo_windowing tests have result changes compared to > master > --- > > Key: HIVE-11014 > URL: https://issues.apache.org/jira/browse/HIVE-11014 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Matt McCline > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11008) webhcat GET /jobs retries on getting job details from history server is too agressive
[ https://issues.apache.org/jira/browse/HIVE-11008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587229#comment-14587229 ] Eugene Koifman commented on HIVE-11008: --- there are 2 places where StatusDelegator.run() is called Server.showJobList() and Server.showJobId(). Don't we need the same logic in both places? Can the setting of the 2 properties be moved into StatusDelegator.run() just before ShimLoader.getHadoopShims().getWebHCatShim()? > webhcat GET /jobs retries on getting job details from history server is too > agressive > - > > Key: HIVE-11008 > URL: https://issues.apache.org/jira/browse/HIVE-11008 > Project: Hive > Issue Type: Bug > Components: WebHCat >Affects Versions: 1.2.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-11008.1.patch > > > Webhcat "jobs" api gets the list of jobs from RM and then gets details from > history server. > RM has a policy of retaining fixed number of jobs to accommodate for the > memory it has, while HistoryServer retains jobs based on their age. As a > result, jobs that RM returns might not be present in HistoryServer and can > result in a failure. HistoryServer also ends up retrying on failures even if > they happen because the job actually does not exist. > The retries to get details from HistoryServer in such cases is too aggressive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9248) Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is Hash mode
[ https://issues.apache.org/jira/browse/HIVE-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587209#comment-14587209 ] Jason Dere commented on HIVE-9248: -- +1 > Vectorization : Tez Reduce vertex not getting vectorized when GROUP BY is > Hash mode > --- > > Key: HIVE-9248 > URL: https://issues.apache.org/jira/browse/HIVE-9248 > Project: Hive > Issue Type: Bug > Components: Tez, Vectorization >Affects Versions: 0.14.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-9248.01.patch, HIVE-9248.02.patch, > HIVE-9248.03.patch, HIVE-9248.04.patch, HIVE-9248.05.patch, HIVE-9248.06.patch > > > Under Tez and Vectorization, ReduceWork not getting vectorized unless it > GROUP BY operator is MergePartial. Add valid cases where GROUP BY is Hash > (and presumably there are downstream reducers that will do MergePartial). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11014) LLAP: MiniTez vector_binary_join_groupby test has result changes compared to master
[ https://issues.apache.org/jira/browse/HIVE-11014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587205#comment-14587205 ] Sergey Shelukhin commented on HIVE-11014: - Note: right now there are some incorrect changes committed there, I'm going to commit the master version again > LLAP: MiniTez vector_binary_join_groupby test has result changes compared to > master > --- > > Key: HIVE-11014 > URL: https://issues.apache.org/jira/browse/HIVE-11014 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Matt McCline > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10907) Hive on Tez: Classcast exception in some cases with SMB joins
[ https://issues.apache.org/jira/browse/HIVE-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10907: -- Affects Version/s: 1.0.0 1.2.0 > Hive on Tez: Classcast exception in some cases with SMB joins > - > > Key: HIVE-10907 > URL: https://issues.apache.org/jira/browse/HIVE-10907 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0, 1.2.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Fix For: 1.2.1 > > Attachments: HIVE-10907.1.patch, HIVE-10907.2.patch, > HIVE-10907.3.patch, HIVE-10907.4.patch > > > In cases where there is a mix of Map side work and reduce side work, we get a > classcast exception because we assume homogeneity in the code. We need to fix > this correctly. For now this is a workaround. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10915) ORC fails to read table with a 38Gb ORC file
[ https://issues.apache.org/jira/browse/HIVE-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran resolved HIVE-10915. -- Resolution: Fixed Fixed by HIVE-10685. Verified it against lineitem TPCH 1000 scale. > ORC fails to read table with a 38Gb ORC file > > > Key: HIVE-10915 > URL: https://issues.apache.org/jira/browse/HIVE-10915 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 1.3.0 >Reporter: Gopal V > > {code} > hive> set mapreduce.input.fileinputformat.split.maxsize=1; > hive> set mapreduce.input.fileinputformat.split.maxsize=1; > hive> alter table lineitem concatenate; > .. > hive> dfs -ls /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem; > Found 12 items > -rwxr-xr-x 3 gopal supergroup 41368976599 2015-06-03 15:49 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/00_0 > -rwxr-xr-x 3 gopal supergroup 36226719673 2015-06-03 15:48 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/01_0 > -rwxr-xr-x 3 gopal supergroup 27544042018 2015-06-03 15:50 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/02_0 > -rwxr-xr-x 3 gopal supergroup 23147063608 2015-06-03 15:44 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/03_0 > -rwxr-xr-x 3 gopal supergroup 21079035936 2015-06-03 15:44 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/04_0 > -rwxr-xr-x 3 gopal supergroup 13813961419 2015-06-03 15:43 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/05_0 > -rwxr-xr-x 3 gopal supergroup 8155299977 2015-06-03 15:40 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/06_0 > -rwxr-xr-x 3 gopal supergroup 6264478613 2015-06-03 15:40 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/07_0 > -rwxr-xr-x 3 gopal supergroup 4653393054 2015-06-03 15:40 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/08_0 > -rwxr-xr-x 3 gopal supergroup 3621672928 2015-06-03 15:39 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/09_0 > -rwxr-xr-x 3 gopal supergroup 1460919310 2015-06-03 15:38 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/10_0 > -rwxr-xr-x 3 gopal supergroup 485129789 2015-06-03 15:38 > /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/11_0 > {code} > Errors without PPD > Suspicious offsets in the stream information - > {code} > Caused by: java.io.EOFException: Read past end of RLE integer from compressed > stream Stream for column 1 kind DATA position: 1608840 length: 1608840 range: > 0 offset: 1608840 limit: 1608840 range 0 = 0 to 1608840 uncompressed: 36845 > to 36845 > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:56) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:346) > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$LongTreeReader.nextVector(TreeReaderFactory.java:582) > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:2026) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1070) > ... 25 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587171#comment-14587171 ] Jesus Camacho Rodriguez commented on HIVE-10996: I can reproduce the problem in 1.2. Still investigating the issue... > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10999: --- Attachment: HIVE-10999.1-spark.patch > Upgrade Spark dependency to 1.4 [Spark Branch] > -- > > Key: HIVE-10999 > URL: https://issues.apache.org/jira/browse/HIVE-10999 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-10999.1-spark.patch, HIVE-10999.1-spark.patch > > > Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to > 1.4.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10999: --- Comment: was deleted (was: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12739493/HIVE-10999.1-spark.patch {color:red}ERROR:{color} -1 due to 604 failed/errored test(s), 7420 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.initializationError org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketizedhiveinputformat org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_empty_dir_in_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_external_table_with_space_in_location_path org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap_auto org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_bucketed_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_merge org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_leftsemijoin_mr org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_quotedid_smb org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_smb_mapjoin_8 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter_partitioned org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_truncate_column_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_uber_reduce org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_orc org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_stats_orc org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_annotate_stats_join org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join11 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join17 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDrive
[jira] [Commented] (HIVE-11006) improve logging wrt ACID module
[ https://issues.apache.org/jira/browse/HIVE-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587157#comment-14587157 ] Alan Gates commented on HIVE-11006: --- Will review. > improve logging wrt ACID module > --- > > Key: HIVE-11006 > URL: https://issues.apache.org/jira/browse/HIVE-11006 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-11006.patch > > > especially around metastore DB operations (TxnHandler) which are retried or > fail for some reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10937) LLAP: make ObjectCache for plans work properly in the daemon
[ https://issues.apache.org/jira/browse/HIVE-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10937: Attachment: HIVE-10937.01.patch simplified patch > LLAP: make ObjectCache for plans work properly in the daemon > > > Key: HIVE-10937 > URL: https://issues.apache.org/jira/browse/HIVE-10937 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: llap > > Attachments: HIVE-10937.01.patch, HIVE-10937.patch > > > There's perf hit otherwise, esp. when stupid planner creates 1009 reducers of > 4Mb each. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587114#comment-14587114 ] Jesus Camacho Rodriguez commented on HIVE-10996: This seems to be fixed in HIVE-9613. The fix was backported to 1.0, but not to 1.1. [~hagleitn], [~brocknoland], could we backport HIVE-9613 to 1.1 to solve this issue? Thanks > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10107) Union All : Vertex missing stats resulting in OOM and in-efficient plans
[ https://issues.apache.org/jira/browse/HIVE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-10107: --- Fix Version/s: 1.2.1 > Union All : Vertex missing stats resulting in OOM and in-efficient plans > > > Key: HIVE-10107 > URL: https://issues.apache.org/jira/browse/HIVE-10107 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 0.14.0 >Reporter: Mostafa Mokhtar >Assignee: Pengcheng Xiong > Fix For: 1.2.1 > > > Reducer Vertices sending data to a Union all edge are missing statistics and > as a result we either use very few reducers in the UNION ALL edge or decide > to broadcast the results of UNION ALL. > Query > {code} > select > count(*) rowcount > from > (select > ss_item_sk, ss_ticket_number, ss_store_sk > from > store_sales a, store_returns b > where > a.ss_item_sk = b.sr_item_sk > and a.ss_ticket_number = b.sr_ticket_number union all select > ss_item_sk, ss_ticket_number, ss_store_sk > from > store_sales c, store_returns d > where > c.ss_item_sk = d.sr_item_sk > and c.ss_ticket_number = d.sr_ticket_number) t > group by t.ss_store_sk , t.ss_item_sk , t.ss_ticket_number > having rowcount > 1; > {code} > Plan snippet > {code} > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 5 (SIMPLE_EDGE), Union 3 > (CONTAINS) > Reducer 4 <- Union 3 (SIMPLE_EDGE) > Reducer 7 <- Map 6 (SIMPLE_EDGE), Map 8 (SIMPLE_EDGE), Union 3 > (CONTAINS) > Reducer 4 > Reduce Operator Tree: > Group By Operator > aggregations: count(VALUE._col0) > keys: KEY._col0 (type: int), KEY._col1 (type: int), KEY._col2 > (type: int) > mode: mergepartial > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: COMPLETE > Filter Operator > predicate: (_col3 > 1) (type: boolean) > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE > Column stats: COMPLETE > Select Operator > expressions: _col3 (type: bigint) > outputColumnNames: _col0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE > Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE > Column stats: COMPLETE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Reducer 7 > Reduce Operator Tree: > Merge Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 ss_item_sk (type: int), ss_ticket_number (type: int) > 1 sr_item_sk (type: int), sr_ticket_number (type: int) > outputColumnNames: _col1, _col6, _col8, _col27, _col34 > Filter Operator > predicate: ((_col1 = _col27) and (_col8 = _col34)) (type: > boolean) > Select Operator > expressions: _col1 (type: int), _col8 (type: int), _col6 > (type: int) > outputColumnNames: _col0, _col1, _col2 > Group By Operator > aggregations: count() > keys: _col2 (type: int), _col0 (type: int), _col1 > (type: int) > mode: hash > outputColumnNames: _col0, _col1, _col2, _col3 > Reduce Output Operator > key expressions: _col0 (type: int), _col1 (type: > int), _col2 (type: int) > sort order: +++ > Map-reduce partition columns: _col0 (type: int), > _col1 (type: int), _col2 (type: int) > value expressions: _col3 (type: bigint) > {code} > The full explain plan > {code} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 5 (SIMPLE_EDGE), Union 3 > (CONTAINS) > Reducer 4 <- Union 3 (SIMPLE_EDGE) > Reducer 7 <- Map 6 (SIMPLE_EDGE), Map 8 (SIMPLE_EDGE), Union 3 > (CONTAINS) > DagName: mmokhtar_201502141
[jira] [Commented] (HIVE-11010) Accumulo storage handler queries via HS2 fail
[ https://issues.apache.org/jira/browse/HIVE-11010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587091#comment-14587091 ] Josh Elser commented on HIVE-11010: --- Thanks for filing this. Was doing some debugging with [~taksaito] with the AccumuloStorageHandler -- loaded some data in both HBase and Accumulo, ran some Hive queries against both and found that when we ran the Accumulo queries via hiveserver (but not in the local client) both the queries would fail on the RPC handshakes. Short story, AccumuloStorageHandler queries with Kerberos on don't work with HiveServer2. I think what was happening is that the additions to the AccumuloStorageHandler in HIVE-10857 don't work as expected because HS2 is going to be running with its own Kerberos credentials. I think we need to change how we set up the credentials inside of AccumuloStorageHandler so that it will work regardless of a local hive client or hs2 -- running a doAs with a PROXY instead of replacing the HS2 credentials. The second half is that we'd need to make sure Accumulo itself is configured to allow HS2 to proxy on behalf of users -- not relevant for Hive code, but something to document for users to set up in Accumulo. > Accumulo storage handler queries via HS2 fail > - > > Key: HIVE-11010 > URL: https://issues.apache.org/jira/browse/HIVE-11010 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.0, 1.2.1 > Environment: Secure >Reporter: Takahiko Saito >Assignee: Josh Elser > Fix For: 1.2.1 > > > On Kerberized cluster, accumulo storage handler throws an error, > "[usrname]@[principlaname] is not allowed to impersonate [username]" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11010) Accumulo storage handler queries via HS2 fail
[ https://issues.apache.org/jira/browse/HIVE-11010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HIVE-11010: -- Fix Version/s: 1.2.1 > Accumulo storage handler queries via HS2 fail > - > > Key: HIVE-11010 > URL: https://issues.apache.org/jira/browse/HIVE-11010 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.0, 1.2.1 > Environment: Secure >Reporter: Takahiko Saito >Assignee: Josh Elser > Fix For: 1.2.1 > > > On Kerberized cluster, accumulo storage handler throws an error, > "[usrname]@[principlaname] is not allowed to impersonate [username]" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11010) Accumulo storage handler queries via HS2 fail
[ https://issues.apache.org/jira/browse/HIVE-11010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HIVE-11010: -- Affects Version/s: 1.2.1 > Accumulo storage handler queries via HS2 fail > - > > Key: HIVE-11010 > URL: https://issues.apache.org/jira/browse/HIVE-11010 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.0, 1.2.1 > Environment: Secure >Reporter: Takahiko Saito >Assignee: Josh Elser > Fix For: 1.2.1 > > > On Kerberized cluster, accumulo storage handler throws an error, > "[usrname]@[principlaname] is not allowed to impersonate [username]" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4239) Remove lock on compilation stage
[ https://issues.apache.org/jira/browse/HIVE-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-4239: --- Attachment: HIVE-4239.07.patch Rebased the patch. Some other commit has refactored Hive out of the session class, so the issue with this change is moot > Remove lock on compilation stage > > > Key: HIVE-4239 > URL: https://issues.apache.org/jira/browse/HIVE-4239 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Query Processor >Reporter: Carl Steinbach >Assignee: Sergey Shelukhin > Attachments: HIVE-4239.01.patch, HIVE-4239.02.patch, > HIVE-4239.03.patch, HIVE-4239.04.patch, HIVE-4239.05.patch, > HIVE-4239.06.patch, HIVE-4239.07.patch, HIVE-4239.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11010) Accumulo storage handler queries via HS2 fail
[ https://issues.apache.org/jira/browse/HIVE-11010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HIVE-11010: -- Summary: Accumulo storage handler queries via HS2 fail (was: Accumulo storage handler throws "[usrname]@[principlaname] is not allowed to impersonate [username]" via beeline on kerberized cluster) > Accumulo storage handler queries via HS2 fail > - > > Key: HIVE-11010 > URL: https://issues.apache.org/jira/browse/HIVE-11010 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.0 > Environment: Secure >Reporter: Takahiko Saito >Assignee: Josh Elser > > On Kerberized cluster, accumulo storage handler throws an error, > "[usrname]@[principlaname] is not allowed to impersonate [username]" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10937) LLAP: make ObjectCache for plans work properly in the daemon
[ https://issues.apache.org/jira/browse/HIVE-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587071#comment-14587071 ] Sergey Shelukhin commented on HIVE-10937: - I didn't get a repro of issues reported in the cluster > LLAP: make ObjectCache for plans work properly in the daemon > > > Key: HIVE-10937 > URL: https://issues.apache.org/jira/browse/HIVE-10937 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: llap > > Attachments: HIVE-10937.patch > > > There's perf hit otherwise, esp. when stupid planner creates 1009 reducers of > 4Mb each. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements
[ https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587064#comment-14587064 ] Laljo John Pullokkaran commented on HIVE-10841: --- Committed to branch 1.0. > [WHERE col is not null] does not work sometimes for queries with many JOIN > statements > - > > Key: HIVE-10841 > URL: https://issues.apache.org/jira/browse/HIVE-10841 > Project: Hive > Issue Type: Bug > Components: Query Planning, Query Processor >Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0 >Reporter: Alexander Pivovarov >Assignee: Laljo John Pullokkaran > Fix For: 1.2.1 > > Attachments: HIVE-10841.03.patch, HIVE-10841.1.patch, > HIVE-10841.2.patch, HIVE-10841.patch > > > The result from the following SELECT query is 3 rows but it should be 1 row. > I checked it in MySQL - it returned 1 row. > To reproduce the issue in Hive > 1. prepare tables > {code} > drop table if exists L; > drop table if exists LA; > drop table if exists FR; > drop table if exists A; > drop table if exists PI; > drop table if exists acct; > create table L as select 4436 id; > create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id; > create table FR as select 4436 loan_id; > create table A as select 4748 id; > create table PI as select 4415 id; > create table acct as select 4748 aid, 10 acc_n, 122 brn; > insert into table acct values(4748, null, null); > insert into table acct values(4748, null, null); > {code} > 2. run SELECT query > {code} > select > acct.ACC_N, > acct.brn > FROM L > JOIN LA ON L.id = LA.loan_id > JOIN FR ON L.id = FR.loan_id > JOIN A ON LA.aid = A.id > JOIN PI ON PI.id = LA.pi_id > JOIN acct ON A.id = acct.aid > WHERE > L.id = 4436 > and acct.brn is not null; > {code} > the result is 3 rows > {code} > 10122 > NULL NULL > NULL NULL > {code} > but it should be 1 row > {code} > 10122 > {code} > 2.1 "explain select ..." output for hive-1.3.0 MR > {code} > STAGE DEPENDENCIES: > Stage-12 is a root stage > Stage-9 depends on stages: Stage-12 > Stage-0 depends on stages: Stage-9 > STAGE PLANS: > Stage: Stage-12 > Map Reduce Local Work > Alias -> Map Local Tables: > a > Fetch Operator > limit: -1 > acct > Fetch Operator > limit: -1 > fr > Fetch Operator > limit: -1 > l > Fetch Operator > limit: -1 > pi > Fetch Operator > limit: -1 > Alias -> Map Local Operator Tree: > a > TableScan > alias: a > Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column > stats: NONE > Filter Operator > predicate: id is not null (type: boolean) > Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE > Column stats: NONE > HashTable Sink Operator > keys: > 0 _col5 (type: int) > 1 id (type: int) > 2 aid (type: int) > acct > TableScan > alias: acct > Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: aid is not null (type: boolean) > Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE > Column stats: NONE > HashTable Sink Operator > keys: > 0 _col5 (type: int) > 1 id (type: int) > 2 aid (type: int) > fr > TableScan > alias: fr > Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column > stats: NONE > Filter Operator > predicate: (loan_id = 4436) (type: boolean) > Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE > Column stats: NONE > HashTable Sink Operator > keys: > 0 4436 (type: int) > 1 4436 (type: int) > 2 4436 (type: int) > l > TableScan > alias: l > Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column > stats: NONE > Filter Operator > predicate: (id = 4436) (type: boolean) > Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE > Column stats: NONE > HashTable Sink Operator > keys: > 0 4436 (type: int) > 1 4436 (type: int) > 2 4436 (type: int) > pi > TableScan > alias: pi > Statistics: Num rows: 1 Data size: 4 Basic stat
[jira] [Commented] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call
[ https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587060#comment-14587060 ] Prasanth Jayachandran commented on HIVE-10940: -- +1 > HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader > call > - > > Key: HIVE-10940 > URL: https://issues.apache.org/jira/browse/HIVE-10940 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 1.2.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-10940.01.patch, HIVE-10940.patch > > > {code} > String filterText = filterExpr.getExprString(); > String filterExprSerialized = Utilities.serializeExpression(filterExpr); > {code} > the serializeExpression initializes Kryo and produces a new packed object for > every split. > HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters. > And Kryo is very slow to do this for a large filter clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11006) improve logging wrt ACID module
[ https://issues.apache.org/jira/browse/HIVE-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587045#comment-14587045 ] Eugene Koifman commented on HIVE-11006: --- [~sushanth], could we get this into 1.2.1? It's only logging changes but will make diagnostics easier. > improve logging wrt ACID module > --- > > Key: HIVE-11006 > URL: https://issues.apache.org/jira/browse/HIVE-11006 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-11006.patch > > > especially around metastore DB operations (TxnHandler) which are retried or > fail for some reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call
[ https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10940: Attachment: HIVE-10940.01.patch > HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader > call > - > > Key: HIVE-10940 > URL: https://issues.apache.org/jira/browse/HIVE-10940 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 1.2.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-10940.01.patch, HIVE-10940.patch > > > {code} > String filterText = filterExpr.getExprString(); > String filterExprSerialized = Utilities.serializeExpression(filterExpr); > {code} > the serializeExpression initializes Kryo and produces a new packed object for > every split. > HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters. > And Kryo is very slow to do this for a large filter clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10685) Alter table concatenate oparetor will cause duplicate data
[ https://issues.apache.org/jira/browse/HIVE-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10685: - Attachment: (was: HIVE-10685.1.patch) > Alter table concatenate oparetor will cause duplicate data > -- > > Key: HIVE-10685 > URL: https://issues.apache.org/jira/browse/HIVE-10685 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 1.2.1 >Reporter: guoliming >Assignee: guoliming >Priority: Critical > Fix For: 1.2.0, 1.1.0 > > Attachments: HIVE-10685.patch > > > "Orders" table has 15 rows and stored as ORC. > {noformat} > hive> select count(*) from orders; > OK > 15 > Time taken: 37.692 seconds, Fetched: 1 row(s) > {noformat} > The table contain 14 files,the size of each file is about 2.1 ~ 3.2 GB. > After executing command : ALTER TABLE orders CONCATENATE; > The table is already 1530115000 rows. > My hive version is 1.1.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call
[ https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587018#comment-14587018 ] Sergey Shelukhin commented on HIVE-10940: - text representation is preserved for backward compat (if you mean the original one we used to serialize). Will add logging > HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader > call > - > > Key: HIVE-10940 > URL: https://issues.apache.org/jira/browse/HIVE-10940 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 1.2.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-10940.patch > > > {code} > String filterText = filterExpr.getExprString(); > String filterExprSerialized = Utilities.serializeExpression(filterExpr); > {code} > the serializeExpression initializes Kryo and produces a new packed object for > every split. > HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters. > And Kryo is very slow to do this for a large filter clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-10685) Alter table concatenate oparetor will cause duplicate data
[ https://issues.apache.org/jira/browse/HIVE-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reopened HIVE-10685: -- I am gonna revert the committed patch and apply the original patch. The committed patch will not work as the stripe index increment is outside of continue. > Alter table concatenate oparetor will cause duplicate data > -- > > Key: HIVE-10685 > URL: https://issues.apache.org/jira/browse/HIVE-10685 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 1.2.1 >Reporter: guoliming >Assignee: guoliming >Priority: Critical > Fix For: 1.2.0, 1.1.0 > > Attachments: HIVE-10685.1.patch, HIVE-10685.patch > > > "Orders" table has 15 rows and stored as ORC. > {noformat} > hive> select count(*) from orders; > OK > 15 > Time taken: 37.692 seconds, Fetched: 1 row(s) > {noformat} > The table contain 14 files,the size of each file is about 2.1 ~ 3.2 GB. > After executing command : ALTER TABLE orders CONCATENATE; > The table is already 1530115000 rows. > My hive version is 1.1.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call
[ https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586996#comment-14586996 ] Prasanth Jayachandran commented on HIVE-10940: -- Patch mostly looks good. Although it will be good to add some debug logging after each if null checks. Also from simple reference look up we don't seem be using textual representation of the filter expression anywhere. I don't think we need to set the text representation of filter expression. If we need text representation we have methods in PlanUtils to do so. [~ashutoshc]/[~gopalv] Any idea why we set the filter expression in text form to job conf? > HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader > call > - > > Key: HIVE-10940 > URL: https://issues.apache.org/jira/browse/HIVE-10940 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 1.2.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-10940.patch > > > {code} > String filterText = filterExpr.getExprString(); > String filterExprSerialized = Utilities.serializeExpression(filterExpr); > {code} > the serializeExpression initializes Kryo and produces a new packed object for > every split. > HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters. > And Kryo is very slow to do this for a large filter clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586978#comment-14586978 ] Laljo John Pullokkaran commented on HIVE-10996: --- [~jcamachorodriguez] Could you take a look? seems like related to DT removal HIVE-8435. > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0 >Reporter: Gautam Kowshik >Priority: Minor > Attachments: explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10996: -- Priority: Critical (was: Minor) > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10996: -- Assignee: Jesus Camacho Rodriguez > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Minor > Attachments: explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10972) DummyTxnManager always locks the current database in shared mode, which is incorrect.
[ https://issues.apache.org/jira/browse/HIVE-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586753#comment-14586753 ] Alan Gates commented on HIVE-10972: --- Yes, I'll take a look. > DummyTxnManager always locks the current database in shared mode, which is > incorrect. > - > > Key: HIVE-10972 > URL: https://issues.apache.org/jira/browse/HIVE-10972 > Project: Hive > Issue Type: Bug > Components: Locking >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-10972.2.patch, HIVE-10972.patch > > > In DummyTxnManager [line 163 | > http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163], > it always locks the current database. > That is not correct since the current database can be "db1", and the query > can be "select * from db2.tb1", which will lock db1 unnecessarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11007) CBO: Calcite Operator To Hive Operator (Calcite Return Path): dpCtx's mapInputToDP should depends on the last SEL
[ https://issues.apache.org/jira/browse/HIVE-11007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11007: --- Attachment: HIVE-11007.01.patch > CBO: Calcite Operator To Hive Operator (Calcite Return Path): dpCtx's > mapInputToDP should depends on the last SEL > - > > Key: HIVE-11007 > URL: https://issues.apache.org/jira/browse/HIVE-11007 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11007.01.patch > > > In dynamic partitioning case, for example, we are going to have > TS0-SEL1-SEL2-FS3. The dpCtx's mapInputToDP is populated by SEL1 rather than > SEL2, which causes error in return path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11007) CBO: Calcite Operator To Hive Operator (Calcite Return Path): dpCtx's mapInputToDP should depends on the last SEL
[ https://issues.apache.org/jira/browse/HIVE-11007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11007: --- Attachment: (was: HIVE-11007.01.patch) > CBO: Calcite Operator To Hive Operator (Calcite Return Path): dpCtx's > mapInputToDP should depends on the last SEL > - > > Key: HIVE-11007 > URL: https://issues.apache.org/jira/browse/HIVE-11007 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11007.01.patch > > > In dynamic partitioning case, for example, we are going to have > TS0-SEL1-SEL2-FS3. The dpCtx's mapInputToDP is populated by SEL1 rather than > SEL2, which causes error in return path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10986) Check of fs.trash.interval in HiveMetaStore should be consistent with Trash.moveToAppropriateTrash()
[ https://issues.apache.org/jira/browse/HIVE-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-10986: -- Attachment: HIVE-10986.patch > Check of fs.trash.interval in HiveMetaStore should be consistent with > Trash.moveToAppropriateTrash() > > > Key: HIVE-10986 > URL: https://issues.apache.org/jira/browse/HIVE-10986 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 1.2.1 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-10986.patch > > > This is a followup to HIVE-10629. > Trash.moveToAppropriateTrash() takes core-site.xml but HiveMetaStore checks > "hiveConf" which is a problem when they disagree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-10884) Enable some beeline tests and turn on HIVE-4239 by default
[ https://issues.apache.org/jira/browse/HIVE-10884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10884: Comment: was deleted (was: It looks like the instrumentation needs to be updated to run beeline tests... ) > Enable some beeline tests and turn on HIVE-4239 by default > -- > > Key: HIVE-10884 > URL: https://issues.apache.org/jira/browse/HIVE-10884 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-10884.01.patch, HIVE-10884.02.patch, > HIVE-10884.03.patch, HIVE-10884.04.patch, HIVE-10884.05.patch, > HIVE-10884.patch > > > See comments in HIVE-4239. > Beeline tests with parallelism need to be enabled to turn compilation > parallelism on by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10884) Enable some beeline tests and turn on HIVE-4239 by default
[ https://issues.apache.org/jira/browse/HIVE-10884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10884: Attachment: HIVE-10884.05.patch Beeline tests weren't attempted. Attempting to remove the exclude from hivetest... > Enable some beeline tests and turn on HIVE-4239 by default > -- > > Key: HIVE-10884 > URL: https://issues.apache.org/jira/browse/HIVE-10884 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-10884.01.patch, HIVE-10884.02.patch, > HIVE-10884.03.patch, HIVE-10884.04.patch, HIVE-10884.05.patch, > HIVE-10884.patch > > > See comments in HIVE-4239. > Beeline tests with parallelism need to be enabled to turn compilation > parallelism on by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10884) Enable some beeline tests and turn on HIVE-4239 by default
[ https://issues.apache.org/jira/browse/HIVE-10884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586674#comment-14586674 ] Sergey Shelukhin commented on HIVE-10884: - It looks like the instrumentation needs to be updated to run beeline tests... > Enable some beeline tests and turn on HIVE-4239 by default > -- > > Key: HIVE-10884 > URL: https://issues.apache.org/jira/browse/HIVE-10884 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-10884.01.patch, HIVE-10884.02.patch, > HIVE-10884.03.patch, HIVE-10884.04.patch, HIVE-10884.05.patch, > HIVE-10884.patch > > > See comments in HIVE-4239. > Beeline tests with parallelism need to be enabled to turn compilation > parallelism on by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10991) CBO: Calcite Operator To Hive Operator (Calcite Return Path): NonBlockingOpDeDupProc did not kick in rcfile_merge2.q
[ https://issues.apache.org/jira/browse/HIVE-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586613#comment-14586613 ] Hive QA commented on HIVE-10991: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12739659/HIVE-10991.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9008 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4269/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4269/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4269/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12739659 - PreCommit-HIVE-TRUNK-Build > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > NonBlockingOpDeDupProc did not kick in rcfile_merge2.q > > > Key: HIVE-10991 > URL: https://issues.apache.org/jira/browse/HIVE-10991 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10991.patch > > > NonBlockingOpDeDupProc did not kick in rcfile_merge2.q in return path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11008) webhcat GET /jobs retries on getting job details from history server is too agressive
[ https://issues.apache.org/jira/browse/HIVE-11008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-11008: - Attachment: HIVE-11008.1.patch Patch from [~cwelch] > webhcat GET /jobs retries on getting job details from history server is too > agressive > - > > Key: HIVE-11008 > URL: https://issues.apache.org/jira/browse/HIVE-11008 > Project: Hive > Issue Type: Bug > Components: WebHCat >Affects Versions: 1.2.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-11008.1.patch > > > Webhcat "jobs" api gets the list of jobs from RM and then gets details from > history server. > RM has a policy of retaining fixed number of jobs to accommodate for the > memory it has, while HistoryServer retains jobs based on their age. As a > result, jobs that RM returns might not be present in HistoryServer and can > result in a failure. HistoryServer also ends up retrying on failures even if > they happen because the job actually does not exist. > The retries to get details from HistoryServer in such cases is too aggressive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10984) "Lock table" explicit lock command doesn't lock the database object.
[ https://issues.apache.org/jira/browse/HIVE-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10984: Description: There is an issue in ZooKeeperHiveLockManager.java, in which when locking exclusively on a table, it doesn't lock the database object (which does if it's from the query). The current implementation of ZooKeeperHiveLockManager will lock the the object and the parents, and won't check the children when it tries to acquire lock on certain object. Then it will cause the following scenario which should not be allowed but right now it goes through. {noformat} use default; lock table db1.tbl1 shared; lock database db1 exclusive; {noformat} Also check the test case lockneg_try_lock_db_in_use.q to add more reasonable failure cases. was: There is an issue in ZooKeeperHiveLockManager.java, in which when locking exclusively on an object we didn't check if the children are locked. So the following should not be allowed. {noformat} use default; lock table lockneg2.tstsrcpart shared; lock database lockneg2 exclusive; {noformat} Also check the test case lockneg_try_lock_db_in_use.q to add more reasonable failure cases. > "Lock table" explicit lock command doesn't lock the database object. > > > Key: HIVE-10984 > URL: https://issues.apache.org/jira/browse/HIVE-10984 > Project: Hive > Issue Type: Bug > Components: Locking >Reporter: Aihua Xu >Assignee: Aihua Xu > > There is an issue in ZooKeeperHiveLockManager.java, in which when locking > exclusively on a table, it doesn't lock the database object (which does if > it's from the query). > The current implementation of ZooKeeperHiveLockManager will lock the the > object and the parents, and won't check the children when it tries to acquire > lock on certain object. Then it will cause the following scenario which > should not be allowed but right now it goes through. > {noformat} > use default; > lock table db1.tbl1 shared; > lock database db1 exclusive; > {noformat} > Also check the test case lockneg_try_lock_db_in_use.q to add more reasonable > failure cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10984) "Lock table" explicit lock command doesn't lock the database object.
[ https://issues.apache.org/jira/browse/HIVE-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10984: Summary: "Lock table" explicit lock command doesn't lock the database object. (was: When ZooKeeperHiveLockManager locks an object exclusively, it doesn't check the lock on the children.) > "Lock table" explicit lock command doesn't lock the database object. > > > Key: HIVE-10984 > URL: https://issues.apache.org/jira/browse/HIVE-10984 > Project: Hive > Issue Type: Bug > Components: Locking >Reporter: Aihua Xu >Assignee: Aihua Xu > > There is an issue in ZooKeeperHiveLockManager.java, in which when locking > exclusively on an object we didn't check if the children are locked. > So the following should not be allowed. > {noformat} > use default; > lock table lockneg2.tstsrcpart shared; > lock database lockneg2 exclusive; > {noformat} > Also check the test case lockneg_try_lock_db_in_use.q to add more reasonable > failure cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11006) improve logging wrt ACID module
[ https://issues.apache.org/jira/browse/HIVE-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11006: -- Attachment: HIVE-11006.patch [~alangates] could you review please > improve logging wrt ACID module > --- > > Key: HIVE-11006 > URL: https://issues.apache.org/jira/browse/HIVE-11006 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-11006.patch > > > especially around metastore DB operations (TxnHandler) which are retried or > fail for some reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11007) CBO: Calcite Operator To Hive Operator (Calcite Return Path): dpCtx's mapInputToDP should depends on the last SEL
[ https://issues.apache.org/jira/browse/HIVE-11007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11007: --- Attachment: HIVE-11007.01.patch > CBO: Calcite Operator To Hive Operator (Calcite Return Path): dpCtx's > mapInputToDP should depends on the last SEL > - > > Key: HIVE-11007 > URL: https://issues.apache.org/jira/browse/HIVE-11007 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11007.01.patch > > > In dynamic partitioning case, for example, we are going to have > TS0-SEL1-SEL2-FS3. The dpCtx's mapInputToDP is populated by SEL1 rather than > SEL2, which causes error in return path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11004) PermGen OOM error in Hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586438#comment-14586438 ] Mostafa Mokhtar commented on HIVE-11004: [~martinbenson] Try setting hive.orc.cache.stripe.details.size=-1 and restart HS2. > PermGen OOM error in Hiveserver2 > > > Key: HIVE-11004 > URL: https://issues.apache.org/jira/browse/HIVE-11004 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 > Environment: cdh 5.4 >Reporter: Martin Benson >Priority: Critical > > Periodically Hiveserver2 will become unresponsive and looking in the logs > there is the following error: > 2:28:22.965 PMERROR org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > Unexpected Exception > java.lang.OutOfMemoryError: PermGen space > 2:28:22.969 PMWARN > org.apache.hive.service.cli.thrift.ThriftCLIService > Error fetching results: > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > java.lang.RuntimeException: serious problem > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:343) > at > org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:250) > at > org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:656) > at > org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:451) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:672) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: java.lang.RuntimeException: serious problem > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:338) > ... 13 more > Caused by: java.lang.RuntimeException: serious problem > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$Context.waitForTasks(OrcInputFormat.java:478) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:944) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:969) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:362) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:294) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:445) > ... 17 more > Caused by: java.lang.OutOfMemoryError: PermGen space > There does not appear to be an obvious trigger for this (other than the fact > that the error mentions ORC). If further details would be helpful in > diagnosing the issue please let me know and I'll supply them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins
[ https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10533: --- Attachment: HIVE-10533.03.patch > CBO (Calcite Return Path): Join to MultiJoin support for outer joins > > > Key: HIVE-10533 > URL: https://issues.apache.org/jira/browse/HIVE-10533 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, > HIVE-10533.02.patch, HIVE-10533.03.patch, HIVE-10533.patch > > > CBO return path: auto_join7.q can be used to reproduce the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10991) CBO: Calcite Operator To Hive Operator (Calcite Return Path): NonBlockingOpDeDupProc did not kick in rcfile_merge2.q
[ https://issues.apache.org/jira/browse/HIVE-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10991: --- Attachment: HIVE-10991.patch > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > NonBlockingOpDeDupProc did not kick in rcfile_merge2.q > > > Key: HIVE-10991 > URL: https://issues.apache.org/jira/browse/HIVE-10991 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10991.patch > > > NonBlockingOpDeDupProc did not kick in rcfile_merge2.q in return path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10991) CBO: Calcite Operator To Hive Operator (Calcite Return Path): NonBlockingOpDeDupProc did not kick in rcfile_merge2.q
[ https://issues.apache.org/jira/browse/HIVE-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-10991: -- Assignee: Jesus Camacho Rodriguez (was: Pengcheng Xiong) > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > NonBlockingOpDeDupProc did not kick in rcfile_merge2.q > > > Key: HIVE-10991 > URL: https://issues.apache.org/jira/browse/HIVE-10991 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10991.patch > > > NonBlockingOpDeDupProc did not kick in rcfile_merge2.q in return path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11005) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : Regression on the latest master
[ https://issues.apache.org/jira/browse/HIVE-11005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11005: --- Assignee: Jesus Camacho Rodriguez > CBO: Calcite Operator To Hive Operator (Calcite Return Path) : Regression on > the latest master > -- > > Key: HIVE-11005 > URL: https://issues.apache.org/jira/browse/HIVE-11005 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Jesus Camacho Rodriguez > > Test cbo_join.q and cbo_views.q on return path failed. Part of the stack > trace is > {code} > 2015-06-15 09:51:53,377 ERROR [main]: parse.CalcitePlanner > (CalcitePlanner.java:genOPTree(282)) - CBO failed, skipping CBO. > java.lang.IndexOutOfBoundsException: index (0) must be less than size (0) > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305) > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284) > at > com.google.common.collect.EmptyImmutableList.get(EmptyImmutableList.java:80) > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveInsertExchange4JoinRule.onMatch(HiveInsertExchange4JoinRule.java:101) > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:326) > at > org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:515) > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:392) > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:255) > at > org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:125) > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:207) > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:194) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:888) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:771) > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:876) > at > org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-10233: - Attachment: HIVE-10233-WIP-8.patch Upload WIP-8 patch for join only MM. > Hive on LLAP: Memory manager > > > Key: HIVE-10233 > URL: https://issues.apache.org/jira/browse/HIVE-10233 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: llap >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, > HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, > HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch > > > We need a memory manager in llap/tez to manage the usage of memory across > threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586234#comment-14586234 ] Hive QA commented on HIVE-10165: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12739615/HIVE-10165.7.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9085 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28 org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4268/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4268/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4268/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12739615 - PreCommit-HIVE-TRUNK-Build > Improve hive-hcatalog-streaming extensibility and support updates and deletes. > -- > > Key: HIVE-10165 > URL: https://issues.apache.org/jira/browse/HIVE-10165 > Project: Hive > Issue Type: Improvement > Components: HCatalog >Affects Versions: 1.2.0 >Reporter: Elliot West >Assignee: Elliot West > Labels: streaming_api > Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, > HIVE-10165.5.patch, HIVE-10165.6.patch, HIVE-10165.7.patch, > mutate-system-overview.png > > > h3. Overview > I'd like to extend the > [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] > API so that it also supports the writing of record updates and deletes in > addition to the already supported inserts. > h3. Motivation > We have many Hadoop processes outside of Hive that merge changed facts into > existing datasets. Traditionally we achieve this by: reading in a > ground-truth dataset and a modified dataset, grouping by a key, sorting by a > sequence and then applying a function to determine inserted, updated, and > deleted rows. However, in our current scheme we must rewrite all partitions > that may potentially contain changes. In practice the number of mutated > records is very small when compared with the records contained in a > partition. This approach results in a number of operational issues: > * Excessive amount of write activity required for small data changes. > * Downstream applications cannot robustly read these datasets while they are > being updated. > * Due to scale of the updates (hundreds or partitions) the scope for > contention is high. > I believe we can address this problem by instead writing only the changed > records to a Hive transactional table. This should drastically reduce the > amount of data that we need to write and also provide a means for managing > concurrent access to the data. Our existing merge processes can read and > retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to > an updated form of the hive-hcatalog-streaming API which will then have the > required data to perform an update or insert in a transactional manner. > h3. Benefits > * Enables the creation of large-scale dataset merge processes > * Opens up Hive transactional functionality in an accessible manner to > processes that operate outside of Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11004) PermGen OOM error in Hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martin Benson updated HIVE-11004: - Summary: PermGen OOM error in Hiveserver2 (was: PermGen) > PermGen OOM error in Hiveserver2 > > > Key: HIVE-11004 > URL: https://issues.apache.org/jira/browse/HIVE-11004 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 > Environment: cdh 5.4 >Reporter: Martin Benson >Priority: Critical > > Periodically Hiveserver2 will become unresponsive and looking in the logs > there is the following error: > 2:28:22.965 PMERROR org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > Unexpected Exception > java.lang.OutOfMemoryError: PermGen space > 2:28:22.969 PMWARN > org.apache.hive.service.cli.thrift.ThriftCLIService > Error fetching results: > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > java.lang.RuntimeException: serious problem > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:343) > at > org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:250) > at > org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:656) > at > org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:451) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:672) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: java.lang.RuntimeException: serious problem > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:338) > ... 13 more > Caused by: java.lang.RuntimeException: serious problem > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$Context.waitForTasks(OrcInputFormat.java:478) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:944) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:969) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:362) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:294) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:445) > ... 17 more > Caused by: java.lang.OutOfMemoryError: PermGen space > There does not appear to be an obvious trigger for this (other than the fact > that the error mentions ORC). If further details would be helpful in > diagnosing the issue please let me know and I'll supply them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10972) DummyTxnManager always locks the current database in shared mode, which is incorrect.
[ https://issues.apache.org/jira/browse/HIVE-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586162#comment-14586162 ] Aihua Xu commented on HIVE-10972: - [~alangates] Seems you worked on the initial version? Can you also take a look at the change to see if it will cause any issue? > DummyTxnManager always locks the current database in shared mode, which is > incorrect. > - > > Key: HIVE-10972 > URL: https://issues.apache.org/jira/browse/HIVE-10972 > Project: Hive > Issue Type: Bug > Components: Locking >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-10972.2.patch, HIVE-10972.patch > > > In DummyTxnManager [line 163 | > http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163], > it always locks the current database. > That is not correct since the current database can be "db1", and the query > can be "select * from db2.tb1", which will lock db1 unnecessarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10972) DummyTxnManager always locks the current database in shared mode, which is incorrect.
[ https://issues.apache.org/jira/browse/HIVE-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586157#comment-14586157 ] Aihua Xu commented on HIVE-10972: - The tests are not related. > DummyTxnManager always locks the current database in shared mode, which is > incorrect. > - > > Key: HIVE-10972 > URL: https://issues.apache.org/jira/browse/HIVE-10972 > Project: Hive > Issue Type: Bug > Components: Locking >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-10972.2.patch, HIVE-10972.patch > > > In DummyTxnManager [line 163 | > http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163], > it always locks the current database. > That is not correct since the current database can be "db1", and the query > can be "select * from db2.tb1", which will lock db1 unnecessarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others
[ https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586098#comment-14586098 ] Chaoyu Tang commented on HIVE-7018: --- [~ychena] Looks like the HMS upgrade test failed, do you know the reason? > Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but > not others > - > > Key: HIVE-7018 > URL: https://issues.apache.org/jira/browse/HIVE-7018 > Project: Hive > Issue Type: Bug >Reporter: Brock Noland >Assignee: Yongzhi Chen > Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch, HIVE-7018.3.patch, > HIVE-7018.4.patch > > > It appears that at least postgres and oracle do not have the LINK_TARGET_ID > column while mysql does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586087#comment-14586087 ] Aihua Xu commented on HIVE-10754: - [~mithun] Sorry for the late reply. Busy with something else. Seems it's hadoop version related issue. Would it be fair to update all the calls in HCatalog to use the new getInstance() since it's deprecated anyway? If you agree, I will use this jira to do that and I will update the title to reflect it. > Pig+Hcatalog doesn't work properly since we need to clone the Job instance in > HCatLoader > > > Key: HIVE-10754 > URL: https://issues.apache.org/jira/browse/HIVE-10754 > Project: Hive > Issue Type: Sub-task > Components: HCatalog >Affects Versions: 1.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-10754.patch > > > {noformat} > Create table tbl1 (key string, value string) stored as rcfile; > Create table tbl2 (key string, value string); > insert into tbl1 values( '1', '111'); > insert into tbl2 values('1', '2'); > {noformat} > Pig script: > {noformat} > src_tbl1 = FILTER tbl1 BY (key == '1'); > prj_tbl1 = FOREACH src_tbl1 GENERATE >key as tbl1_key, >value as tbl1_value, >'333' as tbl1_v1; > > src_tbl2 = FILTER tbl2 BY (key == '1'); > prj_tbl2 = FOREACH src_tbl2 GENERATE >key as tbl2_key, >value as tbl2_value; > > dump prj_tbl1; > dump prj_tbl2; > result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key); > prj_result = FOREACH result > GENERATE prj_tbl1::tbl1_key AS key1, > prj_tbl1::tbl1_value AS value1, > prj_tbl1::tbl1_v1 AS v1, > prj_tbl2::tbl2_key AS key2, > prj_tbl2::tbl2_value AS value2; > > dump prj_result; > {noformat} > The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2). We > need to clone the job instance in HCatLoader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliot West updated HIVE-10165: --- Attachment: HIVE-10165.7.patch > Improve hive-hcatalog-streaming extensibility and support updates and deletes. > -- > > Key: HIVE-10165 > URL: https://issues.apache.org/jira/browse/HIVE-10165 > Project: Hive > Issue Type: Improvement > Components: HCatalog >Affects Versions: 1.2.0 >Reporter: Elliot West >Assignee: Elliot West > Labels: streaming_api > Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, > HIVE-10165.5.patch, HIVE-10165.6.patch, HIVE-10165.7.patch, > mutate-system-overview.png > > > h3. Overview > I'd like to extend the > [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] > API so that it also supports the writing of record updates and deletes in > addition to the already supported inserts. > h3. Motivation > We have many Hadoop processes outside of Hive that merge changed facts into > existing datasets. Traditionally we achieve this by: reading in a > ground-truth dataset and a modified dataset, grouping by a key, sorting by a > sequence and then applying a function to determine inserted, updated, and > deleted rows. However, in our current scheme we must rewrite all partitions > that may potentially contain changes. In practice the number of mutated > records is very small when compared with the records contained in a > partition. This approach results in a number of operational issues: > * Excessive amount of write activity required for small data changes. > * Downstream applications cannot robustly read these datasets while they are > being updated. > * Due to scale of the updates (hundreds or partitions) the scope for > contention is high. > I believe we can address this problem by instead writing only the changed > records to a Hive transactional table. This should drastically reduce the > amount of data that we need to write and also provide a means for managing > concurrent access to the data. Our existing merge processes can read and > retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to > an updated form of the hive-hcatalog-streaming API which will then have the > required data to perform an update or insert in a transactional manner. > h3. Benefits > * Enables the creation of large-scale dataset merge processes > * Opens up Hive transactional functionality in an accessible manner to > processes that operate outside of Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10989) HoS can't control number of map tasks for runtime skew join [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585889#comment-14585889 ] Xuefu Zhang commented on HIVE-10989: Makes sense. +1 > HoS can't control number of map tasks for runtime skew join [Spark Branch] > -- > > Key: HIVE-10989 > URL: https://issues.apache.org/jira/browse/HIVE-10989 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-10989.1-spark.patch > > > Flags {{hive.skewjoin.mapjoin.map.tasks}} and > {{hive.skewjoin.mapjoin.min.split}} are used to control the number of map > tasks for the map join of runtime skew join. They work well for MR but have > no effect for spark. > This makes runtime skew join less useful, i.e. we just end up with slow > mappers instead of reducers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6500) Stats collection via filesystem
[ https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585874#comment-14585874 ] Damien Carol commented on HIVE-6500: Plz ignore my last comment > Stats collection via filesystem > --- > > Key: HIVE-6500 > URL: https://issues.apache.org/jira/browse/HIVE-6500 > Project: Hive > Issue Type: New Feature > Components: Statistics >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 0.13.0 > > Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch > > > Recently, support for stats gathering via counter was [added | > https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has > following issues: > * [Length of counter group name is limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340] > * [Length of counter name is limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337] > * [Number of distinct counter groups are limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343] > * [Number of distinct counters are limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334] > Although, these limits are configurable, but setting them to higher value > implies increased memory load on AM and job history server. > Now, whether these limits makes sense or not is [debatable | > https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that > Hive doesn't make use of counters features of framework so that it we can > evolve this feature without relying on support from framework. Filesystem > based counter collection is a step in that direction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6500) Stats collection via filesystem
[ https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585873#comment-14585873 ] Damien Carol commented on HIVE-6500: Plz ignore my last comment > Stats collection via filesystem > --- > > Key: HIVE-6500 > URL: https://issues.apache.org/jira/browse/HIVE-6500 > Project: Hive > Issue Type: New Feature > Components: Statistics >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 0.13.0 > > Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch > > > Recently, support for stats gathering via counter was [added | > https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has > following issues: > * [Length of counter group name is limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340] > * [Length of counter name is limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337] > * [Number of distinct counter groups are limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343] > * [Number of distinct counters are limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334] > Although, these limits are configurable, but setting them to higher value > implies increased memory load on AM and job history server. > Now, whether these limits makes sense or not is [debatable | > https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that > Hive doesn't make use of counters features of framework so that it we can > evolve this feature without relying on support from framework. Filesystem > based counter collection is a step in that direction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10542) Full outer joins in tez produce incorrect results in certain cases
[ https://issues.apache.org/jira/browse/HIVE-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585725#comment-14585725 ] Goun Na commented on HIVE-10542: No patch available for Hive 1.1? > Full outer joins in tez produce incorrect results in certain cases > -- > > Key: HIVE-10542 > URL: https://issues.apache.org/jira/browse/HIVE-10542 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Blocker > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-10542.1.patch, HIVE-10542.2.patch, > HIVE-10542.3.patch, HIVE-10542.4.patch, HIVE-10542.5.patch, > HIVE-10542.6.patch, HIVE-10542.7.patch, HIVE-10542.8.patch, HIVE-10542.9.patch > > > If there is no records for one of the tables in the full outer join, we do > not read the other input and end up not producing rows which we should be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6500) Stats collection via filesystem
[ https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585683#comment-14585683 ] Damien Carol commented on HIVE-6500: [~ashutoshc] Did you miss the property *hive.stats.tmp.loc* in _common/src/java/org/apache/hadoop/hive/conf/HiveConf.java_ ? > Stats collection via filesystem > --- > > Key: HIVE-6500 > URL: https://issues.apache.org/jira/browse/HIVE-6500 > Project: Hive > Issue Type: New Feature > Components: Statistics >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 0.13.0 > > Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch > > > Recently, support for stats gathering via counter was [added | > https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has > following issues: > * [Length of counter group name is limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340] > * [Length of counter name is limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337] > * [Number of distinct counter groups are limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343] > * [Number of distinct counters are limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334] > Although, these limits are configurable, but setting them to higher value > implies increased memory load on AM and job history server. > Now, whether these limits makes sense or not is [debatable | > https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that > Hive doesn't make use of counters features of framework so that it we can > evolve this feature without relying on support from framework. Filesystem > based counter collection is a step in that direction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6500) Stats collection via filesystem
[ https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585678#comment-14585678 ] Damien Carol commented on HIVE-6500: [~leftylev] This JIRA added a new property NOT documented *hive.stats.tmp.loc* Also this property is not added in "hive-default.xml" system. > Stats collection via filesystem > --- > > Key: HIVE-6500 > URL: https://issues.apache.org/jira/browse/HIVE-6500 > Project: Hive > Issue Type: New Feature > Components: Statistics >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 0.13.0 > > Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch > > > Recently, support for stats gathering via counter was [added | > https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has > following issues: > * [Length of counter group name is limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340] > * [Length of counter name is limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337] > * [Number of distinct counter groups are limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343] > * [Number of distinct counters are limited | > https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334] > Although, these limits are configurable, but setting them to higher value > implies increased memory load on AM and job history server. > Now, whether these limits makes sense or not is [debatable | > https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that > Hive doesn't make use of counters features of framework so that it we can > evolve this feature without relying on support from framework. Filesystem > based counter collection is a step in that direction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)