[jira] [Updated] (HIVE-2340) optimize orderby followed by a groupby
[ https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2340: -- Attachment: HIVE-2340.D1209.8.patch navis updated the revision HIVE-2340 [jira] optimize orderby followed by a groupby. Reviewers: JIRA 1. Fixed big of JOIN-RS cases and prevent deduped join converted to skew-join 2. Do not try merging RS with defined number of reducers(ORDER BY, etc) 3. Fixed test results (rolled back some tests) Running test REVISION DETAIL https://reviews.facebook.net/D1209 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java ql/src/test/queries/clientpositive/auto_join26.q ql/src/test/queries/clientpositive/reduce_deduplicate_extended.q ql/src/test/results/clientpositive/auto_join26.q.out ql/src/test/results/clientpositive/cluster.q.out ql/src/test/results/clientpositive/groupby2_map_skew.q.out ql/src/test/results/clientpositive/groupby_cube1.q.out ql/src/test/results/clientpositive/groupby_rollup1.q.out ql/src/test/results/clientpositive/index_bitmap3.q.out ql/src/test/results/clientpositive/index_bitmap_auto.q.out ql/src/test/results/clientpositive/ppd2.q.out ql/src/test/results/clientpositive/ppd_gby_join.q.out ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out ql/src/test/results/clientpositive/semijoin.q.out ql/src/test/results/clientpositive/union24.q.out ql/src/test/results/compiler/plan/join2.q.xml To: JIRA, navis optimize orderby followed by a groupby -- Key: HIVE-2340 URL: https://issues.apache.org/jira/browse/HIVE-2340 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Labels: perfomance Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt, HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, HIVE-2340.D1209.8.patch Before implementing optimizer for JOIN-GBY, try to implement RS-GBY optimizer(cluster-by following group-by). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3935) Extra new line character in output when sequence file is used for storage of a table
Abhinav Chawade created HIVE-3935: - Summary: Extra new line character in output when sequence file is used for storage of a table Key: HIVE-3935 URL: https://issues.apache.org/jira/browse/HIVE-3935 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0, 0.10.0 Environment: Centos 6.3 Reporter: Abhinav Chawade When a select distinct command is issued on empty table which uses sequence file for storage, a new extra line (0x0a) is present in the result set even when table has no data. This output is not consistent with result of same command Hive 0.7.1 and can cause workflows to fail due to wrong record count. Execution on Hive 0.9 and 0.10 hive create table hoge2(col1 string,col2 string) partitioned by (p_part string) stored as sequencefile; hive describe hoge2; OK col1string col2string p_part string Time taken: 0.24 seconds hive select distinct p_part from hoge2; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201301230112_0001, Tracking URL = http://testcluster2-1:50030/jobdetails.jsp?jobid=job_201301230112_0001 Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=maprfs:/// -kill job_201301230112_0001 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2013-01-23 02:50:16,843 Stage-1 map = 0%, reduce = 0% 2013-01-23 02:50:26,897 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:27,905 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:28,911 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:29,919 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:30,925 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:31,933 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:32,939 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:33,945 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 1.8 sec MapReduce Total cumulative CPU time: 1 seconds 800 msec Ended Job = job_201301230112_0001 MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 Cumulative CPU: 1.8 sec MAPRFS Read: 327 MAPRFS Write: 71 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 800 msec OK Time taken: 21.94 seconds Result on Hive 0.7.1 hive select count(distinct p_part) from hoge3; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201210261659_0019, Tracking URL = http://testcluster1-1:50030/jobdetails.jsp?jobid=job_201210261659_0019 Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=maprfs:/// -kill job_201210261659_0019 2013-01-23 21:42:01,787 Stage-1 map = 0%, reduce = 0% 2013-01-23 21:42:07,815 Stage-1 map = 100%, reduce = 0% 2013-01-23 21:42:12,835 Stage-1 map = 100%, reduce = 100% Ended Job = job_201210261659_0019 OK 0 Time taken: 16.637 seconds Underlying Hadoop version for Hive 0.9 is Hadoop 1.0.3 and for Hive 0.7 it is 0.20.203 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3935) New line character in output when sequence file is used for storage and table is empty
[ https://issues.apache.org/jira/browse/HIVE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhinav Chawade updated HIVE-3935: -- Summary: New line character in output when sequence file is used for storage and table is empty (was: Extra new line character in output when sequence file is used for storage of a table) New line character in output when sequence file is used for storage and table is empty -- Key: HIVE-3935 URL: https://issues.apache.org/jira/browse/HIVE-3935 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0, 0.10.0 Environment: Centos 6.3 Reporter: Abhinav Chawade When a select distinct command is issued on empty table which uses sequence file for storage, a new extra line (0x0a) is present in the result set even when table has no data. This output is not consistent with result of same command Hive 0.7.1 and can cause workflows to fail due to wrong record count. Execution on Hive 0.9 and 0.10 hive create table hoge2(col1 string,col2 string) partitioned by (p_part string) stored as sequencefile; hive describe hoge2; OK col1string col2string p_part string Time taken: 0.24 seconds hive select distinct p_part from hoge2; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201301230112_0001, Tracking URL = http://testcluster2-1:50030/jobdetails.jsp?jobid=job_201301230112_0001 Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=maprfs:/// -kill job_201301230112_0001 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2013-01-23 02:50:16,843 Stage-1 map = 0%, reduce = 0% 2013-01-23 02:50:26,897 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:27,905 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:28,911 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:29,919 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:30,925 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:31,933 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:32,939 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2013-01-23 02:50:33,945 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 1.8 sec MapReduce Total cumulative CPU time: 1 seconds 800 msec Ended Job = job_201301230112_0001 MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 Cumulative CPU: 1.8 sec MAPRFS Read: 327 MAPRFS Write: 71 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 800 msec OK Time taken: 21.94 seconds Result on Hive 0.7.1 hive select count(distinct p_part) from hoge3; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201210261659_0019, Tracking URL = http://testcluster1-1:50030/jobdetails.jsp?jobid=job_201210261659_0019 Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=maprfs:/// -kill job_201210261659_0019 2013-01-23 21:42:01,787 Stage-1 map = 0%, reduce = 0% 2013-01-23 21:42:07,815 Stage-1 map = 100%, reduce = 0% 2013-01-23 21:42:12,835 Stage-1 map = 100%, reduce = 100% Ended Job = job_201210261659_0019 OK 0 Time taken: 16.637 seconds Underlying Hadoop version for Hive 0.9 is Hadoop 1.0.3 and for Hive 0.7 it is 0.20.203 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561532#comment-13561532 ] Namit Jain commented on HIVE-3784: -- I got the above query working. The latest patch has the changes. Will start cleaning up, and fixing the patch. de-emphasize mapjoin hint - Key: HIVE-3784 URL: https://issues.apache.org/jira/browse/HIVE-3784 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch hive.auto.convert.join has been around for a long time, and is pretty stable. When mapjoin hint was created, the above parameter did not exist. The only reason for the user to specify a mapjoin currently is if they want it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin. Eventually, that should also go away, but that may take some time to stabilize. There are many rules in SemanticAnalyzer to handle the following trees: ReduceSink - MapJoin Union - MapJoin MapJoin- MapJoin This should not be supported anymore. In any of the above scenarios, the user can get the mapjoin behavior by setting hive.auto.convert.join to true and not specifying the hint. This will simplify the code a lot. What does everyone think ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3672) Support altering partition column type in Hive
[ https://issues.apache.org/jira/browse/HIVE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561538#comment-13561538 ] Namit Jain commented on HIVE-3672: -- [~jingweilu], is it ready for review ? Please refresh, and mark 'Submit Patch' if it is ready for review ? Support altering partition column type in Hive -- Key: HIVE-3672 URL: https://issues.apache.org/jira/browse/HIVE-3672 Project: Hive Issue Type: Improvement Components: CLI, SQL Affects Versions: 0.10.0 Reporter: Jingwei Lu Assignee: Jingwei Lu Attachments: HIVE-3672.1.patch.txt, HIVE-3672.2.patch.txt Original Estimate: 72h Remaining Estimate: 72h Currently, Hive does not allow altering partition column types. As we've discouraged users from using non-string partition column types, this presents a problem for users who want to change there partition columns to be strings, they have to rename their table, create a new table, and copy all the data over. To support this via the CLI, adding a command like ALTER TABLE table_name PARTITION COLUMN (column_name new type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3784: - Attachment: hive.3784.10.patch de-emphasize mapjoin hint - Key: HIVE-3784 URL: https://issues.apache.org/jira/browse/HIVE-3784 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3784.10.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch hive.auto.convert.join has been around for a long time, and is pretty stable. When mapjoin hint was created, the above parameter did not exist. The only reason for the user to specify a mapjoin currently is if they want it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin. Eventually, that should also go away, but that may take some time to stabilize. There are many rules in SemanticAnalyzer to handle the following trees: ReduceSink - MapJoin Union - MapJoin MapJoin- MapJoin This should not be supported anymore. In any of the above scenarios, the user can get the mapjoin behavior by setting hive.auto.convert.join to true and not specifying the hint. This will simplify the code a lot. What does everyone think ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3936) Remote debug failed with hadoop 0.23X, hadoop 2.X
Xie Long created HIVE-3936: -- Summary: Remote debug failed with hadoop 0.23X, hadoop 2.X Key: HIVE-3936 URL: https://issues.apache.org/jira/browse/HIVE-3936 Project: Hive Issue Type: Bug Affects Versions: 0.9.0, 0.8.1, 0.8.0 Reporter: Xie Long Priority: Minor In $HIVE_HOME/bin/hive and $HADOOP_HOME/bin/hadoop, $HADOOP_CLIENT_OPTS is appended to $HADOOP_OPTS, which leads to the problem. hive --debug ERROR: Cannot load this JVM TI agent twice, check your java command line for duplicate jdwp options. Error occurred during initialization of VM agent library failed to init: jdwp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #271
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/ -- [...truncated 5482 lines...] [echo] Project: hbase-handler create-dirs: [echo] Project: pdk [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/pdk [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/pdk/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/pdk/test [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/pdk/test/src [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/pdk/test/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/pdk/test/resources [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/pdk/src/test/resources does not exist. init: [echo] Project: pdk create-dirs: [echo] Project: builtins [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/builtins [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/builtins/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/builtins/test [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/builtins/test/src [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/builtins/test/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/builtins/test/resources [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/builtins/src/test/resources does not exist. init: [echo] Project: builtins jar: [echo] Project: hive create-dirs: [echo] Project: shims [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/shims/src/test/resources does not exist. init: [echo] Project: shims ivy-init-settings: [echo] Project: shims ivy-resolve: [echo] Project: shims [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/ivy/ivysettings.xml [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/zookeeper/zookeeper/3.4.3/zookeeper-3.4.3.jar ... [ivy:resolve] .. (749kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.zookeeper#zookeeper;3.4.3!zookeeper.jar (38ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/thrift/libthrift/0.7.0/libthrift-0.7.0.jar ... [ivy:resolve] ... (294kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.thrift#libthrift;0.7.0!libthrift.jar (29ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/commons-logging/commons-logging/1.0.4/commons-logging-1.0.4.jar ... [ivy:resolve] ... (37kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] commons-logging#commons-logging;1.0.4!commons-logging.jar (13ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/commons-logging/commons-logging-api/1.0.4/commons-logging-api-1.0.4.jar ... [ivy:resolve] .. (25kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] commons-logging#commons-logging-api;1.0.4!commons-logging-api.jar (15ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/codehaus/jackson/jackson-core-asl/1.8.8/jackson-core-asl-1.8.8.jar ... [ivy:resolve] .. (222kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.codehaus.jackson#jackson-core-asl;1.8.8!jackson-core-asl.jar (24ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/codehaus/jackson/jackson-mapper-asl/1.8.8/jackson-mapper-asl-1.8.8.jar ... [ivy:resolve] (652kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.codehaus.jackson#jackson-mapper-asl;1.8.8!jackson-mapper-asl.jar (36ms) [ivy:report] Processing https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/271/artifact/hive/build/ivy/report/org.apache.hive-hive-shims-default.html ivy-retrieve: [echo] Project: shims compile: [echo] Project: shims [echo] Building shims 0.20 build_shims: [echo] Project: shims [echo] Compiling
[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3784: - Attachment: hive.3784.11.patch de-emphasize mapjoin hint - Key: HIVE-3784 URL: https://issues.apache.org/jira/browse/HIVE-3784 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3784.10.patch, hive.3784.11.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch hive.auto.convert.join has been around for a long time, and is pretty stable. When mapjoin hint was created, the above parameter did not exist. The only reason for the user to specify a mapjoin currently is if they want it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin. Eventually, that should also go away, but that may take some time to stabilize. There are many rules in SemanticAnalyzer to handle the following trees: ReduceSink - MapJoin Union - MapJoin MapJoin- MapJoin This should not be supported anymore. In any of the above scenarios, the user can get the mapjoin behavior by setting hive.auto.convert.join to true and not specifying the hint. This will simplify the code a lot. What does everyone think ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3918) Normalize more CRLF line endings
[ https://issues.apache.org/jira/browse/HIVE-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561767#comment-13561767 ] Brock Noland commented on HIVE-3918: +1, this causes problems with git. Normalize more CRLF line endings Key: HIVE-3918 URL: https://issues.apache.org/jira/browse/HIVE-3918 Project: Hive Issue Type: Bug Reporter: Mark Grover Assignee: Mark Grover Attachments: HIVE-3918.1.patch While I tried to fix files that had incompatible line endings in HIVE-3858, I missed some more files. It was most likely because HIVE-3858 was tested by me on Mac OS X but when I started using the github repository on a ubuntu box, I saw there are some more files in the Hive repo that have incorrect line endings. These files get added to the index automatically when a change is made to the local git repo. Once these cosmetic changes get committed, we would have gotten rid of all windows/old Mac OS style line endings in our Hive repository. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3833) object inspectors should be initialized based on partition metadata
[ https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3833: --- Resolution: Fixed Fix Version/s: 0.11.0 Release Note: Rows in partitions are now read using partition schema and than made to comply with table schema, instead of being read directly using table schema. Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Namit! object inspectors should be initialized based on partition metadata --- Key: HIVE-3833 URL: https://issues.apache.org/jira/browse/HIVE-3833 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.11.0 Attachments: hive.3833.10.patch, hive.3833.11.patch, hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, hive.3833.21.patch, hive.3833.22.patch, hive.3833.23.patch, hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch Currently, different partitions can be picked up for the same input split based on the serdes' etc. And, we dont allow to change the schema for LazyColumnarBinarySerDe. Instead of that, different partitions should be part of the same split, only if the partition schemas exactly match. The operator tree object inspectors should be based on the partition schema. That would give greater flexibility and also help using binary serde with rcfile -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-3913) Possible deadlock in ZK lock manager
[ https://issues.apache.org/jira/browse/HIVE-3913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-3913. Resolution: Fixed Fix Version/s: 0.11.0 Committed to trunk. Thanks, Mikhail! Possible deadlock in ZK lock manager Key: HIVE-3913 URL: https://issues.apache.org/jira/browse/HIVE-3913 Project: Hive Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Critical Fix For: 0.11.0 Attachments: D8097.1.patch ZK Hive lock manager can get into a state when the connection is closed, but no reconnection is attempted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata
[ https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562009#comment-13562009 ] Hudson commented on HIVE-3833: -- Integrated in hive-trunk-hadoop1 #41 (See [https://builds.apache.org/job/hive-trunk-hadoop1/41/]) HIVE-3833 : object inspectors should be initialized based on partition metadata (Namit Jain via Ashutosh Chauhan) (Revision 1438111) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1438111 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/ObjectPair.java * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecMapper.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/util/ObjectPair.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestPartition.java * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat10.q * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat11.q * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat12.q * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat13.q * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat14.q * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat8.q * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat9.q * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_1.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_2.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_3.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_4.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_6.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_7.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_8.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin10.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin11.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin12.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin13.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin5.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin7.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin8.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin9.q.out * /hive/trunk/ql/src/test/results/clientpositive/columnstats_partlvl.q.out * /hive/trunk/ql/src/test/results/clientpositive/combine2_hadoop20.q.out * /hive/trunk/ql/src/test/results/clientpositive/filter_join_breaktask.q.out * /hive/trunk/ql/src/test/results/clientpositive/groupby_map_ppr.q.out * /hive/trunk/ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out * /hive/trunk/ql/src/test/results/clientpositive/groupby_ppr.q.out * /hive/trunk/ql/src/test/results/clientpositive/groupby_ppr_multi_distinct.q.out * /hive/trunk/ql/src/test/results/clientpositive/groupby_sort_6.q.out * /hive/trunk/ql/src/test/results/clientpositive/input23.q.out * /hive/trunk/ql/src/test/results/clientpositive/input42.q.out * /hive/trunk/ql/src/test/results/clientpositive/input_part1.q.out * /hive/trunk/ql/src/test/results/clientpositive/input_part2.q.out * /hive/trunk/ql/src/test/results/clientpositive/input_part7.q.out * /hive/trunk/ql/src/test/results/clientpositive/input_part9.q.out * /hive/trunk/ql/src/test/results/clientpositive/join26.q.out * /hive/trunk/ql/src/test/results/clientpositive/join33.q.out * /hive/trunk/ql/src/test/results/clientpositive/join9.q.out * /hive/trunk/ql/src/test/results/clientpositive/join_map_ppr.q.out * /hive/trunk/ql/src/test/results/clientpositive/load_dyn_part8.q.out *
[jira] [Commented] (HIVE-3913) Possible deadlock in ZK lock manager
[ https://issues.apache.org/jira/browse/HIVE-3913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562010#comment-13562010 ] Hudson commented on HIVE-3913: -- Integrated in hive-trunk-hadoop1 #41 (See [https://builds.apache.org/job/hive-trunk-hadoop1/41/]) HIVE-3913 : Possible deadlock in ZK lock manager (Mikhail Bautin via Ashutosh Chauhan) (Revision 1438116) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1438116 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java Possible deadlock in ZK lock manager Key: HIVE-3913 URL: https://issues.apache.org/jira/browse/HIVE-3913 Project: Hive Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Critical Fix For: 0.11.0 Attachments: D8097.1.patch ZK Hive lock manager can get into a state when the connection is closed, but no reconnection is attempted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3918) Normalize more CRLF line endings
[ https://issues.apache.org/jira/browse/HIVE-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562026#comment-13562026 ] Carl Steinbach commented on HIVE-3918: -- +1. Will commit if tests pass. Normalize more CRLF line endings Key: HIVE-3918 URL: https://issues.apache.org/jira/browse/HIVE-3918 Project: Hive Issue Type: Bug Reporter: Mark Grover Assignee: Mark Grover Attachments: HIVE-3918.1.patch While I tried to fix files that had incompatible line endings in HIVE-3858, I missed some more files. It was most likely because HIVE-3858 was tested by me on Mac OS X but when I started using the github repository on a ubuntu box, I saw there are some more files in the Hive repo that have incorrect line endings. These files get added to the index automatically when a change is made to the local git repo. Once these cosmetic changes get committed, we would have gotten rid of all windows/old Mac OS style line endings in our Hive repository. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3931) Add Oracle metastore upgrade script for 0.9 to 10.0
[ https://issues.apache.org/jira/browse/HIVE-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562032#comment-13562032 ] Hudson commented on HIVE-3931: -- Integrated in Hive-trunk-h0.21 #1934 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1934/]) HIVE-3931. Add Oracle metastore upgrade script for 0.9 to 10.0 (Prasad Mujumdar via cws) (Revision 1437778) Result = ABORTED cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1437778 Files : * /hive/trunk/metastore/scripts/upgrade/oracle/upgrade-0.9.0-to-0.10.0.oracle.sql Add Oracle metastore upgrade script for 0.9 to 10.0 --- Key: HIVE-3931 URL: https://issues.apache.org/jira/browse/HIVE-3931 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.11.0 Attachments: HIVE-3931-1.patch The top level Oracle metastore upgrade script for 0.9 to 0.10 is missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3937) Hive Profiler
Pamela Vagata created HIVE-3937: --- Summary: Hive Profiler Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Priority: Minor Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pamela Vagata reassigned HIVE-3937: --- Assignee: Pamela Vagata Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pamela Vagata updated HIVE-3937: Status: Patch Available (was: Open) https://reviews.facebook.net/D8157 Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Attachments: HIVE-3937.1.patch.txt Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pamela Vagata updated HIVE-3937: Attachment: HIVE-3937.1.patch.txt Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Attachments: HIVE-3937.1.patch.txt Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3903) Allow updating bucketing/sorting metadata of a partition through the CLI
[ https://issues.apache.org/jira/browse/HIVE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562168#comment-13562168 ] Kevin Wilfong commented on HIVE-3903: - +1 Allow updating bucketing/sorting metadata of a partition through the CLI Key: HIVE-3903 URL: https://issues.apache.org/jira/browse/HIVE-3903 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Samuel Yuan Attachments: HIVE-3903.1.patch.txt, HIVE-3903.2.patch.txt Right now users can update the bucketing/sorting metadata of a table through the CLI, but not a partition. Use case: Need to merge a partition's files, but it's bucketed/sorted, so want to mark the partition as unbucketed/unsorted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3938) Hive MetaStore sends 1 AddPartitionEvent each for all partitions added with add_partitions().
Mithun Radhakrishnan created HIVE-3938: -- Summary: Hive MetaStore sends 1 AddPartitionEvent each for all partitions added with add_partitions(). Key: HIVE-3938 URL: https://issues.apache.org/jira/browse/HIVE-3938 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan HiveMetaStore::add_partitions() currently adds all partitions specified in one call using a single meta-store transaction. This acts correctly. However, there's one AddPartitionEvent created per partition specified. Ideally, the set of partitions added atomically can be communicated using a single AddPartitionEvent, such that they are consumed together. I'll post a patch that does this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3939) INSERT INTO behaves like INSERT OVERWRITE if the table name referred is not all lowercase
mohan dharmarajan created HIVE-3939: --- Summary: INSERT INTO behaves like INSERT OVERWRITE if the table name referred is not all lowercase Key: HIVE-3939 URL: https://issues.apache.org/jira/browse/HIVE-3939 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 0.9.0 Environment: Windows 2012, HDInsight Reporter: mohan dharmarajan If table referred does not use all lowercase in INSERT INTO command, the data is not appended but overwritten. set hive.exec.dynamic.partition.mode=nonstrict; set hive.exec.dynamic.partition=true; CREATE TABLE test (key int, value string) PARTITIONED BY (ds string); SELECT * FROM test; INSERT INTO TABLE test PARTITION (ds) SELECT key, value, value FROM src; SELECT * FROM test; The following statement works as expected. The data from src is appended to test SELECT * FROM test; INSERT INTO TABLE test PARTITION (ds) SELECT key, value, value FROM src; SELECT * FROM test; The following is copied from the processing log Loading data to table default.test partition (ds=null) Loading partition {ds=1} Loading partition {ds=2} The following statement does not work. Note the table name referred as Test (not test). INSERT INTO behaves like INSERT OVERWRITE SELECT * FROM test; INSERT INTO TABLE Test PARTITION (ds) SELECT key, value, value FROM src; SELECT * FROM test; The following is copied from the processing log Loading data to table default.test partition (ds=null) Moved to trash: hdfs://localhost:8020/hive/warehouse/test/ds=1 Moved to trash: hdfs://localhost:8020/hive/warehouse/test/ds=2 Loading partition {ds=1} Loading partition {ds=2} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3938) Hive MetaStore sends 1 AddPartitionEvent each for all partitions added with add_partitions().
[ https://issues.apache.org/jira/browse/HIVE-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-3938: --- Attachment: HIVE-3938.patch This patch refactors add_partition_core_notxn() such that a single event is sent for an atomic partition-set. Hive MetaStore sends 1 AddPartitionEvent each for all partitions added with add_partitions(). - Key: HIVE-3938 URL: https://issues.apache.org/jira/browse/HIVE-3938 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3938.patch HiveMetaStore::add_partitions() currently adds all partitions specified in one call using a single meta-store transaction. This acts correctly. However, there's one AddPartitionEvent created per partition specified. Ideally, the set of partitions added atomically can be communicated using a single AddPartitionEvent, such that they are consumed together. I'll post a patch that does this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3938) Hive MetaStore sends 1 AddPartitionEvent each for all partitions added with add_partitions().
[ https://issues.apache.org/jira/browse/HIVE-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-3938: --- Status: Patch Available (was: Open) Hive MetaStore sends 1 AddPartitionEvent each for all partitions added with add_partitions(). - Key: HIVE-3938 URL: https://issues.apache.org/jira/browse/HIVE-3938 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3938.patch HiveMetaStore::add_partitions() currently adds all partitions specified in one call using a single meta-store transaction. This acts correctly. However, there's one AddPartitionEvent created per partition specified. Ideally, the set of partitions added atomically can be communicated using a single AddPartitionEvent, such that they are consumed together. I'll post a patch that does this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3938) Hive MetaStore should send a single AddPartitionEvent for atomically added partition-set.
[ https://issues.apache.org/jira/browse/HIVE-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-3938: --- Summary: Hive MetaStore should send a single AddPartitionEvent for atomically added partition-set. (was: Hive MetaStore sends 1 AddPartitionEvent each for all partitions added with add_partitions().) Hive MetaStore should send a single AddPartitionEvent for atomically added partition-set. - Key: HIVE-3938 URL: https://issues.apache.org/jira/browse/HIVE-3938 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3938.patch HiveMetaStore::add_partitions() currently adds all partitions specified in one call using a single meta-store transaction. This acts correctly. However, there's one AddPartitionEvent created per partition specified. Ideally, the set of partitions added atomically can be communicated using a single AddPartitionEvent, such that they are consumed together. I'll post a patch that does this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3940) Track columns accessed in each table in a query
Samuel Yuan created HIVE-3940: - Summary: Track columns accessed in each table in a query Key: HIVE-3940 URL: https://issues.apache.org/jira/browse/HIVE-3940 Project: Hive Issue Type: Task Components: Query Processor Reporter: Samuel Yuan Assignee: Samuel Yuan Priority: Minor Similar to partition access logs, we need to have columns access logs, so later we can build tools/reports to inform users if there are wasted columns in a table to be trimmed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3932) Hive release tarballs don't contain PostGreSQL metastore scripts
[ https://issues.apache.org/jira/browse/HIVE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-3932: -- Attachment: HIVE-3932.1.patch Removes the includes.postgres property so PostGreSQL metastore scripts are included by default (like mysql, derby and oracle). This is essentially a revert of HIVE-2552. Testing done: 1. Verified PostGreSQL scripts were infact being included under {{build/dist/scripts/metastore/upgrade}} 2. Verified that the {{vcs.excludes}} property is still being honored. Hive release tarballs don't contain PostGreSQL metastore scripts Key: HIVE-3932 URL: https://issues.apache.org/jira/browse/HIVE-3932 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-3932.1.patch By means of HIVE-2552, we decided to not include Hive metastore upgrade scripts for PostGreSQL in Hive release tarballs. This was done primarily because the scripts were incomplete at that time. However, since then, starting at least 0.10.0 (maybe 0.9?) PostGreSQL scripts are now being maintained and any changes to the metastore are reflected to them (HIVE-2529). Consequently, there is no reason to not release PostGreSQL scripts. This JIRA plans to disable/remove/ ignore the include.postgres property in build.xml that enables the exclusion of PostGreSQL scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3932) Hive release tarballs don't contain PostgreSQL metastore scripts
[ https://issues.apache.org/jira/browse/HIVE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-3932: -- Summary: Hive release tarballs don't contain PostgreSQL metastore scripts (was: Hive release tarballs don't contain PostGreSQL metastore scripts) Hive release tarballs don't contain PostgreSQL metastore scripts Key: HIVE-3932 URL: https://issues.apache.org/jira/browse/HIVE-3932 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-3932.1.patch By means of HIVE-2552, we decided to not include Hive metastore upgrade scripts for PostGreSQL in Hive release tarballs. This was done primarily because the scripts were incomplete at that time. However, since then, starting at least 0.10.0 (maybe 0.9?) PostGreSQL scripts are now being maintained and any changes to the metastore are reflected to them (HIVE-2529). Consequently, there is no reason to not release PostGreSQL scripts. This JIRA plans to disable/remove/ ignore the include.postgres property in build.xml that enables the exclusion of PostGreSQL scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3932) Hive release tarballs don't contain PostgreSQL metastore scripts
[ https://issues.apache.org/jira/browse/HIVE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-3932: -- Description: By means of HIVE-2552, we decided to not include Hive metastore upgrade scripts for PostgreSQL in Hive release tarballs. This was done primarily because the scripts were incomplete at that time. However, since then, starting at least 0.10.0 (maybe 0.9?) PostgreSQL scripts are now being maintained and any changes to the metastore are reflected to them (HIVE-2529). Consequently, there is no reason to not release PostgreSQL scripts. This JIRA plans to disable/remove/ ignore the include.postgres property in build.xml that enables the exclusion of PostgreSQL scripts. (was: By means of HIVE-2552, we decided to not include Hive metastore upgrade scripts for PostGreSQL in Hive release tarballs. This was done primarily because the scripts were incomplete at that time. However, since then, starting at least 0.10.0 (maybe 0.9?) PostGreSQL scripts are now being maintained and any changes to the metastore are reflected to them (HIVE-2529). Consequently, there is no reason to not release PostGreSQL scripts. This JIRA plans to disable/remove/ ignore the include.postgres property in build.xml that enables the exclusion of PostGreSQL scripts.) Hive release tarballs don't contain PostgreSQL metastore scripts Key: HIVE-3932 URL: https://issues.apache.org/jira/browse/HIVE-3932 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-3932.1.patch By means of HIVE-2552, we decided to not include Hive metastore upgrade scripts for PostgreSQL in Hive release tarballs. This was done primarily because the scripts were incomplete at that time. However, since then, starting at least 0.10.0 (maybe 0.9?) PostgreSQL scripts are now being maintained and any changes to the metastore are reflected to them (HIVE-2529). Consequently, there is no reason to not release PostgreSQL scripts. This JIRA plans to disable/remove/ ignore the include.postgres property in build.xml that enables the exclusion of PostgreSQL scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema
[ https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562323#comment-13562323 ] Ashutosh Chauhan commented on HIVE-3528: +1 running tests. Avro SerDe doesn't handle serializing Nullable types that require access to a Schema Key: HIVE-3528 URL: https://issues.apache.org/jira/browse/HIVE-3528 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Sean Busbey Assignee: Sean Busbey Labels: avro Attachments: HIVE-3528.1.patch.txt, HIVE-3528.2.patch.txt Deserialization properly handles hiding Nullable Avro types, including complex types like record, map, array, etc. However, when Serialization attempts to write out these types it erroneously makes use of the UNION schema that contains NULL and the other type. This results in Schema mis-match errors for Record, Array, Enum, Fixed, and Bytes. Here's a [review board of unit tests that express the problem|https://reviews.apache.org/r/7431/], as well as one that supports the case that it's only when the schema is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema
[ https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3528: --- Assignee: Sean Busbey Avro SerDe doesn't handle serializing Nullable types that require access to a Schema Key: HIVE-3528 URL: https://issues.apache.org/jira/browse/HIVE-3528 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Sean Busbey Assignee: Sean Busbey Labels: avro Attachments: HIVE-3528.1.patch.txt, HIVE-3528.2.patch.txt Deserialization properly handles hiding Nullable Avro types, including complex types like record, map, array, etc. However, when Serialization attempts to write out these types it erroneously makes use of the UNION schema that contains NULL and the other type. This results in Schema mis-match errors for Record, Array, Enum, Fixed, and Bytes. Here's a [review board of unit tests that express the problem|https://reviews.apache.org/r/7431/], as well as one that supports the case that it's only when the schema is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3784: - Attachment: hive.3784.12.patch de-emphasize mapjoin hint - Key: HIVE-3784 URL: https://issues.apache.org/jira/browse/HIVE-3784 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3784.10.patch, hive.3784.11.patch, hive.3784.12.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch hive.auto.convert.join has been around for a long time, and is pretty stable. When mapjoin hint was created, the above parameter did not exist. The only reason for the user to specify a mapjoin currently is if they want it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin. Eventually, that should also go away, but that may take some time to stabilize. There are many rules in SemanticAnalyzer to handle the following trees: ReduceSink - MapJoin Union - MapJoin MapJoin- MapJoin This should not be supported anymore. In any of the above scenarios, the user can get the mapjoin behavior by setting hive.auto.convert.join to true and not specifying the hint. This will simplify the code a lot. What does everyone think ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562374#comment-13562374 ] Namit Jain commented on HIVE-3784: -- Recently, in https://issues.apache.org/jira/browse/HIVE-3633, support was added for sub-query sort-merge joins, where joins could be performed across sub-queries, and each sub-query was transformed into a sort-merge join. This support is being removed, will be added automatically as part of https://issues.apache.org/jira/browse/HIVE-3403 de-emphasize mapjoin hint - Key: HIVE-3784 URL: https://issues.apache.org/jira/browse/HIVE-3784 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3784.10.patch, hive.3784.11.patch, hive.3784.12.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch hive.auto.convert.join has been around for a long time, and is pretty stable. When mapjoin hint was created, the above parameter did not exist. The only reason for the user to specify a mapjoin currently is if they want it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin. Eventually, that should also go away, but that may take some time to stabilize. There are many rules in SemanticAnalyzer to handle the following trees: ReduceSink - MapJoin Union - MapJoin MapJoin- MapJoin This should not be supported anymore. In any of the above scenarios, the user can get the mapjoin behavior by setting hive.auto.convert.join to true and not specifying the hint. This will simplify the code a lot. What does everyone think ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3784: - Attachment: hive.3784.13.patch de-emphasize mapjoin hint - Key: HIVE-3784 URL: https://issues.apache.org/jira/browse/HIVE-3784 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3784.10.patch, hive.3784.11.patch, hive.3784.12.patch, hive.3784.13.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch hive.auto.convert.join has been around for a long time, and is pretty stable. When mapjoin hint was created, the above parameter did not exist. The only reason for the user to specify a mapjoin currently is if they want it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin. Eventually, that should also go away, but that may take some time to stabilize. There are many rules in SemanticAnalyzer to handle the following trees: ReduceSink - MapJoin Union - MapJoin MapJoin- MapJoin This should not be supported anymore. In any of the above scenarios, the user can get the mapjoin behavior by setting hive.auto.convert.join to true and not specifying the hint. This will simplify the code a lot. What does everyone think ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3784: - Attachment: hive.3784.14.patch de-emphasize mapjoin hint - Key: HIVE-3784 URL: https://issues.apache.org/jira/browse/HIVE-3784 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3784.10.patch, hive.3784.11.patch, hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch hive.auto.convert.join has been around for a long time, and is pretty stable. When mapjoin hint was created, the above parameter did not exist. The only reason for the user to specify a mapjoin currently is if they want it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin. Eventually, that should also go away, but that may take some time to stabilize. There are many rules in SemanticAnalyzer to handle the following trees: ReduceSink - MapJoin Union - MapJoin MapJoin- MapJoin This should not be supported anymore. In any of the above scenarios, the user can get the mapjoin behavior by setting hive.auto.convert.join to true and not specifying the hint. This will simplify the code a lot. What does everyone think ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562443#comment-13562443 ] Namit Jain commented on HIVE-3784: -- Running tests after refreshing. [~vinodkv], does it look OK for your usecase ? de-emphasize mapjoin hint - Key: HIVE-3784 URL: https://issues.apache.org/jira/browse/HIVE-3784 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3784.10.patch, hive.3784.11.patch, hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch hive.auto.convert.join has been around for a long time, and is pretty stable. When mapjoin hint was created, the above parameter did not exist. The only reason for the user to specify a mapjoin currently is if they want it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin. Eventually, that should also go away, but that may take some time to stabilize. There are many rules in SemanticAnalyzer to handle the following trees: ReduceSink - MapJoin Union - MapJoin MapJoin- MapJoin This should not be supported anymore. In any of the above scenarios, the user can get the mapjoin behavior by setting hive.auto.convert.join to true and not specifying the hint. This will simplify the code a lot. What does everyone think ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.
[ https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562444#comment-13562444 ] Sho Shimauchi commented on HIVE-3620: - Hi Arup, Current version of Hive fetches only 1000 partitions information at a time. Increasing hive.metastore.batch.retrieve.table.partition.max may solve this issue. This change can cause OOME on your client. If you hit heap error, please increase the heap size of the client and try again. See HIVE-2907 for more details. Drop table using hive CLI throws error when the total number of partition in the table is around 50K. - Key: HIVE-3620 URL: https://issues.apache.org/jira/browse/HIVE-3620 Project: Hive Issue Type: Bug Reporter: Arup Malakar hive drop table load_test_table_2_0; FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timedout FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask The DB used is Oracle and hive had only one table: select COUNT(*) from PARTITIONS; 54839 I can try and play around with the parameter hive.metastore.client.socket.timeout if that is what is being used. But it is 200 seconds as of now, and 200 seconds for a drop table calls seems high already. Thanks, Arup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3941) Implement The Schema Search Path Feature
caofangkun created HIVE-3941: Summary: Implement The Schema Search Path Feature Key: HIVE-3941 URL: https://issues.apache.org/jira/browse/HIVE-3941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor The Schema Search Path http://www.postgresql.org/docs/current/static/ddl-schemas.html hive(myschema) SET search_path TO myschema,default; -- set Schema Search Path hive(myschema) SHOW search_path; myschema default hive(default) show tables; de_src; hive(myschema) show tables; -- in myschema database there is no table named de_src src; src1; hiveselect * from de_src; --this queuery is equivalent to query: select * from default.de_src -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pamela Vagata updated HIVE-3937: Attachment: (was: HIVE-3937.1.patch.txt) Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3937) Hive Profiler
[ https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pamela Vagata updated HIVE-3937: Attachment: HIVE-3937.1.patch.txt Hive Profiler - Key: HIVE-3937 URL: https://issues.apache.org/jira/browse/HIVE-3937 Project: Hive Issue Type: New Feature Reporter: Pamela Vagata Assignee: Pamela Vagata Priority: Minor Attachments: HIVE-3937.1.patch.txt Adding a Hive Profiler implementation which tracks inclusive wall times and call counts of the operators -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3784) de-emphasize mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3784: - Attachment: hive.3784.15.patch de-emphasize mapjoin hint - Key: HIVE-3784 URL: https://issues.apache.org/jira/browse/HIVE-3784 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3784.10.patch, hive.3784.11.patch, hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, hive.3784.15.patch, hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch hive.auto.convert.join has been around for a long time, and is pretty stable. When mapjoin hint was created, the above parameter did not exist. The only reason for the user to specify a mapjoin currently is if they want it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin. Eventually, that should also go away, but that may take some time to stabilize. There are many rules in SemanticAnalyzer to handle the following trees: ReduceSink - MapJoin Union - MapJoin MapJoin- MapJoin This should not be supported anymore. In any of the above scenarios, the user can get the mapjoin behavior by setting hive.auto.convert.join to true and not specifying the hint. This will simplify the code a lot. What does everyone think ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3942) Add UDF month_add and month_sub
caofangkun created HIVE-3942: Summary: Add UDF month_add and month_sub Key: HIVE-3942 URL: https://issues.apache.org/jira/browse/HIVE-3942 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-3942-1.patch hive (default) desc function extended month_add; month_add(start_date, num_months) - Returns the date that is num_months after start_date. Synonyms: month_sub start_date is a string in the format '-MM-dd HH:mm:ss' or '-MM-dd'. num_months is a number. The time part of start_date is ignored. Example: SELECT month_add('2012-04-12', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_add('2012-04-12 11:22:31', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_add(cast('2012-04-12 11:22:31' as timestamp), 1) FROM src LIMIT 1; --Return 2012-05-12 hive (default) desc function extended month_sub; month_sub(start_date, num_months) - Returns the date that is num_months after start_date. Synonyms: month_add start_date is a string in the format '-MM-dd HH:mm:ss' or '-MM-dd'. num_months is a number. The time part of start_date is ignored. Example: SELECT month_sub('2012-04-12', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_sub('2012-04-12 11:22:31', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_sub(cast('2012-04-12 11:22:31' as timestamp), 1) FROM src LIMIT 1; --Return 2012-05-12 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3942) Add UDF month_add and month_sub
[ https://issues.apache.org/jira/browse/HIVE-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-3942: - Attachment: HIVE-3942-1.patch Add UDF month_add and month_sub Key: HIVE-3942 URL: https://issues.apache.org/jira/browse/HIVE-3942 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-3942-1.patch hive (default) desc function extended month_add; month_add(start_date, num_months) - Returns the date that is num_months after start_date. Synonyms: month_sub start_date is a string in the format '-MM-dd HH:mm:ss' or '-MM-dd'. num_months is a number. The time part of start_date is ignored. Example: SELECT month_add('2012-04-12', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_add('2012-04-12 11:22:31', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_add(cast('2012-04-12 11:22:31' as timestamp), 1) FROM src LIMIT 1; --Return 2012-05-12 hive (default) desc function extended month_sub; month_sub(start_date, num_months) - Returns the date that is num_months after start_date. Synonyms: month_add start_date is a string in the format '-MM-dd HH:mm:ss' or '-MM-dd'. num_months is a number. The time part of start_date is ignored. Example: SELECT month_sub('2012-04-12', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_sub('2012-04-12 11:22:31', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_sub(cast('2012-04-12 11:22:31' as timestamp), 1) FROM src LIMIT 1; --Return 2012-05-12 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-3943) Skewed query fails if hdfs path has special characters
[ https://issues.apache.org/jira/browse/HIVE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-3943 started by Gang Tim Liu. Skewed query fails if hdfs path has special characters -- Key: HIVE-3943 URL: https://issues.apache.org/jira/browse/HIVE-3943 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gang Tim Liu Assignee: Gang Tim Liu If partition name has special character like :, query will fail like: FAILED: IllegalArgumentException Pathname /... from ... is not a valid DFS filename. rootcause is skewed map in partition metastore has unescapted path -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2340) optimize orderby followed by a groupby
[ https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2340: Status: Patch Available (was: Open) optimize orderby followed by a groupby -- Key: HIVE-2340 URL: https://issues.apache.org/jira/browse/HIVE-2340 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Labels: perfomance Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt, HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch Before implementing optimizer for JOIN-GBY, try to implement RS-GBY optimizer(cluster-by following group-by). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2340) optimize orderby followed by a groupby
[ https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2340: -- Attachment: HIVE-2340.D1209.9.patch navis updated the revision HIVE-2340 [jira] optimize orderby followed by a groupby. Reviewers: JIRA 1. Fixed test result (miniMR) 2. Added config setting min number of reducer for merged RS REVISION DETAIL https://reviews.facebook.net/D1209 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java conf/hive-default.xml.template ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java ql/src/test/queries/clientpositive/auto_join26.q ql/src/test/queries/clientpositive/reduce_deduplicate.q ql/src/test/queries/clientpositive/reduce_deduplicate_extended.q ql/src/test/results/clientpositive/auto_join26.q.out ql/src/test/results/clientpositive/cluster.q.out ql/src/test/results/clientpositive/groupby2.q.out ql/src/test/results/clientpositive/groupby2_map_skew.q.out ql/src/test/results/clientpositive/groupby_cube1.q.out ql/src/test/results/clientpositive/groupby_rollup1.q.out ql/src/test/results/clientpositive/index_bitmap3.q.out ql/src/test/results/clientpositive/index_bitmap_auto.q.out ql/src/test/results/clientpositive/ppd2.q.out ql/src/test/results/clientpositive/ppd_gby_join.q.out ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out ql/src/test/results/clientpositive/semijoin.q.out ql/src/test/results/clientpositive/union24.q.out ql/src/test/results/compiler/plan/join2.q.xml To: JIRA, navis optimize orderby followed by a groupby -- Key: HIVE-2340 URL: https://issues.apache.org/jira/browse/HIVE-2340 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Labels: perfomance Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt, HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch Before implementing optimizer for JOIN-GBY, try to implement RS-GBY optimizer(cluster-by following group-by). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3943) Skewed query fails if hdfs path has special characters
[ https://issues.apache.org/jira/browse/HIVE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562476#comment-13562476 ] Gang Tim Liu commented on HIVE-3943: patch in https://reviews.facebook.net/D8169 Skewed query fails if hdfs path has special characters -- Key: HIVE-3943 URL: https://issues.apache.org/jira/browse/HIVE-3943 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gang Tim Liu Assignee: Gang Tim Liu If partition name has special character like :, query will fail like: FAILED: IllegalArgumentException Pathname /... from ... is not a valid DFS filename. rootcause is skewed map in partition metastore has unescapted path -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3933) StatsWork of QueryPlan can't be serialized correctly
[ https://issues.apache.org/jira/browse/HIVE-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3933: - Status: Open (was: Patch Available) StatsWork of QueryPlan can't be serialized correctly Key: HIVE-3933 URL: https://issues.apache.org/jira/browse/HIVE-3933 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: yi Priority: Minor Attachments: HIVE-3933.patch QueryPlan is serialized using java.beans.XMLEncoder, but StatsWork of QueryPlan doesn't not follow java bean syntax, so it can't be serialized correctly. java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:240) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1351) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1137) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:948) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3943) Skewed query fails if hdfs path has special characters
[ https://issues.apache.org/jira/browse/HIVE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3943: --- Attachment: HIVE-3943.patch Skewed query fails if hdfs path has special characters -- Key: HIVE-3943 URL: https://issues.apache.org/jira/browse/HIVE-3943 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3943.patch If partition name has special character like :, query will fail like: FAILED: IllegalArgumentException Pathname /... from ... is not a valid DFS filename. rootcause is skewed map in partition metastore has unescapted path -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3943) Skewed query fails if hdfs path has special characters
[ https://issues.apache.org/jira/browse/HIVE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3943: --- Status: Patch Available (was: In Progress) patch is available for review. Skewed query fails if hdfs path has special characters -- Key: HIVE-3943 URL: https://issues.apache.org/jira/browse/HIVE-3943 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3943.patch If partition name has special character like :, query will fail like: FAILED: IllegalArgumentException Pathname /... from ... is not a valid DFS filename. rootcause is skewed map in partition metastore has unescapted path -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3944) Make Accept qfile argument for miniMR tests
Navis created HIVE-3944: --- Summary: Make Accept qfile argument for miniMR tests Key: HIVE-3944 URL: https://issues.apache.org/jira/browse/HIVE-3944 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Currently, miniMR test runs all tests regardless of setting qfile argument. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3944) Make accept qfile argument for miniMR tests
[ https://issues.apache.org/jira/browse/HIVE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3944: Summary: Make accept qfile argument for miniMR tests (was: Make Accept qfile argument for miniMR tests) Make accept qfile argument for miniMR tests --- Key: HIVE-3944 URL: https://issues.apache.org/jira/browse/HIVE-3944 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3944.D8175.1.patch Currently, miniMR test runs all tests regardless of setting qfile argument. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3944) Make accept qfile argument for miniMR tests
[ https://issues.apache.org/jira/browse/HIVE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3944: Status: Patch Available (was: Open) Make accept qfile argument for miniMR tests --- Key: HIVE-3944 URL: https://issues.apache.org/jira/browse/HIVE-3944 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3944.D8175.1.patch Currently, miniMR test runs all tests regardless of setting qfile argument. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3944) Make accept qfile argument for miniMR tests
[ https://issues.apache.org/jira/browse/HIVE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3944: -- Attachment: HIVE-3944.D8175.1.patch navis requested code review of HIVE-3944 [jira] Make accept qfile argument for miniMR tests. Reviewers: JIRA DPAL-1970 Make accept qfile argument for miniMR tests Currently, miniMR test runs all tests regardless of setting qfile argument. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D8175 AFFECTED FILES ant/src/org/apache/hadoop/hive/ant/QTestGenTask.java build-common.xml ql/build.xml MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/19773/ To: JIRA, navis Make accept qfile argument for miniMR tests --- Key: HIVE-3944 URL: https://issues.apache.org/jira/browse/HIVE-3944 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3944.D8175.1.patch Currently, miniMR test runs all tests regardless of setting qfile argument. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira