[jira] Updated: (HIVE-1212) Explicitly say "Hive Internal Error" to ease debugging
[ https://issues.apache.org/jira/browse/HIVE-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1212: - Attachment: HIVE-1212.1.patch > Explicitly say "Hive Internal Error" to ease debugging > -- > > Key: HIVE-1212 > URL: https://issues.apache.org/jira/browse/HIVE-1212 > Project: Hadoop Hive > Issue Type: Improvement >Affects Versions: 0.6.0 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HIVE-1212.1.patch > > > Our users complain that hive fails error messages like "FAILED: Unknown > exception: null". > We should explicitly mention that's an internal error of Hive, and provide > more information (stacktrace) on the screen to ease bug reporting and > debugging. > In other cases, we will still put the detailed information (stacktrace) in > the log, since users should be able to figure out what's wrong with a single > line of message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1212) Explicitly say "Hive Internal Error" to ease debugging
[ https://issues.apache.org/jira/browse/HIVE-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1212: - Attachment: (was: HIVE-1212.1.patch) > Explicitly say "Hive Internal Error" to ease debugging > -- > > Key: HIVE-1212 > URL: https://issues.apache.org/jira/browse/HIVE-1212 > Project: Hadoop Hive > Issue Type: Improvement >Affects Versions: 0.6.0 >Reporter: Zheng Shao >Assignee: Zheng Shao > > Our users complain that hive fails error messages like "FAILED: Unknown > exception: null". > We should explicitly mention that's an internal error of Hive, and provide > more information (stacktrace) on the screen to ease bug reporting and > debugging. > In other cases, we will still put the detailed information (stacktrace) in > the log, since users should be able to figure out what's wrong with a single > line of message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1212) Explicitly say "Hive Internal Error" to ease debugging
[ https://issues.apache.org/jira/browse/HIVE-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1212: - Attachment: HIVE-1212.1.patch This also fixes UDFArgumentException reporting. > Explicitly say "Hive Internal Error" to ease debugging > -- > > Key: HIVE-1212 > URL: https://issues.apache.org/jira/browse/HIVE-1212 > Project: Hadoop Hive > Issue Type: Improvement >Affects Versions: 0.6.0 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HIVE-1212.1.patch > > > Our users complain that hive fails error messages like "FAILED: Unknown > exception: null". > We should explicitly mention that's an internal error of Hive, and provide > more information (stacktrace) on the screen to ease bug reporting and > debugging. > In other cases, we will still put the detailed information (stacktrace) in > the log, since users should be able to figure out what's wrong with a single > line of message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1212) Explicitly say "Hive Internal Error" to ease debugging
[ https://issues.apache.org/jira/browse/HIVE-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1212: - Affects Version/s: 0.6.0 Status: Patch Available (was: Open) > Explicitly say "Hive Internal Error" to ease debugging > -- > > Key: HIVE-1212 > URL: https://issues.apache.org/jira/browse/HIVE-1212 > Project: Hadoop Hive > Issue Type: Improvement >Affects Versions: 0.6.0 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HIVE-1212.1.patch > > > Our users complain that hive fails error messages like "FAILED: Unknown > exception: null". > We should explicitly mention that's an internal error of Hive, and provide > more information (stacktrace) on the screen to ease bug reporting and > debugging. > In other cases, we will still put the detailed information (stacktrace) in > the log, since users should be able to figure out what's wrong with a single > line of message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-224) implement lfu based flushing policy for map side aggregates
[ https://issues.apache.org/jira/browse/HIVE-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841714#action_12841714 ] Zheng Shao commented on HIVE-224: - Hi James, currently we don't have the bandwidth to do this, but I guess it won't be too hard - we just need to use http://java.sun.com/j2se/1.4.2/docs/api/java/util/LinkedHashMap.html (search for LRU). Are you interested in joining force on this? > implement lfu based flushing policy for map side aggregates > --- > > Key: HIVE-224 > URL: https://issues.apache.org/jira/browse/HIVE-224 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > > currently we flush some random set of rows when the map side hash table > approaches memory limits. > we have discussed a strategy of flushing hash table entries that have the > been seen the least number of times (effectively LFU flushing strategy). This > will be very effective at reducing the amount of data sent from map to reduce > step - as well as reduce the chances for any skews. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-224) implement lfu based flushing policy for map side aggregates
[ https://issues.apache.org/jira/browse/HIVE-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841692#action_12841692 ] James Warren commented on HIVE-224: --- think i bumped up against this or a related issue today - is there any plans on incorporating this into a future release? thanks, -James > implement lfu based flushing policy for map side aggregates > --- > > Key: HIVE-224 > URL: https://issues.apache.org/jira/browse/HIVE-224 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > > currently we flush some random set of rows when the map side hash table > approaches memory limits. > we have discussed a strategy of flushing hash table entries that have the > been seen the least number of times (effectively LFU flushing strategy). This > will be very effective at reducing the amount of data sent from map to reduce > step - as well as reduce the chances for any skews. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
storage handlers and HBase integration
Hey folks, In case you're not following the action over at HIVE-705, we're getting close to having HBase integration committed into Hive. I've written up docs here: http://wiki.apache.org/hadoop/Hive/HBaseIntegration http://wiki.apache.org/hadoop/Hive/StorageHandlers (If you happened to read the first draft of the HBaseIntegration doc a few days ago, I've made a lot of updates today to fill out the details on column mapping.) As part of commit, we'll be doing some code reviews within Facebook next week and logging a bunch of followup tasks; if you have any comments on the approach or implementation, please pile on in JIRA. I'll be giving this a quick mention at the Hive user's group later this month, and then a more detailed presentation at the HBase User Group meeting in April. JVS
[jira] Updated: (HIVE-705) Let Hive can analyse hbase's tables
[ https://issues.apache.org/jira/browse/HIVE-705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-705: Attachment: HIVE-705.3.patch HIVE-705.3.patch resolves a conflict with trunk, fixes some serde bugs, and adds more tests. > Let Hive can analyse hbase's tables > --- > > Key: HIVE-705 > URL: https://issues.apache.org/jira/browse/HIVE-705 > Project: Hadoop Hive > Issue Type: New Feature >Affects Versions: 0.6.0 >Reporter: Samuel Guo >Assignee: John Sichi > Fix For: 0.6.0 > > Attachments: hbase-0.19.3-test.jar, hbase-0.19.3.jar, > hbase-0.20.3-test.jar, hbase-0.20.3.jar, HIVE-705.1.patch, HIVE-705.2.patch, > HIVE-705.3.patch, HIVE-705_draft.patch, HIVE-705_revision806905.patch, > HIVE-705_revision883033.patch, zookeeper-3.2.2.jar > > > Add a serde over the hbase's tables, so that hive can analyse the data stored > in hbase easily. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-1194) sorted merge join
[ https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain resolved HIVE-1194. -- Resolution: Fixed Hadoop Flags: [Reviewed] Committed. Thanks Yongqiang > sorted merge join > - > > Key: HIVE-1194 > URL: https://issues.apache.org/jira/browse/HIVE-1194 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: He Yongqiang > Fix For: 0.6.0 > > Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, > hive-1194-2010-3-2.patch, hive-1194-2010-3-3-2.patch, > hive-1194-2010-3-3.patch, hive-1194-2010-3-4.patch > > > If the input tables are sorted on the join key, and a mapjoin is being > performed, it is useful to exploit the sorted properties of the table. > This can lead to substantial cpu savings - this needs to work across bucketed > map joins also. > Since, sorted properties of a table are not enforced currently, a new > parameter can be added to specify to use the sort-merge join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1215) bogus assertion in GroupByOperator.initializeOp
[ https://issues.apache.org/jira/browse/HIVE-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841607#action_12841607 ] Edward Capriolo commented on HIVE-1215: --- Yes you can reproduce this by a simple edit in build common and a q file that does select count(1) from src; {noformat} {noformat} This be easy to add. or > bogus assertion in GroupByOperator.initializeOp > --- > > Key: HIVE-1215 > URL: https://issues.apache.org/jira/browse/HIVE-1215 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: John Sichi >Assignee: Edward Capriolo > Fix For: 0.6.0 > > > export HADOOP_OPTS="-ea" > and then run the following query in Hive: > select count(1) from pokes; > This causes an assertion failure: > Caused by: java.lang.AssertionError > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:161) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:344) > at > org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:143) > ... 10 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1215) bogus assertion in GroupByOperator.initializeOp
[ https://issues.apache.org/jira/browse/HIVE-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo reassigned HIVE-1215: - Assignee: Edward Capriolo > bogus assertion in GroupByOperator.initializeOp > --- > > Key: HIVE-1215 > URL: https://issues.apache.org/jira/browse/HIVE-1215 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: John Sichi >Assignee: Edward Capriolo > Fix For: 0.6.0 > > > export HADOOP_OPTS="-ea" > and then run the following query in Hive: > select count(1) from pokes; > This causes an assertion failure: > Caused by: java.lang.AssertionError > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:161) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:344) > at > org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:143) > ... 10 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1095) Hive in Maven
[ https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841600#action_12841600 ] Gerrit Jansen van Vuuren commented on HIVE-1095: What I said was a bit confusing I've realized :) and what I meant was: There are two parts: -> Commiting (if reviewed and accepted) the HIVE-1095-trunk.patch to the trunk. This is generated against trunk. -> Using the HIVE-1095-0.4.1.patch against the version 0.4.1 of hive to generate the maven artifacts for 0.4.1 hive, are commits allowed for already versioned releases? if so then it would be better to have it committed cause any changes to build.xml, ivy.xml or build-common.xml would mean that the patch needs generation. So the broad scope and idea would be to publish the already released hive versions to the maven repo: 0.3.0 0.4.0 0.4.1 0.5.0 and then have the build in trunk so that when another release is made the maven publishing code is already in the build and its only needed to run ant maven-publish. By writing this I've realized that I probably need to generate the patches for the builds on the other versions of hive also, should I do this and attach to this task? > Hive in Maven > - > > Key: HIVE-1095 > URL: https://issues.apache.org/jira/browse/HIVE-1095 > Project: Hadoop Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 0.6.0 >Reporter: Gerrit Jansen van Vuuren >Priority: Minor > Fix For: 0.6.0 > > Attachments: HIVE-1095-0.4.1.patch, HIVE-1095-Sample.patch, > HIVE-1095-trunk.patch > > > Getting hive into maven main repositories > Documentation on how to do this is on: > http://maven.apache.org/guides/mini/guide-central-repository-upload.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1215) bogus assertion in GroupByOperator.initializeOp
[ https://issues.apache.org/jira/browse/HIVE-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841591#action_12841591 ] John Sichi commented on HIVE-1215: -- We have tests for such queries, and we enable assertions in the junit ant task, but we do not seem to be enabling assertions for the forked JVM's which execute the plan. We should find a way to address this in order to get more coverage for assertions (and also to expose any more bogus ones). > bogus assertion in GroupByOperator.initializeOp > --- > > Key: HIVE-1215 > URL: https://issues.apache.org/jira/browse/HIVE-1215 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: John Sichi > Fix For: 0.6.0 > > > export HADOOP_OPTS="-ea" > and then run the following query in Hive: > select count(1) from pokes; > This causes an assertion failure: > Caused by: java.lang.AssertionError > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:161) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:344) > at > org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:143) > ... 10 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1215) bogus assertion in GroupByOperator.initializeOp
bogus assertion in GroupByOperator.initializeOp --- Key: HIVE-1215 URL: https://issues.apache.org/jira/browse/HIVE-1215 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.5.0 Reporter: John Sichi Fix For: 0.6.0 export HADOOP_OPTS="-ea" and then run the following query in Hive: select count(1) from pokes; This causes an assertion failure: Caused by: java.lang.AssertionError at org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:161) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:344) at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:143) ... 10 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1211) Tapping logs from child processes
[ https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bc Wong updated HIVE-1211: -- Status: Patch Available (was: Open) > Tapping logs from child processes > - > > Key: HIVE-1211 > URL: https://issues.apache.org/jira/browse/HIVE-1211 > Project: Hadoop Hive > Issue Type: Improvement > Components: Logging >Reporter: bc Wong > Attachments: HIVE-1211.1.patch > > > Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to > the parent's stdout/stderr. There is little one can do to to sort out which > log is from which query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1095) Hive in Maven
[ https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841573#action_12841573 ] He Yongqiang commented on HIVE-1095: >>Would somebody be able to apply this patch after review (not commit) Does this mean the patch need to regenerated after any conflicting changes? > Hive in Maven > - > > Key: HIVE-1095 > URL: https://issues.apache.org/jira/browse/HIVE-1095 > Project: Hadoop Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 0.6.0 >Reporter: Gerrit Jansen van Vuuren >Priority: Minor > Fix For: 0.6.0 > > Attachments: HIVE-1095-0.4.1.patch, HIVE-1095-Sample.patch, > HIVE-1095-trunk.patch > > > Getting hive into maven main repositories > Documentation on how to do this is on: > http://maven.apache.org/guides/mini/guide-central-repository-upload.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1200) Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition
[ https://issues.apache.org/jira/browse/HIVE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1200: - Resolution: Fixed Fix Version/s: 0.6.0 0.5.1 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to both trunk and 0.5. Thanks Zheng > Fix CombineHiveInputFormat to work with multi-level of directories in a > single table/partition > -- > > Key: HIVE-1200 > URL: https://issues.apache.org/jira/browse/HIVE-1200 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.5.1, 0.6.0 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 0.5.1, 0.6.0 > > Attachments: HIVE-1200.1.branch-0.5.patch, HIVE-1200.1.patch > > > The CombineHiveInputFormat does not work with multi-level of directories in a > single table/partition, because it uses an exact match logic, instead of the > relativize logic as in MapOperator > {code} > MapOperator.java: > if > (!onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri())) { > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1194) sorted merge join
[ https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841550#action_12841550 ] Namit Jain commented on HIVE-1194: -- +1 looks good - will commit if the tests pass > sorted merge join > - > > Key: HIVE-1194 > URL: https://issues.apache.org/jira/browse/HIVE-1194 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: He Yongqiang > Fix For: 0.6.0 > > Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, > hive-1194-2010-3-2.patch, hive-1194-2010-3-3-2.patch, > hive-1194-2010-3-3.patch, hive-1194-2010-3-4.patch > > > If the input tables are sorted on the join key, and a mapjoin is being > performed, it is useful to exploit the sorted properties of the table. > This can lead to substantial cpu savings - this needs to work across bucketed > map joins also. > Since, sorted properties of a table are not enforced currently, a new > parameter can be added to specify to use the sort-merge join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1194) sorted merge join
[ https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1194: --- Attachment: hive-1194-2010-3-4.patch attached a new patch > sorted merge join > - > > Key: HIVE-1194 > URL: https://issues.apache.org/jira/browse/HIVE-1194 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: He Yongqiang > Fix For: 0.6.0 > > Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, > hive-1194-2010-3-2.patch, hive-1194-2010-3-3-2.patch, > hive-1194-2010-3-3.patch, hive-1194-2010-3-4.patch > > > If the input tables are sorted on the join key, and a mapjoin is being > performed, it is useful to exploit the sorted properties of the table. > This can lead to substantial cpu savings - this needs to work across bucketed > map joins also. > Since, sorted properties of a table are not enforced currently, a new > parameter can be added to specify to use the sort-merge join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1194) sorted merge join
[ https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841442#action_12841442 ] Namit Jain commented on HIVE-1194: -- I know - the log file is correct, but when I run the tests, I get a diff. > sorted merge join > - > > Key: HIVE-1194 > URL: https://issues.apache.org/jira/browse/HIVE-1194 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: He Yongqiang > Fix For: 0.6.0 > > Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, > hive-1194-2010-3-2.patch, hive-1194-2010-3-3-2.patch, hive-1194-2010-3-3.patch > > > If the input tables are sorted on the join key, and a mapjoin is being > performed, it is useful to exploit the sorted properties of the table. > This can lead to substantial cpu savings - this needs to work across bucketed > map joins also. > Since, sorted properties of a table are not enforced currently, a new > parameter can be added to specify to use the sort-merge join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-936) dynamic partitions creation based on values
[ https://issues.apache.org/jira/browse/HIVE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-936: Attachment: (was: dp_design.txt) > dynamic partitions creation based on values > --- > > Key: HIVE-936 > URL: https://issues.apache.org/jira/browse/HIVE-936 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Ning Zhang >Assignee: Ning Zhang > Attachments: dp_design.txt > > > If a Hive table is created as partitioned, DML could only inserted into one > partitioin per query. Ideally partitions should be created on the fly based > on the value of the partition columns. As an example: > {{{ > create table T (a int, b string) partitioned by (ds string); > insert overwrite table T select a, b, ds from S where ds >= '2009-11-01' > and ds <= '2009-11-16'; > }}} > should be able to execute in one DML rather than possibley 16 DML for each > distinct ds values. CTAS and alter table should be able to do the same thing: > {{{ > create table T partitioned by (ds string) as select * from S where ds >= > '2009-11-01' and ds <= '2009-11-16'; > }}} > and > {{{ > create table T(a int, b string, ds string); > insert overwrite table T select * from S where ds >= '2009-11-1' and ds <= > '2009-11-16'; > alter table T partitioned by (ds); > }}} > should all return the same results. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-936) dynamic partitions creation based on values
[ https://issues.apache.org/jira/browse/HIVE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-936: Attachment: dp_design.txt Updated design notes after a group discussion. > dynamic partitions creation based on values > --- > > Key: HIVE-936 > URL: https://issues.apache.org/jira/browse/HIVE-936 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Ning Zhang >Assignee: Ning Zhang > Attachments: dp_design.txt, dp_design.txt > > > If a Hive table is created as partitioned, DML could only inserted into one > partitioin per query. Ideally partitions should be created on the fly based > on the value of the partition columns. As an example: > {{{ > create table T (a int, b string) partitioned by (ds string); > insert overwrite table T select a, b, ds from S where ds >= '2009-11-01' > and ds <= '2009-11-16'; > }}} > should be able to execute in one DML rather than possibley 16 DML for each > distinct ds values. CTAS and alter table should be able to do the same thing: > {{{ > create table T partitioned by (ds string) as select * from S where ds >= > '2009-11-01' and ds <= '2009-11-16'; > }}} > and > {{{ > create table T(a int, b string, ds string); > insert overwrite table T select * from S where ds >= '2009-11-1' and ds <= > '2009-11-16'; > alter table T partitioned by (ds); > }}} > should all return the same results. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1194) sorted merge join
[ https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841414#action_12841414 ] He Yongqiang commented on HIVE-1194: @namit, 498's join results is in the results: 496 val_496 496 val_496 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 498 val_498 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 5 val_5 9 val_9 9 val_9 I will add a automatic check query in the test and upload a new one. > sorted merge join > - > > Key: HIVE-1194 > URL: https://issues.apache.org/jira/browse/HIVE-1194 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: He Yongqiang > Fix For: 0.6.0 > > Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, > hive-1194-2010-3-2.patch, hive-1194-2010-3-3-2.patch, hive-1194-2010-3-3.patch > > > If the input tables are sorted on the join key, and a mapjoin is being > performed, it is useful to exploit the sorted properties of the table. > This can lead to substantial cpu savings - this needs to work across bucketed > map joins also. > Since, sorted properties of a table are not enforced currently, a new > parameter can be added to specify to use the sort-merge join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.