[jira] Updated: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1198: --- Status: Patch Available (was: Open) This patch address this issue as follows: 1. The .checkstyle file's match pattern has been changed. This is done to exclude top-level ant directory from being included in the checkstyle application path. Since this is not a valid source directory, it causes checkstyle to report errors for all matching files within it. 2. The build file - build.xml has been modified to exclude {{ant}} directory from processing under the checkstyle target for the same reason as above. 3. The {{eclipse-templates/.project}} file has been modified to include the {{CheckstyleBuilder}} and {{CheckstyleNature}}. This would automatically enable checkstyle when the project is imported in Eclipse. I tested out the scenario of importing the project in Eclipse without having the checkstyle plugin installed - it did not seem to cause any noticeable problem. 4. The section on {{Developing Hive using Eclipse}} in the README.txt has been updated to state that the user must install Checkstyle plugin in eclipse if not already present before importing the project. When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors. - Key: HIVE-1198 URL: https://issues.apache.org/jira/browse/HIVE-1198 Project: Hadoop Hive Issue Type: Improvement Components: Build Infrastructure Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) Reporter: Arvind Prabhakar Priority: Minor Attachments: HIVE-1198.patch As of now, checkstyle plugin reports all problems as errors. This causes an overwhelming number of errors to show up (3000+) which masks real errors that might be there. Since all the checkstyle violations are not going to be fixed in one shot, it is desirable to lower the severity of checkstyle violations to warnings so that the plugin can be kept enabled. This will encourage developers to spot checkstyle violations in the files they touch and potentially fix them as they go along, along with pointing out violations as they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838508#action_12838508 ] Arvind Prabhakar commented on HIVE-1198: Missed out a change description in my previous comment: 5. Modified the {{checkstyle/checkstyle.xml}} to set the default value of {{severity}} as {{warning}}. When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors. - Key: HIVE-1198 URL: https://issues.apache.org/jira/browse/HIVE-1198 Project: Hadoop Hive Issue Type: Improvement Components: Build Infrastructure Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) Reporter: Arvind Prabhakar Priority: Minor Attachments: HIVE-1198.patch As of now, checkstyle plugin reports all problems as errors. This causes an overwhelming number of errors to show up (3000+) which masks real errors that might be there. Since all the checkstyle violations are not going to be fixed in one shot, it is desirable to lower the severity of checkstyle violations to warnings so that the plugin can be kept enabled. This will encourage developers to spot checkstyle violations in the files they touch and potentially fix them as they go along, along with pointing out violations as they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function
[ https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838512#action_12838512 ] Jerome Boulon commented on HIVE-259: Can someone explain how can I create/populate a new table to be used by the ant test target? Add PERCENTILE aggregate function - Key: HIVE-259 URL: https://issues.apache.org/jira/browse/HIVE-259 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Venky Iyer Assignee: Jerome Boulon Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx Compute atleast 25, 50, 75th percentiles -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Hive unit-test table
Can someone explain how can I create/populate a new table to be used by the ant test target? Thanks in advance, /Jerome.
Hudson build is back to normal : Hive-trunk-h0.20 #199
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.20/199/changes
[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function
[ https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838516#action_12838516 ] Carl Steinbach commented on HIVE-259: - @Jerome: take a look at ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java Add PERCENTILE aggregate function - Key: HIVE-259 URL: https://issues.apache.org/jira/browse/HIVE-259 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Venky Iyer Assignee: Jerome Boulon Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx Compute atleast 25, 50, 75th percentiles -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Hive unit-test table
Take a look at ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java Carl On Thu, Feb 25, 2010 at 12:03 PM, Jerome Boulon jbou...@netflix.com wrote: Can someone explain how can I create/populate a new table to be used by the ant test target? Thanks in advance, /Jerome.
[jira] Updated: (HIVE-1137) build references IVY_HOME incorrectly
[ https://issues.apache.org/jira/browse/HIVE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1137: - Resolution: Fixed Release Note: HIVE-1137. Fix build.xml for references to IVY_HOME. (Carl Steinbach via zshao) Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Carl! build references IVY_HOME incorrectly - Key: HIVE-1137 URL: https://issues.apache.org/jira/browse/HIVE-1137 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.6.0 Reporter: John Sichi Assignee: Carl Steinbach Fix For: 0.6.0 Attachments: HIVE-1137.patch The build references env.IVY_HOME, but doesn't actually import env as it should (via property environment=env/). It's not clear what the IVY_HOME reference is for since the build doesn't even use ivy.home (instead, it installs under the build/ivy directory). It looks like someone copied bits and pieces from the Automatically section here: http://ant.apache.org/ivy/history/latest-milestone/install.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1197) create a new input format where a mapper spans a file
[ https://issues.apache.org/jira/browse/HIVE-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain reassigned HIVE-1197: Assignee: Siying Dong (was: Namit Jain) create a new input format where a mapper spans a file - Key: HIVE-1197 URL: https://issues.apache.org/jira/browse/HIVE-1197 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Siying Dong Fix For: 0.6.0 This will be needed for Sort merge joins. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1193) ensure sorting properties for a table
[ https://issues.apache.org/jira/browse/HIVE-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain reassigned HIVE-1193: Assignee: Namit Jain ensure sorting properties for a table - Key: HIVE-1193 URL: https://issues.apache.org/jira/browse/HIVE-1193 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.6.0 If a table is sorted, and data is being inserted into that - currently, we dont make sure that data is sorted. That might be useful some downstream operations. This cannot be made the default due to backward compatibility, but an option can be added for the same -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1193) ensure sorting properties for a table
[ https://issues.apache.org/jira/browse/HIVE-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1193: - Status: Patch Available (was: Open) ensure sorting properties for a table - Key: HIVE-1193 URL: https://issues.apache.org/jira/browse/HIVE-1193 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.6.0 Attachments: hive.1193.1.patch If a table is sorted, and data is being inserted into that - currently, we dont make sure that data is sorted. That might be useful some downstream operations. This cannot be made the default due to backward compatibility, but an option can be added for the same -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1193) ensure sorting properties for a table
[ https://issues.apache.org/jira/browse/HIVE-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1193: - Attachment: hive.1193.1.patch ensure sorting properties for a table - Key: HIVE-1193 URL: https://issues.apache.org/jira/browse/HIVE-1193 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.6.0 Attachments: hive.1193.1.patch If a table is sorted, and data is being inserted into that - currently, we dont make sure that data is sorted. That might be useful some downstream operations. This cannot be made the default due to backward compatibility, but an option can be added for the same -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1032) Better Error Messages for Execution Errors
[ https://issues.apache.org/jira/browse/HIVE-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1032: Attachment: HIVE-1032.6.patch * Fixed checkstyle issues Better Error Messages for Execution Errors -- Key: HIVE-1032 URL: https://issues.apache.org/jira/browse/HIVE-1032 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: Paul Yang Assignee: Paul Yang Attachments: HIVE-1032.1.patch, HIVE-1032.2.patch, HIVE-1032.3.patch, HIVE-1032.4.patch, HIVE-1032.5.patch, HIVE-1032.6.patch Three common errors that occur during execution are: 1. Map-side group-by causing an out of memory exception due to large aggregation hash tables 2. ScriptOperator failing due to the user's script throwing an exception or otherwise returning a non-zero error code 3. Incorrectly specifying the join order of small and large tables, causing the large table to be loaded into memory and producing an out of memory exception. These errors are typically discovered by manually examining the error log files of the failed task. This task proposes to create a feature that would automatically read the error logs and output a probable cause and solution to the command line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1032) Better Error Messages for Execution Errors
[ https://issues.apache.org/jira/browse/HIVE-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1032: Status: Patch Available (was: Open) Better Error Messages for Execution Errors -- Key: HIVE-1032 URL: https://issues.apache.org/jira/browse/HIVE-1032 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: Paul Yang Assignee: Paul Yang Attachments: HIVE-1032.1.patch, HIVE-1032.2.patch, HIVE-1032.3.patch, HIVE-1032.4.patch, HIVE-1032.5.patch, HIVE-1032.6.patch Three common errors that occur during execution are: 1. Map-side group-by causing an out of memory exception due to large aggregation hash tables 2. ScriptOperator failing due to the user's script throwing an exception or otherwise returning a non-zero error code 3. Incorrectly specifying the join order of small and large tables, causing the large table to be loaded into memory and producing an out of memory exception. These errors are typically discovered by manually examining the error log files of the failed task. This task proposes to create a feature that would automatically read the error logs and output a probable cause and solution to the command line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1200) Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition
Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition -- Key: HIVE-1200 URL: https://issues.apache.org/jira/browse/HIVE-1200 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.5.1, 0.6.0 Reporter: Zheng Shao Assignee: Zheng Shao The CombineHiveInputFormat does not work with multi-level of directories in a single table/partition, because it uses an exact match logic, instead of the relativize logic as in MapOperator {code} MapOperator.java: if (!onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri())) { {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1032) Better Error Messages for Execution Errors
[ https://issues.apache.org/jira/browse/HIVE-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1032: - Resolution: Fixed Fix Version/s: 0.6.0 Release Note: HIVE-1032. Better Error Messages for Execution Errors. (Paul Yang via zshao) Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Paul! Better Error Messages for Execution Errors -- Key: HIVE-1032 URL: https://issues.apache.org/jira/browse/HIVE-1032 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: Paul Yang Assignee: Paul Yang Fix For: 0.6.0 Attachments: HIVE-1032.1.patch, HIVE-1032.2.patch, HIVE-1032.3.patch, HIVE-1032.4.patch, HIVE-1032.5.patch, HIVE-1032.6.patch Three common errors that occur during execution are: 1. Map-side group-by causing an out of memory exception due to large aggregation hash tables 2. ScriptOperator failing due to the user's script throwing an exception or otherwise returning a non-zero error code 3. Incorrectly specifying the join order of small and large tables, causing the large table to be loaded into memory and producing an out of memory exception. These errors are typically discovered by manually examining the error log files of the failed task. This task proposes to create a feature that would automatically read the error logs and output a probable cause and solution to the command line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1193) ensure sorting properties for a table
[ https://issues.apache.org/jira/browse/HIVE-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838628#action_12838628 ] He Yongqiang commented on HIVE-1193: Looks good. Will test. ensure sorting properties for a table - Key: HIVE-1193 URL: https://issues.apache.org/jira/browse/HIVE-1193 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.6.0 Attachments: hive.1193.1.patch If a table is sorted, and data is being inserted into that - currently, we dont make sure that data is sorted. That might be useful some downstream operations. This cannot be made the default due to backward compatibility, but an option can be added for the same -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1201) Add a python command-line interface for Hive
Add a python command-line interface for Hive Key: HIVE-1201 URL: https://issues.apache.org/jira/browse/HIVE-1201 Project: Hadoop Hive Issue Type: New Feature Reporter: Zheng Shao Assignee: Venky Iyer Venky has a nice python command-line interface for Hive. It uses thrift API to talk with metastore. It uses hadoop command line to submit jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-474) Support for distinct selection on two or more columns
[ https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838683#action_12838683 ] Liu commented on HIVE-474: -- We have implemented this feature using union type, as metioned as A2 by Zheng. Support for distinct selection on two or more columns - Key: HIVE-474 URL: https://issues.apache.org/jira/browse/HIVE-474 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Alexis Rondeau The ability to select distinct several, individual columns as by example: select count(distinct user), count(distinct session) from actions; Currently returns the following failure: FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns not Supported user -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1193) ensure sorting properties for a table
[ https://issues.apache.org/jira/browse/HIVE-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1193: --- Resolution: Fixed Release Note: HIVE-1193. ensure sorting properties for a table. Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed! Thanks Namit! ensure sorting properties for a table - Key: HIVE-1193 URL: https://issues.apache.org/jira/browse/HIVE-1193 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.6.0 Attachments: hive.1193.1.patch If a table is sorted, and data is being inserted into that - currently, we dont make sure that data is sorted. That might be useful some downstream operations. This cannot be made the default due to backward compatibility, but an option can be added for the same -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (HIVE-474) Support for distinct selection on two or more columns
Hi Liu, How to implement to support for distinct selection on two or more columns? Regards Jian 2010/2/26 Liu (JIRA) j...@apache.org [ https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838683#action_12838683] Liu commented on HIVE-474: -- We have implemented this feature using union type, as metioned as A2 by Zheng. Support for distinct selection on two or more columns - Key: HIVE-474 URL: https://issues.apache.org/jira/browse/HIVE-474 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Alexis Rondeau The ability to select distinct several, individual columns as by example: select count(distinct user), count(distinct session) from actions; Currently returns the following failure: FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns not Supported user -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. -- Hadoop Forum: http://bbs.hadoopor.com
[jira] Updated: (HIVE-1200) Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition
[ https://issues.apache.org/jira/browse/HIVE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1200: - Attachment: HIVE-1200.1.branch-0.5.patch HIVE-1200.1.patch Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition -- Key: HIVE-1200 URL: https://issues.apache.org/jira/browse/HIVE-1200 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.5.1, 0.6.0 Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HIVE-1200.1.branch-0.5.patch, HIVE-1200.1.patch The CombineHiveInputFormat does not work with multi-level of directories in a single table/partition, because it uses an exact match logic, instead of the relativize logic as in MapOperator {code} MapOperator.java: if (!onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri())) { {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1200) Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition
[ https://issues.apache.org/jira/browse/HIVE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1200: - Status: Patch Available (was: Open) Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition -- Key: HIVE-1200 URL: https://issues.apache.org/jira/browse/HIVE-1200 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.5.1, 0.6.0 Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HIVE-1200.1.branch-0.5.patch, HIVE-1200.1.patch The CombineHiveInputFormat does not work with multi-level of directories in a single table/partition, because it uses an exact match logic, instead of the relativize logic as in MapOperator {code} MapOperator.java: if (!onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri())) { {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function
[ https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838718#action_12838718 ] Zheng Shao commented on HIVE-259: - Hi Jerome, using ArrayListInteger won't cause unnecessary Object creation. We will just create a single ArrayListInteger and use it forever. Does that make sense? Add PERCENTILE aggregate function - Key: HIVE-259 URL: https://issues.apache.org/jira/browse/HIVE-259 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Venky Iyer Assignee: Jerome Boulon Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx Compute atleast 25, 50, 75th percentiles -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function
[ https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838735#action_12838735 ] Todd Lipcon commented on HIVE-259: -- Doesn't the autoboxing of Integer types actually allocate objects? I think JVM only flyweights integers for very small ones (iirc only from -127 to 128) Add PERCENTILE aggregate function - Key: HIVE-259 URL: https://issues.apache.org/jira/browse/HIVE-259 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Venky Iyer Assignee: Jerome Boulon Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx Compute atleast 25, 50, 75th percentiles -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1193) ensure sorting properties for a table
[ https://issues.apache.org/jira/browse/HIVE-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838737#action_12838737 ] Zheng Shao commented on HIVE-1193: -- Can we have some more description on the JIRA? The patch contains 2 properties: enforceBucketing and enforceSorting. But I don't see it from the JIRA. 1. How do we make sure that the data is bucketed / sorted? By adding an additional map-reduce job? 2. What if the user already specified CLUSTER BY key in his query? 3. Do we disable merging of small files when we do this? ensure sorting properties for a table - Key: HIVE-1193 URL: https://issues.apache.org/jira/browse/HIVE-1193 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.6.0 Attachments: hive.1193.1.patch If a table is sorted, and data is being inserted into that - currently, we dont make sure that data is sorted. That might be useful some downstream operations. This cannot be made the default due to backward compatibility, but an option can be added for the same -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1202) Unknown exception : null while join
Unknown exception : null while join - Key: HIVE-1202 URL: https://issues.apache.org/jira/browse/HIVE-1202 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.4.1 Environment: hive-0.4.1 hadoop 0.19.1 Reporter: Mafish Fix For: 0.4.1 Hive throws Unknown exception : null with query: select * from ( select name from classes ) a join classes b where a.name b.number After tracing the code, I found this bug will occur with following conditions: 1. It is join operation. 2. At least one of the source of join is physical table (right side in above case). 3. With where condition and condition(s) of where clause must include columns from both side of join (a.name and b.number in case) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1202) Unknown exception : null while join
[ https://issues.apache.org/jira/browse/HIVE-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838743#action_12838743 ] Mafish commented on HIVE-1202: -- The call stack is: org.apache.hadoop.hive.ql.session.SessionState$LogHelper.printError(SessionState.java:279) - FAILED: Unknown exception : null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.QBMetaData.getTableForAlias(QBMetaData.java:76) at org.apache.hadoop.hive.ql.parse.ASTPartitionPruner.getTableColumnDesc(ASTPartitionPruner.java:298) at org.apache.hadoop.hive.ql.parse.ASTPartitionPruner.genExprNodeDesc(ASTPartitionPruner.java:220) at org.apache.hadoop.hive.ql.parse.ASTPartitionPruner.genExprNodeDesc(ASTPartitionPruner.java:234) at org.apache.hadoop.hive.ql.parse.ASTPartitionPruner.genExprNodeDesc(ASTPartitionPruner.java:234) at org.apache.hadoop.hive.ql.parse.ASTPartitionPruner.addExpression(ASTPartitionPruner.java:397) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPartitionPruners(SemanticAnalyzer.java:624) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:4440) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:76) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:249) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:281) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:165) at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) This bug occurs while hive tries to prune table b, but it takes columns in where clauses. Bug there also exists columns of table a. Thus, hive fails to find column name in table b. Unknown exception : null while join - Key: HIVE-1202 URL: https://issues.apache.org/jira/browse/HIVE-1202 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.4.1 Environment: hive-0.4.1 hadoop 0.19.1 Reporter: Mafish Fix For: 0.4.1 Hive throws Unknown exception : null with query: select * from ( select name from classes ) a join classes b where a.name b.number After tracing the code, I found this bug will occur with following conditions: 1. It is join operation. 2. At least one of the source of join is physical table (right side in above case). 3. With where condition and condition(s) of where clause must include columns from both side of join (a.name and b.number in case) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.