[jira] Updated: (HIVE-1835) Better auto-complete for Hive
[ https://issues.apache.org/jira/browse/HIVE-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1835: -- Attachment: HIVE-1835.3.patch > Better auto-complete for Hive > - > > Key: HIVE-1835 > URL: https://issues.apache.org/jira/browse/HIVE-1835 > Project: Hive > Issue Type: New Feature > Components: CLI >Reporter: Paul Butler >Assignee: Paul Butler >Priority: Minor > Attachments: HIVE-1835.2.patch, HIVE-1835.3.patch, HIVE-1835.patch > > > - Add functions and keywords to auto-complete list > - Make Hive auto-complete aware of Hive delimiters (eg. whitespace, > parentheses) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969878#action_12969878 ] Paul Butler commented on HIVE-1648: --- Added an SVN patch (5) that applies to latest. > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.4.patch, > HIVE-1648.5.patch, HIVE-1648.patch, hive-1648.svn.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1648: -- Attachment: HIVE-1648.5.patch > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.4.patch, > HIVE-1648.5.patch, HIVE-1648.patch, hive-1648.svn.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1835) Better auto-complete for Hive
[ https://issues.apache.org/jira/browse/HIVE-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1835: -- Attachment: HIVE-1835.2.patch Fixed missing file > Better auto-complete for Hive > - > > Key: HIVE-1835 > URL: https://issues.apache.org/jira/browse/HIVE-1835 > Project: Hive > Issue Type: New Feature > Components: CLI >Reporter: Paul Butler >Assignee: Paul Butler >Priority: Minor > Attachments: HIVE-1835.2.patch, HIVE-1835.patch > > > - Add functions and keywords to auto-complete list > - Make Hive auto-complete aware of Hive delimiters (eg. whitespace, > parentheses) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1835) Better auto-complete for Hive
[ https://issues.apache.org/jira/browse/HIVE-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1835: -- Status: Patch Available (was: Open) > Better auto-complete for Hive > - > > Key: HIVE-1835 > URL: https://issues.apache.org/jira/browse/HIVE-1835 > Project: Hive > Issue Type: New Feature > Components: CLI >Reporter: Paul Butler >Assignee: Paul Butler >Priority: Minor > Attachments: HIVE-1835.patch > > > - Add functions and keywords to auto-complete list > - Make Hive auto-complete aware of Hive delimiters (eg. whitespace, > parentheses) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1763) drop table (or view) should issue warning if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1763: -- Status: Patch Available (was: Open) > drop table (or view) should issue warning if table doesn't exist > > > Key: HIVE-1763 > URL: https://issues.apache.org/jira/browse/HIVE-1763 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: dan f >Assignee: Paul Butler >Priority: Minor > Attachments: HIVE-1763.patch > > > drop table reports "OK" even if the table doesn't exist. Better to report > something like mysql's "Unknown table 'foo'" so that, e.g., unwanted tables > (especially ones with names prone to typos) don't persist. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1648: -- Status: Patch Available (was: Open) > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.4.patch, > HIVE-1648.patch, hive-1648.svn.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1835) Better auto-complete for Hive
[ https://issues.apache.org/jira/browse/HIVE-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1835: -- Attachment: HIVE-1835.patch > Better auto-complete for Hive > - > > Key: HIVE-1835 > URL: https://issues.apache.org/jira/browse/HIVE-1835 > Project: Hive > Issue Type: New Feature > Components: CLI >Reporter: Paul Butler >Assignee: Paul Butler >Priority: Minor > Attachments: HIVE-1835.patch > > > - Add functions and keywords to auto-complete list > - Make Hive auto-complete aware of Hive delimiters (eg. whitespace, > parentheses) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1835) Better auto-complete for Hive
Better auto-complete for Hive - Key: HIVE-1835 URL: https://issues.apache.org/jira/browse/HIVE-1835 Project: Hive Issue Type: New Feature Components: CLI Reporter: Paul Butler Assignee: Paul Butler Priority: Minor - Add functions and keywords to auto-complete list - Make Hive auto-complete aware of Hive delimiters (eg. whitespace, parentheses) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966690#action_12966690 ] Paul Butler commented on HIVE-1648: --- The three-way join test was added to piggyback_join.q.out. No other new tests were added. I also changed the tests to create new tables and use SHOW TABLE EXTENDED, which I modified to print numRows if available. > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.4.patch, > HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1763) drop table (or view) should issue warning if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1763: -- Assignee: Paul Butler > drop table (or view) should issue warning if table doesn't exist > > > Key: HIVE-1763 > URL: https://issues.apache.org/jira/browse/HIVE-1763 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: dan f >Assignee: Paul Butler >Priority: Minor > Attachments: HIVE-1763.patch > > > drop table reports "OK" even if the table doesn't exist. Better to report > something like mysql's "Unknown table 'foo'" so that, e.g., unwanted tables > (especially ones with names prone to typos) don't persist. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1763) drop table (or view) should issue warning if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966367#action_12966367 ] Paul Butler commented on HIVE-1763: --- I was concerned about breaking DROP TABLE's idempotence, so rather than throwing an exception I just print the error to the console. If someone can suggest a better approach I'll do it. > drop table (or view) should issue warning if table doesn't exist > > > Key: HIVE-1763 > URL: https://issues.apache.org/jira/browse/HIVE-1763 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: dan f >Priority: Minor > Attachments: HIVE-1763.patch > > > drop table reports "OK" even if the table doesn't exist. Better to report > something like mysql's "Unknown table 'foo'" so that, e.g., unwanted tables > (especially ones with names prone to typos) don't persist. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1763) drop table (or view) should issue warning if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1763: -- Attachment: HIVE-1763.patch > drop table (or view) should issue warning if table doesn't exist > > > Key: HIVE-1763 > URL: https://issues.apache.org/jira/browse/HIVE-1763 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: dan f >Priority: Minor > Attachments: HIVE-1763.patch > > > drop table reports "OK" even if the table doesn't exist. Better to report > something like mysql's "Unknown table 'foo'" so that, e.g., unwanted tables > (especially ones with names prone to typos) don't persist. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966354#action_12966354 ] Paul Butler commented on HIVE-1648: --- Changes made. Note that subqueries are not piggybacked, but tests are there to make sure they still run when hive.stats.autogather=true. > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.4.patch, > HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1648: -- Attachment: HIVE-1648.4.patch > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.4.patch, > HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935551#action_12935551 ] Paul Butler commented on HIVE-1648: --- Namit, it looks like show table extended like ``; doesn't print the number of rows. Unless there's a way to make it do that, I'll have to stick with desc extended. I sent you an email for clarification on the ConditionalTasks also. > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1809) Hive comparison operators are broken for NaN values
[ https://issues.apache.org/jira/browse/HIVE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1809: -- Attachment: HIVE-1809.patch > Hive comparison operators are broken for NaN values > --- > > Key: HIVE-1809 > URL: https://issues.apache.org/jira/browse/HIVE-1809 > Project: Hive > Issue Type: Bug >Reporter: Paul Butler >Assignee: Paul Butler > Attachments: HIVE-1809.patch > > > Comparisons between NaN values and doubles do not work as expected: > hive> select 'NaN' = 4.3 from data_one limit 1; > Total MapReduce jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Execution log at: > /tmp/pbutler/pbutler_20101123145656_d23f9b77-8907-4ed3-aef9-8b99a1cc3138.log > Job running in-process (local Hadoop) > 2010-11-23 14:56:40,488 null map = 100%, reduce = 0% > Ended Job = job_local_0001 > OK > true > Time taken: 9.47 seconds > hive> select 4 <> 'NaN' from data_one limit 1; > Total MapReduce jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Execution log at: > /tmp/pbutler/pbutler_20101123145858_0d243ac2-f745-4e25-9a38-509bef3bb370.log > Job running in-process (local Hadoop) > 2010-11-23 14:58:45,689 null map = 100%, reduce = 0% > Ended Job = job_local_0001 > OK > false > Time taken: 3.938 seconds -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1809) Hive comparison operators are broken for NaN values
[ https://issues.apache.org/jira/browse/HIVE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1809: -- Status: Patch Available (was: Open) > Hive comparison operators are broken for NaN values > --- > > Key: HIVE-1809 > URL: https://issues.apache.org/jira/browse/HIVE-1809 > Project: Hive > Issue Type: Bug >Reporter: Paul Butler >Assignee: Paul Butler > Attachments: HIVE-1809.patch > > > Comparisons between NaN values and doubles do not work as expected: > hive> select 'NaN' = 4.3 from data_one limit 1; > Total MapReduce jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Execution log at: > /tmp/pbutler/pbutler_20101123145656_d23f9b77-8907-4ed3-aef9-8b99a1cc3138.log > Job running in-process (local Hadoop) > 2010-11-23 14:56:40,488 null map = 100%, reduce = 0% > Ended Job = job_local_0001 > OK > true > Time taken: 9.47 seconds > hive> select 4 <> 'NaN' from data_one limit 1; > Total MapReduce jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Execution log at: > /tmp/pbutler/pbutler_20101123145858_0d243ac2-f745-4e25-9a38-509bef3bb370.log > Job running in-process (local Hadoop) > 2010-11-23 14:58:45,689 null map = 100%, reduce = 0% > Ended Job = job_local_0001 > OK > false > Time taken: 3.938 seconds -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1809) Hive comparison operators are broken for NaN values
Hive comparison operators are broken for NaN values --- Key: HIVE-1809 URL: https://issues.apache.org/jira/browse/HIVE-1809 Project: Hive Issue Type: Bug Reporter: Paul Butler Assignee: Paul Butler Comparisons between NaN values and doubles do not work as expected: hive> select 'NaN' = 4.3 from data_one limit 1; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Execution log at: /tmp/pbutler/pbutler_20101123145656_d23f9b77-8907-4ed3-aef9-8b99a1cc3138.log Job running in-process (local Hadoop) 2010-11-23 14:56:40,488 null map = 100%, reduce = 0% Ended Job = job_local_0001 OK true Time taken: 9.47 seconds hive> select 4 <> 'NaN' from data_one limit 1; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Execution log at: /tmp/pbutler/pbutler_20101123145858_0d243ac2-f745-4e25-9a38-509bef3bb370.log Job running in-process (local Hadoop) 2010-11-23 14:58:45,689 null map = 100%, reduce = 0% Ended Job = job_local_0001 OK false Time taken: 3.938 seconds -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-138) Provide option to export a HEADER
[ https://issues.apache.org/jira/browse/HIVE-138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-138: - Attachment: HIVE-138.patch > Provide option to export a HEADER > - > > Key: HIVE-138 > URL: https://issues.apache.org/jira/browse/HIVE-138 > Project: Hive > Issue Type: Improvement > Components: Clients, Query Processor >Reporter: Adam Kramer >Priority: Minor > Attachments: HIVE-138.patch > > > When writing data to directories or files for later analysis, or when > exploring data in the hive CLI with raw SELECT statements, it'd be great if > we could get a "header" or something so we know which columns our output > comes from. Any chance this is easy to add? Just print the column names (or > formula used to generate them) in the first row? > SELECT foo.* WITH HEADER FROM some_table foo limit 3; > col1col2col3 > 1 9 6 > 7 5 0 > 7 5 3 > SELECT f.col1-f.col2, col3 WITH HEADER FROM some_table foo limit 3; > f.col1-f.col2 col3 > -8 6 > 2 0 > 2 3 > ...etc -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-138) Provide option to export a HEADER
[ https://issues.apache.org/jira/browse/HIVE-138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-138: - Assignee: Paul Butler Status: Patch Available (was: Open) > Provide option to export a HEADER > - > > Key: HIVE-138 > URL: https://issues.apache.org/jira/browse/HIVE-138 > Project: Hive > Issue Type: Improvement > Components: Clients, Query Processor >Reporter: Adam Kramer >Assignee: Paul Butler >Priority: Minor > Attachments: HIVE-138.patch > > > When writing data to directories or files for later analysis, or when > exploring data in the hive CLI with raw SELECT statements, it'd be great if > we could get a "header" or something so we know which columns our output > comes from. Any chance this is easy to add? Just print the column names (or > formula used to generate them) in the first row? > SELECT foo.* WITH HEADER FROM some_table foo limit 3; > col1col2col3 > 1 9 6 > 7 5 0 > 7 5 3 > SELECT f.col1-f.col2, col3 WITH HEADER FROM some_table foo limit 3; > f.col1-f.col2 col3 > -8 6 > 2 0 > 2 3 > ...etc -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1648: -- Attachment: HIVE-1648.3.patch Added unit tests and fixed some issues with partitions. > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930326#action_12930326 ] Paul Butler commented on HIVE-1648: --- I get a bunch of tests failing when I build the latest trunk, even without applying my patch. I'm trying to figure out what's wrong with those first. > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928836#action_12928836 ] Paul Butler commented on HIVE-1648: --- Attached a new patch which applies on top of HIVE-1750. Also fixed the limits as suggested. Working on unit tests now. -- Paul > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1648: -- Attachment: HIVE-1648.2.patch > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1648: -- Attachment: hadoop-6974.2.patch > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1648: -- Attachment: (was: hadoop-6974.2.patch) > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang >Assignee: Paul Butler > Attachments: HIVE-1648.2.patch, HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1648) Automatically gathering stats when reading a table/partition
[ https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1648: -- Attachment: HIVE-1648.patch > Automatically gathering stats when reading a table/partition > > > Key: HIVE-1648 > URL: https://issues.apache.org/jira/browse/HIVE-1648 > Project: Hive > Issue Type: Sub-task >Reporter: Ning Zhang > Attachments: HIVE-1648.patch > > > HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to > gathering stats. This requires additional scan of the data. Stats gathering > can be piggy-backed on TableScanOperator whenever a table/partition is > scanned (given not LIMIT operator). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1748) Statistics broken for tables with size in excess of Integer.MAX_VALUE
[ https://issues.apache.org/jira/browse/HIVE-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924819#action_12924819 ] Paul Butler commented on HIVE-1748: --- @Namit added. It took a while as I had to check out Hive from Apache's repo and apply my patch to that; is there a better way to create patches against the Apache repos from the internal one? > Statistics broken for tables with size in excess of Integer.MAX_VALUE > - > > Key: HIVE-1748 > URL: https://issues.apache.org/jira/browse/HIVE-1748 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Paul Butler > Attachments: HIVE-1748.patch > > > ANALYZE TABLE x COMPUTE STATISTICS would fail to update the table size if it > exceeded Integer.MAX_VALUE because it used parseInt instead of parseLong. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1748) Statistics broken for tables with size in excess of Integer.MAX_VALUE
[ https://issues.apache.org/jira/browse/HIVE-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Butler updated HIVE-1748: -- Attachment: HIVE-1748.patch > Statistics broken for tables with size in excess of Integer.MAX_VALUE > - > > Key: HIVE-1748 > URL: https://issues.apache.org/jira/browse/HIVE-1748 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Paul Butler > Attachments: HIVE-1748.patch > > > ANALYZE TABLE x COMPUTE STATISTICS would fail to update the table size if it > exceeded Integer.MAX_VALUE because it used parseInt instead of parseLong. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1748) Statistics broken for tables with size in excess of Integer.MAX_VALUE
Statistics broken for tables with size in excess of Integer.MAX_VALUE - Key: HIVE-1748 URL: https://issues.apache.org/jira/browse/HIVE-1748 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Paul Butler ANALYZE TABLE x COMPUTE STATISTICS would fail to update the table size if it exceeded Integer.MAX_VALUE because it used parseInt instead of parseLong. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.