[jira] Commented: (HIVE-417) Implement Indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886634#action_12886634 ] Jeff Hammerbacher commented on HIVE-417: Hey, Any chance you guys could post a more detailed design document for "full-fledged index support"? I'm quite curious to read up on it. Thanks, Jeff > Implement Indexing in Hive > -- > > Key: HIVE-417 > URL: https://issues.apache.org/jira/browse/HIVE-417 > Project: Hadoop Hive > Issue Type: New Feature > Components: Metastore, Query Processor >Affects Versions: 0.3.0, 0.3.1, 0.4.0, 0.6.0 >Reporter: Prasad Chakka >Assignee: He Yongqiang > Attachments: hive-417.proto.patch, hive-417-2009-07-18.patch, > hive-indexing.3.patch, hive-indexing.5.thrift.patch, > indexing_with_ql_rewrites_trunk_953221.patch > > > Implement indexing on Hive so that lookup and range queries are efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1305) add progress in join and groupby
[ https://issues.apache.org/jira/browse/HIVE-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-1305: -- Attachment: hive.1305.3.patch > add progress in join and groupby > > > Key: HIVE-1305 > URL: https://issues.apache.org/jira/browse/HIVE-1305 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Siying Dong > Attachments: hive.1305.1.patch, hive.1305.2.patch, hive.1305.3.patch > > > The operators join and groupby can consume a lot of rows before producing any > output. > All operators which do not have a output for every input should report > progress periodically. > Currently, it is only being done for ScriptOperator and FilterOperator. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1455) lateral view does not work with column pruning
[ https://issues.apache.org/jira/browse/HIVE-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1455: --- Attachment: hive.1455.1.patch running tests now. > lateral view does not work with column pruning > --- > > Key: HIVE-1455 > URL: https://issues.apache.org/jira/browse/HIVE-1455 > Project: Hadoop Hive > Issue Type: Bug >Reporter: He Yongqiang >Assignee: He Yongqiang > Fix For: 0.6.0, 0.7.0 > > Attachments: hive.1455.1.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1455) lateral view does not work with column pruning
lateral view does not work with column pruning --- Key: HIVE-1455 URL: https://issues.apache.org/jira/browse/HIVE-1455 Project: Hadoop Hive Issue Type: Bug Reporter: He Yongqiang Assignee: He Yongqiang Fix For: 0.6.0, 0.7.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1305) add progress in join and groupby
[ https://issues.apache.org/jira/browse/HIVE-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886608#action_12886608 ] He Yongqiang commented on HIVE-1305: Overall looks good to me. minor comments: 1. in GroupByOp's flush, countAfterReport = 0; should put in the beginning of the function? 2. in AbstractMapjoin heartbeatInterval = HiveConf.getIntVar(hconf, HiveConf.ConfVars.HIVESENDHEARTBEAT); is not needed? because the parent common join op already has that. > add progress in join and groupby > > > Key: HIVE-1305 > URL: https://issues.apache.org/jira/browse/HIVE-1305 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Siying Dong > Attachments: hive.1305.1.patch, hive.1305.2.patch > > > The operators join and groupby can consume a lot of rows before producing any > output. > All operators which do not have a output for every input should report > progress periodically. > Currently, it is only being done for ScriptOperator and FilterOperator. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1096) Hive Variables
[ https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-1096: -- Attachment: hive-1096-12.patch.txt Change interpolate to substituteAdded the substitution logic to file, dfs, set , and query processor > Hive Variables > -- > > Key: HIVE-1096 > URL: https://issues.apache.org/jira/browse/HIVE-1096 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Edward Capriolo >Assignee: Edward Capriolo > Fix For: 0.6.0, 0.7.0 > > Attachments: 1096-9.diff, hive-1096-10-patch.txt, > hive-1096-11-patch.txt, hive-1096-12.patch.txt, hive-1096-2.diff, > hive-1096-7.diff, hive-1096-8.diff, hive-1096.diff > > > From mailing list: > --Amazon Elastic MapReduce version of Hive seems to have a nice feature > called "Variables." Basically you can define a variable via command-line > while invoking hive with -d DT=2009-12-09 and then refer to the variable via > ${DT} within the hive queries. This could be extremely useful. I can't seem > to find this feature even on trunk. Is this feature currently anywhere in the > roadmap?-- > This could be implemented in many places. > A simple place to put this is > in Driver.compile or Driver.run we can do string substitutions at that level, > and further downstream need not be effected. > There could be some benefits to doing this further downstream, parser,plan. > but based on the simple needs we may not need to overthink this. > I will get started on implementing in compile unless someone wants to discuss > this more. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1096) Hive Variables
[ https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-1096: -- Status: Patch Available (was: Open) > Hive Variables > -- > > Key: HIVE-1096 > URL: https://issues.apache.org/jira/browse/HIVE-1096 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Edward Capriolo >Assignee: Edward Capriolo > Fix For: 0.6.0, 0.7.0 > > Attachments: 1096-9.diff, hive-1096-10-patch.txt, > hive-1096-11-patch.txt, hive-1096-12.patch.txt, hive-1096-2.diff, > hive-1096-7.diff, hive-1096-8.diff, hive-1096.diff > > > From mailing list: > --Amazon Elastic MapReduce version of Hive seems to have a nice feature > called "Variables." Basically you can define a variable via command-line > while invoking hive with -d DT=2009-12-09 and then refer to the variable via > ${DT} within the hive queries. This could be extremely useful. I can't seem > to find this feature even on trunk. Is this feature currently anywhere in the > roadmap?-- > This could be implemented in many places. > A simple place to put this is > in Driver.compile or Driver.run we can do string substitutions at that level, > and further downstream need not be effected. > There could be some benefits to doing this further downstream, parser,plan. > but based on the simple needs we may not need to overthink this. > I will get started on implementing in compile unless someone wants to discuss > this more. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1305) add progress in join and groupby
[ https://issues.apache.org/jira/browse/HIVE-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886593#action_12886593 ] He Yongqiang commented on HIVE-1305: will take a look. > add progress in join and groupby > > > Key: HIVE-1305 > URL: https://issues.apache.org/jira/browse/HIVE-1305 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Siying Dong > Attachments: hive.1305.1.patch, hive.1305.2.patch > > > The operators join and groupby can consume a lot of rows before producing any > output. > All operators which do not have a output for every input should report > progress periodically. > Currently, it is only being done for ScriptOperator and FilterOperator. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1454) insert overwrite and CTAS fail in hive local mode
[ https://issues.apache.org/jira/browse/HIVE-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886592#action_12886592 ] He Yongqiang commented on HIVE-1454: no, i only committed it to trunk. Do you need me to commit this to 0.6 as well? > insert overwrite and CTAS fail in hive local mode > - > > Key: HIVE-1454 > URL: https://issues.apache.org/jira/browse/HIVE-1454 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Joydeep Sen Sarma >Assignee: Joydeep Sen Sarma >Priority: Blocker > Fix For: 0.7.0 > > Attachments: hive-1454.1.patch > > > this is because of the changes in HIVE-543. We switched to using local > storage for intermediate data for local mode queries. However there are code > paths that are incorrectly allocating intermediate storage where they should > be allocating external file system storage (based on table/directory uri). > This is causing regressions in running queries in local mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-417) Implement Indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886587#action_12886587 ] John Sichi commented on HIVE-417: - Based on discussion with Yongqiang, we've decided to go for "Full-fledged index support". > Implement Indexing in Hive > -- > > Key: HIVE-417 > URL: https://issues.apache.org/jira/browse/HIVE-417 > Project: Hadoop Hive > Issue Type: New Feature > Components: Metastore, Query Processor >Affects Versions: 0.3.0, 0.3.1, 0.4.0, 0.6.0 >Reporter: Prasad Chakka >Assignee: He Yongqiang > Attachments: hive-417.proto.patch, hive-417-2009-07-18.patch, > hive-indexing.3.patch, hive-indexing.5.thrift.patch, > indexing_with_ql_rewrites_trunk_953221.patch > > > Implement indexing on Hive so that lookup and range queries are efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1305) add progress in join and groupby
[ https://issues.apache.org/jira/browse/HIVE-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-1305: -- Attachment: hive.1305.2.patch > add progress in join and groupby > > > Key: HIVE-1305 > URL: https://issues.apache.org/jira/browse/HIVE-1305 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Siying Dong > Attachments: hive.1305.1.patch, hive.1305.2.patch > > > The operators join and groupby can consume a lot of rows before producing any > output. > All operators which do not have a output for every input should report > progress periodically. > Currently, it is only being done for ScriptOperator and FilterOperator. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1305) add progress in join and groupby
[ https://issues.apache.org/jira/browse/HIVE-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-1305: -- Attachment: hive.1305.1.patch > add progress in join and groupby > > > Key: HIVE-1305 > URL: https://issues.apache.org/jira/browse/HIVE-1305 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Siying Dong > Attachments: hive.1305.1.patch > > > The operators join and groupby can consume a lot of rows before producing any > output. > All operators which do not have a output for every input should report > progress periodically. > Currently, it is only being done for ScriptOperator and FilterOperator. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1428) ALTER TABLE ADD PARTITION fails with a remote Thirft metastore
[ https://issues.apache.org/jira/browse/HIVE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1428: Status: Open (was: Patch Available) > ALTER TABLE ADD PARTITION fails with a remote Thirft metastore > -- > > Key: HIVE-1428 > URL: https://issues.apache.org/jira/browse/HIVE-1428 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.6.0, 0.7.0 >Reporter: Paul Yang >Assignee: Pradeep Kamath > Attachments: HIVE-1428-2.patch, HIVE-1428.patch, > TestHiveMetaStoreRemote.java > > > If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD > PARTITION commands will fail with an error similar to the following: > [prade...@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE > mytable add partition(datestamp = '20091101', srcid = '10',action) location > '/user/pradeepk/mytable/20091101/10';" > 10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found > in the classpath. Usage of hadoop-site.xml is deprecated. Instead use > core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of > core-default.xml, mapred-default.xml and hdfs-default.xml respectively > Hive history > file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt > FAILED: Error in metadata: org.apache.thrift.TApplicationException: > get_partition failed: unknown result > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > [prade...@chargesize:~/dev/howl] > This is due to a check that tries to retrieve the partition to see if it > exists. If it does not, an attempt is made to pass a null value from the > metastore. Since thrift does not support null return values, an exception is > thrown. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1428) ALTER TABLE ADD PARTITION fails with a remote Thirft metastore
[ https://issues.apache.org/jira/browse/HIVE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886559#action_12886559 ] Paul Yang commented on HIVE-1428: - Fix looks pretty good - two things though: 1. The field identifiers for NoSuchObjectException and MetaException in hive_metastore.thrift should be swapped - this is because thrift uses those identifiers for versioning and we want to be consistent. 2. TestHiveMetaStoreRemote and TestHiveMetaStore share quite a bit of code. Can you extract this out to a separate class? Or maybe roll the remote metastore client into TestHiveMetastore. > ALTER TABLE ADD PARTITION fails with a remote Thirft metastore > -- > > Key: HIVE-1428 > URL: https://issues.apache.org/jira/browse/HIVE-1428 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.6.0, 0.7.0 >Reporter: Paul Yang >Assignee: Pradeep Kamath > Attachments: HIVE-1428-2.patch, HIVE-1428.patch, > TestHiveMetaStoreRemote.java > > > If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD > PARTITION commands will fail with an error similar to the following: > [prade...@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE > mytable add partition(datestamp = '20091101', srcid = '10',action) location > '/user/pradeepk/mytable/20091101/10';" > 10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found > in the classpath. Usage of hadoop-site.xml is deprecated. Instead use > core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of > core-default.xml, mapred-default.xml and hdfs-default.xml respectively > Hive history > file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt > FAILED: Error in metadata: org.apache.thrift.TApplicationException: > get_partition failed: unknown result > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > [prade...@chargesize:~/dev/howl] > This is due to a check that tries to retrieve the partition to see if it > exists. If it does not, an attempt is made to pass a null value from the > metastore. Since thrift does not support null return values, an exception is > thrown. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1454) insert overwrite and CTAS fail in hive local mode
[ https://issues.apache.org/jira/browse/HIVE-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886558#action_12886558 ] Joydeep Sen Sarma commented on HIVE-1454: - Yongqiang - did u commit this to 0.6 as well? > insert overwrite and CTAS fail in hive local mode > - > > Key: HIVE-1454 > URL: https://issues.apache.org/jira/browse/HIVE-1454 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Joydeep Sen Sarma >Assignee: Joydeep Sen Sarma >Priority: Blocker > Fix For: 0.7.0 > > Attachments: hive-1454.1.patch > > > this is because of the changes in HIVE-543. We switched to using local > storage for intermediate data for local mode queries. However there are code > paths that are incorrectly allocating intermediate storage where they should > be allocating external file system storage (based on table/directory uri). > This is causing regressions in running queries in local mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1245) allow access to values stored as non-strings in HBase
[ https://issues.apache.org/jira/browse/HIVE-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886541#action_12886541 ] John Sichi commented on HIVE-1245: -- For atomic types, we could extend the column-level mapping directive to allow for three options * string * binary * use table-level default So where we currently have a:b, we would support a:b:string and a:b:binary. The table-level default would be set in a separate serde property hbase.storedtype.atomic, with a default value of string for backwards-compatibility. Then something similar for compound types, but with json and delimited as options? I haven't thought about all the combinations, and what to do with column familiies. > allow access to values stored as non-strings in HBase > - > > Key: HIVE-1245 > URL: https://issues.apache.org/jira/browse/HIVE-1245 > Project: Hadoop Hive > Issue Type: Improvement > Components: HBase Handler >Affects Versions: 0.6.0 >Reporter: John Sichi >Assignee: John Sichi > > See test case in > http://mail-archives.apache.org/mod_mbox/hadoop-hive-user/201003.mbox/browser -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1229) replace dependencies on HBase deprecated API
[ https://issues.apache.org/jira/browse/HIVE-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886523#action_12886523 ] John Sichi commented on HIVE-1229: -- Instead of cachedColumnNameBytes, is it possible to instead keep a separate array of the names in byte form? If we're doing all access positionally, that would allow us to skip the hash map lookups. > replace dependencies on HBase deprecated API > > > Key: HIVE-1229 > URL: https://issues.apache.org/jira/browse/HIVE-1229 > Project: Hadoop Hive > Issue Type: Improvement > Components: HBase Handler >Affects Versions: 0.6.0 >Reporter: John Sichi >Assignee: Basab Maulik > Fix For: 0.7.0 > > Attachments: HIVE-1229.1.patch, HIVE-1229.2.patch, HIVE-1229.3.patch > > > Some of these dependencies are on the old Hadoop mapred packages; others are > HBase-specific. The former have to wait until the rest of Hive moves over to > the new Hadoop mapreduce package, but the HBase-specific ones don't have to > wait. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1454) insert overwrite and CTAS fail in hive local mode
[ https://issues.apache.org/jira/browse/HIVE-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1454: --- Status: Resolved (was: Patch Available) Fix Version/s: 0.7.0 (was: 0.6.0) Resolution: Fixed I just committed. Thanks Joydeep! > insert overwrite and CTAS fail in hive local mode > - > > Key: HIVE-1454 > URL: https://issues.apache.org/jira/browse/HIVE-1454 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Joydeep Sen Sarma >Assignee: Joydeep Sen Sarma >Priority: Blocker > Fix For: 0.7.0 > > Attachments: hive-1454.1.patch > > > this is because of the changes in HIVE-543. We switched to using local > storage for intermediate data for local mode queries. However there are code > paths that are incorrectly allocating intermediate storage where they should > be allocating external file system storage (based on table/directory uri). > This is causing regressions in running queries in local mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1229) replace dependencies on HBase deprecated API
[ https://issues.apache.org/jira/browse/HIVE-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1229: - Status: Patch Available (was: Open) > replace dependencies on HBase deprecated API > > > Key: HIVE-1229 > URL: https://issues.apache.org/jira/browse/HIVE-1229 > Project: Hadoop Hive > Issue Type: Improvement > Components: HBase Handler >Affects Versions: 0.6.0 >Reporter: John Sichi >Assignee: Basab Maulik > Fix For: 0.7.0 > > Attachments: HIVE-1229.1.patch, HIVE-1229.2.patch, HIVE-1229.3.patch > > > Some of these dependencies are on the old Hadoop mapred packages; others are > HBase-specific. The former have to wait until the rest of Hive moves over to > the new Hadoop mapreduce package, but the HBase-specific ones don't have to > wait. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1428) ALTER TABLE ADD PARTITION fails with a remote Thirft metastore
[ https://issues.apache.org/jira/browse/HIVE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi reassigned HIVE-1428: Assignee: Pradeep Kamath Setting assignee to Pradeep. (Pradeep, I just added you as a contributor on Hive, so you should be able to assign issues to yourself going forward.) > ALTER TABLE ADD PARTITION fails with a remote Thirft metastore > -- > > Key: HIVE-1428 > URL: https://issues.apache.org/jira/browse/HIVE-1428 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.6.0, 0.7.0 >Reporter: Paul Yang >Assignee: Pradeep Kamath > Attachments: HIVE-1428-2.patch, HIVE-1428.patch, > TestHiveMetaStoreRemote.java > > > If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD > PARTITION commands will fail with an error similar to the following: > [prade...@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE > mytable add partition(datestamp = '20091101', srcid = '10',action) location > '/user/pradeepk/mytable/20091101/10';" > 10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found > in the classpath. Usage of hadoop-site.xml is deprecated. Instead use > core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of > core-default.xml, mapred-default.xml and hdfs-default.xml respectively > Hive history > file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt > FAILED: Error in metadata: org.apache.thrift.TApplicationException: > get_partition failed: unknown result > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > [prade...@chargesize:~/dev/howl] > This is due to a check that tries to retrieve the partition to see if it > exists. If it does not, an attempt is made to pass a null value from the > metastore. Since thrift does not support null return values, an exception is > thrown. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1454) insert overwrite and CTAS fail in hive local mode
[ https://issues.apache.org/jira/browse/HIVE-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886442#action_12886442 ] Joydeep Sen Sarma commented on HIVE-1454: - it's difficult to add queries that can test this. the reason is that we need two different valid and different Hadoop FileSystems to replicate this problem. this is not trivial. medium term - as part of HIVE-1408 - i will add a new dummy Hadoop filesystem (that will just use local storage underneath). Then we will be able to systematically test this using our regression queries. > insert overwrite and CTAS fail in hive local mode > - > > Key: HIVE-1454 > URL: https://issues.apache.org/jira/browse/HIVE-1454 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Joydeep Sen Sarma >Assignee: Joydeep Sen Sarma >Priority: Blocker > Fix For: 0.6.0 > > Attachments: hive-1454.1.patch > > > this is because of the changes in HIVE-543. We switched to using local > storage for intermediate data for local mode queries. However there are code > paths that are incorrectly allocating intermediate storage where they should > be allocating external file system storage (based on table/directory uri). > This is causing regressions in running queries in local mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1454) insert overwrite and CTAS fail in hive local mode
[ https://issues.apache.org/jira/browse/HIVE-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886445#action_12886445 ] He Yongqiang commented on HIVE-1454: +1. will commit after tests pass. > insert overwrite and CTAS fail in hive local mode > - > > Key: HIVE-1454 > URL: https://issues.apache.org/jira/browse/HIVE-1454 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Joydeep Sen Sarma >Assignee: Joydeep Sen Sarma >Priority: Blocker > Fix For: 0.6.0 > > Attachments: hive-1454.1.patch > > > this is because of the changes in HIVE-543. We switched to using local > storage for intermediate data for local mode queries. However there are code > paths that are incorrectly allocating intermediate storage where they should > be allocating external file system storage (based on table/directory uri). > This is causing regressions in running queries in local mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886428#action_12886428 ] John Sichi commented on HIVE-287: - Regarding DISTINCT: I agree with Arvind; this information should be provided to the UDAF so that it can reject invocations that don't make sense. Once this validation is passed, the distinct elimination is still implemented generically inside of Hive (upstream of the UDAF). Regarding F(*): let's discriminate three cases. COUNT(*): this really means COUNT(), not COUNT(x,y,z). This is a very important distinction to make from an optimizer perspective, because we want to be able to push down projection to avoid I/O and other processing for columns whose values we will never look at. SUM(*) and similar ones: these we should disallow. MY_UDAF(*), or MY_UDAF(t.*): this is similar to Pradeep's case that came up recently on the mailing list, and it needs to expand to MY_UDAF(x,y,z), not MY_UDAF(). I think the patch is currently doing MY_UDAF(), which isn't what he wants. My recommendation is that we commit Arvind's patch as is, then create a followup JIRA issue to do what Pradeep is looking for (the expansion of * in the semantic analyzer) for both UDF and UDAF, but with a special case for COUNT. UDAF authors will be able to decide whether or not to reject the star syntax, since in the common case of a UDAF expecting a limited number of parameters, the star won't make sense. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1454) insert overwrite and CTAS fail in hive local mode
[ https://issues.apache.org/jira/browse/HIVE-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma updated HIVE-1454: Status: Patch Available (was: Open) > insert overwrite and CTAS fail in hive local mode > - > > Key: HIVE-1454 > URL: https://issues.apache.org/jira/browse/HIVE-1454 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Joydeep Sen Sarma >Assignee: Joydeep Sen Sarma >Priority: Blocker > Fix For: 0.6.0 > > Attachments: hive-1454.1.patch > > > this is because of the changes in HIVE-543. We switched to using local > storage for intermediate data for local mode queries. However there are code > paths that are incorrectly allocating intermediate storage where they should > be allocating external file system storage (based on table/directory uri). > This is causing regressions in running queries in local mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1454) insert overwrite and CTAS fail in hive local mode
[ https://issues.apache.org/jira/browse/HIVE-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma updated HIVE-1454: Attachment: hive-1454.1.patch > insert overwrite and CTAS fail in hive local mode > - > > Key: HIVE-1454 > URL: https://issues.apache.org/jira/browse/HIVE-1454 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Joydeep Sen Sarma >Assignee: Joydeep Sen Sarma >Priority: Blocker > Fix For: 0.6.0 > > Attachments: hive-1454.1.patch > > > this is because of the changes in HIVE-543. We switched to using local > storage for intermediate data for local mode queries. However there are code > paths that are incorrectly allocating intermediate storage where they should > be allocating external file system storage (based on table/directory uri). > This is causing regressions in running queries in local mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1454) insert overwrite and CTAS fail in hive local mode
insert overwrite and CTAS fail in hive local mode - Key: HIVE-1454 URL: https://issues.apache.org/jira/browse/HIVE-1454 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Joydeep Sen Sarma Priority: Blocker Fix For: 0.6.0 this is because of the changes in HIVE-543. We switched to using local storage for intermediate data for local mode queries. However there are code paths that are incorrectly allocating intermediate storage where they should be allocating external file system storage (based on table/directory uri). This is causing regressions in running queries in local mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1408) add option to let hive automatically run in local mode based on tunable heuristics
[ https://issues.apache.org/jira/browse/HIVE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma updated HIVE-1408: Status: Patch Available (was: Open) > add option to let hive automatically run in local mode based on tunable > heuristics > -- > > Key: HIVE-1408 > URL: https://issues.apache.org/jira/browse/HIVE-1408 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Joydeep Sen Sarma >Assignee: Joydeep Sen Sarma > Attachments: 1408.1.patch > > > as a followup to HIVE-543 - we should have a simple option (enabled by > default) to let hive run in local mode if possible. > two levels of options are desirable: > 1. hive.exec.mode.local.auto=true/false // control whether local mode is > automatically chosen > 2. Options to control different heuristics, some naiive examples: > hive.exec.mode.local.auto.input.size.max=1G // don't choose local mode > if data > 1G > hive.exec.mode.local.auto.script.enable=true/false // choose if local > mode is enabled for queries with user scripts > this can be implemented as a pre/post execution hook. It makes sense to > provide this as a standard hook in the hive codebase since it's likely to > improve response time for many users (especially for test queries). > the initial proposal is to choose this at a query level and not at per > hive-task (ie. hadoop job) level. per job-level requires more changes to > compilation (to not pre-commit to hdfs or local scratch directories at > compile time). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-417) Implement Indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886380#action_12886380 ] He Yongqiang commented on HIVE-417: --- I think SUMMARY index's mapper code is comment out in the uploaded patch. > Implement Indexing in Hive > -- > > Key: HIVE-417 > URL: https://issues.apache.org/jira/browse/HIVE-417 > Project: Hadoop Hive > Issue Type: New Feature > Components: Metastore, Query Processor >Affects Versions: 0.3.0, 0.3.1, 0.4.0, 0.6.0 >Reporter: Prasad Chakka >Assignee: He Yongqiang > Attachments: hive-417.proto.patch, hive-417-2009-07-18.patch, > hive-indexing.3.patch, hive-indexing.5.thrift.patch, > indexing_with_ql_rewrites_trunk_953221.patch > > > Implement indexing on Hive so that lookup and range queries are efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1428) ALTER TABLE ADD PARTITION fails with a remote Thirft metastore
[ https://issues.apache.org/jira/browse/HIVE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated HIVE-1428: - Attachment: HIVE-1428-2.patch New patch with unit tests included. Thanks for the suggestion of using threads Paul, have used that in the unit test. I have made a change in HiveConf to enable the unit test. The remaining changes in the patch are as in the first version - to throw NoSuchObjectException in getPartition() when no partition exists. This mainly changes the generated thrift code (to add throws in the method signature) and in other code which interacts with it to catch the exception and set the partition object to null. > ALTER TABLE ADD PARTITION fails with a remote Thirft metastore > -- > > Key: HIVE-1428 > URL: https://issues.apache.org/jira/browse/HIVE-1428 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.6.0, 0.7.0 >Reporter: Paul Yang > Attachments: HIVE-1428-2.patch, HIVE-1428.patch, > TestHiveMetaStoreRemote.java > > > If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD > PARTITION commands will fail with an error similar to the following: > [prade...@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE > mytable add partition(datestamp = '20091101', srcid = '10',action) location > '/user/pradeepk/mytable/20091101/10';" > 10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found > in the classpath. Usage of hadoop-site.xml is deprecated. Instead use > core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of > core-default.xml, mapred-default.xml and hdfs-default.xml respectively > Hive history > file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt > FAILED: Error in metadata: org.apache.thrift.TApplicationException: > get_partition failed: unknown result > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > [prade...@chargesize:~/dev/howl] > This is due to a check that tries to retrieve the partition to see if it > exists. If it does not, an attempt is made to pass a null value from the > metastore. Since thrift does not support null return values, an exception is > thrown. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1428) ALTER TABLE ADD PARTITION fails with a remote Thirft metastore
[ https://issues.apache.org/jira/browse/HIVE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated HIVE-1428: - Status: Patch Available (was: Open) HIVE-1428-2.patch is ready for review. > ALTER TABLE ADD PARTITION fails with a remote Thirft metastore > -- > > Key: HIVE-1428 > URL: https://issues.apache.org/jira/browse/HIVE-1428 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.6.0, 0.7.0 >Reporter: Paul Yang > Attachments: HIVE-1428-2.patch, HIVE-1428.patch, > TestHiveMetaStoreRemote.java > > > If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD > PARTITION commands will fail with an error similar to the following: > [prade...@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE > mytable add partition(datestamp = '20091101', srcid = '10',action) location > '/user/pradeepk/mytable/20091101/10';" > 10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found > in the classpath. Usage of hadoop-site.xml is deprecated. Instead use > core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of > core-default.xml, mapred-default.xml and hdfs-default.xml respectively > Hive history > file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt > FAILED: Error in metadata: org.apache.thrift.TApplicationException: > get_partition failed: unknown result > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > [prade...@chargesize:~/dev/howl] > This is due to a check that tries to retrieve the partition to see if it > exists. If it does not, an attempt is made to pass a null value from the > metastore. Since thrift does not support null return values, an exception is > thrown. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886339#action_12886339 ] Arvind Prabhakar commented on HIVE-287: --- @Zheng: Welcome to the party. bq. Why do we put the DISTINCT in the information? DISTINCT is currently done by the framework, instead of individual UDAF. This is good because the logic of removing duplicates are common for all UDAFs. We do support SUM(DISTINCT val). Providing the information in the parameter specification is not the same as enforcing its interpretation. This is provided primarily to ensure that UDAFs that rely on this information can make appropriate decisions. For example, we wanted to disallow the invocation {{COUNT( EXPR1, EXPR2 ...)}} in favor of {{COUNT(*DISTINCT* EXPR1, EXPR2 ...)}}. Without this information, the count UDAF will not be able to enforce the later syntax. bq. Why do we special-case ""? It seems to me that "" is just a short-cut. Hive already supports regex-based multi-column specification, so that we can say `abc.*` for all columns with name starting with abc. The compiler should just expand * and give all the columns to the UDAF. If you wish to use \* as a regular expression, you would have to quote it as a string - {{COUNT('\*')}}. This is different from the invocation as specified in SQL which treats \* as a terminal symbol. So if it is OK to deviate from the standard representation, the user can easily use the quoted string representation to achieve the effect similar to {{COUNT(col1, col2 ..)}}. The semantics of this should be more like {{COUNT(DISTINCT EXPR1, EXPR2 ...)}} as opposed to {{COUNT(\*)}}. bq. Since COUNT(\*) is a special-case in the SQL standard (COUNT(\*) is different from COUNT(col) even if the table has a single column col), I think we should just special-case that and replace that with count(1) at some place. Are you suggesting that we allow the grammar to express {{COUNT(\*)}} syntax, but in the lexical analysis stage turn it into a {{COUNT(1)}}? I can see how that may work - but personally I am not a fan of such an approach. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-417) Implement Indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886308#action_12886308 ] Prafulla Tekawade commented on HIVE-417: Hi Yongqiang, I am facing some problem for creating SUMMARY indexes. This index is not built with update index command. COMPACT SUMMARY index works fine. Is there any problem with creation of SUMMARY index table ? > Implement Indexing in Hive > -- > > Key: HIVE-417 > URL: https://issues.apache.org/jira/browse/HIVE-417 > Project: Hadoop Hive > Issue Type: New Feature > Components: Metastore, Query Processor >Affects Versions: 0.3.0, 0.3.1, 0.4.0, 0.6.0 >Reporter: Prasad Chakka >Assignee: He Yongqiang > Attachments: hive-417.proto.patch, hive-417-2009-07-18.patch, > hive-indexing.3.patch, hive-indexing.5.thrift.patch, > indexing_with_ql_rewrites_trunk_953221.patch > > > Implement indexing on Hive so that lookup and range queries are efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1408) add option to let hive automatically run in local mode based on tunable heuristics
[ https://issues.apache.org/jira/browse/HIVE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma updated HIVE-1408: Attachment: 1408.1.patch v1 - i will update with tests. couple of main objectives: 1. decide whether each mr job can be run locally 2. decide whether local disk can be used for intermediate data (if all jobs are going to run locally) right now - both #1 and #2 are code complete - but only #1 has been enabled in the code (#2 needs more testing) the general strategy is: - after compilation/optimization - look at input size of each mr job. - if all the jobs are small - then we can use local disk for intermediate data (#2) - else - we use hdfs for intermediate input and before launching each job - we (re)test whether the input data set is such that we can execute locally. had to do substantial restructuring to make this happen: a. MapRedTask is now a wrapper around ExecDriver. This allows us to have a single task implementation for running mr jobs. mapredtask decides at execute time whether it should run locally or not. b. Context.java is pretty much rewritten - the path management code was somewhat buggy (in particular isMRTmpFileURI was incorrect). the code was rewritten to allow make it easy to swizzle tmp paths to be directed to local disk after plan generation c. added a small cache for caching DFS file metadata (sizes). this is because we lookup file metadata many times over now (for determining local mode as well as for estimating reducer count) and this cuts the overhead of repeated DFS rpcs d. most test output changes are because of altered temporary path naming convention due to (b) e. bug fixes: CTAS and RCFileOutputFormat were broken for local mode execution. some cleanup (debug log statements should be wrapped in ifDebugEnabled()). > add option to let hive automatically run in local mode based on tunable > heuristics > -- > > Key: HIVE-1408 > URL: https://issues.apache.org/jira/browse/HIVE-1408 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Joydeep Sen Sarma >Assignee: Joydeep Sen Sarma > Attachments: 1408.1.patch > > > as a followup to HIVE-543 - we should have a simple option (enabled by > default) to let hive run in local mode if possible. > two levels of options are desirable: > 1. hive.exec.mode.local.auto=true/false // control whether local mode is > automatically chosen > 2. Options to control different heuristics, some naiive examples: > hive.exec.mode.local.auto.input.size.max=1G // don't choose local mode > if data > 1G > hive.exec.mode.local.auto.script.enable=true/false // choose if local > mode is enabled for queries with user scripts > this can be implemented as a pre/post execution hook. It makes sense to > provide this as a standard hook in the hive codebase since it's likely to > improve response time for many users (especially for test queries). > the initial proposal is to choose this at a query level and not at per > hive-task (ie. hadoop job) level. per job-level requires more changes to > compilation (to not pre-commit to hdfs or local scratch directories at > compile time). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1452) Mapside join on non partitioned table with partitioned table causes error
[ https://issues.apache.org/jira/browse/HIVE-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886272#action_12886272 ] He Yongqiang commented on HIVE-1452: Not sure what's happening here. It will be great if you can provide a testcase to reproduce. The parameter "hive.mapjoin.cache.numrows" (default 25K) is used to control when to flush the in-memory hashmap (which's value object is MapJoinObjectValue). You may want to use a small number for this parameter in your testcase. A guess for this issue is maybe we should do a {noformat} out.flush(); {noformat} in MapjoinObjectValue's writeExternal method. (MapjoinObjectValue line 131) > Mapside join on non partitioned table with partitioned table causes error > - > > Key: HIVE-1452 > URL: https://issues.apache.org/jira/browse/HIVE-1452 > Project: Hadoop Hive > Issue Type: Bug > Components: CLI >Affects Versions: 0.6.0 >Reporter: Viraj Bhat > > I am running script which contains two tables, one is dynamically partitioned > and stored as RCFormat and the other is stored as TXT file. > The TXT file has around 397MB in size and has around 24million rows. > {code} > drop table joinquery; > create external table joinquery ( > id string, > type string, > sec string, > num string, > url string, > cost string, > listinfo array > > ) > STORED AS TEXTFILE > LOCATION '/projects/joinquery'; > CREATE EXTERNAL TABLE idtable20mil( > id string > ) > STORED AS TEXTFILE > LOCATION '/projects/idtable20mil'; > insert overwrite table joinquery >select > /*+ MAPJOIN(idtable20mil) */ > rctable.id, > rctable.type, > rctable.map['sec'], > rctable.map['num'], > rctable.map['url'], > rctable.map['cost'], > rctable.listinfo > from rctable > JOIN idtable20mil on (rctable.id = idtable20mil.id) > where > rctable.id is not null and > rctable.part='value' and > rctable.subpart='value'and > rctable.pty='100' and > rctable.uniqid='1000' > order by id; > {code} > Result: > Possible error: > Data file split:string,part:string,subpart:string,subsubpart:string> is > corrupted. > Solution: > Replace file. i.e. by re-running the query that produced the source table / > partition. > - > If I look at mapper logs. > {verbatim} > Caused by: java.io.IOException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.persistence.MapJoinObjectValue.readExternal(MapJoinObjectValue.java:109) > at > java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1792) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1751) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) > at > org.apache.hadoop.hive.ql.util.jdbm.htree.HashBucket.readExternal(HashBucket.java:284) > at > java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1792) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1751) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) > at > org.apache.hadoop.hive.ql.util.jdbm.helper.Serialization.deserialize(Serialization.java:106) > at > org.apache.hadoop.hive.ql.util.jdbm.helper.DefaultSerializer.deserialize(DefaultSerializer.java:106) > at > org.apache.hadoop.hive.ql.util.jdbm.recman.BaseRecordManager.fetch(BaseRecordManager.java:360) > at > org.apache.hadoop.hive.ql.util.jdbm.recman.BaseRecordManager.fetch(BaseRecordManager.java:332) > at > org.apache.hadoop.hive.ql.util.jdbm.htree.HashDirectory.get(HashDirectory.java:195) > at org.apache.hadoop.hive.ql.util.jdbm.htree.HTree.get(HTree.java:155) > at > org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.get(HashMapWrapper.java:114) > ... 11 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.java:2776) > at java.io.ObjectInputStream.readInt(ObjectInputStream.java:950) > at org.apache.hadoop.io.BytesWritable.readFields(BytesWritable.java:153) > at > org.apache.hadoop.hive.ql.exec.persistence.MapJoinObjectValue.readExternal(MapJoinObjectValue.java:98) > {verbatim} > I am trying to create a testcase, which can demonstrate this error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
ReviewBoard Tips
I'm really excited to see more people using ReviewBoard for their Hive JIRAs. I want to remind everyone that when creating a review request it is really important to set the "Bugs" and "Groups" fields. The "Bugs" field should be set to the ID of the Hive JIRA, e.g. "HIVE-756". ReviewBoard needs this information in order to automatically post review comments back to the JIRA ticket. The "Groups" field should be set to "hive". This ensures that ReviewBoard will send the review request and comments to the hive-dev mailing list. Thanks. Carl