[jira] Created: (HIVE-1515) archive is not working when multiple partitions inside one table are archived.
archive is not working when multiple partitions inside one table are archived. -- Key: HIVE-1515 URL: https://issues.apache.org/jira/browse/HIVE-1515 Project: Hadoop Hive Issue Type: Bug Reporter: He Yongqiang set hive.exec.compress.output = true; set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; set mapred.min.split.size=256; set mapred.min.split.size.per.node=256; set mapred.min.split.size.per.rack=256; set mapred.max.split.size=256; set hive.archive.enabled = true; drop table combine_3_srcpart_seq_rc; create table combine_3_srcpart_seq_rc (key int , value string) partitioned by (ds string, hr string) stored as sequencefile; insert overwrite table combine_3_srcpart_seq_rc partition (ds="2010-08-03", hr="00") select * from src; insert overwrite table combine_3_srcpart_seq_rc partition (ds="2010-08-03", hr="001") select * from src; ALTER TABLE combine_3_srcpart_seq_rc ARCHIVE PARTITION (ds="2010-08-03", hr="00"); ALTER TABLE combine_3_srcpart_seq_rc ARCHIVE PARTITION (ds="2010-08-03", hr="001"); select key, value, ds, hr from combine_3_srcpart_seq_rc where ds="2010-08-03" order by key, hr limit 30; drop table combine_3_srcpart_seq_rc; will fail. java.io.IOException: Invalid file name: har:/data/users/heyongqiang/hive-trunk-clean/build/ql/test/data/warehouse/combine_3_srcpart_seq_rc/ds=2010-08-03/hr=001/data.har/data/users/heyongqiang/hive-trunk-clean/build/ql/test/data/warehouse/combine_3_srcpart_seq_rc/ds=2010-08-03/hr=001 in har:/data/users/heyongqiang/hive-trunk-clean/build/ql/test/data/warehouse/combine_3_srcpart_seq_rc/ds=2010-08-03/hr=00/data.har The reason it fails is because: there are 2 input paths (one for each partition) for the above query: 1): har:/Users/heyongqiang/Documents/workspace/Hive-Index/build/ql/test/data/warehouse/combine_3_srcpart_seq_rc/ds=2010-08-03/hr=00/data.har/Users/heyongqiang/Documents/workspace/Hive-Index/build/ql/test/data/warehouse/combine_3_srcpart_seq_rc/ds=2010-08-03/hr=00 2): har:/Users/heyongqiang/Documents/workspace/Hive-Index/build/ql/test/data/warehouse/combine_3_srcpart_seq_rc/ds=2010-08-03/hr=001/data.har/Users/heyongqiang/Documents/workspace/Hive-Index/build/ql/test/data/warehouse/combine_3_srcpart_seq_rc/ds=2010-08-03/hr=001 But when doing path.getFileSystem() for these 2 input paths. they both return same one file system instance which points the first caller, in this case which is har:/Users/heyongqiang/Documents/workspace/Hive-Index/build/ql/test/data/warehouse/combine_3_srcpart_seq_rc/ds=2010-08-03/hr=00/data.har The reason here is Hadoop's FileSystem has a global cache, and when trying to load a FileSystem instance from a given path, it only take the path's scheme and username to lookup the cache. So when we do Path.getFileSystem for the second har path, it actually returns the file system handle for the first path. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1513) hive starter scripts should load admin/user supplied script for configurability
[ https://issues.apache.org/jira/browse/HIVE-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895959#action_12895959 ] Joydeep Sen Sarma commented on HIVE-1513: - yes - it's possible. however a lot of variables etc. are initialized by the time we get to loading ext/*.sh. for example we allow HADOOP_HEAPSIZE to be specified via env var. but aside from doing an export before launching the hive script, there's no way to configure this externally. the ext/* trick wouldn't work cause it's comes too late. i think this is simple enough - we can just source a conf/hive-env.sh or something of the sort so that admins can provide right values for all these vars based on their requirements via config files. > hive starter scripts should load admin/user supplied script for > configurability > --- > > Key: HIVE-1513 > URL: https://issues.apache.org/jira/browse/HIVE-1513 > Project: Hadoop Hive > Issue Type: Improvement > Components: CLI >Reporter: Joydeep Sen Sarma > > it's difficult to add environment variables to Hive starter scripts except by > modifying the scripts directly. this is undesirable (since they are source > code). Hive starter scripts should load a admin supplied shell script for > configurability. This would be similar to what hadoop does with hadoop-env.sh -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Hive Contributors Meeting August 9th @ Facebook
Hi, This is a reminder that the next Hive Contributors Meeting will convene Monday August 9th at 4pm at Facebook HQ. Space is limited, so if you plan to attend you *must* RSVP here: http://www.meetup.com/Hive-Contributors-Group/ The following is a preliminary agenda for the meeting. Please email me if you want to add something to the list. * Update on the 0.6.0 release * Documentation policies * Automated patch testing * Moving to the 0.20 Hadoop API and removing support for pre-0.20 versions * Updates on recent/continuing work: * Howl * Indexes * Filter pushdown Thanks. Carl
[jira] Commented: (HIVE-1512) Need to get hive_hbase-handler to work with hbase versions 0.20.4 0.20.5 and cloudera CDH3 version
[ https://issues.apache.org/jira/browse/HIVE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895829#action_12895829 ] John Sichi commented on HIVE-1512: -- This patch can't be applied until we actually upgrade the Hbase libs, since it is incompatible with 0.20.3. I'll link it to HIVE-1235. Also, when supplying patches, please base them off of hive trunk (not off of a subdirectory). Thanks! > Need to get hive_hbase-handler to work with hbase versions 0.20.4 0.20.5 and > cloudera CDH3 version > --- > > Key: HIVE-1512 > URL: https://issues.apache.org/jira/browse/HIVE-1512 > Project: Hadoop Hive > Issue Type: Improvement > Components: HBase Handler >Affects Versions: 0.7.0 >Reporter: Jimmy Hu > Fix For: 0.7.0 > > Attachments: HIVE-1512.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > the current trunk hive_hbase-handler only works with hbase 0.20.3, we need > to get it to work with hbase versions 0.20.4 0.20.5 and cloudera CDH3 version -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1512) Need to get hive_hbase-handler to work with hbase versions 0.20.4 0.20.5 and cloudera CDH3 version
[ https://issues.apache.org/jira/browse/HIVE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi reassigned HIVE-1512: Assignee: John Sichi > Need to get hive_hbase-handler to work with hbase versions 0.20.4 0.20.5 and > cloudera CDH3 version > --- > > Key: HIVE-1512 > URL: https://issues.apache.org/jira/browse/HIVE-1512 > Project: Hadoop Hive > Issue Type: Improvement > Components: HBase Handler >Affects Versions: 0.7.0 >Reporter: Jimmy Hu >Assignee: John Sichi > Fix For: 0.7.0 > > Attachments: HIVE-1512.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > the current trunk hive_hbase-handler only works with hbase 0.20.3, we need > to get it to work with hbase versions 0.20.4 0.20.5 and cloudera CDH3 version -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1514) Be able to modify a partition's fileformat and file location information.
[ https://issues.apache.org/jira/browse/HIVE-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1514: --- Status: Patch Available (was: Open) > Be able to modify a partition's fileformat and file location information. > - > > Key: HIVE-1514 > URL: https://issues.apache.org/jira/browse/HIVE-1514 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: He Yongqiang >Assignee: He Yongqiang > Attachments: hive-1514.1.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1514) Be able to modify a partition's fileformat and file location information.
[ https://issues.apache.org/jira/browse/HIVE-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1514: --- Attachment: hive-1514.1.patch > Be able to modify a partition's fileformat and file location information. > - > > Key: HIVE-1514 > URL: https://issues.apache.org/jira/browse/HIVE-1514 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: He Yongqiang >Assignee: He Yongqiang > Attachments: hive-1514.1.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1514) Be able to modify a partition's fileformat and file location information.
Be able to modify a partition's fileformat and file location information. - Key: HIVE-1514 URL: https://issues.apache.org/jira/browse/HIVE-1514 Project: Hadoop Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: He Yongqiang -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1374) Query compile-only option
[ https://issues.apache.org/jira/browse/HIVE-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895787#action_12895787 ] Siying Dong commented on HIVE-1374: --- Communicated with people who asked for this feature. Basically, what they want is a mode that checks syntax and doesn't fail when a table or partition doesn't exist, or more simply, it doesn't go to metastore to check objects at all. > Query compile-only option > - > > Key: HIVE-1374 > URL: https://issues.apache.org/jira/browse/HIVE-1374 > Project: Hadoop Hive > Issue Type: New Feature >Affects Versions: 0.6.0 >Reporter: Paul Yang >Assignee: Siying Dong > > A compile-only option might be useful for helping users quickly prototype > queries, fix errors, and do test runs. The proposed change would be adding a > -c switch that behaves like -e but only compiles the specified query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1434) Cassandra Storage Handler
[ https://issues.apache.org/jira/browse/HIVE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-1434: -- Status: Patch Available (was: Open) This patch has full read/write functionality. I am going to do another patch later today with xdocs, but do not expect any code changes. > Cassandra Storage Handler > - > > Key: HIVE-1434 > URL: https://issues.apache.org/jira/browse/HIVE-1434 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Edward Capriolo >Assignee: Edward Capriolo > Attachments: cas-handle.tar.gz, hive-1434-1.txt, > hive-1434-2-patch.txt, hive-1434-3-patch.txt > > > Add a cassandra storage handler. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1434) Cassandra Storage Handler
[ https://issues.apache.org/jira/browse/HIVE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-1434: -- Attachment: hive-1434-3-patch.txt > Cassandra Storage Handler > - > > Key: HIVE-1434 > URL: https://issues.apache.org/jira/browse/HIVE-1434 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Edward Capriolo >Assignee: Edward Capriolo > Attachments: cas-handle.tar.gz, hive-1434-1.txt, > hive-1434-2-patch.txt, hive-1434-3-patch.txt > > > Add a cassandra storage handler. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1513) hive starter scripts should load admin/user supplied script for configurability
[ https://issues.apache.org/jira/browse/HIVE-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895698#action_12895698 ] Edward Capriolo commented on HIVE-1513: --- Anything you put in the bin/ext is sourced as part of the bootstrap process. Could you do something like bin/ext/mystuff.sh? > hive starter scripts should load admin/user supplied script for > configurability > --- > > Key: HIVE-1513 > URL: https://issues.apache.org/jira/browse/HIVE-1513 > Project: Hadoop Hive > Issue Type: Improvement > Components: CLI >Reporter: Joydeep Sen Sarma > > it's difficult to add environment variables to Hive starter scripts except by > modifying the scripts directly. this is undesirable (since they are source > code). Hive starter scripts should load a admin supplied shell script for > configurability. This would be similar to what hadoop does with hadoop-env.sh -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1513) hive starter scripts should load admin/user supplied script for configurability
hive starter scripts should load admin/user supplied script for configurability --- Key: HIVE-1513 URL: https://issues.apache.org/jira/browse/HIVE-1513 Project: Hadoop Hive Issue Type: Improvement Components: CLI Reporter: Joydeep Sen Sarma it's difficult to add environment variables to Hive starter scripts except by modifying the scripts directly. this is undesirable (since they are source code). Hive starter scripts should load a admin supplied shell script for configurability. This would be similar to what hadoop does with hadoop-env.sh -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1509) Monitor the working set of the number of files
[ https://issues.apache.org/jira/browse/HIVE-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma updated HIVE-1509: Affects Version/s: 0.6.0 (was: 0.7.0) > Monitor the working set of the number of files > --- > > Key: HIVE-1509 > URL: https://issues.apache.org/jira/browse/HIVE-1509 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Namit Jain >Assignee: Ning Zhang > Fix For: 0.7.0 > > Attachments: HIVE-1509.2.patch, HIVE-1509.3.patch, HIVE-1509.4.patch, > HIVE-1509.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1509) Monitor the working set of the number of files
[ https://issues.apache.org/jira/browse/HIVE-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma updated HIVE-1509: Status: Resolved (was: Patch Available) Fix Version/s: 0.7.0 Resolution: Fixed committed - thanks Ning. it seems that the test problems were likely because there was a problem applying the patch. > Monitor the working set of the number of files > --- > > Key: HIVE-1509 > URL: https://issues.apache.org/jira/browse/HIVE-1509 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Ning Zhang > Fix For: 0.7.0 > > Attachments: HIVE-1509.2.patch, HIVE-1509.3.patch, HIVE-1509.4.patch, > HIVE-1509.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.