[jira] [Commented] (HIVE-4934) Improve documentation of OVER clause
[ https://issues.apache.org/jira/browse/HIVE-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080569#comment-14080569 ] Lefty Leverenz commented on HIVE-4934: -- Thanks [~lars_francke]. Was there a reason for the line break that put "SELECT" and "a," on separate lines? (I removed it to match all the other examples, but you can restore it if it has a purpose.) {code} SELECT a, COUNT(b) OVER (PARTITION BY c), SUM(b) OVER (PARTITION BY c) FROM T; {code} Also thanks for changing the formatting of code samples. * [PARTITION BY with partitioning, ORDER BY, and window specification | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-PARTITIONBYwithpartitioning,ORDERBY,andwindowspecification] > Improve documentation of OVER clause > > > Key: HIVE-4934 > URL: https://issues.apache.org/jira/browse/HIVE-4934 > Project: Hive > Issue Type: Bug >Reporter: Lars Francke >Assignee: Lars Francke >Priority: Minor > > {code} > CREATE TABLE test (foo INT); > SELECT ntile(10), foo OVER (PARTITION BY foo) FROM test; > FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: > Only COMPLETE mode supported for NTile function > SELECT foo, ntile(10) OVER (PARTITION BY foo) FROM test; > ...works... > {code} > I'm not sure if that is a bug or necessary. Either way the error message is > not helpful as it's not documented anywhere what {{COMPLETE}} mode is. A > cursory glance at the code didn't help me either. > Edit: It is not a bug, it wasn't clear to me that the OVER clause only > applies to the directly preceding function. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4933) Document how aliases work with the OVER clause
[ https://issues.apache.org/jira/browse/HIVE-4933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080570#comment-14080570 ] Lefty Leverenz commented on HIVE-4933: -- Thanks [~lars_francke], this is helpful. (I added a missing "AS" for the b_sum alias.) * [PARTITION BY with partitioning, ORDER BY, and window specification | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-PARTITIONBYwithpartitioning,ORDERBY,andwindowspecification] > Document how aliases work with the OVER clause > -- > > Key: HIVE-4933 > URL: https://issues.apache.org/jira/browse/HIVE-4933 > Project: Hive > Issue Type: Bug >Affects Versions: 0.11.0 >Reporter: Lars Francke >Assignee: Lars Francke >Priority: Minor > > {code} > CREATE TABLE test (foo INT); > hive> SELECT SUM(foo) AS bar OVER (PARTITION BY foo) FROM test; > MismatchedTokenException(175!=110) > at > org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617) > at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115) > at > org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1424) > at > org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:35998) > at > org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:33974) > at > org.apache.hadoop.hive.ql.parse.HiveParser.regular_body(HiveParser.java:33882) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement(HiveParser.java:33389) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:33169) > at > org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1284) > at > org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:983) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:190) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:434) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:352) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:995) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1038) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:921) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > FAILED: ParseException line 1:20 mismatched input 'OVER' expecting FROM near > 'bar' in from clause{code} > The same happens without the {{AS}} but it works when leaving out the alias > entirely. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7438) Counters, statistics, and metrics
[ https://issues.apache.org/jira/browse/HIVE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-7438: Attachment: (was: hive on spark job statistic design.docx) > Counters, statistics, and metrics > - > > Key: HIVE-7438 > URL: https://issues.apache.org/jira/browse/HIVE-7438 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Chengxiang Li > Attachments: hive on spark job statistic design.docx > > > Hive makes use of MapReduce counters for statistics and possibly for other > purposes. For Hive on Spark, we should achieve the same functionality using > Spark's accumulators. > Hive also collects metrics from MapReduce jobs traditionally. Spark job very > likely publishes a different set of metrics, which, if made available, would > help user to get insights into their spark jobs. Thus, we should obtain the > metrics and make them available as we do for MapReduce. > This task therefore includes 1. identify Hive's existing functionality w.r.t. > counters, statistics, and metrics; 2. design and implement the same > functionality in Spark. > Please refer to the design document for more information. > https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark#HiveonSpark-CountersandMetrics -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7438) Counters, statistics, and metrics
[ https://issues.apache.org/jira/browse/HIVE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-7438: Attachment: hive on spark job statistic design.docx > Counters, statistics, and metrics > - > > Key: HIVE-7438 > URL: https://issues.apache.org/jira/browse/HIVE-7438 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Chengxiang Li > Attachments: hive on spark job statistic design.docx > > > Hive makes use of MapReduce counters for statistics and possibly for other > purposes. For Hive on Spark, we should achieve the same functionality using > Spark's accumulators. > Hive also collects metrics from MapReduce jobs traditionally. Spark job very > likely publishes a different set of metrics, which, if made available, would > help user to get insights into their spark jobs. Thus, we should obtain the > metrics and make them available as we do for MapReduce. > This task therefore includes 1. identify Hive's existing functionality w.r.t. > counters, statistics, and metrics; 2. design and implement the same > functionality in Spark. > Please refer to the design document for more information. > https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark#HiveonSpark-CountersandMetrics -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7554) Parquet Hive should resolve column names in case insensitive manner
[ https://issues.apache.org/jira/browse/HIVE-7554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080558#comment-14080558 ] Hive QA commented on HIVE-7554: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12658658/HIVE-7554.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5857 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/116/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/116/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-116/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12658658 > Parquet Hive should resolve column names in case insensitive manner > --- > > Key: HIVE-7554 > URL: https://issues.apache.org/jira/browse/HIVE-7554 > Project: Hive > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Brock Noland > Attachments: HIVE-7554.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-7567) support automatic calculating reduce task number
[ https://issues.apache.org/jira/browse/HIVE-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li reassigned HIVE-7567: --- Assignee: Chengxiang Li > support automatic calculating reduce task number > > > Key: HIVE-7567 > URL: https://issues.apache.org/jira/browse/HIVE-7567 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Chengxiang Li >Assignee: Chengxiang Li > > Hive have its own machenism to calculate reduce task number, we need to > implement it on spark job. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7438) Counters, statistics, and metrics
[ https://issues.apache.org/jira/browse/HIVE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-7438: Attachment: hive on spark job statistic design.docx Add a design doc for hive on spark job statistic collection. > Counters, statistics, and metrics > - > > Key: HIVE-7438 > URL: https://issues.apache.org/jira/browse/HIVE-7438 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Chengxiang Li > Attachments: hive on spark job statistic design.docx > > > Hive makes use of MapReduce counters for statistics and possibly for other > purposes. For Hive on Spark, we should achieve the same functionality using > Spark's accumulators. > Hive also collects metrics from MapReduce jobs traditionally. Spark job very > likely publishes a different set of metrics, which, if made available, would > help user to get insights into their spark jobs. Thus, we should obtain the > metrics and make them available as we do for MapReduce. > This task therefore includes 1. identify Hive's existing functionality w.r.t. > counters, statistics, and metrics; 2. design and implement the same > functionality in Spark. > Please refer to the design document for more information. > https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark#HiveonSpark-CountersandMetrics -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7223) Support generic PartitionSpecs in Metastore partition-functions
[ https://issues.apache.org/jira/browse/HIVE-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080527#comment-14080527 ] Mithun Radhakrishnan commented on HIVE-7223: I should have the patch up for this change tomorrow. I'll only deal with the Thrift/Hive changes in this JIRA. The corresponding changes to HCatClient will go up on a separate JIRA, so as not to clash with HIVE-7341. > Support generic PartitionSpecs in Metastore partition-functions > --- > > Key: HIVE-7223 > URL: https://issues.apache.org/jira/browse/HIVE-7223 > Project: Hive > Issue Type: Improvement > Components: HCatalog, Metastore >Affects Versions: 0.12.0, 0.13.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > > Currently, the functions in the HiveMetaStore API that handle multiple > partitions do so using List. E.g. > {code} > public List listPartitions(String db_name, String tbl_name, short > max_parts); > public List listPartitionsByFilter(String db_name, String > tbl_name, String filter, short max_parts); > public int add_partitions(List new_parts); > {code} > Partition objects are fairly heavyweight, since each Partition carries its > own copy of a StorageDescriptor, partition-values, etc. Tables with tens of > thousands of partitions take so long to have their partitions listed that the > client times out with default hive.metastore.client.socket.timeout. There is > the additional expense of serializing and deserializing metadata for large > sets of partitions, w.r.t time and heap-space. Reducing the thrift traffic > should help in this regard. > In a date-partitioned table, all sub-partitions for a particular date are > *likely* (but not expected) to have: > # The same base directory (e.g. {{/feeds/search/20140601/}}) > # Similar directory structure (e.g. {{/feeds/search/20140601/[US,UK,IN]}}) > # The same SerDe/StorageHandler/IOFormat classes > # Sorting/Bucketing/SkewInfo settings > In this “most likely” scenario (henceforth termed “normal”), it’s possible to > represent the partition-list (for a date) in a more condensed form: a list of > LighterPartition instances, all sharing a common StorageDescriptor whose > location points to the root directory. > We can go one better for the {{add_partitions()}} case: When adding all > partitions for a given date, the “normal” case affords us the ability to > specify the top-level date-directory, where sub-partitions can be inferred > from the HDFS directory-path. > These extensions are hard to introduce at the metastore-level, since > partition-functions explicitly specify {{List}} arguments. I > wonder if a {{PartitionSpec}} interface might help: > {code} > public PartitionSpec listPartitions(db_name, tbl_name, max_parts) throws ... > ; > public int add_partitions( PartitionSpec new_parts ) throws … ; > {code} > where the PartitionSpec looks like: > {code} > public interface PartitionSpec { > public List getPartitions(); > public List getPartNames(); > public Iterator getPartitionIter(); > public Iterator getPartNameIter(); > } > {code} > For addPartitions(), an {{HDFSDirBasedPartitionSpec}} class could implement > {{PartitionSpec}}, store a top-level directory, and return Partition > instances from sub-directory names, while storing a single StorageDescriptor > for all of them. > Similarly, list_partitions() could return a List, where each > PartitionSpec corresponds to a set or partitions that can share a > StorageDescriptor. > By exposing iterator semantics, neither the client nor the metastore need > instantiate all partitions at once. That should help with memory requirements. > In case no smart grouping is possible, we could just fall back on a > {{DefaultPartitionSpec}} which composes {{List}}, and is no worse > than status quo. > PartitionSpec abstracts away how a set of partitions may be represented. A > tighter representation allows us to communicate metadata for a larger number > of Partitions, with less Thrift traffic. > Given that Thrift doesn’t support polymorphism, we’d have to implement the > PartitionSpec as a Thrift Union of supported implementations. (We could > convert from the Thrift PartitionSpec to the appropriate Java PartitionSpec > sub-class.) > Thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7330) Create SparkTask
[ https://issues.apache.org/jira/browse/HIVE-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-7330: --- Attachment: HIVE-7330-spark.patch > Create SparkTask > > > Key: HIVE-7330 > URL: https://issues.apache.org/jira/browse/HIVE-7330 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Chinna Rao Lalam > Attachments: HIVE-7330-spark.patch > > > SparkTask handles the execution of SparkWork. It will execute a graph of map > and reduce work using a SparkClient instance. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7341) Support for Table replication across HCatalog instances
[ https://issues.apache.org/jira/browse/HIVE-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-7341: --- Status: Patch Available (was: Open) > Support for Table replication across HCatalog instances > --- > > Key: HIVE-7341 > URL: https://issues.apache.org/jira/browse/HIVE-7341 > Project: Hive > Issue Type: New Feature > Components: HCatalog >Affects Versions: 0.13.1 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 0.14.0 > > Attachments: HIVE-7341.1.patch, HIVE-7341.2.patch > > > The HCatClient currently doesn't provide very much support for replicating > HCatTable definitions between 2 HCatalog Server (i.e. Hive metastore) > instances. > Systems similar to Apache Falcon might find the need to replicate partition > data between 2 clusters, and keep the HCatalog metadata in sync between the > two. This poses a couple of problems: > # The definition of the source table might change (in column schema, I/O > formats, record-formats, serde-parameters, etc.) The system will need a way > to diff 2 tables and update the target-metastore with the changes. E.g. > {code} > targetTable.resolve( sourceTable, targetTable.diff(sourceTable) ); > hcatClient.updateTableSchema(dbName, tableName, targetTable); > {code} > # The current {{HCatClient.addPartitions()}} API requires that the > partition's schema be derived from the table's schema, thereby requiring that > the table-schema be resolved *before* partitions with the new schema are > added to the table. This is problematic, because it introduces race > conditions when 2 partitions with differing column-schemas (e.g. right after > a schema change) are copied in parallel. This can be avoided if each > HCatAddPartitionDesc kept track of the partition's schema, in flight. > # The source and target metastores might be running different/incompatible > versions of Hive. > The impending patch attempts to address these concerns (with some caveats). > # {{HCatTable}} now has > ## a {{diff()}} method, to compare against another HCatTable instance > ## a {{resolve(diff)}} method to copy over specified table-attributes from > another HCatTable > ## a serialize/deserialize mechanism (via {{HCatClient.serializeTable()}} and > {{HCatClient.deserializeTable()}}), so that HCatTable instances constructed > in other class-loaders may be used for comparison > # {{HCatPartition}} now provides finer-grained control over a Partition's > column-schema, StorageDescriptor settings, etc. This allows partitions to be > copied completely from source, with the ability to override specific > properties if required (e.g. location). > # {{HCatClient.updateTableSchema()}} can now update the entire > table-definition, not just the column schema. > # I've cleaned up and removed most of the redundancy between the HCatTable, > HCatCreateTableDesc and HCatCreateTableDesc.Builder. The prior API failed to > separate the table-attributes from the add-table-operation's attributes. By > providing fluent-interfaces in HCatTable, and composing an HCatTable instance > in HCatCreateTableDesc, the interfaces are cleaner(ish). The old setters are > deprecated, in favour of those in HCatTable. Likewise, HCatPartition and > HCatAddPartitionDesc. > I'll post a patch for trunk shortly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7390) Make quote character optional and configurable in BeeLine CSV/TSV output
[ https://issues.apache.org/jira/browse/HIVE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-7390: --- Attachment: HIVE-7390.5.patch > Make quote character optional and configurable in BeeLine CSV/TSV output > > > Key: HIVE-7390 > URL: https://issues.apache.org/jira/browse/HIVE-7390 > Project: Hive > Issue Type: New Feature > Components: Clients >Affects Versions: 0.13.1 >Reporter: Jim Halfpenny >Assignee: Ferdinand Xu > Attachments: HIVE-7390.1.patch, HIVE-7390.2.patch, HIVE-7390.3.patch, > HIVE-7390.4.patch, HIVE-7390.5.patch, HIVE-7390.patch > > > Currently when either the CSV or TSV output formats are used in beeline each > column is wrapped in single quotes. Quote wrapping of columns should be > optional and the user should be able to choose the character used to wrap the > columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7341) Support for Table replication across HCatalog instances
[ https://issues.apache.org/jira/browse/HIVE-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-7341: --- Attachment: HIVE-7341.2.patch Improved patch, to ensure deprecated APIs still function. {{HCatAddPartitionDesc.create(db, table, location, partKeyValMap)}} doesn't throw an UnsupportedException now. > Support for Table replication across HCatalog instances > --- > > Key: HIVE-7341 > URL: https://issues.apache.org/jira/browse/HIVE-7341 > Project: Hive > Issue Type: New Feature > Components: HCatalog >Affects Versions: 0.13.1 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 0.14.0 > > Attachments: HIVE-7341.1.patch, HIVE-7341.2.patch > > > The HCatClient currently doesn't provide very much support for replicating > HCatTable definitions between 2 HCatalog Server (i.e. Hive metastore) > instances. > Systems similar to Apache Falcon might find the need to replicate partition > data between 2 clusters, and keep the HCatalog metadata in sync between the > two. This poses a couple of problems: > # The definition of the source table might change (in column schema, I/O > formats, record-formats, serde-parameters, etc.) The system will need a way > to diff 2 tables and update the target-metastore with the changes. E.g. > {code} > targetTable.resolve( sourceTable, targetTable.diff(sourceTable) ); > hcatClient.updateTableSchema(dbName, tableName, targetTable); > {code} > # The current {{HCatClient.addPartitions()}} API requires that the > partition's schema be derived from the table's schema, thereby requiring that > the table-schema be resolved *before* partitions with the new schema are > added to the table. This is problematic, because it introduces race > conditions when 2 partitions with differing column-schemas (e.g. right after > a schema change) are copied in parallel. This can be avoided if each > HCatAddPartitionDesc kept track of the partition's schema, in flight. > # The source and target metastores might be running different/incompatible > versions of Hive. > The impending patch attempts to address these concerns (with some caveats). > # {{HCatTable}} now has > ## a {{diff()}} method, to compare against another HCatTable instance > ## a {{resolve(diff)}} method to copy over specified table-attributes from > another HCatTable > ## a serialize/deserialize mechanism (via {{HCatClient.serializeTable()}} and > {{HCatClient.deserializeTable()}}), so that HCatTable instances constructed > in other class-loaders may be used for comparison > # {{HCatPartition}} now provides finer-grained control over a Partition's > column-schema, StorageDescriptor settings, etc. This allows partitions to be > copied completely from source, with the ability to override specific > properties if required (e.g. location). > # {{HCatClient.updateTableSchema()}} can now update the entire > table-definition, not just the column schema. > # I've cleaned up and removed most of the redundancy between the HCatTable, > HCatCreateTableDesc and HCatCreateTableDesc.Builder. The prior API failed to > separate the table-attributes from the add-table-operation's attributes. By > providing fluent-interfaces in HCatTable, and composing an HCatTable instance > in HCatCreateTableDesc, the interfaces are cleaner(ish). The old setters are > deprecated, in favour of those in HCatTable. Likewise, HCatPartition and > HCatAddPartitionDesc. > I'll post a patch for trunk shortly. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23799: HIVE-7390: refactor csv output format with in RFC mode and add one more option to support formatting as the csv format in hive cli
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23799/ --- (Updated July 31, 2014, 5:34 a.m.) Review request for hive. Changes --- (1) fix code style issues (2) add option to specify the delimiter for DSV format (3) add delimiter-separated values format support Bugs: HIVE-7390 https://issues.apache.org/jira/browse/HIVE-7390 Repository: hive-git Description --- HIVE-7390: refactor csv output format with in RFC mode and add one more option to support formatting as the csv format in hive cli Diffs (updated) - beeline/pom.xml 6ec1d1aff3f35c097aa6054aae84faf2d63854f1 beeline/src/java/org/apache/hive/beeline/BeeLine.java 528a98e29c23421f9352bdf7c5edd3a9fae0e3ea beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 75f7d38cb97fb753a8f39c19488b9ce0a8d77590 beeline/src/java/org/apache/hive/beeline/SeparatedValuesOutputFormat.java 7853c3f38f3c3fb9ae0b9939c714f1dc940ba053 beeline/src/main/resources/BeeLine.properties 390d062b8dc52dfa790c7351f3db44c1e0dd7e37 itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java bd97aff5959fd9040fc0f0a1f6b782f2aa6f pom.xml b5a5697e6a3b689c2b244ba0338be541261eaa3d Diff: https://reviews.apache.org/r/23799/diff/ Testing --- Thanks, cheng xu
[jira] [Commented] (HIVE-7348) Beeline could not parse ; separated queries provided with -e option
[ https://issues.apache.org/jira/browse/HIVE-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080508#comment-14080508 ] Hive QA commented on HIVE-7348: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12658787/HIVE-7348.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5857 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hive.beeline.TestBeelineArgParsing.testQueryScripts {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/115/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/115/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-115/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12658787 > Beeline could not parse ; separated queries provided with -e option > --- > > Key: HIVE-7348 > URL: https://issues.apache.org/jira/browse/HIVE-7348 > Project: Hive > Issue Type: Bug >Reporter: Ashish Kumar Singh >Assignee: Ashish Kumar Singh > Attachments: HIVE-7348.1.patch, HIVE-7348.patch > > > Beeline could not parse ; separated queries provided with -e option. This > works fine on hive cli. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7526) Research to use groupby transformation to replace Hive existing partitionByKey and SparkCollector combination
[ https://issues.apache.org/jira/browse/HIVE-7526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-7526: --- Attachment: HIVE-7526.4-spark.patch Hi [~xuefuz], I take your suggestions and proposed another patch. Please take a look. Thanks. > Research to use groupby transformation to replace Hive existing > partitionByKey and SparkCollector combination > - > > Key: HIVE-7526 > URL: https://issues.apache.org/jira/browse/HIVE-7526 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Chao > Attachments: HIVE-7526.2.patch, HIVE-7526.3.patch, > HIVE-7526.4-spark.patch, HIVE-7526.patch > > > Currently SparkClient shuffles data by calling paritionByKey(). This > transformation outputs tuples. However, Hive's ExecMapper > expects > tuples, and Spark's groupByKey() seems > outputing this directly. Thus, using groupByKey, we may be able to avoid its > own key clustering mechanism (in HiveReduceFunction). This research is to > have a try. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7532) allow disabling direct sql per query with external metastore
[ https://issues.apache.org/jira/browse/HIVE-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080468#comment-14080468 ] Navis commented on HIVE-7532: - It's applied per session. HiveConf in ObjectStore is what is in thread local of HMSHandler. > allow disabling direct sql per query with external metastore > > > Key: HIVE-7532 > URL: https://issues.apache.org/jira/browse/HIVE-7532 > Project: Hive > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Navis > Attachments: HIVE-7532.1.patch.txt, HIVE-7532.2.nogen, > HIVE-7532.2.patch.txt > > > Currently with external metastore, direct sql can only be disabled via > metastore config globally. Perhaps it makes sense to have the ability to > propagate the setting per query from client to override the metastore > setting, e.g. if one particular query causes it to fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7562) Cleanup ExecReducer
[ https://issues.apache.org/jira/browse/HIVE-7562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080466#comment-14080466 ] Hive QA commented on HIVE-7562: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12658783/HIVE-7562.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5857 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/114/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/114/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-114/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12658783 > Cleanup ExecReducer > --- > > Key: HIVE-7562 > URL: https://issues.apache.org/jira/browse/HIVE-7562 > Project: Hive > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Brock Noland > Attachments: HIVE-7562.patch > > > ExecReducer places member variables at random with random visibility. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7567) support automatic calculating reduce task number
Chengxiang Li created HIVE-7567: --- Summary: support automatic calculating reduce task number Key: HIVE-7567 URL: https://issues.apache.org/jira/browse/HIVE-7567 Project: Hive Issue Type: Task Components: Spark Reporter: Chengxiang Li Hive have its own machenism to calculate reduce task number, we need to implement it on spark job. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7432) Remove deprecated Avro's Schema.parse usages
[ https://issues.apache.org/jira/browse/HIVE-7432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Kumar Singh updated HIVE-7432: - Attachment: HIVE-7432.1.patch Parser.parse maintains state and can not be reused. Added a util method to take care of creating avro schema from string, file or inputstream. Should take care of test failures. > Remove deprecated Avro's Schema.parse usages > > > Key: HIVE-7432 > URL: https://issues.apache.org/jira/browse/HIVE-7432 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Kumar Singh >Assignee: Ashish Kumar Singh > Attachments: HIVE-7432.1.patch, HIVE-7432.patch > > > Schema.parse has been deprecated by Avro, however it is being used at > multiple places in Hive. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24081: HIVE-7432: Remove deprecated Avro's Schema.parse usages
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24081/ --- (Updated July 31, 2014, 3:49 a.m.) Review request for hive. Changes --- Parser.parse maintains state and can not be reused. Add a util method to take care of creating avro schema from string, file or inputstream. Bugs: HIVE-7432 https://issues.apache.org/jira/browse/HIVE-7432 Repository: hive-git Description --- HIVE-7432: Remove deprecated Avro's Schema.parse usages Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/io/avro/AvroGenericRecordReader.java 60b43888b957fe315720c4ee5562b9b67a07d0e2 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroGenericRecordWritable.java b55474331736ecbdeb5958dad9342e132642d889 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 8c5cf3e87078fd87d0dc9b41d9545486d76903f3 serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaResolutionProblem.java 3dceb6384000e255e87df832f6189c80c636531b serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java 915f01679183904d0d93b9b8a88dc1a64ac2af78 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java 198bd24dcb1c2552fd45b919ecb39ef7a29ed321 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroObjectInspectorGenerator.java 76c1940fb05a0c8c6b74d570d6d788829e17de01 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerde.java 072225dcc80bfdb84f0a31f67693616393c264df serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 67d557082eec88eefdde76cb1fead6d51f7784a4 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java f8161da44312c2ad9b4dd2bab2aa242692a42d5a serde/src/test/org/apache/hadoop/hive/serde2/avro/TestGenericAvroRecordWritable.java cf3b16ce65d07c0a714530ef4a26adef7188ea2e serde/src/test/org/apache/hadoop/hive/serde2/avro/TestSchemaReEncoder.java 8dd61097433ab0c2b1c3e326978bf06337f815e6 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestThatEvolvedSchemasActAsWeWant.java 4b8cc98bfc75ea01f25944d7833f00da1b6911f0 Diff: https://reviews.apache.org/r/24081/diff/ Testing --- qTests Thanks, Ashish Singh
[jira] [Updated] (HIVE-7565) Fix exception in Greedy Join reordering Algo
[ https://issues.apache.org/jira/browse/HIVE-7565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7565: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch. Thanks [~jpullokkaran]! > Fix exception in Greedy Join reordering Algo > > > Key: HIVE-7565 > URL: https://issues.apache.org/jira/browse/HIVE-7565 > Project: Hive > Issue Type: Sub-task >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > Attachments: HIVE-7565.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7436) Load Spark configuration into Hive driver
[ https://issues.apache.org/jira/browse/HIVE-7436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080441#comment-14080441 ] Chengxiang Li commented on HIVE-7436: - In Hive on Tez mode, hive driver load tez-site.xml with TezConfiguration, based on Configuration.java::addDefaultResource(). > Load Spark configuration into Hive driver > - > > Key: HIVE-7436 > URL: https://issues.apache.org/jira/browse/HIVE-7436 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Chengxiang Li >Assignee: Chengxiang Li > Fix For: spark-branch > > Attachments: HIVE-7436-Spark.1.patch, HIVE-7436-Spark.2.patch, > HIVE-7436-Spark.3.patch > > > load Spark configuration into Hive driver, there are 3 ways to setup spark > configurations: > # Java property. > # Configure properties in spark configuration file(spark-defaults.conf). > # Hive configuration file(hive-site.xml). > The below configuration has more priority, and would overwrite previous > configuration with the same property name. > Please refer to [http://spark.apache.org/docs/latest/configuration.html] for > all configurable properties of spark, and you can configure spark > configuration in Hive through following ways: > # Configure through spark configuration file. > #* Create spark-defaults.conf, and place it in the /etc/spark/conf > configuration directory. configure properties in spark-defaults.conf in java > properties format. > #* Create the $SPARK_CONF_DIR environment variable and set it to the location > of spark-defaults.conf. > export SPARK_CONF_DIR=/etc/spark/conf > #* Add $SAPRK_CONF_DIR to the $HADOOP_CLASSPATH environment variable. > export HADOOP_CLASSPATH=$SPARK_CONF_DIR:$HADOOP_CLASSPATH > # Configure through hive configuration file. > #* edit hive-site.xml in hive conf directory, configure properties in > spark-defaults.conf in xml format. > Hive driver default spark properties: > ||name||default value||description|| > |spark.master|local|Spark master url.| > |spark.app.name|Hive on Spark|Default Spark application name.| > NO PRECOMMIT TESTS. This is for spark-branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7532) allow disabling direct sql per query with external metastore
[ https://issues.apache.org/jira/browse/HIVE-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080426#comment-14080426 ] Sergey Shelukhin commented on HIVE-7532: Will this change configuration for the query/session, or for entire metastore or thread including other users? I thought it should be possible to send for individual calls to metastore, which is less clean... but it seems like this patch will reconfigure metastore > allow disabling direct sql per query with external metastore > > > Key: HIVE-7532 > URL: https://issues.apache.org/jira/browse/HIVE-7532 > Project: Hive > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Navis > Attachments: HIVE-7532.1.patch.txt, HIVE-7532.2.nogen, > HIVE-7532.2.patch.txt > > > Currently with external metastore, direct sql can only be disabled via > metastore config globally. Perhaps it makes sense to have the ability to > propagate the setting per query from client to override the metastore > setting, e.g. if one particular query causes it to fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7547) Add ipAddress and userName to ExecHook
[ https://issues.apache.org/jira/browse/HIVE-7547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080424#comment-14080424 ] Hive QA commented on HIVE-7547: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12658784/HIVE-7547.4.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5859 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/113/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/113/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-113/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12658784 > Add ipAddress and userName to ExecHook > -- > > Key: HIVE-7547 > URL: https://issues.apache.org/jira/browse/HIVE-7547 > Project: Hive > Issue Type: New Feature > Components: Diagnosability >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-7547.2.patch, HIVE-7547.3.patch, HIVE-7547.4.patch, > HIVE-7547.patch > > > Auditing tools should be able to know about the ipAddress and userName of the > user executing operations. > These could be made available through the Hive execution-hooks. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7390) Make quote character optional and configurable in BeeLine CSV/TSV output
[ https://issues.apache.org/jira/browse/HIVE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080407#comment-14080407 ] Ferdinand Xu commented on HIVE-7390: Thanks for Lars Francke and Szehon Ho about your comments. For current CSV and TSV, just make it work in the right way(quoted at the correct time) and for customized delimiter support, I think we can add a new output format called DSV(short for Delimiter-separated values) and one beeline option to specify the delimiter for user. > Make quote character optional and configurable in BeeLine CSV/TSV output > > > Key: HIVE-7390 > URL: https://issues.apache.org/jira/browse/HIVE-7390 > Project: Hive > Issue Type: New Feature > Components: Clients >Affects Versions: 0.13.1 >Reporter: Jim Halfpenny >Assignee: Ferdinand Xu > Attachments: HIVE-7390.1.patch, HIVE-7390.2.patch, HIVE-7390.3.patch, > HIVE-7390.4.patch, HIVE-7390.patch > > > Currently when either the CSV or TSV output formats are used in beeline each > column is wrapped in single quotes. Quote wrapping of columns should be > optional and the user should be able to choose the character used to wrap the > columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7565) Fix exception in Greedy Join reordering Algo
[ https://issues.apache.org/jira/browse/HIVE-7565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-7565: - Status: Patch Available (was: Open) > Fix exception in Greedy Join reordering Algo > > > Key: HIVE-7565 > URL: https://issues.apache.org/jira/browse/HIVE-7565 > Project: Hive > Issue Type: Sub-task >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > Attachments: HIVE-7565.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7565) Fix exception in Greedy Join reordering Algo
[ https://issues.apache.org/jira/browse/HIVE-7565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-7565: - Attachment: HIVE-7565.patch > Fix exception in Greedy Join reordering Algo > > > Key: HIVE-7565 > URL: https://issues.apache.org/jira/browse/HIVE-7565 > Project: Hive > Issue Type: Sub-task >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > Attachments: HIVE-7565.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7566) HIVE can't count hbase NULL column value properly
Kent Kong created HIVE-7566: --- Summary: HIVE can't count hbase NULL column value properly Key: HIVE-7566 URL: https://issues.apache.org/jira/browse/HIVE-7566 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.13.0 Environment: HIVE version 0.13.0 HBase version 0.98.0 Reporter: Kent Kong HBase table structure is like this: table name : 'testtable' column family : 'data' column 1 : 'name' column 2 : 'color' HIVE mapping table is structure is like this: table name : 'hb_testtable' column 1 : 'name' column 2 : 'color' in hbase, put two rows James, blue May then do select in hive select * from hb_testtable where color is null the result is May, NULL then try count select count(*) from hb_testtable where color is null the result is 0, which should be 1 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7565) Fix exception in Greedy Join reordering Algo
Laljo John Pullokkaran created HIVE-7565: Summary: Fix exception in Greedy Join reordering Algo Key: HIVE-7565 URL: https://issues.apache.org/jira/browse/HIVE-7565 Project: Hive Issue Type: Sub-task Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6437) DefaultHiveAuthorizationProvider should not initialize a new HiveConf
[ https://issues.apache.org/jira/browse/HIVE-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080395#comment-14080395 ] Navis commented on HIVE-6437: - [~thejas] Updated the patch, thanks. > DefaultHiveAuthorizationProvider should not initialize a new HiveConf > - > > Key: HIVE-6437 > URL: https://issues.apache.org/jira/browse/HIVE-6437 > Project: Hive > Issue Type: Bug > Components: Configuration >Affects Versions: 0.13.0 >Reporter: Harsh J >Assignee: Navis >Priority: Trivial > Attachments: HIVE-6437.1.patch.txt, HIVE-6437.2.patch.txt, > HIVE-6437.3.patch.txt, HIVE-6437.4.patch.txt, HIVE-6437.5.patch.txt, > HIVE-6437.6.patch.txt, HIVE-6437.7.patch.txt > > > During a HS2 connection, every SessionState got initializes a new > DefaultHiveAuthorizationProvider object (on stock configs). > In turn, DefaultHiveAuthorizationProvider carries a {{new HiveConf(…)}} that > may prove too expensive, and unnecessary to do, since SessionState itself > sends in a fully applied HiveConf to it in the first place. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6437) DefaultHiveAuthorizationProvider should not initialize a new HiveConf
[ https://issues.apache.org/jira/browse/HIVE-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6437: Attachment: HIVE-6437.7.patch.txt > DefaultHiveAuthorizationProvider should not initialize a new HiveConf > - > > Key: HIVE-6437 > URL: https://issues.apache.org/jira/browse/HIVE-6437 > Project: Hive > Issue Type: Bug > Components: Configuration >Affects Versions: 0.13.0 >Reporter: Harsh J >Assignee: Navis >Priority: Trivial > Attachments: HIVE-6437.1.patch.txt, HIVE-6437.2.patch.txt, > HIVE-6437.3.patch.txt, HIVE-6437.4.patch.txt, HIVE-6437.5.patch.txt, > HIVE-6437.6.patch.txt, HIVE-6437.7.patch.txt > > > During a HS2 connection, every SessionState got initializes a new > DefaultHiveAuthorizationProvider object (on stock configs). > In turn, DefaultHiveAuthorizationProvider carries a {{new HiveConf(…)}} that > may prove too expensive, and unnecessary to do, since SessionState itself > sends in a fully applied HiveConf to it in the first place. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24043: DefaultHiveAuthorizationProvider should not initialize a new HiveConf
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24043/ --- (Updated July 31, 2014, 2:05 a.m.) Review request for hive. Changes --- Addressed comments Bugs: HIVE-6437 https://issues.apache.org/jira/browse/HIVE-6437 Repository: hive-git Description --- During a HS2 connection, every SessionState got initializes a new DefaultHiveAuthorizationProvider object (on stock configs). In turn, DefaultHiveAuthorizationProvider carries a {{new HiveConf(…)}} that may prove too expensive, and unnecessary to do, since SessionState itself sends in a fully applied HiveConf to it in the first place. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 contrib/src/java/org/apache/hadoop/hive/contrib/metastore/hooks/TestURLHook.java 39562ea contrib/src/test/queries/clientnegative/url_hook.q c346432 contrib/src/test/queries/clientpositive/url_hook.q PRE-CREATION contrib/src/test/results/clientnegative/url_hook.q.out 601fd93 contrib/src/test/results/clientpositive/url_hook.q.out PRE-CREATION data/conf/hive-site.xml fe8080a itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java e8d405d itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreVersion.java 0bb022e itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 2fefa06 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 5cc1cd8 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java d26183b metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 5add436 metastore/src/java/org/apache/hadoop/hive/metastore/RawStoreProxy.java 1cf09d4 ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 81323f6 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/DefaultHiveAuthorizationProvider.java 2fa512c ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java 0dfd997 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveRoleGrant.java ce07f32 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java ce12edb ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java d218271 ql/src/test/queries/clientnegative/authorization_cannot_create_all_role.q de91e91 ql/src/test/queries/clientnegative/authorization_cannot_create_default_role.q 42a42f6 ql/src/test/queries/clientnegative/authorization_cannot_create_none_role.q 0d14cde ql/src/test/queries/clientnegative/authorization_caseinsensitivity.q d5ea284 ql/src/test/queries/clientnegative/authorization_drop_db_cascade.q edeae9b ql/src/test/queries/clientnegative/authorization_drop_db_empty.q 46d4d0f ql/src/test/queries/clientnegative/authorization_drop_role_no_admin.q a7aa17f ql/src/test/queries/clientnegative/authorization_priv_current_role_neg.q 463358a ql/src/test/queries/clientnegative/authorization_role_cycles1.q a819d20 ql/src/test/queries/clientnegative/authorization_role_cycles2.q 423f030 ql/src/test/queries/clientnegative/authorization_role_grant.q c5c500a ql/src/test/queries/clientnegative/authorization_role_grant2.q 7fdf157 ql/src/test/queries/clientnegative/authorization_role_grant_nosuchrole.q f456165 ql/src/test/queries/clientnegative/authorization_role_grant_otherrole.q f91abdb ql/src/test/queries/clientnegative/authorization_role_grant_otheruser.q a530043 ql/src/test/queries/clientnegative/authorization_rolehierarchy_privs.q d9f4c7c ql/src/test/queries/clientnegative/authorization_set_role_neg2.q 03f748f ql/src/test/queries/clientnegative/authorization_show_grant_otherrole.q a709d16 ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_all.q 2073cda ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_alltabs.q 672b81b ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_wtab.q 7d95a9d ql/src/test/queries/clientpositive/authorization_1_sql_std.q 381937c ql/src/test/queries/clientpositive/authorization_admin_almighty1.q 45c4a7d ql/src/test/queries/clientpositive/authorization_admin_almighty2.q ce99670 ql/src/test/queries/clientpositive/authorization_create_func1.q 65a7b33 ql/src/test/queries/clientpositive/authorization_create_macro1.q fb60500 ql/src/test/queries/clientpositive/authorization_insert.q 6cce469 ql/src/test/queries/clientpositive/authorization_owner_actions_db.q 36ab260 ql/src/test/queries/clientpositive/authorization_role_grant1.q c062ef2 ql/src/test/queries/clientpositive/authorization_role_grant2.q 34e19a2 ql/src/test/queries/clientpositive/authorization_set_show_current_role.q 6b5af6e ql/src/test/queries/clientpositive/authorization_show_g
[jira] [Commented] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080394#comment-14080394 ] Hive QA commented on HIVE-7096: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12658805/HIVE-7096.5.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/112/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/112/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-112/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-112/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'itests/qtest/testconfiguration.properties' Reverted 'ql/src/test/results/clientpositive/vectorization_9.q.out' Reverted 'ql/src/test/results/clientpositive/vectorization_14.q.out' Reverted 'ql/src/test/results/clientpositive/vectorization_16.q.out' Reverted 'ql/src/test/results/clientpositive/tez/vectorization_15.q.out' Reverted 'ql/src/test/results/clientpositive/vectorization_15.q.out' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/optimizer/physical/TestVectorizer.java' Reverted 'ql/src/test/queries/clientpositive/vectorization_15.q' Reverted 'ql/src/test/queries/clientpositive/vectorization_9.q' Reverted 'ql/src/test/queries/clientpositive/vectorization_14.q' Reverted 'ql/src/test/queries/clientpositive/vectorization_16.q' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/hcatalog-pig-adapter/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hwi/target common/target common/src/gen contrib/target service/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target ql/src/test/results/clientpositive/tez/vectorized_shufflejoin.q.out ql/src/test/results/clientpositive/tez/vectorization_9.q.out ql/src/test/results/clientpositive/tez/vectorized_timestamp_funcs.q.out ql/src/test/results/clientpositive/tez/vectorization_13.q.out ql/src/test/results/clientpositive/tez/vectorization_part_project.q.out ql/src/test/results/clientpositive/tez/vectorized_nested_mapjoin.q.out ql/src/test/results/clientpositive/tez/vectorization_short_regress.q.out ql/src/test/results/clientpositive/tez/vectorization_12.q.out ql
[jira] [Commented] (HIVE-7029) Vectorize ReduceWork
[ https://issues.apache.org/jira/browse/HIVE-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080392#comment-14080392 ] Hive QA commented on HIVE-7029: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12658713/HIVE-7029.8.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5834 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/111/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/111/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-111/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12658713 > Vectorize ReduceWork > > > Key: HIVE-7029 > URL: https://issues.apache.org/jira/browse/HIVE-7029 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7029.1.patch, HIVE-7029.2.patch, HIVE-7029.3.patch, > HIVE-7029.4.patch, HIVE-7029.5.patch, HIVE-7029.6.patch, HIVE-7029.7.patch, > HIVE-7029.8.patch > > > This will enable vectorization team to independently work on vectorization on > reduce side even before vectorized shuffle is ready. > NOTE: Tez only (i.e. TezTask only) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7420) Parameterize tests for HCatalog Pig interfaces for testing against all storage formats
[ https://issues.apache.org/jira/browse/HIVE-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-7420: - Attachment: (was: HIVE-7420.3.patch) > Parameterize tests for HCatalog Pig interfaces for testing against all > storage formats > -- > > Key: HIVE-7420 > URL: https://issues.apache.org/jira/browse/HIVE-7420 > Project: Hive > Issue Type: Sub-task > Components: HCatalog >Reporter: David Chen >Assignee: David Chen > Attachments: HIVE-7420-without-HIVE-7457.2.patch, > HIVE-7420-without-HIVE-7457.3.patch, HIVE-7420.1.patch, HIVE-7420.2.patch, > HIVE-7420.3.patch > > > Currently, HCatalog tests only test against RCFile with a few testing against > ORC. The tests should be covering other Hive storage formats as well. > HIVE-7286 turns HCatMapReduceTest into a test fixture that can be run with > all Hive storage formats and with that patch, all test suites built on > HCatMapReduceTest are running and passing against Sequence File, Text, and > ORC in addition to RCFile. > Similar changes should be made to make the tests for HCatLoader and > HCatStorer generic so that they can be run against all Hive storage formats. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7532) allow disabling direct sql per query with external metastore
[ https://issues.apache.org/jira/browse/HIVE-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080384#comment-14080384 ] Navis commented on HIVE-7532: - Attached the patch without generated sources and RB link. > allow disabling direct sql per query with external metastore > > > Key: HIVE-7532 > URL: https://issues.apache.org/jira/browse/HIVE-7532 > Project: Hive > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Navis > Attachments: HIVE-7532.1.patch.txt, HIVE-7532.2.nogen, > HIVE-7532.2.patch.txt > > > Currently with external metastore, direct sql can only be disabled via > metastore config globally. Perhaps it makes sense to have the ability to > propagate the setting per query from client to override the metastore > setting, e.g. if one particular query causes it to fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7420) Parameterize tests for HCatalog Pig interfaces for testing against all storage formats
[ https://issues.apache.org/jira/browse/HIVE-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-7420: - Attachment: HIVE-7420.3.patch HIVE-7420-without-HIVE-7457.3.patch > Parameterize tests for HCatalog Pig interfaces for testing against all > storage formats > -- > > Key: HIVE-7420 > URL: https://issues.apache.org/jira/browse/HIVE-7420 > Project: Hive > Issue Type: Sub-task > Components: HCatalog >Reporter: David Chen >Assignee: David Chen > Attachments: HIVE-7420-without-HIVE-7457.2.patch, > HIVE-7420-without-HIVE-7457.3.patch, HIVE-7420.1.patch, HIVE-7420.2.patch, > HIVE-7420.3.patch > > > Currently, HCatalog tests only test against RCFile with a few testing against > ORC. The tests should be covering other Hive storage formats as well. > HIVE-7286 turns HCatMapReduceTest into a test fixture that can be run with > all Hive storage formats and with that patch, all test suites built on > HCatMapReduceTest are running and passing against Sequence File, Text, and > ORC in addition to RCFile. > Similar changes should be made to make the tests for HCatLoader and > HCatStorer generic so that they can be run against all Hive storage formats. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7420) Parameterize tests for HCatalog Pig interfaces for testing against all storage formats
[ https://issues.apache.org/jira/browse/HIVE-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-7420: - Attachment: (was: HIVE-7420-without-HIVE-7457.3.patch) > Parameterize tests for HCatalog Pig interfaces for testing against all > storage formats > -- > > Key: HIVE-7420 > URL: https://issues.apache.org/jira/browse/HIVE-7420 > Project: Hive > Issue Type: Sub-task > Components: HCatalog >Reporter: David Chen >Assignee: David Chen > Attachments: HIVE-7420-without-HIVE-7457.2.patch, > HIVE-7420-without-HIVE-7457.3.patch, HIVE-7420.1.patch, HIVE-7420.2.patch, > HIVE-7420.3.patch > > > Currently, HCatalog tests only test against RCFile with a few testing against > ORC. The tests should be covering other Hive storage formats as well. > HIVE-7286 turns HCatMapReduceTest into a test fixture that can be run with > all Hive storage formats and with that patch, all test suites built on > HCatMapReduceTest are running and passing against Sequence File, Text, and > ORC in addition to RCFile. > Similar changes should be made to make the tests for HCatLoader and > HCatStorer generic so that they can be run against all Hive storage formats. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23797: HIVE-7420: Parameterize tests for HCatalog Pig interfaces for testing against all storage formats.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23797/ --- (Updated July 31, 2014, 1:38 a.m.) Review request for hive. Bugs: HIVE-7420 https://issues.apache.org/jira/browse/HIVE-7420 Repository: hive-git Description --- HIVE-7420: Parameterize tests for HCatalog Pig interfaces for testing against all storage formats. HIVE-7457: Minor HCatalog Pig Adapter test clean up. Diffs (updated) - hcatalog/hcatalog-pig-adapter/pom.xml 4d2ca519d413b7de0a6a8b50f9a099c3539fc432 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/MockLoader.java c87b95a00af03d2531eb8bbdda4e307c3aac1fe2 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestE2EScenarios.java a4b55c8463b3563f1e602ae2d0809dd318bcfa7f hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoader.java 82fc8a9391667138780be8796931793661f61ebb hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderComplexSchema.java eadbf20afc525dd9f33e9e7fb2a5d5cb89907d7e hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorer.java fcfc6428e7db80b8bfe0ce10e37d7b0ee6e58e20 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorerMulti.java 76080f7635548ed9af114c823180d8da9ea8f6c2 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorerWrapper.java 7f0bca763eb07db3822c6d6028357e81278803c9 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestOrcHCatLoader.java 82eb0d72b4f885184c094113f775415c06bdce98 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestOrcHCatLoaderComplexSchema.java 05387711289279cab743f51aee791069609b904a hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestOrcHCatPigStorer.java a9b452101c15fb7a3f0d8d0339f7d0ad97383441 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestOrcHCatStorer.java 1084092828a9ac5e37f5b50b9c6bbd03f70b48fd hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestPigHCatUtil.java a8ce61aaad42b03e4de346530d0724f3d69776b9 ql/src/test/org/apache/hadoop/hive/ql/io/StorageFormats.java 19fdeb5ed3dba7a3bcba71fb285d92d3f6aabea9 Diff: https://reviews.apache.org/r/23797/diff/ Testing --- Thanks, David Chen
[jira] [Updated] (HIVE-7532) allow disabling direct sql per query with external metastore
[ https://issues.apache.org/jira/browse/HIVE-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7532: Attachment: HIVE-7532.2.nogen > allow disabling direct sql per query with external metastore > > > Key: HIVE-7532 > URL: https://issues.apache.org/jira/browse/HIVE-7532 > Project: Hive > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Navis > Attachments: HIVE-7532.1.patch.txt, HIVE-7532.2.nogen, > HIVE-7532.2.patch.txt > > > Currently with external metastore, direct sql can only be disabled via > metastore config globally. Perhaps it makes sense to have the ability to > propagate the setting per query from client to override the metastore > setting, e.g. if one particular query causes it to fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7457) Minor HCatalog Pig Adapter test clean up
[ https://issues.apache.org/jira/browse/HIVE-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-7457: - Attachment: HIVE-7457.4.patch Attach a new patch rebased on trunk. > Minor HCatalog Pig Adapter test clean up > > > Key: HIVE-7457 > URL: https://issues.apache.org/jira/browse/HIVE-7457 > Project: Hive > Issue Type: Sub-task >Reporter: David Chen >Assignee: David Chen >Priority: Minor > Attachments: HIVE-7457.1.patch, HIVE-7457.2.patch, HIVE-7457.3.patch, > HIVE-7457.4.patch > > > Minor cleanup to the HCatalog Pig Adapter tests in preparation for HIVE-7420: > * Run through Hive Eclipse formatter. > * Convert JUnit 3-style tests to follow JUnit 4 conventions. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 24137: allow disabling direct sql per query with external metastore
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24137/ --- Review request for hive. Bugs: HIVE-7532 https://issues.apache.org/jira/browse/HIVE-7532 Repository: hive-git Description --- Currently with external metastore, direct sql can only be disabled via metastore config globally. Perhaps it makes sense to have the ability to propagate the setting per query from client to override the metastore setting, e.g. if one particular query causes it to fail. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 common/src/java/org/apache/hadoop/hive/conf/SystemVariables.java ee98d17 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java 9e416b5 metastore/if/hive_metastore.thrift 55f41db metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 5cc1cd8 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java d26183b metastore/src/java/org/apache/hadoop/hive/metastore/IHMSHandler.java 1675751 metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 5add436 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java c28c46a metastore/src/java/org/apache/hadoop/hive/metastore/RawStoreProxy.java 1cf09d4 metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 86172b9 metastore/src/java/org/apache/hadoop/hive/metastore/events/ConfigChangeEvent.java PRE-CREATION Diff: https://reviews.apache.org/r/24137/diff/ Testing --- Thanks, Navis Ryu
[jira] [Updated] (HIVE-7420) Parameterize tests for HCatalog Pig interfaces for testing against all storage formats
[ https://issues.apache.org/jira/browse/HIVE-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-7420: - Attachment: HIVE-7420-without-HIVE-7457.3.patch HIVE-7420.3.patch Attached new patch rebased on trunk. > Parameterize tests for HCatalog Pig interfaces for testing against all > storage formats > -- > > Key: HIVE-7420 > URL: https://issues.apache.org/jira/browse/HIVE-7420 > Project: Hive > Issue Type: Sub-task > Components: HCatalog >Reporter: David Chen >Assignee: David Chen > Attachments: HIVE-7420-without-HIVE-7457.2.patch, > HIVE-7420-without-HIVE-7457.3.patch, HIVE-7420.1.patch, HIVE-7420.2.patch, > HIVE-7420.3.patch > > > Currently, HCatalog tests only test against RCFile with a few testing against > ORC. The tests should be covering other Hive storage formats as well. > HIVE-7286 turns HCatMapReduceTest into a test fixture that can be run with > all Hive storage formats and with that patch, all test suites built on > HCatMapReduceTest are running and passing against Sequence File, Text, and > ORC in addition to RCFile. > Similar changes should be made to make the tests for HCatLoader and > HCatStorer generic so that they can be run against all Hive storage formats. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7532) allow disabling direct sql per query with external metastore
[ https://issues.apache.org/jira/browse/HIVE-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080349#comment-14080349 ] Sergey Shelukhin commented on HIVE-7532: There are some unrelated generated code changes. I actually noticed I get similar changes but I would usually reset unrelated files. I wonder if everyone gets those and if we should update it in separate jira if so? Is is possible to post an RB? patch is quite big > allow disabling direct sql per query with external metastore > > > Key: HIVE-7532 > URL: https://issues.apache.org/jira/browse/HIVE-7532 > Project: Hive > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Navis > Attachments: HIVE-7532.1.patch.txt, HIVE-7532.2.patch.txt > > > Currently with external metastore, direct sql can only be disabled via > metastore config globally. Perhaps it makes sense to have the ability to > propagate the setting per query from client to override the metastore > setting, e.g. if one particular query causes it to fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7559) StarterProject: Move configuration from SparkClient to HiveConf
[ https://issues.apache.org/jira/browse/HIVE-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080341#comment-14080341 ] Xuefu Zhang commented on HIVE-7559: --- [~brocknoland] Thanks for noticing this, but I think we're not in a hurry to do this, as there has been ongoing discussion about the configuration business (HIVE-7436). We will need a followup discussion on the topic. It appears that this can wait until we make a final decision. > StarterProject: Move configuration from SparkClient to HiveConf > --- > > Key: HIVE-7559 > URL: https://issues.apache.org/jira/browse/HIVE-7559 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: spark-branch >Reporter: Brock Noland >Priority: Minor > Labels: StarterProject > > The SparkClient class has some configuration keys and defaults. These should > be moved to HiveConf. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7563) ClassLoader should be released from LogFactory
[ https://issues.apache.org/jira/browse/HIVE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080343#comment-14080343 ] Szehon Ho commented on HIVE-7563: - Looks reasonable, +1 > ClassLoader should be released from LogFactory > -- > > Key: HIVE-7563 > URL: https://issues.apache.org/jira/browse/HIVE-7563 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis > Attachments: HIVE-7563.1.patch.txt > > > NO PRECOMMIT TESTS > LogFactory uses ClassLoader as a key in map, which makes the classloader > impossible to be unloaded. LogFactory.release() should be called explicitly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7564) Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7564: -- Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Patch committed to spark branch. > Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch] > -- > > Key: HIVE-7564 > URL: https://issues.apache.org/jira/browse/HIVE-7564 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: spark-branch > > Attachments: HIVE-7564.patch > > > NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7564) Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7564: -- Status: Patch Available (was: Open) > Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch] > -- > > Key: HIVE-7564 > URL: https://issues.apache.org/jira/browse/HIVE-7564 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-7564.patch > > > NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7564) Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7564: -- Attachment: HIVE-7564.patch > Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch] > -- > > Key: HIVE-7564 > URL: https://issues.apache.org/jira/browse/HIVE-7564 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: spark-branch > > Attachments: HIVE-7564.patch > > > NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080338#comment-14080338 ] David Chen commented on HIVE-4329: -- Some notes about this patch: * {{\*OutputFormatContainer}} classes now wrap a {{HiveOutputFormat}} rather than a mapred {{OutputFormat}}. * {{\*RecordWriterContainer}} classes now wrap a {{FileSinkOperator.RecordWriter}} rather than a mapred {{RecordWriter}}. * {{InternalUtil.initializeOutputSerDe}} and {{InternalUtil.initializeDeserializer}} now take the properties from the {{TableDesc}} created from the table contained in {{HCatTableInfo}} rather than creating the properties manually. As a result, {{InternalUtil.setSerDeProperties}} has been removed. * Fixed a {{NullPointerException}} in {{AvroSerDe.initialize}} that occurrs if {{columnCommentProperty}} is null. Test coverage: * Remove disabled Serde list from {{HCatMapReduceTest}} so that all {{HCatMapReduceTest}} suites are also run against {{AvroSerDe}} and {{ParquetHiveSerDe}} To do: * Fix case where static partitioning is used. * Clean up if necessary * Remove diagnostic print statements. > HCatalog should use getHiveRecordWriter rather than getRecordWriter > --- > > Key: HIVE-4329 > URL: https://issues.apache.org/jira/browse/HIVE-4329 > Project: Hive > Issue Type: Bug > Components: HCatalog, Serializers/Deserializers >Affects Versions: 0.14.0 > Environment: discovered in Pig, but it looks like the root cause > impacts all non-Hive users >Reporter: Sean Busbey >Assignee: David Chen > Attachments: HIVE-4329.0.patch > > > Attempting to write to a HCatalog defined table backed by the AvroSerde fails > with the following stacktrace: > {code} > java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be > cast to org.apache.hadoop.io.LongWritable > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) > at > org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) > at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) > {code} > The proximal cause of this failure is that the AvroContainerOutputFormat's > signature mandates a LongWritable key and HCat's FileRecordWriterContainer > forces a NullWritable. I'm not sure of a general fix, other than redefining > HiveOutputFormat to mandate a WritableComparable. > It looks like accepting WritableComparable is what's done in the other Hive > OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also > be changed, since it's ignoring the key. That way fixing things so > FileRecordWriterContainer can always use NullWritable could get spun into a > different issue? > The underlying cause for failure to write to AvroSerde tables is that > AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so > fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7564) Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7564: -- Description: NO PRECOMMIT TESTS. This is for spark branch only. > Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch] > -- > > Key: HIVE-7564 > URL: https://issues.apache.org/jira/browse/HIVE-7564 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > > NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7564) Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch]
Xuefu Zhang created HIVE-7564: - Summary: Remove some redundant code plus a bit of cleanup in SparkClient [Spark Branch] Key: HIVE-7564 URL: https://issues.apache.org/jira/browse/HIVE-7564 Project: Hive Issue Type: Improvement Components: Spark Reporter: Xuefu Zhang Assignee: Xuefu Zhang -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Attachment: HIVE-7096.5.patch > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.4.patch, HIVE-7096.5.patch, HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-4329: - Status: Patch Available (was: In Progress) > HCatalog should use getHiveRecordWriter rather than getRecordWriter > --- > > Key: HIVE-4329 > URL: https://issues.apache.org/jira/browse/HIVE-4329 > Project: Hive > Issue Type: Bug > Components: HCatalog, Serializers/Deserializers >Affects Versions: 0.10.0 > Environment: discovered in Pig, but it looks like the root cause > impacts all non-Hive users >Reporter: Sean Busbey >Assignee: David Chen > Attachments: HIVE-4329.0.patch > > > Attempting to write to a HCatalog defined table backed by the AvroSerde fails > with the following stacktrace: > {code} > java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be > cast to org.apache.hadoop.io.LongWritable > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) > at > org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) > at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) > {code} > The proximal cause of this failure is that the AvroContainerOutputFormat's > signature mandates a LongWritable key and HCat's FileRecordWriterContainer > forces a NullWritable. I'm not sure of a general fix, other than redefining > HiveOutputFormat to mandate a WritableComparable. > It looks like accepting WritableComparable is what's done in the other Hive > OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also > be changed, since it's ignoring the key. That way fixing things so > FileRecordWriterContainer can always use NullWritable could get spun into a > different issue? > The underlying cause for failure to write to AvroSerde tables is that > AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so > fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-4329: - Attachment: HIVE-4329.0.patch Writing via HCatalog is now working for both Avro and Parquet Serdes for everything except static partitioning. For static partitioning, there is a mismatch between the expected schema and the schema set in the table properties due the partition column not being present; I am looking into this problem right now. I am uploading a patch for initial review and to run through pre-commit tests. > HCatalog should use getHiveRecordWriter rather than getRecordWriter > --- > > Key: HIVE-4329 > URL: https://issues.apache.org/jira/browse/HIVE-4329 > Project: Hive > Issue Type: Bug > Components: HCatalog, Serializers/Deserializers >Affects Versions: 0.10.0 > Environment: discovered in Pig, but it looks like the root cause > impacts all non-Hive users >Reporter: Sean Busbey >Assignee: David Chen > Attachments: HIVE-4329.0.patch > > > Attempting to write to a HCatalog defined table backed by the AvroSerde fails > with the following stacktrace: > {code} > java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be > cast to org.apache.hadoop.io.LongWritable > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) > at > org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) > at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) > {code} > The proximal cause of this failure is that the AvroContainerOutputFormat's > signature mandates a LongWritable key and HCat's FileRecordWriterContainer > forces a NullWritable. I'm not sure of a general fix, other than redefining > HiveOutputFormat to mandate a WritableComparable. > It looks like accepting WritableComparable is what's done in the other Hive > OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also > be changed, since it's ignoring the key. That way fixing things so > FileRecordWriterContainer can always use NullWritable could get spun into a > different issue? > The underlying cause for failure to write to AvroSerde tables is that > AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so > fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-4329: - Affects Version/s: (was: 0.10.0) 0.14.0 > HCatalog should use getHiveRecordWriter rather than getRecordWriter > --- > > Key: HIVE-4329 > URL: https://issues.apache.org/jira/browse/HIVE-4329 > Project: Hive > Issue Type: Bug > Components: HCatalog, Serializers/Deserializers >Affects Versions: 0.14.0 > Environment: discovered in Pig, but it looks like the root cause > impacts all non-Hive users >Reporter: Sean Busbey >Assignee: David Chen > Attachments: HIVE-4329.0.patch > > > Attempting to write to a HCatalog defined table backed by the AvroSerde fails > with the following stacktrace: > {code} > java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be > cast to org.apache.hadoop.io.LongWritable > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) > at > org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) > at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) > {code} > The proximal cause of this failure is that the AvroContainerOutputFormat's > signature mandates a LongWritable key and HCat's FileRecordWriterContainer > forces a NullWritable. I'm not sure of a general fix, other than redefining > HiveOutputFormat to mandate a WritableComparable. > It looks like accepting WritableComparable is what's done in the other Hive > OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also > be changed, since it's ignoring the key. That way fixing things so > FileRecordWriterContainer can always use NullWritable could get spun into a > different issue? > The underlying cause for failure to write to AvroSerde tables is that > AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so > fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080326#comment-14080326 ] David Chen commented on HIVE-4329: -- RB: https://reviews.apache.org/r/24136 > HCatalog should use getHiveRecordWriter rather than getRecordWriter > --- > > Key: HIVE-4329 > URL: https://issues.apache.org/jira/browse/HIVE-4329 > Project: Hive > Issue Type: Bug > Components: HCatalog, Serializers/Deserializers >Affects Versions: 0.10.0 > Environment: discovered in Pig, but it looks like the root cause > impacts all non-Hive users >Reporter: Sean Busbey >Assignee: David Chen > Attachments: HIVE-4329.0.patch > > > Attempting to write to a HCatalog defined table backed by the AvroSerde fails > with the following stacktrace: > {code} > java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be > cast to org.apache.hadoop.io.LongWritable > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) > at > org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) > at > org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) > at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) > {code} > The proximal cause of this failure is that the AvroContainerOutputFormat's > signature mandates a LongWritable key and HCat's FileRecordWriterContainer > forces a NullWritable. I'm not sure of a general fix, other than redefining > HiveOutputFormat to mandate a WritableComparable. > It looks like accepting WritableComparable is what's done in the other Hive > OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also > be changed, since it's ignoring the key. That way fixing things so > FileRecordWriterContainer can always use NullWritable could get spun into a > different issue? > The underlying cause for failure to write to AvroSerde tables is that > AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so > fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 24136: HIVE-4329: HCatalog should use getHiveRecordWriter.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24136/ --- Review request for hive. Bugs: HIVE-4329 https://issues.apache.org/jira/browse/HIVE-4329 Repository: hive-git Description --- HIVE-4329: HCatalog should use getHiveRecordWriter. Diffs - hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 93a03adeab7ba3c3c91344955d303e4252005239 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputFormatContainer.java 3a07b0ca7c1956d45e611005cbc5ba2464596471 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultRecordWriterContainer.java 209d7bcef5624100c6cdbc2a0a137dcaf1c1fc42 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicPartitionFileRecordWriterContainer.java 4df912a935221e527c106c754ff233d212df9246 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java 1a7595fd6dd0a5ffbe529bc24015c482068233bf hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java 2a883d6517bfe732b6a6dffa647d9d44e4145b38 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java bfa8657cd1b16aec664aab3e22b430b304a3698d hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatBaseOutputFormat.java 4f7a74a002cedf3b54d0133041184fbcd9d9c4ab hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatMapRedUtil.java b651cb323771843da43667016a7dd2c9d9a1ddac hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java 694739821a202780818924d54d10edb707cfbcfa hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java 1980ef50af42499e0fed8863b6ff7a45f926d9fc hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InternalUtil.java 9b979395e47e54aac87487cb990824e3c3a2ee19 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputFormatContainer.java d83b003f9c16e78a39b3cc7ce810ff19f70848c2 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/RecordWriterContainer.java 5905b46178b510b3a43311739fea2b95f47b4ed7 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticPartitionFileRecordWriterContainer.java b3ea76e6a79f94e09972bc060c06105f60087b71 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/HCatMapReduceTest.java ee57f3fd126af2e36039f84686a4169ef6267593 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatDynamicPartitioned.java 0d87c6ce2b9a2169c3b7c9d80ff33417279fb465 hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java 7c9003e86c61dc9e4f10e05b0c29e40ded73c793 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 69545b046db06fd56f35a0da09d3d6960832484d Diff: https://reviews.apache.org/r/24136/diff/ Testing --- Thanks, David Chen
[jira] [Updated] (HIVE-7563) ClassLoader should be released from LogFactory
[ https://issues.apache.org/jira/browse/HIVE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7563: Status: Patch Available (was: Open) > ClassLoader should be released from LogFactory > -- > > Key: HIVE-7563 > URL: https://issues.apache.org/jira/browse/HIVE-7563 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis > Attachments: HIVE-7563.1.patch.txt > > > NO PRECOMMIT TESTS > LogFactory uses ClassLoader as a key in map, which makes the classloader > impossible to be unloaded. LogFactory.release() should be called explicitly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7563) ClassLoader should be released from LogFactory
[ https://issues.apache.org/jira/browse/HIVE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7563: Description: NO PRECOMMIT TESTS LogFactory uses ClassLoader as a key in map, which makes the classloader impossible to be unloaded. LogFactory.release() should be called explicitly. was:LogFactory uses ClassLoader as a key in map, which makes the classloader impossible to be unloaded. LogFactory.release() should be called explicitly. > ClassLoader should be released from LogFactory > -- > > Key: HIVE-7563 > URL: https://issues.apache.org/jira/browse/HIVE-7563 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis > Attachments: HIVE-7563.1.patch.txt > > > NO PRECOMMIT TESTS > LogFactory uses ClassLoader as a key in map, which makes the classloader > impossible to be unloaded. LogFactory.release() should be called explicitly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7563) ClassLoader should be released from LogFactory
Navis created HIVE-7563: --- Summary: ClassLoader should be released from LogFactory Key: HIVE-7563 URL: https://issues.apache.org/jira/browse/HIVE-7563 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Attachments: HIVE-7563.1.patch.txt LogFactory uses ClassLoader as a key in map, which makes the classloader impossible to be unloaded. LogFactory.release() should be called explicitly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7563) ClassLoader should be released from LogFactory
[ https://issues.apache.org/jira/browse/HIVE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7563: Attachment: HIVE-7563.1.patch.txt > ClassLoader should be released from LogFactory > -- > > Key: HIVE-7563 > URL: https://issues.apache.org/jira/browse/HIVE-7563 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis > Attachments: HIVE-7563.1.patch.txt > > > NO PRECOMMIT TESTS > LogFactory uses ClassLoader as a key in map, which makes the classloader > impossible to be unloaded. LogFactory.release() should be called explicitly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7547) Add ipAddress and userName to ExecHook
[ https://issues.apache.org/jira/browse/HIVE-7547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080299#comment-14080299 ] Thejas M Nair commented on HIVE-7547: - +1 Thanks for fixing the kerberos mode! > Add ipAddress and userName to ExecHook > -- > > Key: HIVE-7547 > URL: https://issues.apache.org/jira/browse/HIVE-7547 > Project: Hive > Issue Type: New Feature > Components: Diagnosability >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-7547.2.patch, HIVE-7547.3.patch, HIVE-7547.4.patch, > HIVE-7547.patch > > > Auditing tools should be able to know about the ipAddress and userName of the > user executing operations. > These could be made available through the Hive execution-hooks. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7526) Research to use groupby transformation to replace Hive existing partitionByKey and SparkCollector combination
[ https://issues.apache.org/jira/browse/HIVE-7526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080283#comment-14080283 ] Xuefu Zhang commented on HIVE-7526: --- Chao, Based on our last conversation, I don't think your patch is final or ready to be reviewed. Please continue working on your patch and update when you think it's ready. Here is what I have emphasized: 1. Define a SparkShuffle interface that's similar to existing ShuffleTran. 2. Have two implementation of this interface: sortBy and groupBy. 3. For sortBy, use a local key clustering mechanism. 4. Have ReduceTran contain a reference to SparkShuffle and HiveReduceFunction instance. Let me know if you have additional questions. > Research to use groupby transformation to replace Hive existing > partitionByKey and SparkCollector combination > - > > Key: HIVE-7526 > URL: https://issues.apache.org/jira/browse/HIVE-7526 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Chao > Attachments: HIVE-7526.2.patch, HIVE-7526.3.patch, HIVE-7526.patch > > > Currently SparkClient shuffles data by calling paritionByKey(). This > transformation outputs tuples. However, Hive's ExecMapper > expects > tuples, and Spark's groupByKey() seems > outputing this directly. Thus, using groupByKey, we may be able to avoid its > own key clustering mechanism (in HiveReduceFunction). This research is to > have a try. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7509) Fast stripe level merging for ORC
[ https://issues.apache.org/jira/browse/HIVE-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080267#comment-14080267 ] Lefty Leverenz commented on HIVE-7509: -- Configuration parameter *hive.merge.orcfile.stripe.level* needs to be added to the wiki by the time 0.14.0 is released, but *hive.merge.input.format.stripe.level* is internal only so it doesn't belong in the wiki. Besides adding *hive.merge.orcfile.stripe.level* to the Configuration Properties doc, a new section could be added to the ORC Files doc listing all the ORC configs or pointing to an ORC section in Configuration Properties (which hasn't been created yet). * [Configuration Properties -- ORC parameters | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.orc.splits.include.file.footer] * [ORC Files | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC] > Fast stripe level merging for ORC > - > > Key: HIVE-7509 > URL: https://issues.apache.org/jira/browse/HIVE-7509 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: TODOC14, orcfile > Fix For: 0.14.0 > > Attachments: HIVE-7509.1.patch, HIVE-7509.2.patch, HIVE-7509.3.patch, > HIVE-7509.4.patch, HIVE-7509.5.patch > > > Similar to HIVE-1950, add support for fast stripe level merging of ORC files > through CONCATENATE command and conditional merge task. This fast merging is > ideal for merging many small ORC files to a larger file without decompressing > and decoding the data of small orc files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7554) Parquet Hive should resolve column names in case insensitive manner
[ https://issues.apache.org/jira/browse/HIVE-7554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7554: --- Status: Patch Available (was: Open) > Parquet Hive should resolve column names in case insensitive manner > --- > > Key: HIVE-7554 > URL: https://issues.apache.org/jira/browse/HIVE-7554 > Project: Hive > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Brock Noland > Attachments: HIVE-7554.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7348) Beeline could not parse ; separated queries provided with -e option
[ https://issues.apache.org/jira/browse/HIVE-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080254#comment-14080254 ] Szehon Ho commented on HIVE-7348: - This looks good, can we add one test? > Beeline could not parse ; separated queries provided with -e option > --- > > Key: HIVE-7348 > URL: https://issues.apache.org/jira/browse/HIVE-7348 > Project: Hive > Issue Type: Bug >Reporter: Ashish Kumar Singh >Assignee: Ashish Kumar Singh > Attachments: HIVE-7348.1.patch, HIVE-7348.patch > > > Beeline could not parse ; separated queries provided with -e option. This > works fine on hive cli. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 24127: Research to use groupby transformation to replace Hive existing partitionByKey and SparkCollector combination
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24127/ --- Review request for hive. Repository: hive-git Description --- An attempt to fix the last patch by moving groupBy op to ShuffleTran. Also, since now SparkTran::transform may have input/output value types other than BytesWritable, we need to make it generic as well.. Also added a CompTran class, which is basically a composition of transformations. It offers better type compatibility than ChainedTran. This is NOT the perfect solution, and may subject to further change. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ChainedTran.java 4991568 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CompTran.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 01a70e9 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 841db87 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/IdentityTran.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 98d08e6 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java d1af86d ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ShuffleTran.java 33e7d45 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java cf85af1 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 440dd93 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTran.java 6aa732f Diff: https://reviews.apache.org/r/24127/diff/ Testing --- Thanks, Chao Sun
[jira] [Commented] (HIVE-7562) Cleanup ExecReducer
[ https://issues.apache.org/jira/browse/HIVE-7562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080248#comment-14080248 ] Szehon Ho commented on HIVE-7562: - +1, pending test > Cleanup ExecReducer > --- > > Key: HIVE-7562 > URL: https://issues.apache.org/jira/browse/HIVE-7562 > Project: Hive > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Brock Noland > Attachments: HIVE-7562.patch > > > ExecReducer places member variables at random with random visibility. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7509) Fast stripe level merging for ORC
[ https://issues.apache.org/jira/browse/HIVE-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7509: - Labels: TODOC14 orcfile (was: orcfile) > Fast stripe level merging for ORC > - > > Key: HIVE-7509 > URL: https://issues.apache.org/jira/browse/HIVE-7509 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: TODOC14, orcfile > Fix For: 0.14.0 > > Attachments: HIVE-7509.1.patch, HIVE-7509.2.patch, HIVE-7509.3.patch, > HIVE-7509.4.patch, HIVE-7509.5.patch > > > Similar to HIVE-1950, add support for fast stripe level merging of ORC files > through CONCATENATE command and conditional merge task. This fast merging is > ideal for merging many small ORC files to a larger file without decompressing > and decoding the data of small orc files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7446) Add support to ALTER TABLE .. ADD COLUMN to Avro backed tables
[ https://issues.apache.org/jira/browse/HIVE-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080245#comment-14080245 ] Ashish Kumar Singh commented on HIVE-7446: -- Test errors are not related to this patch. [~tomwhite] could you take a look at this trivial patch. > Add support to ALTER TABLE .. ADD COLUMN to Avro backed tables > -- > > Key: HIVE-7446 > URL: https://issues.apache.org/jira/browse/HIVE-7446 > Project: Hive > Issue Type: New Feature >Reporter: Ashish Kumar Singh >Assignee: Ashish Kumar Singh > Attachments: HIVE-7446.patch > > > HIVE-6806 adds native support for creating hive table stored as Avro. It > would be good to add support to ALTER TABLE .. ADD COLUMN to Avro backed > tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7509) Fast stripe level merging for ORC
[ https://issues.apache.org/jira/browse/HIVE-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-7509: - Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks [~hagleitn] and [~leftylev] for the reviews. > Fast stripe level merging for ORC > - > > Key: HIVE-7509 > URL: https://issues.apache.org/jira/browse/HIVE-7509 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Fix For: 0.14.0 > > Attachments: HIVE-7509.1.patch, HIVE-7509.2.patch, HIVE-7509.3.patch, > HIVE-7509.4.patch, HIVE-7509.5.patch > > > Similar to HIVE-1950, add support for fast stripe level merging of ORC files > through CONCATENATE command and conditional merge task. This fast merging is > ideal for merging many small ORC files to a larger file without decompressing > and decoding the data of small orc files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7509) Fast stripe level merging for ORC
[ https://issues.apache.org/jira/browse/HIVE-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-7509: - Fix Version/s: 0.14.0 > Fast stripe level merging for ORC > - > > Key: HIVE-7509 > URL: https://issues.apache.org/jira/browse/HIVE-7509 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Fix For: 0.14.0 > > Attachments: HIVE-7509.1.patch, HIVE-7509.2.patch, HIVE-7509.3.patch, > HIVE-7509.4.patch, HIVE-7509.5.patch > > > Similar to HIVE-1950, add support for fast stripe level merging of ORC files > through CONCATENATE command and conditional merge task. This fast merging is > ideal for merging many small ORC files to a larger file without decompressing > and decoding the data of small orc files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7562) Cleanup ExecReducer
[ https://issues.apache.org/jira/browse/HIVE-7562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7562: --- Assignee: Brock Noland Status: Patch Available (was: Open) > Cleanup ExecReducer > --- > > Key: HIVE-7562 > URL: https://issues.apache.org/jira/browse/HIVE-7562 > Project: Hive > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Brock Noland > Attachments: HIVE-7562.patch > > > ExecReducer places member variables at random with random visibility. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7348) Beeline could not parse ; separated queries provided with -e option
[ https://issues.apache.org/jira/browse/HIVE-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Kumar Singh updated HIVE-7348: - Attachment: HIVE-7348.1.patch > Beeline could not parse ; separated queries provided with -e option > --- > > Key: HIVE-7348 > URL: https://issues.apache.org/jira/browse/HIVE-7348 > Project: Hive > Issue Type: Bug >Reporter: Ashish Kumar Singh >Assignee: Ashish Kumar Singh > Attachments: HIVE-7348.1.patch, HIVE-7348.patch > > > Beeline could not parse ; separated queries provided with -e option. This > works fine on hive cli. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24086: HIVE-7348: Beeline could not parse ; separated queries provided with -e option
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24086/ --- (Updated July 30, 2014, 11:48 p.m.) Review request for hive. Changes --- Move changes to only effect -e path. Bugs: HIVE-7348 https://issues.apache.org/jira/browse/HIVE-7348 Repository: hive-git Description --- HIVE-7348: Beeline could not parse ; separated queries provided with -e option Diffs (updated) - beeline/src/java/org/apache/hive/beeline/BeeLine.java 10fd2e2daac78ca43d45c74fcbad6b720a8d28ad Diff: https://reviews.apache.org/r/24086/diff/ Testing --- Tested manually. Thanks, Ashish Singh
[jira] [Updated] (HIVE-7562) Cleanup ExecReducer
[ https://issues.apache.org/jira/browse/HIVE-7562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7562: --- Attachment: HIVE-7562.patch > Cleanup ExecReducer > --- > > Key: HIVE-7562 > URL: https://issues.apache.org/jira/browse/HIVE-7562 > Project: Hive > Issue Type: Improvement >Reporter: Brock Noland > Attachments: HIVE-7562.patch > > > ExecReducer places member variables at random with random visibility. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7562) Cleanup ExecReducer
Brock Noland created HIVE-7562: -- Summary: Cleanup ExecReducer Key: HIVE-7562 URL: https://issues.apache.org/jira/browse/HIVE-7562 Project: Hive Issue Type: Improvement Reporter: Brock Noland Attachments: HIVE-7562.patch ExecReducer places member variables at random with random visibility. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7547) Add ipAddress and userName to ExecHook
[ https://issues.apache.org/jira/browse/HIVE-7547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-7547: Attachment: HIVE-7547.4.patch > Add ipAddress and userName to ExecHook > -- > > Key: HIVE-7547 > URL: https://issues.apache.org/jira/browse/HIVE-7547 > Project: Hive > Issue Type: New Feature > Components: Diagnosability >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-7547.2.patch, HIVE-7547.3.patch, HIVE-7547.4.patch, > HIVE-7547.patch > > > Auditing tools should be able to know about the ipAddress and userName of the > user executing operations. > These could be made available through the Hive execution-hooks. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24084: HIVE-7547 - Add ipAddress and userName to ExecHook
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24084/ --- (Updated July 30, 2014, 11:46 p.m.) Review request for hive. Bugs: HIVE-7547 https://issues.apache.org/jira/browse/HIVE-7547 Repository: hive-git Description --- Passing the ipAddress and userName (already calculated in ThriftCLIService for other purposes) through several layers down to the hooks. Diffs (updated) - itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestHs2HooksWithMiniKdc.java PRE-CREATION itests/hive-unit/src/test/java/org/apache/hadoop/hive/hooks/TestHs2Hooks.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/Driver.java e512199 ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java b11cb86 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb Diff: https://reviews.apache.org/r/24084/diff/ Testing --- Added tests in both kerberos and non-kerberos mode. Thanks, Szehon Ho
[jira] [Updated] (HIVE-7547) Add ipAddress and userName to ExecHook
[ https://issues.apache.org/jira/browse/HIVE-7547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-7547: Attachment: HIVE-7547.3.patch Thanks Thejas for pointing that out. I refactored the code to use SessionState. The SessionState's ipAddress didnt seem to be set for Kerberos mode, so I'm also changing how its being set to work for all modes. Let me know if its not right. > Add ipAddress and userName to ExecHook > -- > > Key: HIVE-7547 > URL: https://issues.apache.org/jira/browse/HIVE-7547 > Project: Hive > Issue Type: New Feature > Components: Diagnosability >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-7547.2.patch, HIVE-7547.3.patch, HIVE-7547.patch > > > Auditing tools should be able to know about the ipAddress and userName of the > user executing operations. > These could be made available through the Hive execution-hooks. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24084: HIVE-7547 - Add ipAddress and userName to ExecHook
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24084/ --- (Updated July 30, 2014, 11:40 p.m.) Review request for hive. Changes --- Incorporating Brock and Thejas review comments. As Thejas pointed out, turns out ipAddress is already stored in sessionState, so using that and code becomes a lot cleaner. However, the ipAddress calculated in TSetIpAddressProcessor doesnt work in kerberos mode, so fixing it so its set in all modes. Bugs: HIVE-7547 https://issues.apache.org/jira/browse/HIVE-7547 Repository: hive-git Description --- Passing the ipAddress and userName (already calculated in ThriftCLIService for other purposes) through several layers down to the hooks. Diffs (updated) - itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestHs2HooksWithMiniKdc.java PRE-CREATION itests/hive-unit/src/test/java/org/apache/hadoop/hive/hooks/TestHs2Hooks.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/Driver.java e512199 ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java b11cb86 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb Diff: https://reviews.apache.org/r/24084/diff/ Testing --- Added tests in both kerberos and non-kerberos mode. Thanks, Szehon Ho
[jira] [Commented] (HIVE-7509) Fast stripe level merging for ORC
[ https://issues.apache.org/jira/browse/HIVE-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080204#comment-14080204 ] Lefty Leverenz commented on HIVE-7509: -- Good doc fixes, thanks [~prasanth_j]. +1 for docs only. > Fast stripe level merging for ORC > - > > Key: HIVE-7509 > URL: https://issues.apache.org/jira/browse/HIVE-7509 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Attachments: HIVE-7509.1.patch, HIVE-7509.2.patch, HIVE-7509.3.patch, > HIVE-7509.4.patch, HIVE-7509.5.patch > > > Similar to HIVE-1950, add support for fast stripe level merging of ORC files > through CONCATENATE command and conditional merge task. This fast merging is > ideal for merging many small ORC files to a larger file without decompressing > and decoding the data of small orc files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Attachment: HIVE-7096.4.patch > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.4.patch, HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Attachment: (was: HIVE-7096.4.patch) > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.4.patch, HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7509) Fast stripe level merging for ORC
[ https://issues.apache.org/jira/browse/HIVE-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080178#comment-14080178 ] Hive QA commented on HIVE-7509: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12658680/HIVE-7509.5.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5842 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/110/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/110/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-110/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12658680 > Fast stripe level merging for ORC > - > > Key: HIVE-7509 > URL: https://issues.apache.org/jira/browse/HIVE-7509 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Prasanth J >Assignee: Prasanth J > Labels: orcfile > Attachments: HIVE-7509.1.patch, HIVE-7509.2.patch, HIVE-7509.3.patch, > HIVE-7509.4.patch, HIVE-7509.5.patch > > > Similar to HIVE-1950, add support for fast stripe level merging of ORC files > through CONCATENATE command and conditional merge task. This fast merging is > ideal for merging many small ORC files to a larger file without decompressing > and decoding the data of small orc files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Attachment: HIVE-7096.4.patch > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.4.patch, HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Attachment: (was: HIVE-7096.4.patch) > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Attachment: HIVE-7096.4.patch > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Attachment: (was: HIVE-7096.4.patch) > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Component/s: Tez > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.4.patch, HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Attachment: HIVE-7096.4.patch This patch works with tez-0.5 only. Since only the tez branch has been upgraded to that version, this is only applicable to that hive branch. > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.4.patch, HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7096) Support grouped splits in Tez partitioned broadcast join
[ https://issues.apache.org/jira/browse/HIVE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7096: - Affects Version/s: tez-branch > Support grouped splits in Tez partitioned broadcast join > > > Key: HIVE-7096 > URL: https://issues.apache.org/jira/browse/HIVE-7096 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: tez-branch >Reporter: Gunther Hagleitner >Assignee: Vikram Dixit K > Attachments: HIVE-7096.1.patch, HIVE-7096.2.patch, HIVE-7096.3.patch, > HIVE-7096.4.patch, HIVE-7096.tez.branch.patch > > > Same checks for schema + deser + file format done in HiveSplitGenerator need > to be done in the CustomPartitionVertex. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7390) Make quote character optional and configurable in BeeLine CSV/TSV output
[ https://issues.apache.org/jira/browse/HIVE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080128#comment-14080128 ] Lars Francke commented on HIVE-7390: You summed it up nicely, thanks. The original intention of this issue was to make the quote character optional and configurable so Jim must have had a use-case for that. I can't think of a good one atm. I can however think of a good reason for a configurable delimiter. Comma, semicolon or tab occur relatively frequently in data but some other character (\001 or "|") might not occur in the data and being able to pick this as the delimiter allows to make parsing way simpler (just split on delimiter instead of looking for quoted strings etc.). This is especially interesting when you then want to mount another table on that data in Hive or post-process in any other simple way where you don't have access to a full fledged CSV parsing library. So: Picking the delimiter is often very helpful in avoiding a whole class of parsing issues and allows to just split on the delimiter. I think that we can easily catch most common issues with two changes: 1. Fix current CSV and TSV. As you say: No debate on that 2. Allow delimiter to be specified and keep "normal quoting" mode That allows everyone who really understands his data to avoid quoting and everyone else can get properly formatted CSVs for a full CSV parser. In the same vein I think that {{surroundingSpacesNeedQuotes}} should stay disabled. But as I said: This is kinda hijacking Jim's original issue... > Make quote character optional and configurable in BeeLine CSV/TSV output > > > Key: HIVE-7390 > URL: https://issues.apache.org/jira/browse/HIVE-7390 > Project: Hive > Issue Type: New Feature > Components: Clients >Affects Versions: 0.13.1 >Reporter: Jim Halfpenny >Assignee: Ferdinand Xu > Attachments: HIVE-7390.1.patch, HIVE-7390.2.patch, HIVE-7390.3.patch, > HIVE-7390.4.patch, HIVE-7390.patch > > > Currently when either the CSV or TSV output formats are used in beeline each > column is wrapped in single quotes. Quote wrapping of columns should be > optional and the user should be able to choose the character used to wrap the > columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7506) MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)
[ https://issues.apache.org/jira/browse/HIVE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080125#comment-14080125 ] Gunther Hagleitner commented on HIVE-7506: -- [~damien.carol] I think the use for this is different that analyze. The ability to update certain stats without scanning any data or without "hacking the backend db" is useful in a number of cases. It helps (esp for CBO work) to set up unit tests quickly and verify both cbo and the stats subsystem. It also helps when experimenting with the system if you're just trying out hive/hadoop on a small cluster. Finally it gives you a quick and clean way to fix things when something went wrong wrt stats in your environment. > MetadataUpdater: provide a mechanism to edit the statistics of a column in a > table (or a partition of a table) > -- > > Key: HIVE-7506 > URL: https://issues.apache.org/jira/browse/HIVE-7506 > Project: Hive > Issue Type: New Feature > Components: Database/Schema >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Original Estimate: 252h > Remaining Estimate: 252h > > Two motivations: > (1) CBO depends heavily on the statistics of a column in a table (or a > partition of a table). If we would like to test whether CBO chooses the best > plan under different statistics, it would be time consuming if we load the > whole table and create the statistics from ground up. > (2) As database runs, the statistics of a column in a table (or a partition > of a table) may change. We need a way or a mechanism to synchronize. > We propose the following command to achieve that: > ALTER TABLE table_name PARTITION partition_spec [COLUMN col_name] UPDATE > STATISTICS col_statistics [COMMENT col_comment] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7506) MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)
[ https://issues.apache.org/jira/browse/HIVE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7506: - Priority: Minor (was: Critical) > MetadataUpdater: provide a mechanism to edit the statistics of a column in a > table (or a partition of a table) > -- > > Key: HIVE-7506 > URL: https://issues.apache.org/jira/browse/HIVE-7506 > Project: Hive > Issue Type: New Feature > Components: Database/Schema >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Minor > Original Estimate: 252h > Remaining Estimate: 252h > > Two motivations: > (1) CBO depends heavily on the statistics of a column in a table (or a > partition of a table). If we would like to test whether CBO chooses the best > plan under different statistics, it would be time consuming if we load the > whole table and create the statistics from ground up. > (2) As database runs, the statistics of a column in a table (or a partition > of a table) may change. We need a way or a mechanism to synchronize. > We propose the following command to achieve that: > ALTER TABLE table_name PARTITION partition_spec [COLUMN col_name] UPDATE > STATISTICS col_statistics [COMMENT col_comment] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7488) pass column names being used for inputs to authorization api
[ https://issues.apache.org/jira/browse/HIVE-7488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080118#comment-14080118 ] Jason Dere commented on HIVE-7488: -- +1. Test failures not related? > pass column names being used for inputs to authorization api > > > Key: HIVE-7488 > URL: https://issues.apache.org/jira/browse/HIVE-7488 > Project: Hive > Issue Type: Bug > Components: Authorization >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-7488.1.patch, HIVE-7488.2.patch, > HIVE-7488.3.patch.txt, HIVE-7488.4.patch, HIVE-7488.5.patch, HIVE-7488.6.patch > > > HivePrivilegeObject in the authorization api has support for columns, but the > columns being used are not being populated for non grant-revoke queries. > This is for enabling any implementation of the api to use this column > information for its authorization decisions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (HIVE-7506) MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)
[ https://issues.apache.org/jira/browse/HIVE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner reopened HIVE-7506: -- > MetadataUpdater: provide a mechanism to edit the statistics of a column in a > table (or a partition of a table) > -- > > Key: HIVE-7506 > URL: https://issues.apache.org/jira/browse/HIVE-7506 > Project: Hive > Issue Type: New Feature > Components: Database/Schema >Reporter: pengcheng xiong >Assignee: pengcheng xiong >Priority: Critical > Original Estimate: 252h > Remaining Estimate: 252h > > Two motivations: > (1) CBO depends heavily on the statistics of a column in a table (or a > partition of a table). If we would like to test whether CBO chooses the best > plan under different statistics, it would be time consuming if we load the > whole table and create the statistics from ground up. > (2) As database runs, the statistics of a column in a table (or a partition > of a table) may change. We need a way or a mechanism to synchronize. > We propose the following command to achieve that: > ALTER TABLE table_name PARTITION partition_spec [COLUMN col_name] UPDATE > STATISTICS col_statistics [COMMENT col_comment] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7503) Support Hive's multi-table insert query with Spark
[ https://issues.apache.org/jira/browse/HIVE-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080076#comment-14080076 ] Xuefu Zhang commented on HIVE-7503: --- Assigned to myself for initial research. > Support Hive's multi-table insert query with Spark > -- > > Key: HIVE-7503 > URL: https://issues.apache.org/jira/browse/HIVE-7503 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > > For Hive's multi insert query > (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there > may be an MR job for each insert. When we achieve this with Spark, it would > be nice if all the inserts can happen concurrently. > It seems that this functionality isn't available in Spark. To make things > worse, the source of the insert may be re-computed unless it's staged. Even > with this, the inserts will happen sequentially, making the performance > suffer. > This task is to find out what takes in Spark to enable this without requiring > staging the source and sequential insertion. If this has to be solved in > Hive, find out an optimum way to do this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-7503) Support Hive's multi-table insert query with Spark
[ https://issues.apache.org/jira/browse/HIVE-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-7503: - Assignee: Xuefu Zhang > Support Hive's multi-table insert query with Spark > -- > > Key: HIVE-7503 > URL: https://issues.apache.org/jira/browse/HIVE-7503 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > > For Hive's multi insert query > (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there > may be an MR job for each insert. When we achieve this with Spark, it would > be nice if all the inserts can happen concurrently. > It seems that this functionality isn't available in Spark. To make things > worse, the source of the insert may be re-computed unless it's staged. Even > with this, the inserts will happen sequentially, making the performance > suffer. > This task is to find out what takes in Spark to enable this without requiring > staging the source and sequential insertion. If this has to be solved in > Hive, find out an optimum way to do this. -- This message was sent by Atlassian JIRA (v6.2#6252)