[jira] [Commented] (HIVE-14907) Hive Metastore should use repeatable-read consistency level
[ https://issues.apache.org/jira/browse/HIVE-14907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1247#comment-1247 ] Lenni Kuff commented on HIVE-14907: --- +[~mohitsabharwal] - FYI > Hive Metastore should use repeatable-read consistency level > --- > > Key: HIVE-14907 > URL: https://issues.apache.org/jira/browse/HIVE-14907 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0 >Reporter: Lenni Kuff > > Currently HMS uses the "read-committed" consistency level which is the > default for DataNucleus. This could cause potential problems since the state > visible to each transaction can actually see updates from other transactions, > so it is very difficult to reason about any code that reads multiple pieces > of data. > Instead it should use "repeatable-read" consistency which guarantees that any > transaction only sees the state at the beginning of a transaction plus any > updates done within a transaction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12983) Provide a builtin function to get Hive version
[ https://issues.apache.org/jira/browse/HIVE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15465232#comment-15465232 ] Lenni Kuff commented on HIVE-12983: --- Thanks [~cartershanklin] for adding the documentation and thanks [~leftylev] for following up. > Provide a builtin function to get Hive version > -- > > Key: HIVE-12983 > URL: https://issues.apache.org/jira/browse/HIVE-12983 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 2.0.0 >Reporter: Lenni Kuff >Assignee: Lenni Kuff > Fix For: 2.1.0 > > Attachments: HIVE-12983.1.patch, HIVE-12983.2.patch > > > It would be nice to have a builtin function that would return the Hive > version. This would make it easier for a users and tests to programmatically > check the Hive version in a SQL script. It's also useful so a client can > check the Hive version on a remote cluster. > For example: > {code} > beeline> SELECT version(); > 2.1.0-SNAPSHOT r208ab352311a6cbbcd1f7fcd40964da2dbc6703d > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property
[ https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453014#comment-15453014 ] Lenni Kuff commented on HIVE-12362: --- I don't have a test case available to confirm this, it was only done by looking at the code so have not confirmed. Seems that there is extra working happening for each column value in each row, so could have a possible performance impact. > Hive's Parquet SerDe ignores 'serialization.null.format' property > - > > Key: HIVE-12362 > URL: https://issues.apache.org/jira/browse/HIVE-12362 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-12362.2.patch, HIVE-12362.patch > > > {code} > create table src (a string); > insert into table src values (NULL), (''), (''); > 0: jdbc:hive2://localhost:1/default> select * from src; > +---+--+ > | src.a | > +---+--+ > | NULL | > || > || > +---+--+ > create table dest (a string) row format serde > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as > INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > alter table dest set SERDEPROPERTIES ('serialization.null.format' = ''); > alter table dest set TBLPROPERTIES ('serialization.null.format' = ''); > insert overwrite table dest select * from src; > 0: jdbc:hive2://localhost:1/default> select * from test11; > +---+--+ > | test11.a | > +---+--+ > | NULL | > || > || > +---+--+ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property
[ https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452648#comment-15452648 ] Lenni Kuff commented on HIVE-12362: --- [~ngangam] - Looking at the patch it appears there may be some significant performance impact with this change. Have you done any performance testing with this patch? > Hive's Parquet SerDe ignores 'serialization.null.format' property > - > > Key: HIVE-12362 > URL: https://issues.apache.org/jira/browse/HIVE-12362 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-12362.2.patch, HIVE-12362.patch > > > {code} > create table src (a string); > insert into table src values (NULL), (''), (''); > 0: jdbc:hive2://localhost:1/default> select * from src; > +---+--+ > | src.a | > +---+--+ > | NULL | > || > || > +---+--+ > create table dest (a string) row format serde > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as > INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > alter table dest set SERDEPROPERTIES ('serialization.null.format' = ''); > alter table dest set TBLPROPERTIES ('serialization.null.format' = ''); > insert overwrite table dest select * from src; > 0: jdbc:hive2://localhost:1/default> select * from test11; > +---+--+ > | test11.a | > +---+--+ > | NULL | > || > || > +---+--+ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12983) Provide a builtin function to get Hive version
[ https://issues.apache.org/jira/browse/HIVE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307289#comment-15307289 ] Lenni Kuff commented on HIVE-12983: --- Test failures don't look related. > Provide a builtin function to get Hive version > -- > > Key: HIVE-12983 > URL: https://issues.apache.org/jira/browse/HIVE-12983 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 2.0.0 >Reporter: Lenni Kuff >Assignee: Lenni Kuff > Attachments: HIVE-12983.1.patch, HIVE-12983.2.patch > > > It would be nice to have a builtin function that would return the Hive > version. This would make it easier for a users and tests to programmatically > check the Hive version in a SQL script. It's also useful so a client can > check the Hive version on a remote cluster. > For example: > {code} > beeline> SELECT version(); > 2.1.0-SNAPSHOT r208ab352311a6cbbcd1f7fcd40964da2dbc6703d > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-12983) Provide a builtin function to get Hive version
[ https://issues.apache.org/jira/browse/HIVE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305199#comment-15305199 ] Lenni Kuff edited comment on HIVE-12983 at 5/28/16 6:10 AM: Updated based on [~szehon]'s feedback. Now returns a shorted build info (version + githash). I do think including the githash is valuable. One of the use cases for this is to understand exactly what bits are deployed someplace, so it's useful to have the githash. was (Author: lskuff): Updated based on [~szehon]'s feedback. Now returns a shorted build info (version + githash). > Provide a builtin function to get Hive version > -- > > Key: HIVE-12983 > URL: https://issues.apache.org/jira/browse/HIVE-12983 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 2.0.0 >Reporter: Lenni Kuff >Assignee: Lenni Kuff > Attachments: HIVE-12983.1.patch, HIVE-12983.2.patch > > > It would be nice to have a builtin function that would return the Hive > version. This would make it easier for a users and tests to programmatically > check the Hive version in a SQL script. It's also useful so a client can > check the Hive version on a remote cluster. > For example: > {code} > beeline> SELECT version(); > 2.1.0-SNAPSHOT r208ab352311a6cbbcd1f7fcd40964da2dbc6703d > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12983) Provide a builtin function to get Hive version
[ https://issues.apache.org/jira/browse/HIVE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-12983: -- Description: It would be nice to have a builtin function that would return the Hive version. This would make it easier for a users and tests to programmatically check the Hive version in a SQL script. It's also useful so a client can check the Hive version on a remote cluster. For example: {code} beeline> SELECT version(); 2.1.0-SNAPSHOT r208ab352311a6cbbcd1f7fcd40964da2dbc6703d {code} was: It would be nice to have a builtin function that would return the Hive version. This would make it easier for a users and tests to programmatically check the Hive version in a SQL script. It's also useful so a client can check the Hive version on a remote cluster. For example: {code} beeline> SELECT version(); 2.1.0-SNAPSHOT from 208ab352311a6cbbcd1f7fcd40964da2dbc6703d by lskuff source checksum 8e971cda755f6b3fb528c233c40eb50a {code} > Provide a builtin function to get Hive version > -- > > Key: HIVE-12983 > URL: https://issues.apache.org/jira/browse/HIVE-12983 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 2.0.0 >Reporter: Lenni Kuff >Assignee: Lenni Kuff > Attachments: HIVE-12983.1.patch, HIVE-12983.2.patch > > > It would be nice to have a builtin function that would return the Hive > version. This would make it easier for a users and tests to programmatically > check the Hive version in a SQL script. It's also useful so a client can > check the Hive version on a remote cluster. > For example: > {code} > beeline> SELECT version(); > 2.1.0-SNAPSHOT r208ab352311a6cbbcd1f7fcd40964da2dbc6703d > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12983) Provide a builtin function to get Hive version
[ https://issues.apache.org/jira/browse/HIVE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-12983: -- Attachment: HIVE-12983.2.patch Updated based on [~szehon]'s feedback. Now returns a shorted build info (version + githash). > Provide a builtin function to get Hive version > -- > > Key: HIVE-12983 > URL: https://issues.apache.org/jira/browse/HIVE-12983 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 2.0.0 >Reporter: Lenni Kuff >Assignee: Lenni Kuff > Attachments: HIVE-12983.1.patch, HIVE-12983.2.patch > > > It would be nice to have a builtin function that would return the Hive > version. This would make it easier for a users and tests to programmatically > check the Hive version in a SQL script. It's also useful so a client can > check the Hive version on a remote cluster. > For example: > {code} > beeline> SELECT version(); > 2.1.0-SNAPSHOT from 208ab352311a6cbbcd1f7fcd40964da2dbc6703d by lskuff source > checksum 8e971cda755f6b3fb528c233c40eb50a > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13050) The row count is not correct after changing partition location to point to another partition location
[ https://issues.apache.org/jira/browse/HIVE-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212818#comment-15212818 ] Lenni Kuff commented on HIVE-13050: --- Do we even want to fix this? It seems much simpler and easier to understand the behavior if we do not de-dupe locations. There also may be existing users who expect the current behavior so it would be an incompatible change. > The row count is not correct after changing partition location to point to > another partition location > - > > Key: HIVE-13050 > URL: https://issues.apache.org/jira/browse/HIVE-13050 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > > {noformat} > CREATE TABLE test (s STRING) PARTITIONED BY (p SMALLINT) location > 'data/test'; > INSERT INTO test PARTITION (`p`=1) VALUES ("v1"); > INSERT INTO test PARTITION (`p`=2) VALUES ("v2"); > ALTER TABLE test PARTITION (`p`=2) SET LOCATION '/data/test/p=1'; > {noformat} > {{select * from test;}} shows 2 rows while {{SELECT count(*) FROM test;}} > shows 1. > That is inconsistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12721) Add UUID built in function
[ https://issues.apache.org/jira/browse/HIVE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190521#comment-15190521 ] Lenni Kuff commented on HIVE-12721: --- [~jbeard] - Curious if the following would work for you: {code} SELECT reflect("java.util.UUID", "randomUUID") {code} I do see the value of a first-class UUID built-in though, as reflect() may be restricted in some deployments due to security requirements. > Add UUID built in function > -- > > Key: HIVE-12721 > URL: https://issues.apache.org/jira/browse/HIVE-12721 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Jeremy Beard >Assignee: Jeremy Beard > Attachments: HIVE-12721.patch > > > A UUID function would be very useful for ETL jobs that need to generate > surrogate keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12983) Provide a builtin function to get Hive version
[ https://issues.apache.org/jira/browse/HIVE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15129157#comment-15129157 ] Lenni Kuff commented on HIVE-12983: --- [~szehon] - I think the revision is nice to have so you can know exactly what commit this is built from. The other build info is not as important. What do you think about something like: {code} 2.1.0-SNAPSHOT r208ab352311a6cbbcd1f7fcd40964da2dbc6703d {code} > Provide a builtin function to get Hive version > -- > > Key: HIVE-12983 > URL: https://issues.apache.org/jira/browse/HIVE-12983 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 2.0.0 >Reporter: Lenni Kuff >Assignee: Lenni Kuff > Attachments: HIVE-12983.1.patch > > > It would be nice to have a builtin function that would return the Hive > version. This would make it easier for a users and tests to programmatically > check the Hive version in a SQL script. It's also useful so a client can > check the Hive version on a remote cluster. > For example: > {code} > beeline> SELECT version(); > 2.1.0-SNAPSHOT from 208ab352311a6cbbcd1f7fcd40964da2dbc6703d by lskuff source > checksum 8e971cda755f6b3fb528c233c40eb50a > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12983) Provide a builtin function to get Hive version
[ https://issues.apache.org/jira/browse/HIVE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-12983: -- Attachment: HIVE-12983.1.patch > Provide a builtin function to get Hive version > -- > > Key: HIVE-12983 > URL: https://issues.apache.org/jira/browse/HIVE-12983 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 2.0.0 >Reporter: Lenni Kuff >Assignee: Lenni Kuff > Attachments: HIVE-12983.1.patch > > > It would be nice to have a builtin function that would return the Hive > version. This would make it easier for a users and tests to programmatically > check the Hive version in a SQL script. It's also useful so a client can > check the Hive version on a remote cluster. > For example: > {code} > beeline> SELECT version(); > 2.1.0-SNAPSHOT from 208ab352311a6cbbcd1f7fcd40964da2dbc6703d by lskuff source > checksum 8e971cda755f6b3fb528c233c40eb50a > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12983) Provide a builtin function to get Hive version
[ https://issues.apache.org/jira/browse/HIVE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff reassigned HIVE-12983: - Assignee: Lenni Kuff (was: Jason Dere) > Provide a builtin function to get Hive version > -- > > Key: HIVE-12983 > URL: https://issues.apache.org/jira/browse/HIVE-12983 > Project: Hive > Issue Type: Improvement > Components: UDF >Affects Versions: 2.0.0 >Reporter: Lenni Kuff >Assignee: Lenni Kuff > > It would be nice to have a builtin function that would return the Hive > version. This would make it easier for a users and tests to programmatically > check the Hive version in a SQL script. It's also useful so a client can > check the Hive version on a remote cluster. > For example: > {code} > beeline> SELECT version(); > 2.1.0-SNAPSHOT from 208ab352311a6cbbcd1f7fcd40964da2dbc6703d by lskuff source > checksum 8e971cda755f6b3fb528c233c40eb50a > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12971) Hive Support for Kudu
[ https://issues.apache.org/jira/browse/HIVE-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-12971: -- Assignee: (was: Lenni Kuff) > Hive Support for Kudu > - > > Key: HIVE-12971 > URL: https://issues.apache.org/jira/browse/HIVE-12971 > Project: Hive > Issue Type: New Feature >Affects Versions: 2.0.0 > Reporter: Lenni Kuff > > JIRA for tracking work related to Hive/Kudu integration. > It would be useful to allow Kudu data to be accessible via Hive. This would > involve creating a Kudu SerDe/StorageHandler and implementing support for > QUERY and DML commands like SELECT, INSERT, UPDATE, and DELETE. Kudu > Input/OutputFormats classes already exist. The work can be staged to support > this functionality incrementally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location
[ https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15115591#comment-15115591 ] Lenni Kuff commented on HIVE-12891: --- [~sircodesalot] - Would you mind posting a review board link since the patch has grown a bit? > Hive fails when java.io.tmpdir is set to a relative location > > > Key: HIVE-12891 > URL: https://issues.apache.org/jira/browse/HIVE-12891 > Project: Hive > Issue Type: Bug >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert > Attachments: HIVE-12891.01.19.2016.01.patch, HIVE-12891.03.patch, > HIVE-12891.04.patch, HIVE-12981.01.22.2016.02.patch > > > The function {{SessionState.createSessionDirs}} fails when trying to create > directories where {{java.io.tmpdir}} is set to a relative location. > {code} > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: > IllegalArgumentException java.net.URISyntaxException: Relative path in > absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > ... > Minor variations: > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException > Exception while processing Exception while writing out the local file > o.a.h.hive.ql/parse.SemanticException: Exception while processing exception > while writing out local file > ... > caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > at o.a.h.fs.Path.initialize (206) > at o.a.h.fs.Path.(197)... > at o.a.h.hive.ql.context.getScratchDir(267) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location
[ https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108163#comment-15108163 ] Lenni Kuff commented on HIVE-12891: --- Comments: - Do you want to expand all of these paths to absolute? Some of them are HDFS scratch dirs, not sure if we want to support relative paths for those or just java.io.tmpdir - Update the config documentation to mention that relative or absolute paths are allowed. - Is it easy to add a test for this? > Hive fails when java.io.tmpdir is set to a relative location > > > Key: HIVE-12891 > URL: https://issues.apache.org/jira/browse/HIVE-12891 > Project: Hive > Issue Type: Bug >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert > Attachments: HIVE-12891.01.19.2016.01.patch > > > The function {{SessionState.createSessionDirs}} fails when trying to create > directories where {{java.io.tmpdir}} is set to a relative location. > {code} > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: > IllegalArgumentException java.net.URISyntaxException: Relative path in > absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > ... > Minor variations: > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException > Exception while processing Exception while writing out the local file > o.a.h.hive.ql/parse.SemanticException: Exception while processing exception > while writing out local file > ... > caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > at o.a.h.fs.Path.initialize (206) > at o.a.h.fs.Path.(197)... > at o.a.h.hive.ql.context.getScratchDir(267) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12603) Add config to block queries that scan > N number of partitions
[ https://issues.apache.org/jira/browse/HIVE-12603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff resolved HIVE-12603. --- Resolution: Duplicate > Add config to block queries that scan > N number of partitions > --- > > Key: HIVE-12603 > URL: https://issues.apache.org/jira/browse/HIVE-12603 > Project: Hive > Issue Type: Bug > Components: Metastore, Query Planning >Affects Versions: 2.0.0 >Reporter: Lenni Kuff > > Strict mode is useful for blocking queries that load all partitions, but it's > still possible to put significant load on the HMS for queries that scan a > large number of partitions. It would be useful to add a config provide a hard > limit to the number of partitions scanned by a query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12603) Add config to block queries that scan > N number of partitions
[ https://issues.apache.org/jira/browse/HIVE-12603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046565#comment-15046565 ] Lenni Kuff commented on HIVE-12603: --- Thanks [~sershe], missed that config. That's exactly what we want. > Add config to block queries that scan > N number of partitions > --- > > Key: HIVE-12603 > URL: https://issues.apache.org/jira/browse/HIVE-12603 > Project: Hive > Issue Type: Bug > Components: Metastore, Query Planning >Affects Versions: 2.0.0 >Reporter: Lenni Kuff > > Strict mode is useful for blocking queries that load all partitions, but it's > still possible to put significant load on the HMS for queries that scan a > large number of partitions. It would be useful to add a config provide a hard > limit to the number of partitions scanned by a query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12550) Cache and display last N completed queries in HS2 WebUI
[ https://issues.apache.org/jira/browse/HIVE-12550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-12550: -- Assignee: (was: Vaibhav Gumashta) > Cache and display last N completed queries in HS2 WebUI > > > Key: HIVE-12550 > URL: https://issues.apache.org/jira/browse/HIVE-12550 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Lenni Kuff > Fix For: 2.0.0 > > > Along with the in-flight queries, it would be nice to see the last N > (configurable?) completed queries since the last process restart (I don't > think this information needs to be persisted anywhere). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12338) Add webui to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033263#comment-15033263 ] Lenni Kuff commented on HIVE-12338: --- [~jxiang] - It looks like this JIRA is resolved, but there some remaining child tasks. Should we treat this JIRA as the "parent" for the full set of WebUI enhancements? > Add webui to HiveServer2 > > > Key: HIVE-12338 > URL: https://issues.apache.org/jira/browse/HIVE-12338 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12338.1.patch, HIVE-12338.2.patch, > HIVE-12338.3.patch, HIVE-12338.4.patch, hs2-conf.png, hs2-logs.png, > hs2-metrics.png, hs2-webui.png > > > A web ui for HiveServer2 can show some useful information such as: > > 1. Sessions, > 2. Queries that are executing on the HS2, their states, starting time, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12550) Cache and display last N completed queries in HS2 WebUI
[ https://issues.apache.org/jira/browse/HIVE-12550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-12550: -- Summary: Cache and display last N completed queries in HS2 WebUI (was: Cache last N completed queries in HS2 WebUI) > Cache and display last N completed queries in HS2 WebUI > > > Key: HIVE-12550 > URL: https://issues.apache.org/jira/browse/HIVE-12550 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Lenni Kuff >Assignee: Vaibhav Gumashta > Fix For: 2.0.0 > > > Along with the in-flight queries, it would be nice to see the last N > (configurable?) completed queries since the last process restart (I don't > think this information needs to be persisted anywhere). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12549) Display execution engine in HS2 webui query view
[ https://issues.apache.org/jira/browse/HIVE-12549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-12549: -- Summary: Display execution engine in HS2 webui query view (was: Disable execution engine in HS2 webui query view) > Display execution engine in HS2 webui query view > > > Key: HIVE-12549 > URL: https://issues.apache.org/jira/browse/HIVE-12549 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Lenni Kuff > Fix For: 2.0.0 > > > As part of the query info, it would be useful to show the execution engine > for the running query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12484) Show meta operations on HS2 web UI
[ https://issues.apache.org/jira/browse/HIVE-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15022655#comment-15022655 ] Lenni Kuff commented on HIVE-12484: --- Yeah, those might better be tracked as metrics? Seems much lower priority to me than than the SQL statements. > Show meta operations on HS2 web UI > -- > > Key: HIVE-12484 > URL: https://issues.apache.org/jira/browse/HIVE-12484 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Jimmy Xiang > > As Mohit pointed out in the review of HIVE-12338, it is nice to show meta > operations on HS2 web UI too. So that we can have an end-to-end picture for > those operations access HMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12484) Show meta operations on HS2 web UI
[ https://issues.apache.org/jira/browse/HIVE-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15022466#comment-15022466 ] Lenni Kuff commented on HIVE-12484: --- IMO we should treat all SQL commands the same - even if they are metadata-only operations. > Show meta operations on HS2 web UI > -- > > Key: HIVE-12484 > URL: https://issues.apache.org/jira/browse/HIVE-12484 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Jimmy Xiang > > As Mohit pointed out in the review of HIVE-12338, it is nice to show meta > operations on HS2 web UI too. So that we can have an end-to-end picture for > those operations access HMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12484) Show meta operations on HS2 web UI
[ https://issues.apache.org/jira/browse/HIVE-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021761#comment-15021761 ] Lenni Kuff commented on HIVE-12484: --- What are meta operations? GetTables() and other calls? That would be nice, but I would assume something like SHOW TABLES would show up under the other SQL queries? > Show meta operations on HS2 web UI > -- > > Key: HIVE-12484 > URL: https://issues.apache.org/jira/browse/HIVE-12484 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Jimmy Xiang > > As Mohit pointed out in the review of HIVE-12338, it is nice to show meta > operations on HS2 web UI too. So that we can have an end-to-end picture for > those operations access HMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12431) Cancel queries after configurable timeout waiting on compilation
[ https://issues.apache.org/jira/browse/HIVE-12431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-12431: -- Assignee: (was: Vaibhav Gumashta) > Cancel queries after configurable timeout waiting on compilation > > > Key: HIVE-12431 > URL: https://issues.apache.org/jira/browse/HIVE-12431 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, Query Processor >Affects Versions: 1.2.1 >Reporter: Lenni Kuff > > To help with HiveServer2 scalability, it would be useful to allow users to > configure a timeout value for queries waiting to be compiled. If the timeout > value is reached then the query would abort. One option to achieve this would > be to update the compile lock to use a try-lock with the timeout value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12414) ALTER TABLE UNSET SERDEPROPERTY does not work
[ https://issues.apache.org/jira/browse/HIVE-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-12414: -- Labels: newbie (was: ) > ALTER TABLE UNSET SERDEPROPERTY does not work > - > > Key: HIVE-12414 > URL: https://issues.apache.org/jira/browse/HIVE-12414 > Project: Hive > Issue Type: Bug > Components: Metastore, SQL >Affects Versions: 1.1.1 >Reporter: Lenni Kuff >Assignee: Reuben Kuhnert > Labels: newbie > > alter table tablename set tblproperties ('key'='value') => works as expected > alter table tablename unset tblproperties ('key') => works as expected > alter table tablename set serdeproperties ('key'='value') => works as > expected > alter table tablename unset serdeproperties ('key') => not supported > FAILED: ParseException line 1:28 mismatched input 'serdeproperties' expecting > TBLPROPERTIES near 'unset' in alter properties statement -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12414) ALTER TABLE UNSET SERDEPROPERTY does not work
[ https://issues.apache.org/jira/browse/HIVE-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff reassigned HIVE-12414: - Assignee: Reuben Kuhnert (was: Alan Gates) > ALTER TABLE UNSET SERDEPROPERTY does not work > - > > Key: HIVE-12414 > URL: https://issues.apache.org/jira/browse/HIVE-12414 > Project: Hive > Issue Type: Bug > Components: Metastore, SQL >Affects Versions: 1.1.1 >Reporter: Lenni Kuff >Assignee: Reuben Kuhnert > > alter table tablename set tblproperties ('key'='value') => works as expected > alter table tablename unset tblproperties ('key') => works as expected > alter table tablename set serdeproperties ('key'='value') => works as > expected > alter table tablename unset serdeproperties ('key') => not supported > FAILED: ParseException line 1:28 mismatched input 'serdeproperties' expecting > TBLPROPERTIES near 'unset' in alter properties statement -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12338) Add webui to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995564#comment-14995564 ] Lenni Kuff commented on HIVE-12338: --- It would be great if the webUI could also provide another interface to view the new metrics that were added in HIVE-10761. Either in raw json or via a nicely formatted table. It would also be good to hear how the webUI will be secured (kerberos?). > Add webui to HiveServer2 > > > Key: HIVE-12338 > URL: https://issues.apache.org/jira/browse/HIVE-12338 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > > A web ui for HiveServer2 can show some useful information such as: > > 1. Sessions, > 2. Queries that are executing on the HS2, their states, starting time, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11279) Hive should emit lineage information in json compact format
[ https://issues.apache.org/jira/browse/HIVE-11279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-11279: -- Attachment: HIVE-11279.1.patch > Hive should emit lineage information in json compact format > --- > > Key: HIVE-11279 > URL: https://issues.apache.org/jira/browse/HIVE-11279 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 1.3.0 >Reporter: Lenni Kuff >Assignee: Lenni Kuff > Attachments: HIVE-11279.1.patch > > > Hive should emit lineage information in json compact format. Currently, Hive > prints this in human readable format which makes it harder to consume > (identify record boundaries) and makes the output files very long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10048) JDBC - Support SSL encryption regardless of Authentication mechanism
[ https://issues.apache.org/jira/browse/HIVE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623030#comment-14623030 ] Lenni Kuff commented on HIVE-10048: --- [~prasadm] - Can you take a look? > JDBC - Support SSL encryption regardless of Authentication mechanism > > > Key: HIVE-10048 > URL: https://issues.apache.org/jira/browse/HIVE-10048 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 1.0.0 >Reporter: Mubashir Kazia >Assignee: Mubashir Kazia > Labels: newbie, patch > Attachments: HIVE-10048.1.patch > > > JDBC driver currently only supports SSL Transport if the Authentication > mechanism is SASL Plain with username and password. SSL transport should be > decoupled from Authentication mechanism. If the customer chooses to do > Kerberos Authentication and SSL encryption over the wire it should be > supported. The Server side already supports this but the driver does not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11174) Hive does not treat floating point signed zeros as equal (-0.0 should equal 0.0 according to IEEE floating point spec)
[ https://issues.apache.org/jira/browse/HIVE-11174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lenni Kuff updated HIVE-11174: -- Description: Hive does not treat floating point signed zeros as equal (-0.0 should equal 0.0). This is because Hive uses Double.compareTo(), which states (http://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#compareTo(java.lang.Double): bq. 0.0d is considered by this method to be greater than -0.0d The IEEE 754 floating point spec specifies that signed -0.0 and 0.0 should be treated as equal. From the Wikipedia article (https://en.wikipedia.org/wiki/Signed_zero#Comparisons): bq. negative zero and positive zero should compare as equal with the usual (numerical) comparison operators Java's compareTo method is implemented to allow for ordering of object instances (in a hash table or similar), but Hive should abide by the IEEE spec. How to reproduce: {code} select 1 where 0.0=-0.0; Returns no results. select 1 where -0.0<0.0; Returns 1 {code} was: Hive does not treat floating point signed zeros as equal (-0.0 should equal 0.0). This is because Hive uses Double.compareTo(), which states: "0.0d is considered by this method to be greater than -0.0d" http://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#compareTo(java.lang.Double) The IEEE 754 floating point spec specifies that signed -0.0 and 0.0 should be treated as equal. From the Wikipedia article (https://en.wikipedia.org/wiki/Signed_zero#Comparisons): bq. negative zero and positive zero should compare as equal with the usual (numerical) comparison operators How to reproduce: {code} select 1 where 0.0=-0.0; Returns no results. select 1 where -0.0<0.0; Returns 1 {code} > Hive does not treat floating point signed zeros as equal (-0.0 should equal > 0.0 according to IEEE floating point spec) > --- > > Key: HIVE-11174 > URL: https://issues.apache.org/jira/browse/HIVE-11174 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Lenni Kuff >Priority: Critical > > Hive does not treat floating point signed zeros as equal (-0.0 should equal > 0.0). This is because Hive uses Double.compareTo(), which states > (http://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#compareTo(java.lang.Double): > bq. 0.0d is considered by this method to be greater than -0.0d > The IEEE 754 floating point spec specifies that signed -0.0 and 0.0 should be > treated as equal. From the Wikipedia article > (https://en.wikipedia.org/wiki/Signed_zero#Comparisons): > bq. negative zero and positive zero should compare as equal with the usual > (numerical) comparison operators > Java's compareTo method is implemented to allow for ordering of object > instances (in a hash table or similar), but Hive should abide by the IEEE > spec. > How to reproduce: > {code} > select 1 where 0.0=-0.0; > Returns no results. > select 1 where -0.0<0.0; > Returns 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10593) Support creating table from a file schema: CREATE TABLE ... LIKE '/path/to/file'
[ https://issues.apache.org/jira/browse/HIVE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551491#comment-14551491 ] Lenni Kuff commented on HIVE-10593: --- Inferring the file type from the header type seems like a good idea. Do you think we should avoid the 'file_format' keyword completely, or should it be an optional "hint" which would fail the DDL op if the specified file does not match the target format? > Support creating table from a file schema: CREATE TABLE ... LIKE > '/path/to/file' > -- > > Key: HIVE-10593 > URL: https://issues.apache.org/jira/browse/HIVE-10593 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Lenni Kuff > > It would be useful if Hive could infer the column definitions in a create > table statement from the underlying data file. For example: > CREATE TABLE new_tbl LIKE PARQUET '/path/to/file.parquet'; > If the targeted file is not the specified file format, the statement should > fail analysis. In addition to PARQUET, it would be useful to support other > formats such as AVRO, JSON, and ORC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10593) Support creating table from a file schema: CREATE TABLE ... LIKE '/path/to/file'
[ https://issues.apache.org/jira/browse/HIVE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550562#comment-14550562 ] Lenni Kuff commented on HIVE-10593: --- [~singhashish] / [~rdblue] - Agree that we should avoid a table property because it doesn't seem necessary. Allowing a directory to be passed in is a bit more complicated than a file because you might have multiple files with different schemas in the directory and you need to add logic to verify the schemas match. I would suggest initially supporting file-only. Do you have feedback on the syntax described in this JIRA? It seems like it avoids the use of table properties and the syntax is fairly natural (similar to CREATE TABLE LIKE otherTable). > Support creating table from a file schema: CREATE TABLE ... LIKE > '/path/to/file' > -- > > Key: HIVE-10593 > URL: https://issues.apache.org/jira/browse/HIVE-10593 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Lenni Kuff > > It would be useful if Hive could infer the column definitions in a create > table statement from the underlying data file. For example: > CREATE TABLE new_tbl LIKE PARQUET '/path/to/file.parquet'; > If the targeted file is not the specified file format, the statement should > fail analysis. In addition to PARQUET, it would be useful to support other > formats such as AVRO, JSON, and ORC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10593) Support creating table from a file schema: CREATE TABLE ... LIKE '/path/to/file'
[ https://issues.apache.org/jira/browse/HIVE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526781#comment-14526781 ] Lenni Kuff commented on HIVE-10593: --- Thanks, I was not aware of HIVE-8950. I agree, a standard syntax for all formats is very desirable. One downside of the syntax proposed in HIVE-8950 is that it is overloading the LOCATION keyword. This means it would not be possible to create a table with the same schema as a data file, but point that table to a different HDFS location (perhaps an edge case, but I can think of some uses). Thoughts? > Support creating table from a file schema: CREATE TABLE ... LIKE > '/path/to/file' > -- > > Key: HIVE-10593 > URL: https://issues.apache.org/jira/browse/HIVE-10593 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Lenni Kuff > > It would be useful if Hive could infer the column definitions in a create > table statement from the underlying data file. For example: > CREATE TABLE new_tbl LIKE PARQUET '/path/to/file.parquet'; > If the targeted file is not the specified file format, the statement should > fail analysis. In addition to PARQUET, it would be useful to support other > formats such as AVRO, JSON, and ORC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)