date:20130321

[jira] [Updated] (HIVE-4144) Add select database() command to show the current database

2013-03-21 Thread Phabricator (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4144:
--

Attachment: HIVE-4144.D9597.1.patch

navis requested code review of HIVE-4144 [jira] Add select database() 
command to show the current database.

Reviewers: JIRA

HIVE-4144 Add select database() command to show the current database

A recent hive-user mailing list conversation asked about having a command to 
show the current database.
http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E

MySQL seems to have a command to do so:

select database();

http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database

We should look into having something similar in Hive.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D9597

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/UDFCurrentDB.java
  ql/src/test/queries/clientpositive/udf_current_database.q
  ql/src/test/results/clientpositive/udf_current_database.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/22923/

To: JIRA, navis


 Add select database() command to show the current database
 

 Key: HIVE-4144
 URL: https://issues.apache.org/jira/browse/HIVE-4144
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Mark Grover
Assignee: Navis
 Attachments: HIVE-4144.D9597.1.patch


 A recent hive-user mailing list conversation asked about having a command to 
 show the current database.
 http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E
 MySQL seems to have a command to do so:
 {code}
 select database();
 {code}
 http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database
 We should look into having something similar in Hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4144) Add select database() command to show the current database

2013-03-21 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4144:


Assignee: Navis
  Status: Patch Available  (was: Open)

'database' cannot be used cause it's a keyword. so used 'current_database' 
instead.

 Add select database() command to show the current database
 

 Key: HIVE-4144
 URL: https://issues.apache.org/jira/browse/HIVE-4144
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Mark Grover
Assignee: Navis
 Attachments: HIVE-4144.D9597.1.patch


 A recent hive-user mailing list conversation asked about having a command to 
 show the current database.
 http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E
 MySQL seems to have a command to do so:
 {code}
 select database();
 {code}
 http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database
 We should look into having something similar in Hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3381) Result of outer join is not valid

2013-03-21 Thread Phabricator (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3381:
--

Attachment: HIVE-3381.D5565.5.patch

navis updated the revision HIVE-3381 [jira] Result of outer join is not valid.

  Rebased to trunk. running test

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D5565

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D5565?vs=22947id=30225#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java
  ql/src/test/results/clientpositive/auto_join21.q.out
  ql/src/test/results/clientpositive/auto_join29.q.out
  ql/src/test/results/clientpositive/auto_join7.q.out
  ql/src/test/results/clientpositive/auto_join_filters.q.out
  ql/src/test/results/clientpositive/join21.q.out
  ql/src/test/results/clientpositive/join7.q.out
  ql/src/test/results/clientpositive/join_1to1.q.out
  ql/src/test/results/clientpositive/join_filters.q.out
  ql/src/test/results/clientpositive/join_filters_overlap.q.out

To: JIRA, navis
Cc: njain


 Result of outer join is not valid
 -

 Key: HIVE-3381
 URL: https://issues.apache.org/jira/browse/HIVE-3381
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3381.D5565.3.patch, HIVE-3381.D5565.4.patch, 
 HIVE-3381.D5565.5.patch


 Outer joins, especially full outer joins or outer join with filter on 'ON 
 clause' is not showing proper results. For example, query in test join_1to1.q
 {code}
 SELECT * FROM join_1to1_1 a full outer join join_1to1_2 b on a.key1 = b.key1 
 and a.value = 66 and b.value = 66 ORDER BY a.key1 ASC, a.key2 ASC, a.value 
 ASC, b.key1 ASC, b.key2 ASC, b.value ASC;
 {code}
 results
 {code}
 NULL  NULLNULLNULLNULL66
 NULL  NULLNULLNULL10050   66
 NULL  NULLNULL10  10010   66
 NULL  NULLNULL30  10030   88
 NULL  NULLNULL35  10035   88
 NULL  NULLNULL40  10040   88
 NULL  NULLNULL40  10040   88
 NULL  NULLNULL50  10050   88
 NULL  NULLNULL50  10050   88
 NULL  NULLNULL50  10050   88
 NULL  NULLNULL70  10040   88
 NULL  NULLNULL70  10040   88
 NULL  NULLNULL70  10040   88
 NULL  NULLNULL70  10040   88
 NULL  NULL66  NULLNULLNULL
 NULL  10050   66  NULLNULLNULL
 5 10005   66  5   10005   66
 1510015   66  NULLNULLNULL
 2010020   66  20  10020   66
 2510025   88  NULLNULLNULL
 3010030   66  NULLNULLNULL
 3510035   88  NULLNULLNULL
 4010040   66  NULLNULLNULL
 4010040   66  40  10040   66
 4010040   88  NULLNULLNULL
 4010040   88  NULLNULLNULL
 5010050   66  NULLNULLNULL
 5010050   66  50  10050   66
 5010050   66  50  10050   66
 5010050   88  NULLNULLNULL
 5010050   88  NULLNULLNULL
 5010050   88  NULLNULLNULL
 5010050   88  NULLNULLNULL
 5010050   88  NULLNULLNULL
 5010050   88  NULLNULLNULL
 6010040   66  60  10040   66
 6010040   66  60  10040   66
 6010040   66  60  10040   66
 6010040   66  60  10040   66
 7010040   66  NULLNULLNULL
 7010040   66  NULLNULLNULL
 7010040   66  NULLNULLNULL
 7010040   66  NULLNULLNULL
 8010040   88  NULLNULLNULL
 8010040   88  NULLNULLNULL
 8010040   88  NULLNULLNULL
 8010040   88  NULLNULLNULL
 {code} 
 but it seemed not right. This should be 
 {code}
 NULL  NULLNULLNULLNULL66
 NULL  NULLNULLNULL10050   66
 NULL  NULLNULL10  10010   66
 NULL  NULLNULL25  10025   66
 NULL  NULLNULL30  10030   88
 NULL  NULLNULL35  10035   88
 NULL  NULLNULL40  10040   88
 NULL  NULLNULL50  10050   88
 NULL  NULLNULL70  10040   88
 NULL  NULLNULL70  10040   88
 NULL  NULLNULL80  10040   66
 NULL  NULLNULL80  10040   66
 NULL  NULL

[jira] [Updated] (HIVE-4206) Sort merge join does not work for outer joins for 7 inputs

2013-03-21 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4206:
-

Summary: Sort merge join does not work for outer joins for 7 inputs  (was: 
Sort merge join does not work for more than 7 inputs)

 Sort merge join does not work for outer joins for 7 inputs
 --

 Key: HIVE-4206
 URL: https://issues.apache.org/jira/browse/HIVE-4206
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4206) Sort merge join does not work for outer joins for 7 inputs

2013-03-21 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608776#comment-13608776
 ] 

Namit Jain commented on HIVE-4206:
--

https://reviews.facebook.net/D9603

 Sort merge join does not work for outer joins for 7 inputs
 --

 Key: HIVE-4206
 URL: https://issues.apache.org/jira/browse/HIVE-4206
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4206.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4206) Sort merge join does not work for outer joins for 7 inputs

2013-03-21 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4206:
-

Attachment: hive.4206.1.patch

 Sort merge join does not work for outer joins for 7 inputs
 --

 Key: HIVE-4206
 URL: https://issues.apache.org/jira/browse/HIVE-4206
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4206.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Deleted] (HIVE-4206) Sort merge join does not work for outer joins for 7 inputs

2013-03-21 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4206:
-

Comment: was deleted

(was: set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;
set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
set hive.enforce.bucketing=true;
set hive.enforce.sorting=true;
set hive.exec.reducers.max = 1;
set hive.merge.mapfiles=false;
set hive.merge.mapredfiles=false;

-- Create bucketed and sorted tables
CREATE TABLE test_table1 (key INT, value STRING) CLUSTERED BY (key) SORTED BY 
(key) INTO 2 BUCKETS;
CREATE TABLE test_table2 (key INT, value STRING) CLUSTERED BY (key) SORTED BY 
(key) INTO 2 BUCKETS;
CREATE TABLE test_table3 (key INT, value STRING) CLUSTERED BY (key) SORTED BY 
(key) INTO 2 BUCKETS;
CREATE TABLE test_table4 (key INT, value STRING) CLUSTERED BY (key) SORTED BY 
(key) INTO 2 BUCKETS;
CREATE TABLE test_table5 (key INT, value STRING) CLUSTERED BY (key) SORTED BY 
(key) INTO 2 BUCKETS;
CREATE TABLE test_table6 (key INT, value STRING) CLUSTERED BY (key) SORTED BY 
(key) INTO 2 BUCKETS;
CREATE TABLE test_table7 (key INT, value STRING) CLUSTERED BY (key) SORTED BY 
(key) INTO 2 BUCKETS;

FROM src
INSERT OVERWRITE TABLE test_table1 SELECT *;

FROM src
INSERT OVERWRITE TABLE test_table2 SELECT *;

FROM src
INSERT OVERWRITE TABLE test_table3 SELECT *;

FROM src
INSERT OVERWRITE TABLE test_table4 SELECT *;

FROM src
INSERT OVERWRITE TABLE test_table5 SELECT *;

FROM src
INSERT OVERWRITE TABLE test_table6 SELECT *;

FROM src
INSERT OVERWRITE TABLE test_table7 SELECT *;


-- Mapjoin followed by a aggregation should be performed in a single MR job
EXPLAIN
SELECT /*+mapjoin(b)*/ count(*) FROM test_table1 a JOIN test_table2 b ON a.key 
= b.key
JOIN test_table3 c ON a.key = c.key
JOIN test_table4 d ON a.key = d.key
JOIN test_table5 e ON a.key = e.key
JOIN test_table6 f ON a.key = f.key
JOIN test_table6 g ON a.key = g.key;


The above query does not use sort-merge join.)

 Sort merge join does not work for outer joins for 7 inputs
 --

 Key: HIVE-4206
 URL: https://issues.apache.org/jira/browse/HIVE-4206
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4206) Sort merge join does not work for outer joins for 7 inputs

2013-03-21 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4206:
-

Attachment: hive.4206.2.patch

 Sort merge join does not work for outer joins for 7 inputs
 --

 Key: HIVE-4206
 URL: https://issues.apache.org/jira/browse/HIVE-4206
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4206.1.patch, hive.4206.2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4196) Support for Streaming Partitions in Hive

2013-03-21 Thread eric baldeschwieler (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608800#comment-13608800
]

eric baldeschwieler commented on HIVE-4196:
---

Maybe we should just return both?

Support for Streaming Partitions in Hive

Key: HIVE-4196
URL: https://issues.apache.org/jira/browse/HIVE-4196
Project: Hive
Issue Type: New Feature
Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik
Assignee: Roshan Naik

Motivation: Allow Hive users to immediately query data streaming in through
clients such as Flume.
Currently Hive partitions must be created after all the data for the
partition is available. Thereafter, data in the partitions is considered
immutable.
This proposal introduces the notion of a streaming partition into which new
files an be committed periodically and made available for queries before the
partition is closed and converted into a standard partition.
The admin enables streaming partition on a table using DDL. He provides the
following pieces of information:
- Name of the partition in the table on which streaming is enabled
- Frequency at which the streaming partition should be closed and converted
into a standard partition.
Tables with streaming partition enabled will be partitioned by one and only
one column. It is assumed that this column will contain a timestamp.
Closing the current streaming partition converts it into a standard
partition. Based on the specified frequency, the current streaming partition
is closed and a new one created for future writes. This is referred to as
'rolling the partition'.
A streaming partition's life cycle is as follows:
- A new streaming partition is instantiated for writes
- Streaming clients request (via webhcat) for a HDFS file name into which
they can write a chunk of records for a specific table.
- Streaming clients write a chunk (via webhdfs) to that file and commit
it(via webhcat). Committing merely indicates that the chunk has been written
completely and ready for serving queries.
- When the partition is rolled, all committed chunks are swept into single
directory and a standard partition pointing to that directory is created. The
streaming partition is closed and new streaming partition is created. Rolling
the partition is atomic. Streaming clients are agnostic of partition rolling.

- Hive queries will be able to query the partition that is currently open
for streaming. only committed chunks will be visible. read consistency will
be ensured so that repeated reads of the same partition will be idempotent
for the lifespan of the query.
Partition rolling requires an active agent/thread running to check when it is
time to roll and trigger the roll. This could be either be achieved by using
an external agent such as Oozie (preferably) or an internal agent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

77 matches

Mail list logo