[jira] [Updated] (HIVE-3810) HiveHistory.log need to replace '\r' with space before writing Entry.value to historyfile

2012-12-16 Thread qiangwang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qiangwang updated HIVE-3810:


Summary: HiveHistory.log need to replace '\r' with space before writing 
Entry.value to historyfile  (was: HiveHistory need replace '\r' with space 
before writing Entry.value to historyfile)

 HiveHistory.log need to replace '\r' with space before writing Entry.value to 
 historyfile
 -

 Key: HIVE-3810
 URL: https://issues.apache.org/jira/browse/HIVE-3810
 Project: Hive
  Issue Type: Bug
  Components: Logging
Reporter: qiangwang
Priority: Minor

 HiveHistory.log will replace '\n' with space before writing Entry.value to 
 history file:
 val = val.replace('\n', ' ');
 but HiveHistoryViewer use BufferedReader.readLine which takes '\n', '\r', 
 '\r\n'  as line delimiter to parse history file
 if val contains '\r', there is a high possibility that 
 HiveHistoryViewer.parseLine will fail, in which case usually 
 RecordTypes.valueOf(recType) will throw exception 
 'java.lang.IllegalArgumentException'
 HiveHistory.log need to replace '\r' with space as well:
 - val = val.replace('\n', ' ');
 + val = val.replaceAll(\r|\n,  );
 or
 - val = val.replace('\n', ' ');
 + val = val.replace('\r', ' ').replace('\n', ' ');

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3810) HiveHistory.log need to replace '\r' with space before writing Entry.value to historyfile

2012-12-16 Thread qiangwang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qiangwang updated HIVE-3810:


Description: 
HiveHistory.log will replace '\n' with space before writing Entry.value to 
history file:

val = val.replace('\n', ' ');

but HiveHistory.parseHiveHistory use BufferedReader.readLine which takes '\n', 
'\r', '\r\n'  as line delimiter to parse history file

if val contains '\r', there is a high possibility that HiveHistory.parseLine 
will fail, in which case usually RecordTypes.valueOf(recType) will throw 
exception 'java.lang.IllegalArgumentException'

HiveHistory.log need to replace '\r' with space as well:

- val = val.replace('\n', ' ');
+ val = val.replaceAll(\r|\n,  );
or
- val = val.replace('\n', ' ');
+ val = val.replace('\r', ' ').replace('\n', ' ');


  was:
HiveHistory.log will replace '\n' with space before writing Entry.value to 
history file:

val = val.replace('\n', ' ');

but HiveHistoryViewer use BufferedReader.readLine which takes '\n', '\r', 
'\r\n'  as line delimiter to parse history file

if val contains '\r', there is a high possibility that 
HiveHistoryViewer.parseLine will fail, in which case usually 
RecordTypes.valueOf(recType) will throw exception 
'java.lang.IllegalArgumentException'

HiveHistory.log need to replace '\r' with space as well:

- val = val.replace('\n', ' ');
+ val = val.replaceAll(\r|\n,  );
or
- val = val.replace('\n', ' ');
+ val = val.replace('\r', ' ').replace('\n', ' ');



 HiveHistory.log need to replace '\r' with space before writing Entry.value to 
 historyfile
 -

 Key: HIVE-3810
 URL: https://issues.apache.org/jira/browse/HIVE-3810
 Project: Hive
  Issue Type: Bug
  Components: Logging
Reporter: qiangwang
Priority: Minor

 HiveHistory.log will replace '\n' with space before writing Entry.value to 
 history file:
 val = val.replace('\n', ' ');
 but HiveHistory.parseHiveHistory use BufferedReader.readLine which takes 
 '\n', '\r', '\r\n'  as line delimiter to parse history file
 if val contains '\r', there is a high possibility that HiveHistory.parseLine 
 will fail, in which case usually RecordTypes.valueOf(recType) will throw 
 exception 'java.lang.IllegalArgumentException'
 HiveHistory.log need to replace '\r' with space as well:
 - val = val.replace('\n', ' ');
 + val = val.replaceAll(\r|\n,  );
 or
 - val = val.replace('\n', ' ');
 + val = val.replace('\r', ' ').replace('\n', ' ');

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2012-12-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533425#comment-13533425
 ] 

Namit Jain commented on HIVE-3778:
--

Code changes look good.

 Add MapJoinDesc.isBucketMapJoin() as part of explain plan
 -

 Key: HIVE-3778
 URL: https://issues.apache.org/jira/browse/HIVE-3778
 Project: Hive
  Issue Type: Bug
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
Priority: Minor
 Attachments: HIVE-3778.patch.3


 This is follow up of HIVE-3767:
 Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3806) Ptest failing due to Argument list too long errors

2012-12-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3806:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed. Thanks Bhushan

 Ptest failing due to Argument list too long errors
 

 Key: HIVE-3806
 URL: https://issues.apache.org/jira/browse/HIVE-3806
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor
 Attachments: HIVE-3806.1.patch.txt


 ptest creates a really huge shell command to delete from each test host those 
 .q files that it should not be running. For TestCliDriver, the command has 
 become long enough that it is over the threshold allowed by the shell. We 
 should rewrite it so that the same semantics is captured in a shorter command.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3796) Multi-insert involving bucketed/sorted table turns off merging on all outputs

2012-12-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533433#comment-13533433
 ] 

Namit Jain commented on HIVE-3796:
--

+1

 Multi-insert involving bucketed/sorted table turns off merging on all outputs
 -

 Key: HIVE-3796
 URL: https://issues.apache.org/jira/browse/HIVE-3796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3796.1.patch.txt, HIVE-3796.2.patch.txt


 When a multi-insert query has at least one output that is bucketed, merging 
 is turned off for all outputs, rather than just the bucketed ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-446) Implement TRUNCATE

2012-12-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-446:
---

Assignee: Navis  (was: Andrew Chalfant)

 Implement TRUNCATE
 --

 Key: HIVE-446
 URL: https://issues.apache.org/jira/browse/HIVE-446
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Prasad Chakka
Assignee: Navis
 Attachments: HIVE-446.D7371.1.patch


 truncate the data but leave the table and metadata intact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-446) Implement TRUNCATE

2012-12-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-446:


Status: Open  (was: Patch Available)

comments on phabricator

 Implement TRUNCATE
 --

 Key: HIVE-446
 URL: https://issues.apache.org/jira/browse/HIVE-446
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Prasad Chakka
Assignee: Navis
 Attachments: HIVE-446.D7371.1.patch


 truncate the data but leave the table and metadata intact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3752) Add a non-sql API in hive to access data.

2012-12-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533444#comment-13533444
 ] 

Namit Jain commented on HIVE-3752:
--

Nitay, along with patch, can you create a document on apache hive cwiki, with 
the proposed API.
If you dont have wiki permissions, please create an account, and send me your 
id. -
I will give you the required permissions.

 Add a non-sql API in hive to access data.
 -

 Key: HIVE-3752
 URL: https://issues.apache.org/jira/browse/HIVE-3752
 Project: Hive
  Issue Type: Improvement
Reporter: Nitay Joffe
Assignee: Nitay Joffe

 We would like to add an input/output format for accessing Hive data in Hadoop 
 directly without having to use e.g. a transform. Using a transform
 means having to do a whole map-reduce step with its own disk accesses and its 
 imposed structure. It also means needing to have Hive be the base 
 infrastructure for the entire system being developed which is not the right 
 fit as we only need a small part of it (access to the data).
 So we propose adding an API level InputFormat and OutputFormat to Hive that 
 will make it trivially easy to select a table with partition spec and read 
 from / write to it. We chose this design to make it compatible with Hadoop so 
 that existing systems that work with Hadoop's IO API will just work out of 
 the box.
 We need this system for the Giraph graph processing system 
 (http://giraph.apache.org/) as running graph jobs which read/write from Hive 
 is a common use case.
 [~namitjain] [~aching] [~kevinwilfong] [~apresta]
 Input-side (HiveApiInputFormat) review: https://reviews.facebook.net/D7401

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #231

2012-12-16 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/231/



Hive-trunk-h0.21 - Build # 1858 - Still Failing

2012-12-16 Thread Apache Jenkins Server
Changes for Build #1854

Changes for Build #1855

Changes for Build #1856
[kevinwilfong] HIVE-3766. Enable adding hooks to hive meta store init. (Jean Xu 
via kevinwilfong)


Changes for Build #1857

Changes for Build #1858



6 tests failed.
REGRESSION:  
org.apache.hadoop.hive.ql.exec.TestStatsPublisherEnhanced.testStatsPublisherOneStat

Error Message:
null

Stack Trace:
junit.framework.AssertionFailedError: null
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at junit.framework.Assert.assertTrue(Assert.java:27)
at 
org.apache.hadoop.hive.ql.exec.TestStatsPublisherEnhanced.testStatsPublisherOneStat(TestStatsPublisherEnhanced.java:81)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:79)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)


REGRESSION:  
org.apache.hadoop.hive.ql.exec.TestStatsPublisherEnhanced.testStatsPublisher

Error Message:
null

Stack Trace:
junit.framework.AssertionFailedError: null
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at junit.framework.Assert.assertTrue(Assert.java:27)
at 
org.apache.hadoop.hive.ql.exec.TestStatsPublisherEnhanced.testStatsPublisher(TestStatsPublisherEnhanced.java:129)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:79)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)


REGRESSION:  
org.apache.hadoop.hive.ql.exec.TestStatsPublisherEnhanced.testStatsPublisherMultipleUpdates

Error Message:
null

Stack Trace:
junit.framework.AssertionFailedError: null
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at junit.framework.Assert.assertTrue(Assert.java:27)
at 
org.apache.hadoop.hive.ql.exec.TestStatsPublisherEnhanced.testStatsPublisherMultipleUpdates(TestStatsPublisherEnhanced.java:190)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at 

[jira] [Commented] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2012-12-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533630#comment-13533630
 ] 

Namit Jain commented on HIVE-3778:
--

[~gangtimliu], the tests passed fine.
I wanted to wait for HIVE-3784 before getting this, since HIVE-3784 is a bigger 
patch and will conflict with this (log files).
If I get any comments on HIVE-3784 and anyway I need to refresh, I will commit 
HIVE-3778.

 Add MapJoinDesc.isBucketMapJoin() as part of explain plan
 -

 Key: HIVE-3778
 URL: https://issues.apache.org/jira/browse/HIVE-3778
 Project: Hive
  Issue Type: Bug
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
Priority: Minor
 Attachments: HIVE-3778.patch.3


 This is follow up of HIVE-3767:
 Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3784) de-emphasize mapjoin hint

2012-12-16 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533631#comment-13533631
 ] 

Mark Grover commented on HIVE-3784:
---

Agreed!

 de-emphasize mapjoin hint
 -

 Key: HIVE-3784
 URL: https://issues.apache.org/jira/browse/HIVE-3784
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3784.1.patch, hive.3784.2.patch, hive.3784.3.patch, 
 hive.3784.4.patch, hive.3784.5.patch


 hive.auto.convert.join has been around for a long time, and is pretty stable.
 When mapjoin hint was created, the above parameter did not exist.
 The only reason for the user to specify a mapjoin currently is if they want
 it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
 Eventually, that should also go away, but that may take some time to 
 stabilize.
 There are many rules in SemanticAnalyzer to handle the following trees:
 ReduceSink - MapJoin
 Union  - MapJoin
 MapJoin- MapJoin
 This should not be supported anymore. In any of the above scenarios, the
 user can get the mapjoin behavior by setting hive.auto.convert.join to true
 and not specifying the hint. This will simplify the code a lot.
 What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3492) Provide ALTER for partition changing bucket number

2012-12-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3492:
-

   Resolution: Fixed
Fix Version/s: 0.11
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Navis

 Provide ALTER for partition changing bucket number 
 ---

 Key: HIVE-3492
 URL: https://issues.apache.org/jira/browse/HIVE-3492
 Project: Hive
  Issue Type: Improvement
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11

 Attachments: HIVE-3492.1.patch.txt, HIVE-3492.2.patch.txt, 
 HIVE-3492.D5589.2.patch, HIVE-3492.D5589.3.patch


 As a follow up of HIVE-3283, bucket number of a partition could be 
 set/changed individually by query like 'ALTER table srcpart 
 PARTIRION(ds='1999') SET BUCKETNUM 5'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3562) Some limit can be pushed down to map stage

2012-12-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3562:
-

Status: Open  (was: Patch Available)

comments on phabricator

 Some limit can be pushed down to map stage
 --

 Key: HIVE-3562
 URL: https://issues.apache.org/jira/browse/HIVE-3562
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch


 Queries with limit clause (with reasonable number), for example
 {noformat}
 select * from src order by key limit 10;
 {noformat}
 makes operator tree, 
 TS-SEL-RS-EXT-LIMIT-FS
 But LIMIT can be partially calculated in RS, reducing size of shuffling.
 TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2012-12-16 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533638#comment-13533638
 ] 

Phabricator commented on HIVE-3562:
---

njain has commented on the revision HIVE-3562 [jira] Some limit can be pushed 
down to map stage.

  The general direction looks OK

INLINE COMMENTS
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java:79 
TODO
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java:45 
spelling: operator
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java:79 
Followup: add a new method in Operator.
  ql/src/test/queries/clientpositive/limit_pushdown.q:26 Looks like this 
optimization should also help if the limit is in a sub-query:
  Can you add a test ?

  something like:

  select ..  from
  (select key, count(1) from src group by key order by key limit 2) subq
  join
  (select key, count(1) from src group by key order by key limit 2) subq2 ..


  The optimization should be applied to both the sub-queries

REVISION DETAIL
  https://reviews.facebook.net/D5967

BRANCH
  DPAL-1910

To: JIRA, tarball, navis
Cc: njain


 Some limit can be pushed down to map stage
 --

 Key: HIVE-3562
 URL: https://issues.apache.org/jira/browse/HIVE-3562
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch


 Queries with limit clause (with reasonable number), for example
 {noformat}
 select * from src order by key limit 10;
 {noformat}
 makes operator tree, 
 TS-SEL-RS-EXT-LIMIT-FS
 But LIMIT can be partially calculated in RS, reducing size of shuffling.
 TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3467) BucketMapJoinOptimizer should optimize joins on partition columns

2012-12-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3467:
-

Fix Version/s: 0.11
Affects Version/s: (was: 0.10.0)
   Status: Open  (was: Patch Available)

comments on phabricator

 BucketMapJoinOptimizer should optimize joins on partition columns
 -

 Key: HIVE-3467
 URL: https://issues.apache.org/jira/browse/HIVE-3467
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Kevin Wilfong
Assignee: Zhenxiao Luo
 Fix For: 0.11

 Attachments: HIVE-3467.1.patch.txt, HIVE-3467.2.patch.txt, 
 HIVE-3467.3.patch.txt


 Consider the query:
 SELECT * FROM t1 JOIN t2 on t1.part = t2.part and t1.key = t2.key;
 Where t1 and t2 are partitioned by part and bucketed by key.
 Suppose part take values 1 and 2 and t1 and t2 are bucketed into 2 buckets.
 The bucket map join optimizer will put the first bucket of part=1 and part=2 
 partitions of t2 into the same mapper as that of part=1 partition of t1.  It 
 will do the same for the part=2 partition of t1.
 It could take advantage of the partition values and send the first bucket of 
 only the part=1 partitions of t1 and t2 into one mapper and the first bucket 
 of only the part=2 partitions into another.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-16 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533646#comment-13533646
 ] 

Mark Grover commented on HIVE-2693:
---

I was playing around with this and noticed that group by on Decimal column 
doesn't work because BinarySortableSerDe doesn't support DECIMAL type. It's 
questionable if it would ever support it since BigDecimal is arbitrarily long 
and BinarySortableSerDe dictates data to be serialized so that the value can be 
compared byte by byte with the same order. That would have been ok but it seems 
like {{getReduceKeyTableDesc}} in {{PlanUtils.java}} is hardcoded to use 
BinarySortableSerDe. I will have to poke around some more to see how Group By 
can be made to work with Decimal type (backed by BigDecimal).

In any case, the last patch didn't apply cleanly on trunk, so I fixed some 
merge conflicts and am attaching a new patch (#11) which is a refresh of patch 
10 that applies cleanly on trunk as of today.

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-1.patch.txt, 
 HIVE-2693-all.patch, HIVE-2693-fix.patch, HIVE-2693.patch, 
 HIVE-2693-take3.patch, HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2693) Add DECIMAL data type

2012-12-16 Thread Mark Grover (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Grover updated HIVE-2693:
--

Attachment: HIVE-2693-11.patch

Refresh of patch 10.

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-1.patch.txt, 
 HIVE-2693-all.patch, HIVE-2693-fix.patch, HIVE-2693.patch, 
 HIVE-2693-take3.patch, HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2012-12-16 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533649#comment-13533649
 ] 

Gang Tim Liu commented on HIVE-3778:


Yes sir

Sent from my iPhone




 Add MapJoinDesc.isBucketMapJoin() as part of explain plan
 -

 Key: HIVE-3778
 URL: https://issues.apache.org/jira/browse/HIVE-3778
 Project: Hive
  Issue Type: Bug
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
Priority: Minor
 Attachments: HIVE-3778.patch.3


 This is follow up of HIVE-3767:
 Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views

2012-12-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3803:
-

Attachment: hive.3803.2.patch

 explain dependency should show the dependencies hierarchically in presence of 
 views
 ---

 Key: HIVE-3803
 URL: https://issues.apache.org/jira/browse/HIVE-3803
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3803.1.patch, hive.3803.2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views

2012-12-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3803:
-

Description: It should also include tables whose partitions are being 
accessed

 explain dependency should show the dependencies hierarchically in presence of 
 views
 ---

 Key: HIVE-3803
 URL: https://issues.apache.org/jira/browse/HIVE-3803
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3803.1.patch, hive.3803.2.patch


 It should also include tables whose partitions are being accessed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3811) explain dependency should work with views

2012-12-16 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3811:


 Summary: explain dependency should work with views
 Key: HIVE-3811
 URL: https://issues.apache.org/jira/browse/HIVE-3811
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain


View partitions should also show up

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira