date:20100224

[jira] Updated: (HIVE-1184) Expression Not In Group By Key error is sometimes masked

2010-02-24 Thread Paul Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-1184:


Attachment: HIVE-1184.2.patch

>From an offline conversation - the solution still works for cases like 
>"concat(value, concat(value))" because the error is only set with the first 
>"value" - node processors terminate early when the global error is set. 

* Added better comments.

> Expression Not In Group By Key error is sometimes masked
> 
>
> Key: HIVE-1184
> URL: https://issues.apache.org/jira/browse/HIVE-1184
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Paul Yang
>Assignee: Paul Yang
> Attachments: HIVE-1184.1.patch, HIVE-1184.2.patch
>
>
> Depending on the order of expressions, the error message for a expression not 
> in group key is not displayed; instead it is null.
> {code}
> hive> select concat(value, concat(value)) from src group by concat(value);
> FAILED: Error in semantic analysis: null
> hive> select concat(concat(value), value) from src group by concat(value);
> FAILED: Error in semantic analysis: line 1:29 Expression Not In Group By Key 
> value
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-990) Incorporate CheckStyle into Hive's build.xml

2010-02-24 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838193#action_12838193
 ] 

Carl Steinbach commented on HIVE-990:
-

Quoting from http://g.oswego.edu/dl/html/javaCodingStd.html:

??Minimize direct internal access to instance variables inside methods. Use 
protected access and update methods instead (or sometimes public ones if they 
exist anyway).??

??Rationale: While inconvenient and sometimes overkill, this allows you to vary 
synchronization and notification policies associated with variable access and 
change in the class and/or its subclasses, which is otherwise a serious 
impediment to extensiblity in concurrent OO programming.??

This advice is just as applicable in single-threaded situations. Declaring 
instance variables as protected allows subclasses and classes within the same 
package to become tightly-coupled to the specifics of your class's 
implementation. This violates the whole point of encapsulation.

For other problems associated with protected instance variables read this: 
http://java.sys-con.com/node/46344


> Incorporate CheckStyle into Hive's build.xml
> 
>
> Key: HIVE-990
> URL: https://issues.apache.org/jira/browse/HIVE-990
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.6.0
>
> Attachments: checkstyle-errors.html, HIVE-990.patch
>
>
> Hadoop and Pig both have CheckStyle integrated into their build. This is 
> useful for catching
> a variety of errors as well as for enforcing a specific coding style and 
> maintaining good code hygiene.
> We just need to snatch Hadoop's checkstyle.xml and integrate it into Hive's 
> build.xml file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1194) sorted merge join

2010-02-24 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838191#action_12838191
 ] 

He Yongqiang commented on HIVE-1194:


Thanks Zheng. Yes, we should do that.

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang resolved HIVE-1195.
--

Resolution: Fixed

Committed to 0.5.1 and trunk. Thanks Zheng!

> Increase ObjectInspector[] length on demand
> ---
>
> Key: HIVE-1195
> URL: https://issues.apache.org/jira/browse/HIVE-1195
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.1, 0.6.0
>
> Attachments: HIVE-1195-branch-0.5.patch, HIVE-1195.1.patch, 
> HIVE-1195.2.branch-0.5.patch, HIVE-1195.2.patch
>
>
> {code}
> Operator.java
>   protected transient ObjectInspector[] inputObjInspectors = new 
> ObjectInspector[Short.MAX_VALUE];
> {code}
> An array of 32K elements takes 256KB memory under 64-bit Java.
> We are seeing hive client going out of memory because of that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-990) Incorporate CheckStyle into Hive's build.xml

2010-02-24 Thread Paul Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838177#action_12838177
 ] 

Paul Yang commented on HIVE-990:


By default, the VisibilityModifier catches protected variables 
(http://checkstyle.sf.net/config_design.html) Is the use of 'protected' 
discouraged? If so, what's the reason?

> Incorporate CheckStyle into Hive's build.xml
> 
>
> Key: HIVE-990
> URL: https://issues.apache.org/jira/browse/HIVE-990
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.6.0
>
> Attachments: checkstyle-errors.html, HIVE-990.patch
>
>
> Hadoop and Pig both have CheckStyle integrated into their build. This is 
> useful for catching
> a variety of errors as well as for enforcing a specific coding style and 
> maintaining good code hygiene.
> We just need to snatch Hadoop's checkstyle.xml and integrate it into Hive's 
> build.xml file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1137) build references IVY_HOME incorrectly

2010-02-24 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838175#action_12838175
 ] 

John Sichi commented on HIVE-1137:
--

+1


> build references IVY_HOME incorrectly
> -
>
> Key: HIVE-1137
> URL: https://issues.apache.org/jira/browse/HIVE-1137
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.6.0
>Reporter: John Sichi
>Assignee: Carl Steinbach
> Fix For: 0.6.0
>
> Attachments: HIVE-1137.patch
>
>
> The build references env.IVY_HOME, but doesn't actually import env as it 
> should (via ).
> It's not clear what the IVY_HOME reference is for since the build doesn't 
> even use ivy.home (instead, it installs under the build/ivy directory).
> It looks like someone copied bits and pieces from the "Automatically" section 
> here:
> http://ant.apache.org/ivy/history/latest-milestone/install.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

2010-02-24 Thread Jerome Boulon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838173#action_12838173
 ] 

Jerome Boulon commented on HIVE-259:


- From my point of view, changing variable access to private in the state 
object will not make the code more readable ...
- I'll change all variables to be lowerCase to match java style, current 
variable's name are based on Oracle definition.

@Zheng - I'm not using an ArrayList but a String to avoid unnecessary 
object creation (for every single row) ... would even be better if the 
constructor could have been used but I haven't found how to do that. If we care 
about 1 extra empty arrayList per mapper/spill in memory then we should care 
about creating (1 ArrayList + 1 Integer Object per percentile) per row.

@Zheng - Regarding the test case that what I add in mind when I asked you, 
howto create my own table and that exactly the reason why I post Jb2.* files


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, 
> jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1032) Better Error Messages for Execution Errors

2010-02-24 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838156#action_12838156
 ] 

Zheng Shao commented on HIVE-1032:
--

That makes sense to me. As long as it's compilable with 0.17 it should be OK.

Sorry there is another last thing :) Can you run "ant checkstyle" and fix the 
checkstyle warnings introduced by this patch (especially in the new files).

> Better Error Messages for Execution Errors
> --
>
> Key: HIVE-1032
> URL: https://issues.apache.org/jira/browse/HIVE-1032
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Paul Yang
>Assignee: Paul Yang
> Attachments: HIVE-1032.1.patch, HIVE-1032.2.patch, HIVE-1032.3.patch, 
> HIVE-1032.4.patch, HIVE-1032.5.patch
>
>
> Three common errors that occur during execution are:
> 1. Map-side group-by causing an out of memory exception due to large 
> aggregation hash tables
> 2. ScriptOperator failing due to the user's script throwing an exception or 
> otherwise returning a non-zero error code
> 3. Incorrectly specifying the join order of small and large tables, causing 
> the large table to be loaded into memory and producing an out of memory 
> exception.
> These errors are typically discovered by manually examining the error log 
> files of the failed task. This task proposes to create a feature that would 
> automatically read the error logs and output a probable cause and solution to 
> the command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.

2010-02-24 Thread Arvind Prabhakar (JIRA)

When checkstyle is activated for Hive in Eclipse environment, it shows all 
checkstyle problems as errors.
-

 Key: HIVE-1198
 URL: https://issues.apache.org/jira/browse/HIVE-1198
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Build Infrastructure
 Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin 
5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010)
Reporter: Arvind Prabhakar
Priority: Minor


As of now, checkstyle plugin reports all problems as errors. This causes an 
overwhelming number of errors to show up (3000+) which masks real errors that 
might be there. Since all the checkstyle violations are not going to be fixed 
in one shot, it is desirable to lower the severity of checkstyle violations to 
warnings so that the plugin can be kept enabled. This will encourage developers 
to spot checkstyle violations in the files they touch and potentially fix them 
as they go along, along with pointing out violations as they code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1032) Better Error Messages for Execution Errors

2010-02-24 Thread Paul Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838149#action_12838149
 ] 

Paul Yang commented on HIVE-1032:
-

Because this patch uses features of HIVE-873, this will not work with hadoop 
0.17. If you want, I can send you the broken queries I used to test on 0.20.

> Better Error Messages for Execution Errors
> --
>
> Key: HIVE-1032
> URL: https://issues.apache.org/jira/browse/HIVE-1032
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Paul Yang
>Assignee: Paul Yang
> Attachments: HIVE-1032.1.patch, HIVE-1032.2.patch, HIVE-1032.3.patch, 
> HIVE-1032.4.patch, HIVE-1032.5.patch
>
>
> Three common errors that occur during execution are:
> 1. Map-side group-by causing an out of memory exception due to large 
> aggregation hash tables
> 2. ScriptOperator failing due to the user's script throwing an exception or 
> otherwise returning a non-zero error code
> 3. Incorrectly specifying the join order of small and large tables, causing 
> the large table to be loaded into memory and producing an out of memory 
> exception.
> These errors are typically discovered by manually examining the error log 
> files of the failed task. This task proposes to create a feature that would 
> automatically read the error logs and output a probable cause and solution to 
> the command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1189) Add package-info.java to Hive

2010-02-24 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838148#action_12838148
 ] 

Zheng Shao commented on HIVE-1189:
--

I am checking the BuildVersion which contains everything.
I need to think of a way to do a negative test.


> Add package-info.java to Hive
> -
>
> Key: HIVE-1189
> URL: https://issues.apache.org/jira/browse/HIVE-1189
> Project: Hadoop Hive
>  Issue Type: New Feature
>Affects Versions: 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.6.0
>
> Attachments: HIVE-1189.1.patch
>
>
> Hadoop automatically generates build/src/org/apache/hadoop/package-info.java 
> with information like this:
> {code}
> /*
>  * Generated by src/saveVersion.sh
>  */
> @HadoopVersionAnnotation(version="0.20.2-dev", revision="826568",
>  user="zshao", date="Sun Oct 18 17:46:56 PDT 2009", 
> url="http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20";)
> package org.apache.hadoop;
> {code}
> Hive should do the same thing so that we can easily know the version of the 
> code at runtime.
> This will help us identify whether we are still running the same version of 
> Hive, if we serialize the plan and later continue the execution (See 
> HIVE-1100).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

2010-02-24 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838135#action_12838135
 ] 

Zheng Shao commented on HIVE-259:
-

1. We are converting "25,50,99" to ArrayList. Why don't we directly 
accept an int array (or a double array to allow 99.9).

In the query, the user can say:

SELECT percentile(mycol, array(25, 50, 99) FROM mytable;

2. Get rid of State.initDone.  We can set "ArrayList percentiles" to 
null first. That saves some space in memory as well as network when we transfer 
the state from mapper to reducer.

3. In Java, variable names should be lowercased.

4. We should change the test case to be non-trivial.


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, 
> jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1194) sorted merge join

2010-02-24 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838132#action_12838132
 ] 

Zheng Shao commented on HIVE-1194:
--

If it does not inherit any methods, shall we add an AbstractMapJoinOperator as 
the common parent?
That AbstractMapJoinOperator can be converted to MapJoinOperator (or 
HashBasedMapJoinOperator, to be accurate) or SortMergeJoinOperator depending on 
the configuration/table properties.


> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1194) sorted merge join

2010-02-24 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838130#action_12838130
 ] 

Namit Jain commented on HIVE-1194:
--

A new optimization step will be created which will convert the mapjoin to a 
sortmergejoin

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1194) sorted merge join

2010-02-24 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838122#action_12838122
 ] 

He Yongqiang commented on HIVE-1194:


Yes. It does not need those storage. 
The main reason of letting it extend mapjoinop is because with that we can 
reuse the code for mapjoinop doing optimization and task generation.

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1194) sorted merge join

2010-02-24 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838121#action_12838121
 ] 

Namit Jain commented on HIVE-1194:
--

Yes, but it happens on the mapper. It is a special type of mapjoin.
It will end up overwriting all the functions of map-join, but keeping it this 
way keeps the hierarchy correct

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1194) sorted merge join

2010-02-24 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838120#action_12838120
 ] 

Zheng Shao commented on HIVE-1194:
--

Why does SortMergeJoinOperator extends MapJoinOperator?
It seems to me that SortMergeJoinOperator does NOTneed the 
in-memory/disk-backed HashMap that MapJoinOperator has, correct?


> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

2010-02-24 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838118#action_12838118
 ] 

Zheng Shao commented on HIVE-259:
-

Also see http://wiki.apache.org/hadoop/Hive/HowToContribute#Coding_Convention

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, 
> jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

2010-02-24 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838119#action_12838119
 ] 

Zheng Shao commented on HIVE-259:
-

The test cases looks a bit too trivial or the results have problems? They 
always return the same number for the 3 different percentile values.


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, 
> jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1197) create a new input format where a mapper spans a file

2010-02-24 Thread Namit Jain (JIRA)

create a new input format where a mapper spans a file
-

 Key: HIVE-1197
 URL: https://issues.apache.org/jira/browse/HIVE-1197
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.6.0


This will be needed for Sort merge joins.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838114#action_12838114
 ] 

Ning Zhang commented on HIVE-1195:
--

Cool. I'll take the new patches to test. 

> Increase ObjectInspector[] length on demand
> ---
>
> Key: HIVE-1195
> URL: https://issues.apache.org/jira/browse/HIVE-1195
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.1, 0.6.0
>
> Attachments: HIVE-1195-branch-0.5.patch, HIVE-1195.1.patch, 
> HIVE-1195.2.branch-0.5.patch, HIVE-1195.2.patch
>
>
> {code}
> Operator.java
>   protected transient ObjectInspector[] inputObjInspectors = new 
> ObjectInspector[Short.MAX_VALUE];
> {code}
> An array of 32K elements takes 256KB memory under 64-bit Java.
> We are seeing hive client going out of memory because of that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1195:
-

Status: Open  (was: Patch Available)

> Increase ObjectInspector[] length on demand
> ---
>
> Key: HIVE-1195
> URL: https://issues.apache.org/jira/browse/HIVE-1195
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.1, 0.6.0
>
> Attachments: HIVE-1195-branch-0.5.patch, HIVE-1195.1.patch, 
> HIVE-1195.2.branch-0.5.patch, HIVE-1195.2.patch
>
>
> {code}
> Operator.java
>   protected transient ObjectInspector[] inputObjInspectors = new 
> ObjectInspector[Short.MAX_VALUE];
> {code}
> An array of 32K elements takes 256KB memory under 64-bit Java.
> We are seeing hive client going out of memory because of that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1194) sorted merge join

2010-02-24 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838113#action_12838113
 ] 

Namit Jain commented on HIVE-1194:
--

Based on a offline discussion with Yongqiang, we were thinking of the following:


There will be a new mapping in MapredWork ->
Operator -> MapredLocalWork

This will be populated for SortMergeJoinOperator only.

SortMergeJoinOperator is a new operator which extends MapJoinOperator, and has 
the
same name as a MapJoinOperator.

MapJoinProcessor needs to create a SortMergeJoinOperator instead of a 
MapJoinOperator
when it sees the new configuration parameter.

MapJoinFactory methods need to change to create Operator->MapredLocalWork 
instead of
MapredLocalWork in MapredWork.

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838112#action_12838112
 ] 

Ning Zhang commented on HIVE-1195:
--

Zheng, join26.q , join_map_ppr.q , union16.q, union9.q, failed on trunk. Can 
you take a look?

> Increase ObjectInspector[] length on demand
> ---
>
> Key: HIVE-1195
> URL: https://issues.apache.org/jira/browse/HIVE-1195
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.1, 0.6.0
>
> Attachments: HIVE-1195-branch-0.5.patch, HIVE-1195.1.patch, 
> HIVE-1195.2.branch-0.5.patch, HIVE-1195.2.patch
>
>
> {code}
> Operator.java
>   protected transient ObjectInspector[] inputObjInspectors = new 
> ObjectInspector[Short.MAX_VALUE];
> {code}
> An array of 32K elements takes 256KB memory under 64-bit Java.
> We are seeing hive client going out of memory because of that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Zheng Shao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-1195:
-

Attachment: HIVE-1195.2.patch
HIVE-1195.2.branch-0.5.patch

Fixed an obvious bug which caused unit test failures.

> Increase ObjectInspector[] length on demand
> ---
>
> Key: HIVE-1195
> URL: https://issues.apache.org/jira/browse/HIVE-1195
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.1, 0.6.0
>
> Attachments: HIVE-1195-branch-0.5.patch, HIVE-1195.1.patch, 
> HIVE-1195.2.branch-0.5.patch, HIVE-1195.2.patch
>
>
> {code}
> Operator.java
>   protected transient ObjectInspector[] inputObjInspectors = new 
> ObjectInspector[Short.MAX_VALUE];
> {code}
> An array of 32K elements takes 256KB memory under 64-bit Java.
> We are seeing hive client going out of memory because of that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1196) Railroad Diagrams for Hive Language Manual

2010-02-24 Thread Carl Steinbach (JIRA)

Railroad Diagrams for Hive Language Manual
--

 Key: HIVE-1196
 URL: https://issues.apache.org/jira/browse/HIVE-1196
 Project: Hadoop Hive
  Issue Type: Task
  Components: Documentation
Reporter: Carl Steinbach
Priority: Minor


Add railroad diagrams (syntax diagrams) to the Hive Language Manual.

* The [ANTLRWorks IDE|http://www.antlr.org/works/index.html] generates railroad 
diagrams and allows you to export them as EPS.
* [Clapham|http://sourceforge.net/projects/clapham/] is another tool for 
generating railroad diagrams based on BNF style inputs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-535) Memory-efficient hash-based Aggregation

2010-02-24 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838095#action_12838095
 ] 

Carl Steinbach commented on HIVE-535:
-

The folks working on Mahout seem to think the CERN license is compatible with 
Apache. They have
already imported cern.colt*, cern.jet* and cern.clhep into their source tree. 
See MAHOUT-222.

Check out the update to their LICENSE.txt file: 
http://svn.apache.org/repos/asf/lucene/mahout/trunk/LICENSE.txt

> Memory-efficient hash-based Aggregation
> ---
>
> Key: HIVE-535
> URL: https://issues.apache.org/jira/browse/HIVE-535
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Zheng Shao
>
> Currently there are a lot of memory overhead in the hash-based aggregation in 
> GroupByOperator.
> The net result is that GroupByOperator won't be able to store many entries in 
> its HashTable, and flushes frequently, and won't be able to achieve very good 
> partial aggregation result.
> Here are some initial thoughts (some of them are from Joydeep long time ago):
> A1. Serialize the key of the HashTable. This will eliminate the 16-byte 
> per-object overhead of Java in keys (depending on how many objects there are 
> in the key, the saving can be substantial).
> A2. Use more memory-efficient hash tables - java.util.HashMap has about 64 
> bytes of overhead per entry.
> A3. Use primitive array to store aggregation results. Basically, the UDAF 
> should manage the array of aggregation results, so UDAFCount should manage a 
> long[], UDAFAvg should manage a double[] and a long[]. The external code 
> should pass an index to iterate/merge/terminal an aggregation result. This 
> will eliminate the 16-byte per-object overhead of Java.
> More ideas are welcome.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1195:
-

Attachment: HIVE-1195-branch-0.5.patch

Uploading a patch for branch 0.5. Zheng, can you double check?

> Increase ObjectInspector[] length on demand
> ---
>
> Key: HIVE-1195
> URL: https://issues.apache.org/jira/browse/HIVE-1195
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.1, 0.6.0
>
> Attachments: HIVE-1195-branch-0.5.patch, HIVE-1195.1.patch
>
>
> {code}
> Operator.java
>   protected transient ObjectInspector[] inputObjInspectors = new 
> ObjectInspector[Short.MAX_VALUE];
> {code}
> An array of 32K elements takes 256KB memory under 64-bit Java.
> We are seeing hive client going out of memory because of that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838080#action_12838080
 ] 

Ning Zhang commented on HIVE-1195:
--

+1 Will commit after tests. 

> Increase ObjectInspector[] length on demand
> ---
>
> Key: HIVE-1195
> URL: https://issues.apache.org/jira/browse/HIVE-1195
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.1, 0.6.0
>
> Attachments: HIVE-1195.1.patch
>
>
> {code}
> Operator.java
>   protected transient ObjectInspector[] inputObjInspectors = new 
> ObjectInspector[Short.MAX_VALUE];
> {code}
> An array of 32K elements takes 256KB memory under 64-bit Java.
> We are seeing hive client going out of memory because of that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Zheng Shao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-1195:
-

Fix Version/s: 0.6.0
   0.5.1
   Status: Patch Available  (was: Open)

> Increase ObjectInspector[] length on demand
> ---
>
> Key: HIVE-1195
> URL: https://issues.apache.org/jira/browse/HIVE-1195
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.1, 0.6.0
>
> Attachments: HIVE-1195.1.patch
>
>
> {code}
> Operator.java
>   protected transient ObjectInspector[] inputObjInspectors = new 
> ObjectInspector[Short.MAX_VALUE];
> {code}
> An array of 32K elements takes 256KB memory under 64-bit Java.
> We are seeing hive client going out of memory because of that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Zheng Shao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-1195:
-

Attachment: HIVE-1195.1.patch

> Increase ObjectInspector[] length on demand
> ---
>
> Key: HIVE-1195
> URL: https://issues.apache.org/jira/browse/HIVE-1195
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: HIVE-1195.1.patch
>
>
> {code}
> Operator.java
>   protected transient ObjectInspector[] inputObjInspectors = new 
> ObjectInspector[Short.MAX_VALUE];
> {code}
> An array of 32K elements takes 256KB memory under 64-bit Java.
> We are seeing hive client going out of memory because of that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1195) Increase ObjectInspector[] length on demand

2010-02-24 Thread Zheng Shao (JIRA)

Increase ObjectInspector[] length on demand
---

 Key: HIVE-1195
 URL: https://issues.apache.org/jira/browse/HIVE-1195
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.5.0, 0.6.0
Reporter: Zheng Shao
Assignee: Zheng Shao


{code}
Operator.java
  protected transient ObjectInspector[] inputObjInspectors = new 
ObjectInspector[Short.MAX_VALUE];
{code}

An array of 32K elements takes 256KB memory under 64-bit Java.
We are seeing hive client going out of memory because of that.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1194) sorted merge join

2010-02-24 Thread Namit Jain (JIRA)

sorted merge join
-

 Key: HIVE-1194
 URL: https://issues.apache.org/jira/browse/HIVE-1194
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: He Yongqiang
 Fix For: 0.6.0


If the input tables are sorted on the join key, and a mapjoin is being 
performed, it is useful to exploit the sorted properties of the table.
This can lead to substantial cpu savings - this needs to work across bucketed 
map joins also.

Since, sorted properties of a table are not enforced currently, a new parameter 
can be added to specify to use the sort-merge join.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1193) ensure sorting properties for a table

2010-02-24 Thread Namit Jain (JIRA)

ensure sorting properties for a table
-

 Key: HIVE-1193
 URL: https://issues.apache.org/jira/browse/HIVE-1193
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
 Fix For: 0.6.0


If a table is sorted, and data is being inserted into that - currently, we dont 
make sure that data is sorted. That might be useful some downstream operations.
This cannot be made the default due to backward compatibility, but an option 
can be added for the same

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Build failed in Hudson: Hive-trunk-h0.20 #198

2010-02-24 Thread Apache Hudson Server

See 

--
[...truncated 13323 lines...]
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHO

[ANNOUNCE] Hive 0.5.0 released

2010-02-24 Thread Zheng Shao

Hi folks,

We have released Hive 0.5.0.
You can find it from the download page in 24 hours (still waiting to
be mirrored)

http://hadoop.apache.org/hive/releases.html#Download

-- 
Yours,
Zheng

[jira] Updated: (HIVE-1184) Expression Not In Group By Key error is sometimes masked

[jira] Commented: (HIVE-990) Incorporate CheckStyle into Hive's build.xml

[jira] Commented: (HIVE-1194) sorted merge join

[jira] Resolved: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Commented: (HIVE-990) Incorporate CheckStyle into Hive's build.xml

[jira] Commented: (HIVE-1137) build references IVY_HOME incorrectly

[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

[jira] Commented: (HIVE-1032) Better Error Messages for Execution Errors

[jira] Created: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.

[jira] Commented: (HIVE-1032) Better Error Messages for Execution Errors

[jira] Commented: (HIVE-1189) Add package-info.java to Hive

[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

[jira] Commented: (HIVE-1194) sorted merge join

[jira] Commented: (HIVE-1194) sorted merge join

[jira] Commented: (HIVE-1194) sorted merge join

[jira] Commented: (HIVE-1194) sorted merge join

[jira] Commented: (HIVE-1194) sorted merge join

[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

[jira] Created: (HIVE-1197) create a new input format where a mapper spans a file

[jira] Commented: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Commented: (HIVE-1194) sorted merge join

[jira] Commented: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Created: (HIVE-1196) Railroad Diagrams for Hive Language Manual

[jira] Commented: (HIVE-535) Memory-efficient hash-based Aggregation

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Commented: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Updated: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Created: (HIVE-1195) Increase ObjectInspector[] length on demand

[jira] Created: (HIVE-1194) sorted merge join

[jira] Created: (HIVE-1193) ensure sorting properties for a table

Build failed in Hudson: Hive-trunk-h0.20 #198

[ANNOUNCE] Hive 0.5.0 released

36 matches

Site Navigation

Mail list logo

Footer information