[jira] Created: (HIVE-953) script_broken_pipe3.q broken

2009-11-24 Thread Namit Jain (JIRA)
script_broken_pipe3.q broken


 Key: HIVE-953
 URL: https://issues.apache.org/jira/browse/HIVE-953
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0
Reporter: Namit Jain
Assignee: Paul Yang


The negative test script_broken_pipe3.q is broken if we allow partial 
consumption.
For now, I have disabled partial consumption. Can you take a look ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-947) Add run length encoding into RCFile's block header

2009-11-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-947.
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed. Thanks Yongqiang

> Add run length encoding into RCFile's block header 
> ---
>
> Key: HIVE-947
> URL: https://issues.apache.org/jira/browse/HIVE-947
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
>Priority: Minor
> Attachments: hive-947-2009-11-22.patch
>
>
> When RCFile constructing rows, it needs to get column value's length via 
> calling readVLong(). And this should be avoided for fix length or most fix 
> length columns. 
> This also should not influence old rcfile files, which means it should also 
> work correctly on files with previous RCFile format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-947) Add run length encoding into RCFile's block header

2009-11-24 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-947:
--

Attachment: (was: hive-947-2009-11-24.patch)

> Add run length encoding into RCFile's block header 
> ---
>
> Key: HIVE-947
> URL: https://issues.apache.org/jira/browse/HIVE-947
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
>Priority: Minor
> Attachments: hive-947-2009-11-22.patch
>
>
> When RCFile constructing rows, it needs to get column value's length via 
> calling readVLong(). And this should be avoided for fix length or most fix 
> length columns. 
> This also should not influence old rcfile files, which means it should also 
> work correctly on files with previous RCFile format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-947) Add run length encoding into RCFile's block header

2009-11-24 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-947:
--

Attachment: hive-947-2009-11-24.patch

> Add run length encoding into RCFile's block header 
> ---
>
> Key: HIVE-947
> URL: https://issues.apache.org/jira/browse/HIVE-947
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
>Priority: Minor
> Attachments: hive-947-2009-11-22.patch, hive-947-2009-11-24.patch
>
>
> When RCFile constructing rows, it needs to get column value's length via 
> calling readVLong(). And this should be avoided for fix length or most fix 
> length columns. 
> This also should not influence old rcfile files, which means it should also 
> work correctly on files with previous RCFile format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-951) Selectively include EXTERNAL TABLE source files via REGEX

2009-11-24 Thread Avram Aelony (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782213#action_12782213
 ] 

Avram Aelony commented on HIVE-951:
---

I want to echo Carl's point about copying to a new location being **extremely** 
inconvenient.  

Sometimes this inconvenience is impossible and a clear show stopper.  

In my mind, Namit's point about interchangeability can be resolved by clear 
documentation, whereas it is not always possible to copy data around from 
bucket to bucket especially if the data is quite large.  In my case, being 
forced to stage large volumes of existing S3 data to other S3 data buckets 
(copying the same data) for the purpose of Hive analysis has really slowed Hive 
adoption.  

Engineers output data to S3 with their own map/reduce efficiencies in mind 
without regard for Hive's preferred data organization so analysts are looking 
forward to having this feature so we can use Hive again.

Can't wait for this feature!  

> Selectively include EXTERNAL TABLE source files via REGEX
> -
>
> Key: HIVE-951
> URL: https://issues.apache.org/jira/browse/HIVE-951
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Carl Steinbach
>
> CREATE EXTERNAL TABLE should allow users to cherry-pick files via regular 
> expression. 
> CREATE EXTERNAL TABLE was designed to allow users to access data that exists 
> outside of Hive, and
> currently makes the assumption that all of the files located under the 
> supplied path should be included
> in the new table. Users frequently encounter directories containing multiple
> datasets, or directories that contain data in heterogeneous schemas, and it's 
> often
> impractical or impossible to adjust the layout of the directory to meet the 
> requirements of 
> CREATE EXTERNAL TABLE. A good example of this problem is creating an external 
> table based
> on the contents of an S3 bucket. 
> One way to solve this problem is to extend the syntax of CREATE EXTERNAL TABLE
> as follows:
> CREATE EXTERNAL TABLE
> ...
> LOCATION path [file_regex]
> ...
> For example:
> {code:sql}
> CREATE EXTERNAL TABLE mytable1 ( a string, b string, c string )
> STORED AS TEXTFILE
> LOCATION 's3://my.bucket/' 'folder/2009.*\.bz2$';
> {code}
> Creates mytable1 which includes all files in s3:/my.bucket with a filename 
> matching 'folder/2009*.bz2'
> {code:sql}
> CREATE EXTERNAL TABLE mytable2 ( d string, e int, f int, g int )
> STORED AS TEXTFILE 
> LOCATION 'hdfs://data/' 'xyz.*2009.bz2$';
> {code}
> Creates mytable2 including all files matching 'xyz*2009.bz2' located 
> under hdfs://data/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

2009-11-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782120#action_12782120
 ] 

Todd Lipcon commented on HIVE-259:
--

An easy way to do this that would work for a ton of data sets would to be 
essentially do counting sort. If you have only a few thousand distinct values 
in the column to be analyzed, just make a hashtable, count up how many you see, 
and then in the single reducer use the histogram to figure out the percentile. 
This should work great for datasets like age, and even for sets like "number of 
days since user signed up". For sets that are truly continuous, would be useful 
when combined with a binning UDF to discretize it.

Sadly it's not general case, but would be an easy first step.

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-928) ScriptOperator does not set CLASSPATH of spawned process.

2009-11-24 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782084#action_12782084
 ] 

Carl Steinbach commented on HIVE-928:
-

Yes, I'll post an updated patch by the end of the day. Sorry for the wait.

> ScriptOperator does not set CLASSPATH of spawned process.
> -
>
> Key: HIVE-928
> URL: https://issues.apache.org/jira/browse/HIVE-928
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-928, HIVE-928.patch
>
>
> ScriptOperator does not set the CLASSPATH of the the spawned script process. 
> The practical implication of this is that Java JARs that are added using the 
> "add JAR" command 
> will not be accessible to TRANSFORM/MAP/REDUCE operators unless the user can 
> guess the
> location of the JAR archive on each node.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-928) ScriptOperator does not set CLASSPATH of spawned process.

2009-11-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782082#action_12782082
 ] 

Namit Jain commented on HIVE-928:
-

@Carl, are you working on modifying the patch so that only the JARs that were 
added by the user appear on the classpath of the child process

> ScriptOperator does not set CLASSPATH of spawned process.
> -
>
> Key: HIVE-928
> URL: https://issues.apache.org/jira/browse/HIVE-928
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-928, HIVE-928.patch
>
>
> ScriptOperator does not set the CLASSPATH of the the spawned script process. 
> The practical implication of this is that Java JARs that are added using the 
> "add JAR" command 
> will not be accessible to TRANSFORM/MAP/REDUCE operators unless the user can 
> guess the
> location of the JAR archive on each node.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Hive-trunk-h0.20 #109

2009-11-24 Thread Apache Hudson Server
See 

Changes:

[heyongqiang] custom mappers/reducers should not be initialized at compile time

--
[...truncated 10878 lines...]
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function2.q
[junit] Begin query: unknown_function3.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function3.q
[junit] Begin query: unknown_function4.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}

[jira] Commented: (HIVE-951) Selectively include EXTERNAL TABLE source files via REGEX

2009-11-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782019#action_12782019
 ] 

Namit Jain commented on HIVE-951:
-

I don't think implementation will present a major problem, I am just concerned 
about the 
user demand on this. Many times, the users use hive and hadoop interchangeably 
to access
the data, and this might be difficult for them.

> Selectively include EXTERNAL TABLE source files via REGEX
> -
>
> Key: HIVE-951
> URL: https://issues.apache.org/jira/browse/HIVE-951
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Carl Steinbach
>
> CREATE EXTERNAL TABLE should allow users to cherry-pick files via regular 
> expression. 
> CREATE EXTERNAL TABLE was designed to allow users to access data that exists 
> outside of Hive, and
> currently makes the assumption that all of the files located under the 
> supplied path should be included
> in the new table. Users frequently encounter directories containing multiple
> datasets, or directories that contain data in heterogeneous schemas, and it's 
> often
> impractical or impossible to adjust the layout of the directory to meet the 
> requirements of 
> CREATE EXTERNAL TABLE. A good example of this problem is creating an external 
> table based
> on the contents of an S3 bucket. 
> One way to solve this problem is to extend the syntax of CREATE EXTERNAL TABLE
> as follows:
> CREATE EXTERNAL TABLE
> ...
> LOCATION path [file_regex]
> ...
> For example:
> {code:sql}
> CREATE EXTERNAL TABLE mytable1 ( a string, b string, c string )
> STORED AS TEXTFILE
> LOCATION 's3://my.bucket/' 'folder/2009.*\.bz2$';
> {code}
> Creates mytable1 which includes all files in s3:/my.bucket with a filename 
> matching 'folder/2009*.bz2'
> {code:sql}
> CREATE EXTERNAL TABLE mytable2 ( d string, e int, f int, g int )
> STORED AS TEXTFILE 
> LOCATION 'hdfs://data/' 'xyz.*2009.bz2$';
> {code}
> Creates mytable2 including all files matching 'xyz*2009.bz2' located 
> under hdfs://data/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Hive-trunk-h0.19 #285

2009-11-24 Thread Apache Hudson Server
See 

Changes:

[heyongqiang] custom mappers/reducers should not be initialized at compile time

--
[...truncated 10849 lines...]
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function2.q
[junit] Begin query: unknown_function3.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function3.q
[junit] Begin query: unknown_function4.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}

Build failed in Hudson: Hive-trunk-h0.18 #285

2009-11-24 Thread Apache Hudson Server
See 

Changes:

[heyongqiang] custom mappers/reducers should not be initialized at compile time

--
[...truncated 10866 lines...]
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function2.q
[junit] Begin query: unknown_function3.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function3.q
[junit] Begin query: unknown_function4.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}

Build failed in Hudson: Hive-trunk-h0.17 #282

2009-11-24 Thread Apache Hudson Server
See 

--
[...truncated 8752 lines...]
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function2.q
[junit] Begin query: unknown_function3.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function3.q
[junit] Begin query: unknown_function4.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit

[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

2009-11-24 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781824#action_12781824
 ] 

Carl Steinbach commented on HIVE-259:
-

This would be a very useful function to have.

For the sake of completeness (and without much additional effort) it would be 
nice to provide both PERCENTILE_DISC and PERCENTILE_CONT.

PERCENTILE_CONT: 
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions110.htm
PERCENTILE_DISC:  
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions111.htm


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-952) Support analytic NTILE function

2009-11-24 Thread Carl Steinbach (JIRA)
Support analytic NTILE function
---

 Key: HIVE-952
 URL: https://issues.apache.org/jira/browse/HIVE-952
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach


The NTILE function divides a set of ordered rows into equally sized buckets and 
assigns a bucket number to each row.
Useful for calculating tertiles, quartiles, quintiles, etc.

Example:

{code:sql}
SELECT last_name, salary,
NTILE(4) OVER (ORDER BY salary DESC) AS quartile
FROM employees
WHERE department_id = 100;
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.