date:20100106

[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

2010-01-06 Thread Patrick Angeles (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797520#action_12797520
 ] 

Patrick Angeles commented on HIVE-1027:
---


1) In general XPath queries return a list of nodes. What is the semantics of 
xpath_double (eg.) return if XPath evaluates to multiple nodes. 

Only xpath() returns multiple nodes (list).

xpath_string() returns the text of the first matching node (and its subnodes, 
if any).
- xpath_string('aab1b2','a') returns 'aab1b2'
- xpath_string('aab1b2','b') returns 'b1'

xpath_double()/float() return the numeric value of the text of the first 
matching node, or NaN if the text value is not numeric.
xpath_int()/long()/short() return the numberic value of the text of the first 
matching node, or 0 if the text value is not numeric, or MAX_INT, MAX_LONG, 
MAX_SHORT respectively if the value overflows.

2) Is the XPath query parsed for every input row, or only parsed once?

The XPath expression is compiled and cached. It is reused if the next 
expression matches the previous. Otherwise, it is recompiled. So, the xml is 
always parsed for every input row, but the xpath expression is precompiled and 
reused for the vast majority of use cases.

3a) Do you support DTD and XMLSchema?

Not sure how these would apply, as the Java XPath API is schema agnostic (no 
validation being performed). However, malformed xml (e.g., '1') 
will result in a runtime exception being thrown.

3b) What about namespace and backward axes in XPath?

Namespace is not currently supported, but could be easily added later.

Backward axes are supported:

> select xpath (' id="2">','/descendant::c/ancestor::b/@id') from t1 limit 1 ;
["1","2"]

4) If XPath evaluates to empty list, do you return NULL or empty string (in 
case of xpath())?

When no match is found:
xpath()  returns an empty list.
xpath_string() returns an empty string.
xpath_int(), float(), etc. will return 0.
xpath_boolean() will return false.

> Create UDFs for XPath expression evaluation
> ---
>
> Key: HIVE-1027
> URL: https://issues.apache.org/jira/browse/HIVE-1027
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Patrick Angeles
>Assignee: Patrick Angeles
>Priority: Minor
> Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT 
> > 1 ;
> 5.0
> > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT 
> > 1 ;
> b2
> > SELECT xpath ('b1b2b3c1c2', 
> > 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, 
> xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-984) Building Hive occasionally fails with Ivy error: hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:

2010-01-06 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797512#action_12797512
 ] 

Carl Steinbach commented on HIVE-984:
-

Any chance this can go into 0.5.0?

> Building Hive occasionally fails with Ivy error: 
> hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:
> ---
>
> Key: HIVE-984
> URL: https://issues.apache.org/jira/browse/HIVE-984
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-984.patch
>
>
> Folks keep running into this problem when building Hive from source:
> {noformat}
> [ivy:retrieve]
> [ivy:retrieve] :: problems summary ::
> [ivy:retrieve]  WARNINGS
> [ivy:retrieve]  [FAILED ]
> hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:
> expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72
> (138662ms)
> [ivy:retrieve]  [FAILED ]
> hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:
> expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72
> (138662ms)
> [ivy:retrieve]   hadoop-resolver: tried
> [ivy:retrieve]
> http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
> [ivy:retrieve]  ::
> [ivy:retrieve]  ::  FAILED DOWNLOADS::
> [ivy:retrieve]  :: ^ see resolution messages for details  ^ ::
> [ivy:retrieve]  ::
> [ivy:retrieve]  :: hadoop#core;0.20.1!hadoop.tar.gz(source)
> [ivy:retrieve]  ::
> [ivy:retrieve]
> [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
> {noformat}
> The problem appears to be either with a) the Hive build scripts, b) ivy, or 
> c) archive.apache.org
> Besides fixing the actual bug, one other option worth considering is to add 
> the Hadoop jars to the
> Hive source repository.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException

2010-01-06 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797489#action_12797489
 ] 

Carl Steinbach commented on HIVE-1031:
--

Sorry, didn't realize that HIVE-996 got committed. I will update the testcase.

> "DESCRIBE FUNCTION array" throws ParseException
> ---
>
> Key: HIVE-1031
> URL: https://issues.apache.org/jira/browse/HIVE-1031
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-1031.patch
>
>
> {noformat}
> hive> describe function array;
> describe function array;
> FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe 
> statement
> hive> describe function 'array';
> describe function 'array';
> OK
> array(n0, n1...) - Creates an array with the given elements 
> Time taken: 0.396 seconds
> hive> describe function map;
> describe function map;
> FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe 
> statement
> hive> describe function 'map';
> describe function 'map';
> OK
> map(key0, value0, key1, value1...) - Creates a map with the given key/value 
> pairs 
> Time taken: 0.054 seconds
> hive> describe function case;
> describe function case;
> FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe 
> statement
> hive> describe function 'case';
> describe function 'case';
> OK
> There is no documentation for function case
> Time taken: 0.072 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException

2010-01-06 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797488#action_12797488
 ] 

Carl Steinbach commented on HIVE-1031:
--

@Namit: difficult, because there is overlap between this and HIVE-996. Would 
you like me to role this change into HIVE-996?


> "DESCRIBE FUNCTION array" throws ParseException
> ---
>
> Key: HIVE-1031
> URL: https://issues.apache.org/jira/browse/HIVE-1031
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-1031.patch
>
>
> {noformat}
> hive> describe function array;
> describe function array;
> FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe 
> statement
> hive> describe function 'array';
> describe function 'array';
> OK
> array(n0, n1...) - Creates an array with the given elements 
> Time taken: 0.396 seconds
> hive> describe function map;
> describe function map;
> FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe 
> statement
> hive> describe function 'map';
> describe function 'map';
> OK
> map(key0, value0, key1, value1...) - Creates a map with the given key/value 
> pairs 
> Time taken: 0.054 seconds
> hive> describe function case;
> describe function case;
> FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe 
> statement
> hive> describe function 'case';
> describe function 'case';
> OK
> There is no documentation for function case
> Time taken: 0.072 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1032) Better Error Messages for Execution Errors

2010-01-06 Thread Paul Yang (JIRA)

Better Error Messages for Execution Errors
--

 Key: HIVE-1032
 URL: https://issues.apache.org/jira/browse/HIVE-1032
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Paul Yang
Assignee: Paul Yang


Three common errors that occur during execution are:

1. Map-side group-by causing an out of memory exception due to large 
aggregation hash tables

2. ScriptOperator failing due to the user's script throwing an exception or 
otherwise returning a non-zero error code

3. Incorrectly specifying the join order of small and large tables, causing the 
large table to be loaded into memory and producing an out of memory exception.

These errors are typically discovered by manually examining the error log files 
of the failed task. This task proposes to create a feature that would 
automatically read the error logs and output a probable cause and solution to 
the command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/scheme support Hive QL

2010-01-06 Thread Alex Loddengaard (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797486#action_12797486
 ] 

Alex Loddengaard commented on HIVE-675:
---

I will be out of the office Thursday, 1/7, through Wednesday, 1/13,
back in the office Thursday, 1/14.  I will be checking email fairly
consistently in the evenings.

Please contact Christophe Bisciglia (christo...@cloudera.com) with any
support or training emergencies.  Otherwise, you'll hear from me soon.

Thanks,

Alex


> add database/scheme support Hive QL
> ---
>
> Key: HIVE-675
> URL: https://issues.apache.org/jira/browse/HIVE-675
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Prasad Chakka
>Assignee: He Yongqiang
> Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
> hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
> hive-675-2009-9-8.patch
>
>
> Currently all Hive tables reside in single namespace (default). Hive should 
> support multiple namespaces (databases or schemas) such that users can create 
> tables in their specific namespaces. These name spaces can have different 
> warehouse directories (with a default naming scheme) and possibly different 
> properties.
> There is already some support for this in metastore but Hive query parser 
> should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/scheme support Hive QL

2010-01-06 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797477#action_12797477
 ] 

Zheng Shao commented on HIVE-675:
-

Yongqiang, you can start adapt the patch to trunk right now.

The plan is to branch 0.5 on 1/7/2010 (tomorrow). After that we can quickly 
review this diff and get it in.


> add database/scheme support Hive QL
> ---
>
> Key: HIVE-675
> URL: https://issues.apache.org/jira/browse/HIVE-675
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Prasad Chakka
>Assignee: He Yongqiang
> Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
> hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
> hive-675-2009-9-8.patch
>
>
> Currently all Hive tables reside in single namespace (default). Hive should 
> support multiple namespaces (databases or schemas) such that users can create 
> tables in their specific namespaces. These name spaces can have different 
> warehouse directories (with a default naming scheme) and possibly different 
> properties.
> There is already some support for this in metastore but Hive query parser 
> should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-686) add UDF substring_index

2010-01-06 Thread Zheng Shao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao reassigned HIVE-686:
---

Assignee: Larry Ogrodnek

> add UDF substring_index
> ---
>
> Key: HIVE-686
> URL: https://issues.apache.org/jira/browse/HIVE-686
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Larry Ogrodnek
> Attachments: HIVE-686.patch
>
>
> add UDFsubstring_index
> look at
> http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html
> for details

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-996:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed. Thanks Carl

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.4.patch, 
> HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797469#action_12797469
 ] 

Namit Jain commented on HIVE-1031:
--

can you add a test or change the existing test to remove quotes ?

> "DESCRIBE FUNCTION array" throws ParseException
> ---
>
> Key: HIVE-1031
> URL: https://issues.apache.org/jira/browse/HIVE-1031
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-1031.patch
>
>
> {noformat}
> hive> describe function array;
> describe function array;
> FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe 
> statement
> hive> describe function 'array';
> describe function 'array';
> OK
> array(n0, n1...) - Creates an array with the given elements 
> Time taken: 0.396 seconds
> hive> describe function map;
> describe function map;
> FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe 
> statement
> hive> describe function 'map';
> describe function 'map';
> OK
> map(key0, value0, key1, value1...) - Creates a map with the given key/value 
> pairs 
> Time taken: 0.054 seconds
> hive> describe function case;
> describe function case;
> FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe 
> statement
> hive> describe function 'case';
> describe function 'case';
> OK
> There is no documentation for function case
> Time taken: 0.072 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException

2010-01-06 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1031:
-

Status: Patch Available  (was: Open)

* Updated the list of "sysFuncNames" in the Hive grammar file.

{noformat}
hive> describe function array;
describe function array;
OK
array(n0, n1...) - Creates an array with the given elements 
Time taken: 0.051 seconds
hive> describe function map;
describe function map;
OK
map(key0, value0, key1, value1...) - Creates a map with the given key/value 
pairs 
Time taken: 0.069 seconds
{noformat}


> "DESCRIBE FUNCTION array" throws ParseException
> ---
>
> Key: HIVE-1031
> URL: https://issues.apache.org/jira/browse/HIVE-1031
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-1031.patch
>
>
> {noformat}
> hive> describe function array;
> describe function array;
> FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe 
> statement
> hive> describe function 'array';
> describe function 'array';
> OK
> array(n0, n1...) - Creates an array with the given elements 
> Time taken: 0.396 seconds
> hive> describe function map;
> describe function map;
> FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe 
> statement
> hive> describe function 'map';
> describe function 'map';
> OK
> map(key0, value0, key1, value1...) - Creates a map with the given key/value 
> pairs 
> Time taken: 0.054 seconds
> hive> describe function case;
> describe function case;
> FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe 
> statement
> hive> describe function 'case';
> describe function 'case';
> OK
> There is no documentation for function case
> Time taken: 0.072 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException

2010-01-06 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1031:
-

Attachment: HIVE-1031.patch

> "DESCRIBE FUNCTION array" throws ParseException
> ---
>
> Key: HIVE-1031
> URL: https://issues.apache.org/jira/browse/HIVE-1031
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-1031.patch
>
>
> {noformat}
> hive> describe function array;
> describe function array;
> FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe 
> statement
> hive> describe function 'array';
> describe function 'array';
> OK
> array(n0, n1...) - Creates an array with the given elements 
> Time taken: 0.396 seconds
> hive> describe function map;
> describe function map;
> FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe 
> statement
> hive> describe function 'map';
> describe function 'map';
> OK
> map(key0, value0, key1, value1...) - Creates a map with the given key/value 
> pairs 
> Time taken: 0.054 seconds
> hive> describe function case;
> describe function case;
> FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe 
> statement
> hive> describe function 'case';
> describe function 'case';
> OK
> There is no documentation for function case
> Time taken: 0.072 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797428#action_12797428
 ] 

Carl Steinbach commented on HIVE-996:
-

@Namit: updated udaf_max.q.out and udaf_min.q.out:

Index: ql/src/test/results/clientpositive/udaf_max.q.out
===
--- ql/src/test/results/clientpositive/udaf_max.q.out   (revision 0)
+++ ql/src/test/results/clientpositive/udaf_max.q.out   (revision 0)
@@ -0,0 +1,20 @@
+PREHOOK: query: DESCRIBE FUNCTION max
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION max
+POSTHOOK: type: DESCFUNCTION
+max(expr) - Returns the maximum value of expr
+PREHOOK: query: DESCRIBE FUNCTION EXTENDED max
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION EXTENDED max
+POSTHOOK: type: DESCFUNCTION
+max(expr) - Returns the maximum value of expr
+PREHOOK: query: DESCRIBE FUNCTION max
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION max
+POSTHOOK: type: DESCFUNCTION
+max(expr) - Returns the maximum value of expr
+PREHOOK: query: DESCRIBE FUNCTION EXTENDED max
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION EXTENDED max
+POSTHOOK: type: DESCFUNCTION
+max(expr) - Returns the maximum value of expr

Index: ql/src/test/results/clientpositive/udaf_min.q.out
===
--- ql/src/test/results/clientpositive/udaf_min.q.out   (revision 0)
+++ ql/src/test/results/clientpositive/udaf_min.q.out   (revision 0)
@@ -0,0 +1,20 @@
+PREHOOK: query: DESCRIBE FUNCTION min
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION min
+POSTHOOK: type: DESCFUNCTION
+min(expr) - Returns the minimum value of expr
+PREHOOK: query: DESCRIBE FUNCTION EXTENDED min
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION EXTENDED min
+POSTHOOK: type: DESCFUNCTION
+min(expr) - Returns the minimum value of expr
+PREHOOK: query: DESCRIBE FUNCTION min
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION min
+POSTHOOK: type: DESCFUNCTION
+min(expr) - Returns the minimum value of expr
+PREHOOK: query: DESCRIBE FUNCTION EXTENDED min
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION EXTENDED min
+POSTHOOK: type: DESCFUNCTION
+min(expr) - Returns the minimum value of expr

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.4.patch, 
> HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

2010-01-06 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797427#action_12797427
 ] 

Ning Zhang commented on HIVE-1027:
--

This is cool stuff. Just some questions:
1) In general XPath queries return a list of nodes. What is the semantics of 
xpath_double (eg.) return if XPath evaluates to multiple nodes. 
2) Is the XPath query parsed for every input row, or only parsed once?
3) Do you support DTD and XMLSchema? What about namespace and backward axes in 
XPath?
4) If XPath evaluates to empty list, do you return NULL or empty string (in 
case of xpath())?
 

> Create UDFs for XPath expression evaluation
> ---
>
> Key: HIVE-1027
> URL: https://issues.apache.org/jira/browse/HIVE-1027
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Patrick Angeles
>Assignee: Patrick Angeles
>Priority: Minor
> Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT 
> > 1 ;
> 5.0
> > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT 
> > 1 ;
> b2
> > SELECT xpath ('b1b2b3c1c2', 
> > 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, 
> xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-996:


Attachment: HIVE-996.4.patch

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.4.patch, 
> HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797420#action_12797420
 ] 

Namit Jain commented on HIVE-1027:
--

+1

looks good - will commit if the tests pass

> Create UDFs for XPath expression evaluation
> ---
>
> Key: HIVE-1027
> URL: https://issues.apache.org/jira/browse/HIVE-1027
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Patrick Angeles
>Assignee: Patrick Angeles
>Priority: Minor
> Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT 
> > 1 ;
> 5.0
> > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT 
> > 1 ;
> b2
> > SELECT xpath ('b1b2b3c1c2', 
> > 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, 
> xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-1027) Create UDFs for XPath expression evaluation

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-1027:


Assignee: Patrick Angeles

> Create UDFs for XPath expression evaluation
> ---
>
> Key: HIVE-1027
> URL: https://issues.apache.org/jira/browse/HIVE-1027
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Patrick Angeles
>Assignee: Patrick Angeles
>Priority: Minor
> Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT 
> > 1 ;
> 5.0
> > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT 
> > 1 ;
> b2
> > SELECT xpath ('b1b2b3c1c2', 
> > 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, 
> xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797414#action_12797414
 ] 

Namit Jain commented on HIVE-996:
-

Index: ql/src/test/results/clientpositive/udaf_max.q.out
===
--- ql/src/test/results/clientpositive/udaf_max.q.out   (revision 0)
+++ ql/src/test/results/clientpositive/udaf_max.q.out   (revision 0)
@@ -0,0 +1,20 @@
+PREHOOK: query: DESCRIBE FUNCTION max
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION max
+POSTHOOK: type: DESCFUNCTION
+There is no documentation for function max
+PREHOOK: query: DESCRIBE FUNCTION EXTENDED max
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION EXTENDED max
+POSTHOOK: type: DESCFUNCTION
+There is no documentation for function max
+PREHOOK: query: DESCRIBE FUNCTION max
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION max
+POSTHOOK: type: DESCFUNCTION
+There is no documentation for function max
+PREHOOK: query: DESCRIBE FUNCTION EXTENDED max
+PREHOOK: type: DESCFUNCTION
+POSTHOOK: query: DESCRIBE FUNCTION EXTENDED max
+POSTHOOK: type: DESCFUNCTION
+There is no documentation for function max



It is still the old one

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-996:


Attachment: HIVE-996.3.patch

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797407#action_12797407
 ] 

Carl Steinbach commented on HIVE-996:
-

* Updated udaf_max.q.out and udaf_min.q.out


> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797402#action_12797402
 ] 

Carl Steinbach commented on HIVE-996:
-

@namit: sorry, regenerating the patch...

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797396#action_12797396
 ] 

Namit Jain commented on HIVE-996:
-

Dont you need to fix the output files udaf_max.q.out/udaf_min.q.out ?

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HIVE-988) mapjoin should throw an error if the input is too large

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-988.
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]

> mapjoin should throw an error if the input is too large
> ---
>
> Key: HIVE-988
> URL: https://issues.apache.org/jira/browse/HIVE-988
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Ning Zhang
> Fix For: 0.5.0
>
> Attachments: HIVE-988.patch, HIVE-988_2.patch, HIVE-988_3.patch, 
> HIVE-988_4.patch
>
>
> If the input to the map join is larger than a specific threshold, it may lead 
> to a very slow execution of the join.
> It is better to throw an error, and let the user redo his query as a non 
> map-join query.
> However, the current map-reduce framework will retry the mapper 4 times 
> before actually killing the job.
> Based on a offline discussion with Dhruba, Ning and myself, we came up with 
> the following algorithm:
> Keep a threshold in the mapper for the number of rows to be processed for 
> map-join. If the number of rows
> exceeds that threshold, set a counter and kill that mapper.
> The client (ExecDriver) monitors that job continuously - if this counter is 
> set, it kills the job and also
> shows an appropriate error message to the user, so that he can retry the 
> query without the map join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-988) mapjoin should throw an error if the input is too large

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-988:


Status: In Progress  (was: Patch Available)

Committed. Thanks Ning

> mapjoin should throw an error if the input is too large
> ---
>
> Key: HIVE-988
> URL: https://issues.apache.org/jira/browse/HIVE-988
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Ning Zhang
> Fix For: 0.5.0
>
> Attachments: HIVE-988.patch, HIVE-988_2.patch, HIVE-988_3.patch, 
> HIVE-988_4.patch
>
>
> If the input to the map join is larger than a specific threshold, it may lead 
> to a very slow execution of the join.
> It is better to throw an error, and let the user redo his query as a non 
> map-join query.
> However, the current map-reduce framework will retry the mapper 4 times 
> before actually killing the job.
> Based on a offline discussion with Dhruba, Ning and myself, we came up with 
> the following algorithm:
> Keep a threshold in the mapper for the number of rows to be processed for 
> map-join. If the number of rows
> exceeds that threshold, set a counter and kill that mapper.
> The client (ExecDriver) monitors that job continuously - if this counter is 
> set, it kills the job and also
> shows an appropriate error message to the user, so that he can retry the 
> query without the map join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans

2010-01-06 Thread Zheng Shao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-1030:
-

Attachment: HIVE-1030.2.patch

Good catch Ning. Here is another place to fix (HIVE-1030.2.patch).

I guess HashMapWrapper runs in mapper or reducer.
It should be OK because in mapper/reducer, Hadoop automatically sets 
java.io.tmpdir according to 
http://hadoop.apache.org/common/docs/r0.18.3/mapred_tutorial.html
And java.io.tmpdir is used in File.getTempDir.


> Hive should use scratchDir instead of system temporary directory for storing 
> plans
> --
>
> Key: HIVE-1030
> URL: https://issues.apache.org/jira/browse/HIVE-1030
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.0
>
> Attachments: HIVE-1030.1.patch, HIVE-1030.2.patch
>
>
> Otherwise these plan files never get deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-996:


Attachment: HIVE-996.2.patch

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797371#action_12797371
 ] 

Carl Steinbach commented on HIVE-996:
-

* Added annotations for UDAFMax and UDAFMin.
* Correctly handle the GenericUDAFBridge case in 
FunctionInfo.getFunctionClass().


> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.2.patch, HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797361#action_12797361
 ] 

Carl Steinbach commented on HIVE-996:
-

@Namit: working on it now.

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797359#action_12797359
 ] 

Namit Jain commented on HIVE-996:
-

@Carl, can you address Zheng's comments ?

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797355#action_12797355
 ] 

Namit Jain commented on HIVE-996:
-

+1

looks good - will commit if the tests pass

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797353#action_12797353
 ] 

Namit Jain commented on HIVE-1031:
--

Seems to be a problem with reserved words.

> "DESCRIBE FUNCTION array" throws ParseException
> ---
>
> Key: HIVE-1031
> URL: https://issues.apache.org/jira/browse/HIVE-1031
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
>
> {noformat}
> hive> describe function array;
> describe function array;
> FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe 
> statement
> hive> describe function 'array';
> describe function 'array';
> OK
> array(n0, n1...) - Creates an array with the given elements 
> Time taken: 0.396 seconds
> hive> describe function map;
> describe function map;
> FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe 
> statement
> hive> describe function 'map';
> describe function 'map';
> OK
> map(key0, value0, key1, value1...) - Creates a map with the given key/value 
> pairs 
> Time taken: 0.054 seconds
> hive> describe function case;
> describe function case;
> FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe 
> statement
> hive> describe function 'case';
> describe function 'case';
> OK
> There is no documentation for function case
> Time taken: 0.072 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797351#action_12797351
 ] 

Zheng Shao commented on HIVE-996:
-

Overall it looks good.

1. Can you add annotation for UDAFMax and UDAFMin?
2. We need to treat GenericUDAFBridge specially in 
FunctionInfo.getFunctionClass (you already did it for GenericUDFBridge).

I will leave the rest to Namit.


> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797347#action_12797347
 ] 

Namit Jain commented on HIVE-996:
-

Great, I will take a look at it right away

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException

2010-01-06 Thread Carl Steinbach (JIRA)

"DESCRIBE FUNCTION array" throws ParseException
---

 Key: HIVE-1031
 URL: https://issues.apache.org/jira/browse/HIVE-1031
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach


{noformat}
hive> describe function array;
describe function array;
FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe 
statement

hive> describe function 'array';
describe function 'array';
OK
array(n0, n1...) - Creates an array with the given elements 
Time taken: 0.396 seconds
hive> describe function map;
describe function map;
FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe 
statement

hive> describe function 'map';
describe function 'map';
OK
map(key0, value0, key1, value1...) - Creates a map with the given key/value 
pairs 
Time taken: 0.054 seconds
hive> describe function case;
describe function case;
FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe 
statement

hive> describe function 'case';
describe function 'case';
OK
There is no documentation for function case
Time taken: 0.072 seconds

{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-996:


Status: Patch Available  (was: Open)

* Fix 'describe function' and 'describe function extended' for UDTFs and UDAFs.
* Differentiate between the case where a function does not exist and 
documentation for the function does not exist.


> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts

2010-01-06 Thread Edward Capriolo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797340#action_12797340
 ] 

Edward Capriolo commented on HIVE-1015:
---

JAVA, JAVA, JAVA. I love it. Even our 'external scripts' can be java now :)

> Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
> ---
>
> Key: HIVE-1015
> URL: https://issues.apache.org/jira/browse/HIVE-1015
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Contrib
>Reporter: Carl Steinbach
>Assignee: Larry Ogrodnek
> Fix For: 0.5.0
>
> Attachments: HIVE-1015.patch, HIVE-1015.patch
>
>
> Larry Ogrodnek has written a set of wrapper classes that make it possible
> to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that
> more closely resembles conventional Hadoop MR programs.
> A blog post describing this library can be found here: 
> http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html
> The source code (with Apache license) is available here: 
> http://github.com/ogrodnek/shmrj
> We should add this to contrib.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-978) Hive jars should follow Hadoop naming and include version

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797339#action_12797339
 ] 

Namit Jain commented on HIVE-978:
-

Looks good to me.

@Edward, can you take care of it ?

> Hive jars should follow Hadoop naming and include version
> -
>
> Key: HIVE-978
> URL: https://issues.apache.org/jira/browse/HIVE-978
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 0.5.0
>Reporter: Chad Metcalf
>Assignee: Chad Metcalf
>Priority: Minor
> Fix For: 0.5.0
>
> Attachments: HIVE-978v1.patch, HIVE-978v2.patch, HIVE-978v3.patch, 
> HIVE-978v4.patch
>
>
> This is a simple patch on the ant build files to change jar naming from
> hive_foo.jar to hive-foo-VERSION.jar
> This matches the convention followed by hadoop jars. This naming scheme is 
> important for packaging, repositories, etc.
> Testing done:
> ant test
> ant tar
> Things look right.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF

2010-01-06 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-996:


Attachment: HIVE-996.patch

> "describe function" throws NPE when when called on UDTF or UDAF
> ---
>
> Key: HIVE-996
> URL: https://issues.apache.org/jira/browse/HIVE-996
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.5.0
>
> Attachments: HIVE-996.patch
>
>
> {noformat}
> hive> describe function explode;
> describe function explode;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function sum;
> describe function sum;
> FAILED: Error in metadata: java.lang.NullPointerException
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive> describe function conv;
> describe function conv;
> OK
> conv(num, from_base, to_base) - convert num from from_base to to_base
> Time taken: 0.042 seconds
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-988) mapjoin should throw an error if the input is too large

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797334#action_12797334
 ] 

Namit Jain commented on HIVE-988:
-

+1

will commit if the tests pass

> mapjoin should throw an error if the input is too large
> ---
>
> Key: HIVE-988
> URL: https://issues.apache.org/jira/browse/HIVE-988
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Ning Zhang
> Fix For: 0.5.0
>
> Attachments: HIVE-988.patch, HIVE-988_2.patch, HIVE-988_3.patch, 
> HIVE-988_4.patch
>
>
> If the input to the map join is larger than a specific threshold, it may lead 
> to a very slow execution of the join.
> It is better to throw an error, and let the user redo his query as a non 
> map-join query.
> However, the current map-reduce framework will retry the mapper 4 times 
> before actually killing the job.
> Based on a offline discussion with Dhruba, Ning and myself, we came up with 
> the following algorithm:
> Keep a threshold in the mapper for the number of rows to be processed for 
> map-join. If the number of rows
> exceeds that threshold, set a counter and kill that mapper.
> The client (ExecDriver) monitors that job continuously - if this counter is 
> set, it kills the job and also
> shows an appropriate error message to the user, so that he can retry the 
> query without the map join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1030:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed. Thanks Zheng

> Hive should use scratchDir instead of system temporary directory for storing 
> plans
> --
>
> Key: HIVE-1030
> URL: https://issues.apache.org/jira/browse/HIVE-1030
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.0
>
> Attachments: HIVE-1030.1.patch
>
>
> Otherwise these plan files never get deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-1015.
--

   Resolution: Fixed
Fix Version/s: 0.5.0
 Hadoop Flags: [Reviewed]

Committed. Thanks Larry

> Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
> ---
>
> Key: HIVE-1015
> URL: https://issues.apache.org/jira/browse/HIVE-1015
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Contrib
>Reporter: Carl Steinbach
>Assignee: Larry Ogrodnek
> Fix For: 0.5.0
>
> Attachments: HIVE-1015.patch, HIVE-1015.patch
>
>
> Larry Ogrodnek has written a set of wrapper classes that make it possible
> to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that
> more closely resembles conventional Hadoop MR programs.
> A blog post describing this library can be found here: 
> http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html
> The source code (with Apache license) is available here: 
> http://github.com/ogrodnek/shmrj
> We should add this to contrib.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-988) mapjoin should throw an error if the input is too large

2010-01-06 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-988:


Attachment: HIVE-988_4.patch

According to offline discussions with Namit, here are the new changes:
1) change Operator.fatalError as a static variable so all operators share it.
2) change Operator.getDone() to check fatalError as well.
3) change ExecMapper.map() to check the operator.getDone() and early exit if so.
4) change the ExecDriver to hold a success variable and ExecDriver.progress 
will set it status rather than getting it from RunningJob.isSuccessful(). So it 
solves the case where the Counter was incrmented but the RunningJob is finished 
without checking for the counter.  

> mapjoin should throw an error if the input is too large
> ---
>
> Key: HIVE-988
> URL: https://issues.apache.org/jira/browse/HIVE-988
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Ning Zhang
> Fix For: 0.5.0
>
> Attachments: HIVE-988.patch, HIVE-988_2.patch, HIVE-988_3.patch, 
> HIVE-988_4.patch
>
>
> If the input to the map join is larger than a specific threshold, it may lead 
> to a very slow execution of the join.
> It is better to throw an error, and let the user redo his query as a non 
> map-join query.
> However, the current map-reduce framework will retry the mapper 4 times 
> before actually killing the job.
> Based on a offline discussion with Dhruba, Ning and myself, we came up with 
> the following algorithm:
> Keep a threshold in the mapper for the number of rows to be processed for 
> map-join. If the number of rows
> exceeds that threshold, set a counter and kill that mapper.
> The client (ExecDriver) monitors that job continuously - if this counter is 
> set, it kills the job and also
> shows an appropriate error message to the user, so that he can retry the 
> query without the map join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797289#action_12797289
 ] 

Namit Jain commented on HIVE-820:
-

We should be consistent across different fields.

serialization.format=9,line.delim= ,field.delim= 

We should use the same format for all of them. We can choose the decimal format 
for all of them. Since it is a existing problem, this need not be a blocker for 
0.5



> Describe Extended Line Breaks When Delimiter is \n
> --
>
> Key: HIVE-820
> URL: https://issues.apache.org/jira/browse/HIVE-820
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>Reporter: Matt Pestritto
>Assignee: Matt Pestritto
>Priority: Minor
> Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe 
> extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the 
> hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table InformationTable(tableName:cobra_merchandise, 
> dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, 
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, 
> type:string, comment:null), FieldSchema(name:client_merch_type_tid, 
> type:string, comment:null), FieldSchema(name:description, type:string, 
> comment:null), FieldSchema(name:client_description, type:string, 
> comment:null), FieldSchema(name:price, type:string, comment:null), 
> FieldSchema(name:cost, type:string, comment:null), 
> FieldSchema(name:start_date, type:string, comment:null), 
> FieldSchema(name:end_date, type:string, comment:null)], 
> location:hdfs://mustique:9000/user/hive/warehouse/m, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=9,line.delim=
> ,field.delim=}), bucketCols:[], sortCols:[], parameters:{}), 
> partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], 
> parameters:{})   
> Proposed Output:
> Detailed Table InformationTable(tableName:cobra_merchandise, 
> dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, 
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, 
> type:string, comment:null), FieldSchema(name:client_merch_type_tid, 
> type:string, comment:null), FieldSchema(name:description, type:string, 
> comment:null), FieldSchema(name:client_description, type:string, 
> comment:null), FieldSchema(name:price, type:string, comment:null), 
> FieldSchema(name:cost, type:string, comment:null), 
> FieldSchema(name:start_date, type:string, comment:null), 
> FieldSchema(name:end_date, type:string, comment:null)], 
> location:hdfs://mustique:9000/user/hive/warehouse/m, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=9,line.delim=,field.delim=}), 
> bucketCols:[], sortCols:[], parameters:{}), 
> partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], 
> parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-820:


Fix Version/s: (was: 0.5.0)

> Describe Extended Line Breaks When Delimiter is \n
> --
>
> Key: HIVE-820
> URL: https://issues.apache.org/jira/browse/HIVE-820
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>Reporter: Matt Pestritto
>Assignee: Matt Pestritto
>Priority: Minor
> Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe 
> extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the 
> hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table InformationTable(tableName:cobra_merchandise, 
> dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, 
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, 
> type:string, comment:null), FieldSchema(name:client_merch_type_tid, 
> type:string, comment:null), FieldSchema(name:description, type:string, 
> comment:null), FieldSchema(name:client_description, type:string, 
> comment:null), FieldSchema(name:price, type:string, comment:null), 
> FieldSchema(name:cost, type:string, comment:null), 
> FieldSchema(name:start_date, type:string, comment:null), 
> FieldSchema(name:end_date, type:string, comment:null)], 
> location:hdfs://mustique:9000/user/hive/warehouse/m, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=9,line.delim=
> ,field.delim=}), bucketCols:[], sortCols:[], parameters:{}), 
> partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], 
> parameters:{})   
> Proposed Output:
> Detailed Table InformationTable(tableName:cobra_merchandise, 
> dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, 
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, 
> type:string, comment:null), FieldSchema(name:client_merch_type_tid, 
> type:string, comment:null), FieldSchema(name:description, type:string, 
> comment:null), FieldSchema(name:client_description, type:string, 
> comment:null), FieldSchema(name:price, type:string, comment:null), 
> FieldSchema(name:cost, type:string, comment:null), 
> FieldSchema(name:start_date, type:string, comment:null), 
> FieldSchema(name:end_date, type:string, comment:null)], 
> location:hdfs://mustique:9000/user/hive/warehouse/m, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=9,line.delim=,field.delim=}), 
> bucketCols:[], sortCols:[], parameters:{}), 
> partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], 
> parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans

2010-01-06 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797288#action_12797288
 ] 

Ning Zhang commented on HIVE-1030:
--

Currently there are other places using File.createTempFile in the persistent 
data structures (HashMapWrapper and RowContainer). We have the 
File.deleteOnExit(true) set to ensure the temp file got deleted when the job is 
killed or normal exit. Is there any issue there as well? Should we also convert 
that to using ScratchDir?

> Hive should use scratchDir instead of system temporary directory for storing 
> plans
> --
>
> Key: HIVE-1030
> URL: https://issues.apache.org/jira/browse/HIVE-1030
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.0
>
> Attachments: HIVE-1030.1.patch
>
>
> Otherwise these plan files never get deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-683) add UDF field

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-683:
---

Assignee: Larry Ogrodnek

> add UDF field
> -
>
> Key: HIVE-683
> URL: https://issues.apache.org/jira/browse/HIVE-683
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Larry Ogrodnek
> Fix For: 0.5.0
>
> Attachments: HIVE-683.patch, HIVE-683.patch, HIVE-683.patch
>
>
> add UDF field
> look at
> http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html
> for details

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797254#action_12797254
 ] 

Namit Jain commented on HIVE-1030:
--

+1

will commit if the tests pass

> Hive should use scratchDir instead of system temporary directory for storing 
> plans
> --
>
> Key: HIVE-1030
> URL: https://issues.apache.org/jira/browse/HIVE-1030
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.0
>
> Attachments: HIVE-1030.1.patch
>
>
> Otherwise these plan files never get deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797253#action_12797253
 ] 

Namit Jain commented on HIVE-1015:
--

+1

looks good - will commit if the tests pass

> Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
> ---
>
> Key: HIVE-1015
> URL: https://issues.apache.org/jira/browse/HIVE-1015
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Contrib
>Reporter: Carl Steinbach
>Assignee: Larry Ogrodnek
> Attachments: HIVE-1015.patch, HIVE-1015.patch
>
>
> Larry Ogrodnek has written a set of wrapper classes that make it possible
> to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that
> more closely resembles conventional Hadoop MR programs.
> A blog post describing this library can be found here: 
> http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html
> The source code (with Apache license) is available here: 
> http://github.com/ogrodnek/shmrj
> We should add this to contrib.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-1015:


Assignee: Larry Ogrodnek

> Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
> ---
>
> Key: HIVE-1015
> URL: https://issues.apache.org/jira/browse/HIVE-1015
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Contrib
>Reporter: Carl Steinbach
>Assignee: Larry Ogrodnek
> Attachments: HIVE-1015.patch, HIVE-1015.patch
>
>
> Larry Ogrodnek has written a set of wrapper classes that make it possible
> to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that
> more closely resembles conventional Hadoop MR programs.
> A blog post describing this library can be found here: 
> http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html
> The source code (with Apache license) is available here: 
> http://github.com/ogrodnek/shmrj
> We should add this to contrib.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans

2010-01-06 Thread Zheng Shao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-1030:
-

Attachment: HIVE-1030.1.patch

> Hive should use scratchDir instead of system temporary directory for storing 
> plans
> --
>
> Key: HIVE-1030
> URL: https://issues.apache.org/jira/browse/HIVE-1030
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.0
>
> Attachments: HIVE-1030.1.patch
>
>
> Otherwise these plan files never get deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans

2010-01-06 Thread Zheng Shao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-1030:
-

Status: Patch Available  (was: Open)

> Hive should use scratchDir instead of system temporary directory for storing 
> plans
> --
>
> Key: HIVE-1030
> URL: https://issues.apache.org/jira/browse/HIVE-1030
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.5.0
>
> Attachments: HIVE-1030.1.patch
>
>
> Otherwise these plan files never get deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans

2010-01-06 Thread Zheng Shao (JIRA)

Hive should use scratchDir instead of system temporary directory for storing 
plans
--

 Key: HIVE-1030
 URL: https://issues.apache.org/jira/browse/HIVE-1030
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Zheng Shao
Assignee: Zheng Shao
 Fix For: 0.5.0


Otherwise these plan files never get deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/scheme support Hive QL

2010-01-06 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797238#action_12797238
 ] 

He Yongqiang commented on HIVE-675:
---

Hi Jeff,

Sorry for the delay. Actually it is now holding for the release of 0.5 because 
it has many related jiras. Let's commit this after the release of 0.5. 

> add database/scheme support Hive QL
> ---
>
> Key: HIVE-675
> URL: https://issues.apache.org/jira/browse/HIVE-675
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Prasad Chakka
>Assignee: He Yongqiang
> Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
> hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
> hive-675-2009-9-8.patch
>
>
> Currently all Hive tables reside in single namespace (default). Hive should 
> support multiple namespaces (databases or schemas) such that users can create 
> tables in their specific namespaces. These name spaces can have different 
> warehouse directories (with a default naming scheme) and possibly different 
> properties.
> There is already some support for this in metastore but Hive query parser 
> should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

2010-01-06 Thread Patrick Angeles (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Angeles updated HIVE-1027:
--

Attachment: hive-1027.patch

updated patch... includes show_functions.q.out

> Create UDFs for XPath expression evaluation
> ---
>
> Key: HIVE-1027
> URL: https://issues.apache.org/jira/browse/HIVE-1027
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Patrick Angeles
>Priority: Minor
> Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT 
> > 1 ;
> 5.0
> > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT 
> > 1 ;
> b2
> > SELECT xpath ('b1b2b3c1c2', 
> > 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, 
> xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

2010-01-06 Thread Patrick Angeles (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Angeles updated HIVE-1027:
--

Status: Patch Available  (was: Open)

Updated patch (this one includes show_functions.q.out).

> Create UDFs for XPath expression evaluation
> ---
>
> Key: HIVE-1027
> URL: https://issues.apache.org/jira/browse/HIVE-1027
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Patrick Angeles
>Priority: Minor
> Attachments: udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT 
> > 1 ;
> 5.0
> > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT 
> > 1 ;
> b2
> > SELECT xpath ('b1b2b3c1c2', 
> > 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, 
> xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/scheme support Hive QL

2010-01-06 Thread Jeff Hammerbacher (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797225#action_12797225
 ] 

Jeff Hammerbacher commented on HIVE-675:


Hey,

Is this patch in a state where it could go into 0.5? If it's not going to be 
polished off, please let us know.

Thanks,
Jeff

> add database/scheme support Hive QL
> ---
>
> Key: HIVE-675
> URL: https://issues.apache.org/jira/browse/HIVE-675
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Prasad Chakka
>Assignee: He Yongqiang
> Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
> hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
> hive-675-2009-9-8.patch
>
>
> Currently all Hive tables reside in single namespace (default). Hive should 
> support multiple namespaces (databases or schemas) such that users can create 
> tables in their specific namespaces. These name spaces can have different 
> warehouse directories (with a default naming scheme) and possibly different 
> properties.
> There is already some support for this in metastore but Hive query parser 
> should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts

2010-01-06 Thread Larry Ogrodnek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Larry Ogrodnek updated HIVE-1015:
-

Attachment: HIVE-1015.patch

Here's a new patch with a .q file using an example mapper and reducer.

I also removed the dependency of these classes on apache commons lang, since 
there was only a single use of StringUtils.join(), and it's one less thing to 
specify on the classpath in the USING clause

Thanks.

> Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
> ---
>
> Key: HIVE-1015
> URL: https://issues.apache.org/jira/browse/HIVE-1015
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Contrib
>Reporter: Carl Steinbach
> Attachments: HIVE-1015.patch, HIVE-1015.patch
>
>
> Larry Ogrodnek has written a set of wrapper classes that make it possible
> to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that
> more closely resembles conventional Hadoop MR programs.
> A blog post describing this library can be found here: 
> http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html
> The source code (with Apache license) is available here: 
> http://github.com/ogrodnek/shmrj
> We should add this to contrib.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-478) Surface "processor time" for queries

2010-01-06 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797164#action_12797164
 ] 

Namit Jain commented on HIVE-478:
-

Can you set the configuration parameter hive.task.progress to true. It will 
dump the total time taken by each operator.
Please check if this meets your requirements, we can enhance it to add more 
stuff.

> Surface "processor time" for queries
> 
>
> Key: HIVE-478
> URL: https://issues.apache.org/jira/browse/HIVE-478
> Project: Hadoop Hive
>  Issue Type: Wish
>  Components: Logging, Query Processor
>Reporter: Adam Kramer
>
> We currently list real-time metrics of how long queries take--"finished in: 
> 1min 13sec" appears on the job tracker. However, this is affected by a lot 
> more than just the quality or implementation of the query. For example, 
> number of mappers used varies a lot when you use subqueries versus 
> single-query aggregation, as does the amount of work necessary.
> For implementation comparisons (e.g., "should I use this version of the query 
> or that one"), ti would be great to know the processor time used instead of 
> the real time used...both in terms of "mapper cpu seconds" and "reducer cpu 
> seconds."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HIVE-683) add UDF field

2010-01-06 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-683.
-

   Resolution: Fixed
Fix Version/s: 0.5.0
 Hadoop Flags: [Reviewed]

Committed. Thanks Larry

> add UDF field
> -
>
> Key: HIVE-683
> URL: https://issues.apache.org/jira/browse/HIVE-683
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
> Fix For: 0.5.0
>
> Attachments: HIVE-683.patch, HIVE-683.patch, HIVE-683.patch
>
>
> add UDF field
> look at
> http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html
> for details

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

2010-01-06 Thread Matt Pestritto (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797150#action_12797150
 ] 

Matt Pestritto commented on HIVE-820:
-

All -

Do we have a decision on what you want the output to show ?  A few different 
ideas were being thrown around.

I would rather replace only characters that would break the output ( tab, \n ) 
with something meaningful vs, as Edward stated, always showing the octal 
representation which would require an ascii table to figure out what the 
delimiter is.  If something is | ( pipe ) delimited, I always need to look it 
up when that is a printable character.

I'll wait for feedback from the FB team and make the changes.

Thanks.

> Describe Extended Line Breaks When Delimiter is \n
> --
>
> Key: HIVE-820
> URL: https://issues.apache.org/jira/browse/HIVE-820
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>Reporter: Matt Pestritto
>Assignee: Matt Pestritto
>Priority: Minor
> Fix For: 0.5.0
>
> Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe 
> extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the 
> hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table InformationTable(tableName:cobra_merchandise, 
> dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, 
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, 
> type:string, comment:null), FieldSchema(name:client_merch_type_tid, 
> type:string, comment:null), FieldSchema(name:description, type:string, 
> comment:null), FieldSchema(name:client_description, type:string, 
> comment:null), FieldSchema(name:price, type:string, comment:null), 
> FieldSchema(name:cost, type:string, comment:null), 
> FieldSchema(name:start_date, type:string, comment:null), 
> FieldSchema(name:end_date, type:string, comment:null)], 
> location:hdfs://mustique:9000/user/hive/warehouse/m, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=9,line.delim=
> ,field.delim=}), bucketCols:[], sortCols:[], parameters:{}), 
> partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], 
> parameters:{})   
> Proposed Output:
> Detailed Table InformationTable(tableName:cobra_merchandise, 
> dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, 
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, 
> type:string, comment:null), FieldSchema(name:client_merch_type_tid, 
> type:string, comment:null), FieldSchema(name:description, type:string, 
> comment:null), FieldSchema(name:client_description, type:string, 
> comment:null), FieldSchema(name:price, type:string, comment:null), 
> FieldSchema(name:cost, type:string, comment:null), 
> FieldSchema(name:start_date, type:string, comment:null), 
> FieldSchema(name:end_date, type:string, comment:null)], 
> location:hdfs://mustique:9000/user/hive/warehouse/m, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=9,line.delim=,field.delim=}), 
> bucketCols:[], sortCols:[], parameters:{}), 
> partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], 
> parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

60 matches

Mail list logo