[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797520#action_12797520 ] Patrick Angeles commented on HIVE-1027: --- 1) In general XPath queries return a list of nodes. What is the semantics of xpath_double (eg.) return if XPath evaluates to multiple nodes. Only xpath() returns multiple nodes (list). xpath_string() returns the text of the first matching node (and its subnodes, if any). - xpath_string('aab1b2','a') returns 'aab1b2' - xpath_string('aab1b2','b') returns 'b1' xpath_double()/float() return the numeric value of the text of the first matching node, or NaN if the text value is not numeric. xpath_int()/long()/short() return the numberic value of the text of the first matching node, or 0 if the text value is not numeric, or MAX_INT, MAX_LONG, MAX_SHORT respectively if the value overflows. 2) Is the XPath query parsed for every input row, or only parsed once? The XPath expression is compiled and cached. It is reused if the next expression matches the previous. Otherwise, it is recompiled. So, the xml is always parsed for every input row, but the xpath expression is precompiled and reused for the vast majority of use cases. 3a) Do you support DTD and XMLSchema? Not sure how these would apply, as the Java XPath API is schema agnostic (no validation being performed). However, malformed xml (e.g., '1') will result in a runtime exception being thrown. 3b) What about namespace and backward axes in XPath? Namespace is not currently supported, but could be easily added later. Backward axes are supported: > select xpath (' id="2">','/descendant::c/ancestor::b/@id') from t1 limit 1 ; ["1","2"] 4) If XPath evaluates to empty list, do you return NULL or empty string (in case of xpath())? When no match is found: xpath() returns an empty list. xpath_string() returns an empty string. xpath_int(), float(), etc. will return 0. xpath_boolean() will return false. > Create UDFs for XPath expression evaluation > --- > > Key: HIVE-1027 > URL: https://issues.apache.org/jira/browse/HIVE-1027 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Patrick Angeles >Assignee: Patrick Angeles >Priority: Minor > Attachments: hive-1027.patch, udf_xpath.patch > > > Create UDFs for evaluating XPath expressions against XML documents. > Examples: > > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT > > 1 ; > 5.0 > > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT > > 1 ; > b2 > > SELECT xpath ('b1b2b3c1c2', > > 'a/c/text()') FROM src LIMIT 1 ; > ["c1","c2"] > Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, > xpath_double/xpath_number, xpath_string, xpath -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-984) Building Hive occasionally fails with Ivy error: hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:
[ https://issues.apache.org/jira/browse/HIVE-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797512#action_12797512 ] Carl Steinbach commented on HIVE-984: - Any chance this can go into 0.5.0? > Building Hive occasionally fails with Ivy error: > hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: > --- > > Key: HIVE-984 > URL: https://issues.apache.org/jira/browse/HIVE-984 > Project: Hadoop Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Attachments: HIVE-984.patch > > > Folks keep running into this problem when building Hive from source: > {noformat} > [ivy:retrieve] > [ivy:retrieve] :: problems summary :: > [ivy:retrieve] WARNINGS > [ivy:retrieve] [FAILED ] > hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: > expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72 > (138662ms) > [ivy:retrieve] [FAILED ] > hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: > expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72 > (138662ms) > [ivy:retrieve] hadoop-resolver: tried > [ivy:retrieve] > http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz > [ivy:retrieve] :: > [ivy:retrieve] :: FAILED DOWNLOADS:: > [ivy:retrieve] :: ^ see resolution messages for details ^ :: > [ivy:retrieve] :: > [ivy:retrieve] :: hadoop#core;0.20.1!hadoop.tar.gz(source) > [ivy:retrieve] :: > [ivy:retrieve] > [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS > {noformat} > The problem appears to be either with a) the Hive build scripts, b) ivy, or > c) archive.apache.org > Besides fixing the actual bug, one other option worth considering is to add > the Hadoop jars to the > Hive source repository. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException
[ https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797489#action_12797489 ] Carl Steinbach commented on HIVE-1031: -- Sorry, didn't realize that HIVE-996 got committed. I will update the testcase. > "DESCRIBE FUNCTION array" throws ParseException > --- > > Key: HIVE-1031 > URL: https://issues.apache.org/jira/browse/HIVE-1031 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Attachments: HIVE-1031.patch > > > {noformat} > hive> describe function array; > describe function array; > FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe > statement > hive> describe function 'array'; > describe function 'array'; > OK > array(n0, n1...) - Creates an array with the given elements > Time taken: 0.396 seconds > hive> describe function map; > describe function map; > FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe > statement > hive> describe function 'map'; > describe function 'map'; > OK > map(key0, value0, key1, value1...) - Creates a map with the given key/value > pairs > Time taken: 0.054 seconds > hive> describe function case; > describe function case; > FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe > statement > hive> describe function 'case'; > describe function 'case'; > OK > There is no documentation for function case > Time taken: 0.072 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException
[ https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797488#action_12797488 ] Carl Steinbach commented on HIVE-1031: -- @Namit: difficult, because there is overlap between this and HIVE-996. Would you like me to role this change into HIVE-996? > "DESCRIBE FUNCTION array" throws ParseException > --- > > Key: HIVE-1031 > URL: https://issues.apache.org/jira/browse/HIVE-1031 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Attachments: HIVE-1031.patch > > > {noformat} > hive> describe function array; > describe function array; > FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe > statement > hive> describe function 'array'; > describe function 'array'; > OK > array(n0, n1...) - Creates an array with the given elements > Time taken: 0.396 seconds > hive> describe function map; > describe function map; > FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe > statement > hive> describe function 'map'; > describe function 'map'; > OK > map(key0, value0, key1, value1...) - Creates a map with the given key/value > pairs > Time taken: 0.054 seconds > hive> describe function case; > describe function case; > FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe > statement > hive> describe function 'case'; > describe function 'case'; > OK > There is no documentation for function case > Time taken: 0.072 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1032) Better Error Messages for Execution Errors
Better Error Messages for Execution Errors -- Key: HIVE-1032 URL: https://issues.apache.org/jira/browse/HIVE-1032 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Paul Yang Assignee: Paul Yang Three common errors that occur during execution are: 1. Map-side group-by causing an out of memory exception due to large aggregation hash tables 2. ScriptOperator failing due to the user's script throwing an exception or otherwise returning a non-zero error code 3. Incorrectly specifying the join order of small and large tables, causing the large table to be loaded into memory and producing an out of memory exception. These errors are typically discovered by manually examining the error log files of the failed task. This task proposes to create a feature that would automatically read the error logs and output a probable cause and solution to the command line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/scheme support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797486#action_12797486 ] Alex Loddengaard commented on HIVE-675: --- I will be out of the office Thursday, 1/7, through Wednesday, 1/13, back in the office Thursday, 1/14. I will be checking email fairly consistently in the evenings. Please contact Christophe Bisciglia (christo...@cloudera.com) with any support or training emergencies. Otherwise, you'll hear from me soon. Thanks, Alex > add database/scheme support Hive QL > --- > > Key: HIVE-675 > URL: https://issues.apache.org/jira/browse/HIVE-675 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Prasad Chakka >Assignee: He Yongqiang > Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, > hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, > hive-675-2009-9-8.patch > > > Currently all Hive tables reside in single namespace (default). Hive should > support multiple namespaces (databases or schemas) such that users can create > tables in their specific namespaces. These name spaces can have different > warehouse directories (with a default naming scheme) and possibly different > properties. > There is already some support for this in metastore but Hive query parser > should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/scheme support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797477#action_12797477 ] Zheng Shao commented on HIVE-675: - Yongqiang, you can start adapt the patch to trunk right now. The plan is to branch 0.5 on 1/7/2010 (tomorrow). After that we can quickly review this diff and get it in. > add database/scheme support Hive QL > --- > > Key: HIVE-675 > URL: https://issues.apache.org/jira/browse/HIVE-675 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Prasad Chakka >Assignee: He Yongqiang > Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, > hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, > hive-675-2009-9-8.patch > > > Currently all Hive tables reside in single namespace (default). Hive should > support multiple namespaces (databases or schemas) such that users can create > tables in their specific namespaces. These name spaces can have different > warehouse directories (with a default naming scheme) and possibly different > properties. > There is already some support for this in metastore but Hive query parser > should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-686) add UDF substring_index
[ https://issues.apache.org/jira/browse/HIVE-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao reassigned HIVE-686: --- Assignee: Larry Ogrodnek > add UDF substring_index > --- > > Key: HIVE-686 > URL: https://issues.apache.org/jira/browse/HIVE-686 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: Larry Ogrodnek > Attachments: HIVE-686.patch > > > add UDFsubstring_index > look at > http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html > for details -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-996: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Carl > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.4.patch, > HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException
[ https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797469#action_12797469 ] Namit Jain commented on HIVE-1031: -- can you add a test or change the existing test to remove quotes ? > "DESCRIBE FUNCTION array" throws ParseException > --- > > Key: HIVE-1031 > URL: https://issues.apache.org/jira/browse/HIVE-1031 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Attachments: HIVE-1031.patch > > > {noformat} > hive> describe function array; > describe function array; > FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe > statement > hive> describe function 'array'; > describe function 'array'; > OK > array(n0, n1...) - Creates an array with the given elements > Time taken: 0.396 seconds > hive> describe function map; > describe function map; > FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe > statement > hive> describe function 'map'; > describe function 'map'; > OK > map(key0, value0, key1, value1...) - Creates a map with the given key/value > pairs > Time taken: 0.054 seconds > hive> describe function case; > describe function case; > FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe > statement > hive> describe function 'case'; > describe function 'case'; > OK > There is no documentation for function case > Time taken: 0.072 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException
[ https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1031: - Status: Patch Available (was: Open) * Updated the list of "sysFuncNames" in the Hive grammar file. {noformat} hive> describe function array; describe function array; OK array(n0, n1...) - Creates an array with the given elements Time taken: 0.051 seconds hive> describe function map; describe function map; OK map(key0, value0, key1, value1...) - Creates a map with the given key/value pairs Time taken: 0.069 seconds {noformat} > "DESCRIBE FUNCTION array" throws ParseException > --- > > Key: HIVE-1031 > URL: https://issues.apache.org/jira/browse/HIVE-1031 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Attachments: HIVE-1031.patch > > > {noformat} > hive> describe function array; > describe function array; > FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe > statement > hive> describe function 'array'; > describe function 'array'; > OK > array(n0, n1...) - Creates an array with the given elements > Time taken: 0.396 seconds > hive> describe function map; > describe function map; > FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe > statement > hive> describe function 'map'; > describe function 'map'; > OK > map(key0, value0, key1, value1...) - Creates a map with the given key/value > pairs > Time taken: 0.054 seconds > hive> describe function case; > describe function case; > FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe > statement > hive> describe function 'case'; > describe function 'case'; > OK > There is no documentation for function case > Time taken: 0.072 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException
[ https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1031: - Attachment: HIVE-1031.patch > "DESCRIBE FUNCTION array" throws ParseException > --- > > Key: HIVE-1031 > URL: https://issues.apache.org/jira/browse/HIVE-1031 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Attachments: HIVE-1031.patch > > > {noformat} > hive> describe function array; > describe function array; > FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe > statement > hive> describe function 'array'; > describe function 'array'; > OK > array(n0, n1...) - Creates an array with the given elements > Time taken: 0.396 seconds > hive> describe function map; > describe function map; > FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe > statement > hive> describe function 'map'; > describe function 'map'; > OK > map(key0, value0, key1, value1...) - Creates a map with the given key/value > pairs > Time taken: 0.054 seconds > hive> describe function case; > describe function case; > FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe > statement > hive> describe function 'case'; > describe function 'case'; > OK > There is no documentation for function case > Time taken: 0.072 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797428#action_12797428 ] Carl Steinbach commented on HIVE-996: - @Namit: updated udaf_max.q.out and udaf_min.q.out: Index: ql/src/test/results/clientpositive/udaf_max.q.out === --- ql/src/test/results/clientpositive/udaf_max.q.out (revision 0) +++ ql/src/test/results/clientpositive/udaf_max.q.out (revision 0) @@ -0,0 +1,20 @@ +PREHOOK: query: DESCRIBE FUNCTION max +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION max +POSTHOOK: type: DESCFUNCTION +max(expr) - Returns the maximum value of expr +PREHOOK: query: DESCRIBE FUNCTION EXTENDED max +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION EXTENDED max +POSTHOOK: type: DESCFUNCTION +max(expr) - Returns the maximum value of expr +PREHOOK: query: DESCRIBE FUNCTION max +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION max +POSTHOOK: type: DESCFUNCTION +max(expr) - Returns the maximum value of expr +PREHOOK: query: DESCRIBE FUNCTION EXTENDED max +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION EXTENDED max +POSTHOOK: type: DESCFUNCTION +max(expr) - Returns the maximum value of expr Index: ql/src/test/results/clientpositive/udaf_min.q.out === --- ql/src/test/results/clientpositive/udaf_min.q.out (revision 0) +++ ql/src/test/results/clientpositive/udaf_min.q.out (revision 0) @@ -0,0 +1,20 @@ +PREHOOK: query: DESCRIBE FUNCTION min +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION min +POSTHOOK: type: DESCFUNCTION +min(expr) - Returns the minimum value of expr +PREHOOK: query: DESCRIBE FUNCTION EXTENDED min +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION EXTENDED min +POSTHOOK: type: DESCFUNCTION +min(expr) - Returns the minimum value of expr +PREHOOK: query: DESCRIBE FUNCTION min +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION min +POSTHOOK: type: DESCFUNCTION +min(expr) - Returns the minimum value of expr +PREHOOK: query: DESCRIBE FUNCTION EXTENDED min +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION EXTENDED min +POSTHOOK: type: DESCFUNCTION +min(expr) - Returns the minimum value of expr > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.4.patch, > HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797427#action_12797427 ] Ning Zhang commented on HIVE-1027: -- This is cool stuff. Just some questions: 1) In general XPath queries return a list of nodes. What is the semantics of xpath_double (eg.) return if XPath evaluates to multiple nodes. 2) Is the XPath query parsed for every input row, or only parsed once? 3) Do you support DTD and XMLSchema? What about namespace and backward axes in XPath? 4) If XPath evaluates to empty list, do you return NULL or empty string (in case of xpath())? > Create UDFs for XPath expression evaluation > --- > > Key: HIVE-1027 > URL: https://issues.apache.org/jira/browse/HIVE-1027 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Patrick Angeles >Assignee: Patrick Angeles >Priority: Minor > Attachments: hive-1027.patch, udf_xpath.patch > > > Create UDFs for evaluating XPath expressions against XML documents. > Examples: > > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT > > 1 ; > 5.0 > > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT > > 1 ; > b2 > > SELECT xpath ('b1b2b3c1c2', > > 'a/c/text()') FROM src LIMIT 1 ; > ["c1","c2"] > Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, > xpath_double/xpath_number, xpath_string, xpath -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-996: Attachment: HIVE-996.4.patch > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.4.patch, > HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797420#action_12797420 ] Namit Jain commented on HIVE-1027: -- +1 looks good - will commit if the tests pass > Create UDFs for XPath expression evaluation > --- > > Key: HIVE-1027 > URL: https://issues.apache.org/jira/browse/HIVE-1027 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Patrick Angeles >Assignee: Patrick Angeles >Priority: Minor > Attachments: hive-1027.patch, udf_xpath.patch > > > Create UDFs for evaluating XPath expressions against XML documents. > Examples: > > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT > > 1 ; > 5.0 > > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT > > 1 ; > b2 > > SELECT xpath ('b1b2b3c1c2', > > 'a/c/text()') FROM src LIMIT 1 ; > ["c1","c2"] > Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, > xpath_double/xpath_number, xpath_string, xpath -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1027) Create UDFs for XPath expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain reassigned HIVE-1027: Assignee: Patrick Angeles > Create UDFs for XPath expression evaluation > --- > > Key: HIVE-1027 > URL: https://issues.apache.org/jira/browse/HIVE-1027 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Patrick Angeles >Assignee: Patrick Angeles >Priority: Minor > Attachments: hive-1027.patch, udf_xpath.patch > > > Create UDFs for evaluating XPath expressions against XML documents. > Examples: > > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT > > 1 ; > 5.0 > > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT > > 1 ; > b2 > > SELECT xpath ('b1b2b3c1c2', > > 'a/c/text()') FROM src LIMIT 1 ; > ["c1","c2"] > Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, > xpath_double/xpath_number, xpath_string, xpath -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797414#action_12797414 ] Namit Jain commented on HIVE-996: - Index: ql/src/test/results/clientpositive/udaf_max.q.out === --- ql/src/test/results/clientpositive/udaf_max.q.out (revision 0) +++ ql/src/test/results/clientpositive/udaf_max.q.out (revision 0) @@ -0,0 +1,20 @@ +PREHOOK: query: DESCRIBE FUNCTION max +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION max +POSTHOOK: type: DESCFUNCTION +There is no documentation for function max +PREHOOK: query: DESCRIBE FUNCTION EXTENDED max +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION EXTENDED max +POSTHOOK: type: DESCFUNCTION +There is no documentation for function max +PREHOOK: query: DESCRIBE FUNCTION max +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION max +POSTHOOK: type: DESCFUNCTION +There is no documentation for function max +PREHOOK: query: DESCRIBE FUNCTION EXTENDED max +PREHOOK: type: DESCFUNCTION +POSTHOOK: query: DESCRIBE FUNCTION EXTENDED max +POSTHOOK: type: DESCFUNCTION +There is no documentation for function max It is still the old one > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-996: Attachment: HIVE-996.3.patch > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797407#action_12797407 ] Carl Steinbach commented on HIVE-996: - * Updated udaf_max.q.out and udaf_min.q.out > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.3.patch, HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797402#action_12797402 ] Carl Steinbach commented on HIVE-996: - @namit: sorry, regenerating the patch... > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797396#action_12797396 ] Namit Jain commented on HIVE-996: - Dont you need to fix the output files udaf_max.q.out/udaf_min.q.out ? > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-988) mapjoin should throw an error if the input is too large
[ https://issues.apache.org/jira/browse/HIVE-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain resolved HIVE-988. - Resolution: Fixed Hadoop Flags: [Reviewed] > mapjoin should throw an error if the input is too large > --- > > Key: HIVE-988 > URL: https://issues.apache.org/jira/browse/HIVE-988 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: Ning Zhang > Fix For: 0.5.0 > > Attachments: HIVE-988.patch, HIVE-988_2.patch, HIVE-988_3.patch, > HIVE-988_4.patch > > > If the input to the map join is larger than a specific threshold, it may lead > to a very slow execution of the join. > It is better to throw an error, and let the user redo his query as a non > map-join query. > However, the current map-reduce framework will retry the mapper 4 times > before actually killing the job. > Based on a offline discussion with Dhruba, Ning and myself, we came up with > the following algorithm: > Keep a threshold in the mapper for the number of rows to be processed for > map-join. If the number of rows > exceeds that threshold, set a counter and kill that mapper. > The client (ExecDriver) monitors that job continuously - if this counter is > set, it kills the job and also > shows an appropriate error message to the user, so that he can retry the > query without the map join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-988) mapjoin should throw an error if the input is too large
[ https://issues.apache.org/jira/browse/HIVE-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-988: Status: In Progress (was: Patch Available) Committed. Thanks Ning > mapjoin should throw an error if the input is too large > --- > > Key: HIVE-988 > URL: https://issues.apache.org/jira/browse/HIVE-988 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: Ning Zhang > Fix For: 0.5.0 > > Attachments: HIVE-988.patch, HIVE-988_2.patch, HIVE-988_3.patch, > HIVE-988_4.patch > > > If the input to the map join is larger than a specific threshold, it may lead > to a very slow execution of the join. > It is better to throw an error, and let the user redo his query as a non > map-join query. > However, the current map-reduce framework will retry the mapper 4 times > before actually killing the job. > Based on a offline discussion with Dhruba, Ning and myself, we came up with > the following algorithm: > Keep a threshold in the mapper for the number of rows to be processed for > map-join. If the number of rows > exceeds that threshold, set a counter and kill that mapper. > The client (ExecDriver) monitors that job continuously - if this counter is > set, it kills the job and also > shows an appropriate error message to the user, so that he can retry the > query without the map join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans
[ https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1030: - Attachment: HIVE-1030.2.patch Good catch Ning. Here is another place to fix (HIVE-1030.2.patch). I guess HashMapWrapper runs in mapper or reducer. It should be OK because in mapper/reducer, Hadoop automatically sets java.io.tmpdir according to http://hadoop.apache.org/common/docs/r0.18.3/mapred_tutorial.html And java.io.tmpdir is used in File.getTempDir. > Hive should use scratchDir instead of system temporary directory for storing > plans > -- > > Key: HIVE-1030 > URL: https://issues.apache.org/jira/browse/HIVE-1030 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 0.5.0 > > Attachments: HIVE-1030.1.patch, HIVE-1030.2.patch > > > Otherwise these plan files never get deleted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-996: Attachment: HIVE-996.2.patch > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797371#action_12797371 ] Carl Steinbach commented on HIVE-996: - * Added annotations for UDAFMax and UDAFMin. * Correctly handle the GenericUDAFBridge case in FunctionInfo.getFunctionClass(). > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.2.patch, HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797361#action_12797361 ] Carl Steinbach commented on HIVE-996: - @Namit: working on it now. > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797359#action_12797359 ] Namit Jain commented on HIVE-996: - @Carl, can you address Zheng's comments ? > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797355#action_12797355 ] Namit Jain commented on HIVE-996: - +1 looks good - will commit if the tests pass > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException
[ https://issues.apache.org/jira/browse/HIVE-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797353#action_12797353 ] Namit Jain commented on HIVE-1031: -- Seems to be a problem with reserved words. > "DESCRIBE FUNCTION array" throws ParseException > --- > > Key: HIVE-1031 > URL: https://issues.apache.org/jira/browse/HIVE-1031 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > > {noformat} > hive> describe function array; > describe function array; > FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe > statement > hive> describe function 'array'; > describe function 'array'; > OK > array(n0, n1...) - Creates an array with the given elements > Time taken: 0.396 seconds > hive> describe function map; > describe function map; > FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe > statement > hive> describe function 'map'; > describe function 'map'; > OK > map(key0, value0, key1, value1...) - Creates a map with the given key/value > pairs > Time taken: 0.054 seconds > hive> describe function case; > describe function case; > FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe > statement > hive> describe function 'case'; > describe function 'case'; > OK > There is no documentation for function case > Time taken: 0.072 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797351#action_12797351 ] Zheng Shao commented on HIVE-996: - Overall it looks good. 1. Can you add annotation for UDAFMax and UDAFMin? 2. We need to treat GenericUDAFBridge specially in FunctionInfo.getFunctionClass (you already did it for GenericUDFBridge). I will leave the rest to Namit. > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797347#action_12797347 ] Namit Jain commented on HIVE-996: - Great, I will take a look at it right away > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1031) "DESCRIBE FUNCTION array" throws ParseException
"DESCRIBE FUNCTION array" throws ParseException --- Key: HIVE-1031 URL: https://issues.apache.org/jira/browse/HIVE-1031 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Carl Steinbach Assignee: Carl Steinbach {noformat} hive> describe function array; describe function array; FAILED: Parse Error: line 1:18 cannot recognize input 'array' in describe statement hive> describe function 'array'; describe function 'array'; OK array(n0, n1...) - Creates an array with the given elements Time taken: 0.396 seconds hive> describe function map; describe function map; FAILED: Parse Error: line 1:18 cannot recognize input 'map' in describe statement hive> describe function 'map'; describe function 'map'; OK map(key0, value0, key1, value1...) - Creates a map with the given key/value pairs Time taken: 0.054 seconds hive> describe function case; describe function case; FAILED: Parse Error: line 1:18 cannot recognize input 'case' in describe statement hive> describe function 'case'; describe function 'case'; OK There is no documentation for function case Time taken: 0.072 seconds {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-996: Status: Patch Available (was: Open) * Fix 'describe function' and 'describe function extended' for UDTFs and UDAFs. * Differentiate between the case where a function does not exist and documentation for the function does not exist. > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
[ https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797340#action_12797340 ] Edward Capriolo commented on HIVE-1015: --- JAVA, JAVA, JAVA. I love it. Even our 'external scripts' can be java now :) > Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts > --- > > Key: HIVE-1015 > URL: https://issues.apache.org/jira/browse/HIVE-1015 > Project: Hadoop Hive > Issue Type: Improvement > Components: Contrib >Reporter: Carl Steinbach >Assignee: Larry Ogrodnek > Fix For: 0.5.0 > > Attachments: HIVE-1015.patch, HIVE-1015.patch > > > Larry Ogrodnek has written a set of wrapper classes that make it possible > to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that > more closely resembles conventional Hadoop MR programs. > A blog post describing this library can be found here: > http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html > The source code (with Apache license) is available here: > http://github.com/ogrodnek/shmrj > We should add this to contrib. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-978) Hive jars should follow Hadoop naming and include version
[ https://issues.apache.org/jira/browse/HIVE-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797339#action_12797339 ] Namit Jain commented on HIVE-978: - Looks good to me. @Edward, can you take care of it ? > Hive jars should follow Hadoop naming and include version > - > > Key: HIVE-978 > URL: https://issues.apache.org/jira/browse/HIVE-978 > Project: Hadoop Hive > Issue Type: Improvement > Components: Build Infrastructure >Affects Versions: 0.5.0 >Reporter: Chad Metcalf >Assignee: Chad Metcalf >Priority: Minor > Fix For: 0.5.0 > > Attachments: HIVE-978v1.patch, HIVE-978v2.patch, HIVE-978v3.patch, > HIVE-978v4.patch > > > This is a simple patch on the ant build files to change jar naming from > hive_foo.jar to hive-foo-VERSION.jar > This matches the convention followed by hadoop jars. This naming scheme is > important for packaging, repositories, etc. > Testing done: > ant test > ant tar > Things look right. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-996) "describe function" throws NPE when when called on UDTF or UDAF
[ https://issues.apache.org/jira/browse/HIVE-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-996: Attachment: HIVE-996.patch > "describe function" throws NPE when when called on UDTF or UDAF > --- > > Key: HIVE-996 > URL: https://issues.apache.org/jira/browse/HIVE-996 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Fix For: 0.5.0 > > Attachments: HIVE-996.patch > > > {noformat} > hive> describe function explode; > describe function explode; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function sum; > describe function sum; > FAILED: Error in metadata: java.lang.NullPointerException > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > hive> describe function conv; > describe function conv; > OK > conv(num, from_base, to_base) - convert num from from_base to to_base > Time taken: 0.042 seconds > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-988) mapjoin should throw an error if the input is too large
[ https://issues.apache.org/jira/browse/HIVE-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797334#action_12797334 ] Namit Jain commented on HIVE-988: - +1 will commit if the tests pass > mapjoin should throw an error if the input is too large > --- > > Key: HIVE-988 > URL: https://issues.apache.org/jira/browse/HIVE-988 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: Ning Zhang > Fix For: 0.5.0 > > Attachments: HIVE-988.patch, HIVE-988_2.patch, HIVE-988_3.patch, > HIVE-988_4.patch > > > If the input to the map join is larger than a specific threshold, it may lead > to a very slow execution of the join. > It is better to throw an error, and let the user redo his query as a non > map-join query. > However, the current map-reduce framework will retry the mapper 4 times > before actually killing the job. > Based on a offline discussion with Dhruba, Ning and myself, we came up with > the following algorithm: > Keep a threshold in the mapper for the number of rows to be processed for > map-join. If the number of rows > exceeds that threshold, set a counter and kill that mapper. > The client (ExecDriver) monitors that job continuously - if this counter is > set, it kills the job and also > shows an appropriate error message to the user, so that he can retry the > query without the map join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans
[ https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1030: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Zheng > Hive should use scratchDir instead of system temporary directory for storing > plans > -- > > Key: HIVE-1030 > URL: https://issues.apache.org/jira/browse/HIVE-1030 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 0.5.0 > > Attachments: HIVE-1030.1.patch > > > Otherwise these plan files never get deleted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
[ https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain resolved HIVE-1015. -- Resolution: Fixed Fix Version/s: 0.5.0 Hadoop Flags: [Reviewed] Committed. Thanks Larry > Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts > --- > > Key: HIVE-1015 > URL: https://issues.apache.org/jira/browse/HIVE-1015 > Project: Hadoop Hive > Issue Type: Improvement > Components: Contrib >Reporter: Carl Steinbach >Assignee: Larry Ogrodnek > Fix For: 0.5.0 > > Attachments: HIVE-1015.patch, HIVE-1015.patch > > > Larry Ogrodnek has written a set of wrapper classes that make it possible > to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that > more closely resembles conventional Hadoop MR programs. > A blog post describing this library can be found here: > http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html > The source code (with Apache license) is available here: > http://github.com/ogrodnek/shmrj > We should add this to contrib. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-988) mapjoin should throw an error if the input is too large
[ https://issues.apache.org/jira/browse/HIVE-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-988: Attachment: HIVE-988_4.patch According to offline discussions with Namit, here are the new changes: 1) change Operator.fatalError as a static variable so all operators share it. 2) change Operator.getDone() to check fatalError as well. 3) change ExecMapper.map() to check the operator.getDone() and early exit if so. 4) change the ExecDriver to hold a success variable and ExecDriver.progress will set it status rather than getting it from RunningJob.isSuccessful(). So it solves the case where the Counter was incrmented but the RunningJob is finished without checking for the counter. > mapjoin should throw an error if the input is too large > --- > > Key: HIVE-988 > URL: https://issues.apache.org/jira/browse/HIVE-988 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: Ning Zhang > Fix For: 0.5.0 > > Attachments: HIVE-988.patch, HIVE-988_2.patch, HIVE-988_3.patch, > HIVE-988_4.patch > > > If the input to the map join is larger than a specific threshold, it may lead > to a very slow execution of the join. > It is better to throw an error, and let the user redo his query as a non > map-join query. > However, the current map-reduce framework will retry the mapper 4 times > before actually killing the job. > Based on a offline discussion with Dhruba, Ning and myself, we came up with > the following algorithm: > Keep a threshold in the mapper for the number of rows to be processed for > map-join. If the number of rows > exceeds that threshold, set a counter and kill that mapper. > The client (ExecDriver) monitors that job continuously - if this counter is > set, it kills the job and also > shows an appropriate error message to the user, so that he can retry the > query without the map join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n
[ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797289#action_12797289 ] Namit Jain commented on HIVE-820: - We should be consistent across different fields. serialization.format=9,line.delim= ,field.delim= We should use the same format for all of them. We can choose the decimal format for all of them. Since it is a existing problem, this need not be a blocker for 0.5 > Describe Extended Line Breaks When Delimiter is \n > -- > > Key: HIVE-820 > URL: https://issues.apache.org/jira/browse/HIVE-820 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0 >Reporter: Matt Pestritto >Assignee: Matt Pestritto >Priority: Minor > Attachments: hive_820.patch > > > Tables defined delimited with \t and breaks using \n has output of describe > extended that is not contiguous. > Line.delim outputs an actual \n which breaks the display output so using the > hiveservice you have to do another FetchOne to get the rest of the line. > For example. > Original Output: > Detailed Table InformationTable(tableName:cobra_merchandise, > dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, > type:string, comment:null), FieldSchema(name:client_merch_type_tid, > type:string, comment:null), FieldSchema(name:description, type:string, > comment:null), FieldSchema(name:client_description, type:string, > comment:null), FieldSchema(name:price, type:string, comment:null), > FieldSchema(name:cost, type:string, comment:null), > FieldSchema(name:start_date, type:string, comment:null), > FieldSchema(name:end_date, type:string, comment:null)], > location:hdfs://mustique:9000/user/hive/warehouse/m, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:{serialization.format=9,line.delim= > ,field.delim=}), bucketCols:[], sortCols:[], parameters:{}), > partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], > parameters:{}) > Proposed Output: > Detailed Table InformationTable(tableName:cobra_merchandise, > dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, > type:string, comment:null), FieldSchema(name:client_merch_type_tid, > type:string, comment:null), FieldSchema(name:description, type:string, > comment:null), FieldSchema(name:client_description, type:string, > comment:null), FieldSchema(name:price, type:string, comment:null), > FieldSchema(name:cost, type:string, comment:null), > FieldSchema(name:start_date, type:string, comment:null), > FieldSchema(name:end_date, type:string, comment:null)], > location:hdfs://mustique:9000/user/hive/warehouse/m, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:{serialization.format=9,line.delim=,field.delim=}), > bucketCols:[], sortCols:[], parameters:{}), > partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], > parameters:{}) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n
[ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-820: Fix Version/s: (was: 0.5.0) > Describe Extended Line Breaks When Delimiter is \n > -- > > Key: HIVE-820 > URL: https://issues.apache.org/jira/browse/HIVE-820 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0 >Reporter: Matt Pestritto >Assignee: Matt Pestritto >Priority: Minor > Attachments: hive_820.patch > > > Tables defined delimited with \t and breaks using \n has output of describe > extended that is not contiguous. > Line.delim outputs an actual \n which breaks the display output so using the > hiveservice you have to do another FetchOne to get the rest of the line. > For example. > Original Output: > Detailed Table InformationTable(tableName:cobra_merchandise, > dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, > type:string, comment:null), FieldSchema(name:client_merch_type_tid, > type:string, comment:null), FieldSchema(name:description, type:string, > comment:null), FieldSchema(name:client_description, type:string, > comment:null), FieldSchema(name:price, type:string, comment:null), > FieldSchema(name:cost, type:string, comment:null), > FieldSchema(name:start_date, type:string, comment:null), > FieldSchema(name:end_date, type:string, comment:null)], > location:hdfs://mustique:9000/user/hive/warehouse/m, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:{serialization.format=9,line.delim= > ,field.delim=}), bucketCols:[], sortCols:[], parameters:{}), > partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], > parameters:{}) > Proposed Output: > Detailed Table InformationTable(tableName:cobra_merchandise, > dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, > type:string, comment:null), FieldSchema(name:client_merch_type_tid, > type:string, comment:null), FieldSchema(name:description, type:string, > comment:null), FieldSchema(name:client_description, type:string, > comment:null), FieldSchema(name:price, type:string, comment:null), > FieldSchema(name:cost, type:string, comment:null), > FieldSchema(name:start_date, type:string, comment:null), > FieldSchema(name:end_date, type:string, comment:null)], > location:hdfs://mustique:9000/user/hive/warehouse/m, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:{serialization.format=9,line.delim=,field.delim=}), > bucketCols:[], sortCols:[], parameters:{}), > partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], > parameters:{}) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans
[ https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797288#action_12797288 ] Ning Zhang commented on HIVE-1030: -- Currently there are other places using File.createTempFile in the persistent data structures (HashMapWrapper and RowContainer). We have the File.deleteOnExit(true) set to ensure the temp file got deleted when the job is killed or normal exit. Is there any issue there as well? Should we also convert that to using ScratchDir? > Hive should use scratchDir instead of system temporary directory for storing > plans > -- > > Key: HIVE-1030 > URL: https://issues.apache.org/jira/browse/HIVE-1030 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 0.5.0 > > Attachments: HIVE-1030.1.patch > > > Otherwise these plan files never get deleted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-683) add UDF field
[ https://issues.apache.org/jira/browse/HIVE-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain reassigned HIVE-683: --- Assignee: Larry Ogrodnek > add UDF field > - > > Key: HIVE-683 > URL: https://issues.apache.org/jira/browse/HIVE-683 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain >Assignee: Larry Ogrodnek > Fix For: 0.5.0 > > Attachments: HIVE-683.patch, HIVE-683.patch, HIVE-683.patch > > > add UDF field > look at > http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html > for details -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans
[ https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797254#action_12797254 ] Namit Jain commented on HIVE-1030: -- +1 will commit if the tests pass > Hive should use scratchDir instead of system temporary directory for storing > plans > -- > > Key: HIVE-1030 > URL: https://issues.apache.org/jira/browse/HIVE-1030 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 0.5.0 > > Attachments: HIVE-1030.1.patch > > > Otherwise these plan files never get deleted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
[ https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797253#action_12797253 ] Namit Jain commented on HIVE-1015: -- +1 looks good - will commit if the tests pass > Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts > --- > > Key: HIVE-1015 > URL: https://issues.apache.org/jira/browse/HIVE-1015 > Project: Hadoop Hive > Issue Type: Improvement > Components: Contrib >Reporter: Carl Steinbach >Assignee: Larry Ogrodnek > Attachments: HIVE-1015.patch, HIVE-1015.patch > > > Larry Ogrodnek has written a set of wrapper classes that make it possible > to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that > more closely resembles conventional Hadoop MR programs. > A blog post describing this library can be found here: > http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html > The source code (with Apache license) is available here: > http://github.com/ogrodnek/shmrj > We should add this to contrib. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
[ https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain reassigned HIVE-1015: Assignee: Larry Ogrodnek > Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts > --- > > Key: HIVE-1015 > URL: https://issues.apache.org/jira/browse/HIVE-1015 > Project: Hadoop Hive > Issue Type: Improvement > Components: Contrib >Reporter: Carl Steinbach >Assignee: Larry Ogrodnek > Attachments: HIVE-1015.patch, HIVE-1015.patch > > > Larry Ogrodnek has written a set of wrapper classes that make it possible > to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that > more closely resembles conventional Hadoop MR programs. > A blog post describing this library can be found here: > http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html > The source code (with Apache license) is available here: > http://github.com/ogrodnek/shmrj > We should add this to contrib. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans
[ https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1030: - Attachment: HIVE-1030.1.patch > Hive should use scratchDir instead of system temporary directory for storing > plans > -- > > Key: HIVE-1030 > URL: https://issues.apache.org/jira/browse/HIVE-1030 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 0.5.0 > > Attachments: HIVE-1030.1.patch > > > Otherwise these plan files never get deleted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans
[ https://issues.apache.org/jira/browse/HIVE-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1030: - Status: Patch Available (was: Open) > Hive should use scratchDir instead of system temporary directory for storing > plans > -- > > Key: HIVE-1030 > URL: https://issues.apache.org/jira/browse/HIVE-1030 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 0.5.0 > > Attachments: HIVE-1030.1.patch > > > Otherwise these plan files never get deleted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1030) Hive should use scratchDir instead of system temporary directory for storing plans
Hive should use scratchDir instead of system temporary directory for storing plans -- Key: HIVE-1030 URL: https://issues.apache.org/jira/browse/HIVE-1030 Project: Hadoop Hive Issue Type: Bug Reporter: Zheng Shao Assignee: Zheng Shao Fix For: 0.5.0 Otherwise these plan files never get deleted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/scheme support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797238#action_12797238 ] He Yongqiang commented on HIVE-675: --- Hi Jeff, Sorry for the delay. Actually it is now holding for the release of 0.5 because it has many related jiras. Let's commit this after the release of 0.5. > add database/scheme support Hive QL > --- > > Key: HIVE-675 > URL: https://issues.apache.org/jira/browse/HIVE-675 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Prasad Chakka >Assignee: He Yongqiang > Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, > hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, > hive-675-2009-9-8.patch > > > Currently all Hive tables reside in single namespace (default). Hive should > support multiple namespaces (databases or schemas) such that users can create > tables in their specific namespaces. These name spaces can have different > warehouse directories (with a default naming scheme) and possibly different > properties. > There is already some support for this in metastore but Hive query parser > should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Angeles updated HIVE-1027: -- Attachment: hive-1027.patch updated patch... includes show_functions.q.out > Create UDFs for XPath expression evaluation > --- > > Key: HIVE-1027 > URL: https://issues.apache.org/jira/browse/HIVE-1027 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Patrick Angeles >Priority: Minor > Attachments: hive-1027.patch, udf_xpath.patch > > > Create UDFs for evaluating XPath expressions against XML documents. > Examples: > > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT > > 1 ; > 5.0 > > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT > > 1 ; > b2 > > SELECT xpath ('b1b2b3c1c2', > > 'a/c/text()') FROM src LIMIT 1 ; > ["c1","c2"] > Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, > xpath_double/xpath_number, xpath_string, xpath -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Angeles updated HIVE-1027: -- Status: Patch Available (was: Open) Updated patch (this one includes show_functions.q.out). > Create UDFs for XPath expression evaluation > --- > > Key: HIVE-1027 > URL: https://issues.apache.org/jira/browse/HIVE-1027 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Patrick Angeles >Priority: Minor > Attachments: udf_xpath.patch > > > Create UDFs for evaluating XPath expressions against XML documents. > Examples: > > SELECT xpath_double ('12 > class="odd">48', 'sum(a/b...@class="odd"])') FROM src LIMIT > > 1 ; > 5.0 > > SELECT xpath_string ('b1b2', 'a/b[2]') FROM src LIMIT > > 1 ; > b2 > > SELECT xpath ('b1b2b3c1c2', > > 'a/c/text()') FROM src LIMIT 1 ; > ["c1","c2"] > Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, > xpath_double/xpath_number, xpath_string, xpath -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-675) add database/scheme support Hive QL
[ https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797225#action_12797225 ] Jeff Hammerbacher commented on HIVE-675: Hey, Is this patch in a state where it could go into 0.5? If it's not going to be polished off, please let us know. Thanks, Jeff > add database/scheme support Hive QL > --- > > Key: HIVE-675 > URL: https://issues.apache.org/jira/browse/HIVE-675 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Prasad Chakka >Assignee: He Yongqiang > Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, > hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, > hive-675-2009-9-8.patch > > > Currently all Hive tables reside in single namespace (default). Hive should > support multiple namespaces (databases or schemas) such that users can create > tables in their specific namespaces. These name spaces can have different > warehouse directories (with a default naming scheme) and possibly different > properties. > There is already some support for this in metastore but Hive query parser > should have this feature as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1015) Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts
[ https://issues.apache.org/jira/browse/HIVE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Larry Ogrodnek updated HIVE-1015: - Attachment: HIVE-1015.patch Here's a new patch with a .q file using an example mapper and reducer. I also removed the dependency of these classes on apache commons lang, since there was only a single use of StringUtils.join(), and it's one less thing to specify on the classpath in the USING clause Thanks. > Java MapReduce wrapper for TRANSFORM/MAP/REDUCE scripts > --- > > Key: HIVE-1015 > URL: https://issues.apache.org/jira/browse/HIVE-1015 > Project: Hadoop Hive > Issue Type: Improvement > Components: Contrib >Reporter: Carl Steinbach > Attachments: HIVE-1015.patch, HIVE-1015.patch > > > Larry Ogrodnek has written a set of wrapper classes that make it possible > to write Hive TRANSFORM/MAP/REDUCE scripts in Java in a style that > more closely resembles conventional Hadoop MR programs. > A blog post describing this library can be found here: > http://dev.bizo.com/2009/10/hive-map-reduce-in-java.html > The source code (with Apache license) is available here: > http://github.com/ogrodnek/shmrj > We should add this to contrib. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-478) Surface "processor time" for queries
[ https://issues.apache.org/jira/browse/HIVE-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797164#action_12797164 ] Namit Jain commented on HIVE-478: - Can you set the configuration parameter hive.task.progress to true. It will dump the total time taken by each operator. Please check if this meets your requirements, we can enhance it to add more stuff. > Surface "processor time" for queries > > > Key: HIVE-478 > URL: https://issues.apache.org/jira/browse/HIVE-478 > Project: Hadoop Hive > Issue Type: Wish > Components: Logging, Query Processor >Reporter: Adam Kramer > > We currently list real-time metrics of how long queries take--"finished in: > 1min 13sec" appears on the job tracker. However, this is affected by a lot > more than just the quality or implementation of the query. For example, > number of mappers used varies a lot when you use subqueries versus > single-query aggregation, as does the amount of work necessary. > For implementation comparisons (e.g., "should I use this version of the query > or that one"), ti would be great to know the processor time used instead of > the real time used...both in terms of "mapper cpu seconds" and "reducer cpu > seconds." -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-683) add UDF field
[ https://issues.apache.org/jira/browse/HIVE-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain resolved HIVE-683. - Resolution: Fixed Fix Version/s: 0.5.0 Hadoop Flags: [Reviewed] Committed. Thanks Larry > add UDF field > - > > Key: HIVE-683 > URL: https://issues.apache.org/jira/browse/HIVE-683 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Reporter: Namit Jain > Fix For: 0.5.0 > > Attachments: HIVE-683.patch, HIVE-683.patch, HIVE-683.patch > > > add UDF field > look at > http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html > for details -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n
[ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797150#action_12797150 ] Matt Pestritto commented on HIVE-820: - All - Do we have a decision on what you want the output to show ? A few different ideas were being thrown around. I would rather replace only characters that would break the output ( tab, \n ) with something meaningful vs, as Edward stated, always showing the octal representation which would require an ascii table to figure out what the delimiter is. If something is | ( pipe ) delimited, I always need to look it up when that is a printable character. I'll wait for feedback from the FB team and make the changes. Thanks. > Describe Extended Line Breaks When Delimiter is \n > -- > > Key: HIVE-820 > URL: https://issues.apache.org/jira/browse/HIVE-820 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0 >Reporter: Matt Pestritto >Assignee: Matt Pestritto >Priority: Minor > Fix For: 0.5.0 > > Attachments: hive_820.patch > > > Tables defined delimited with \t and breaks using \n has output of describe > extended that is not contiguous. > Line.delim outputs an actual \n which breaks the display output so using the > hiveservice you have to do another FetchOne to get the rest of the line. > For example. > Original Output: > Detailed Table InformationTable(tableName:cobra_merchandise, > dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, > type:string, comment:null), FieldSchema(name:client_merch_type_tid, > type:string, comment:null), FieldSchema(name:description, type:string, > comment:null), FieldSchema(name:client_description, type:string, > comment:null), FieldSchema(name:price, type:string, comment:null), > FieldSchema(name:cost, type:string, comment:null), > FieldSchema(name:start_date, type:string, comment:null), > FieldSchema(name:end_date, type:string, comment:null)], > location:hdfs://mustique:9000/user/hive/warehouse/m, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:{serialization.format=9,line.delim= > ,field.delim=}), bucketCols:[], sortCols:[], parameters:{}), > partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], > parameters:{}) > Proposed Output: > Detailed Table InformationTable(tableName:cobra_merchandise, > dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, > type:string, comment:null), FieldSchema(name:client_merch_type_tid, > type:string, comment:null), FieldSchema(name:description, type:string, > comment:null), FieldSchema(name:client_description, type:string, > comment:null), FieldSchema(name:price, type:string, comment:null), > FieldSchema(name:cost, type:string, comment:null), > FieldSchema(name:start_date, type:string, comment:null), > FieldSchema(name:end_date, type:string, comment:null)], > location:hdfs://mustique:9000/user/hive/warehouse/m, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:{serialization.format=9,line.delim=,field.delim=}), > bucketCols:[], sortCols:[], parameters:{}), > partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], > parameters:{}) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.