[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-24 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914346#action_12914346
 ] 

Ning Zhang commented on HIVE-1378:
--

@john, should we run a survey on hive-user mailing list to see how many people 
are still using pre-0.20 hadoop before dropping the support? 

 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.6.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1659) parse_url_tuple: a UDTF version of parse_url

2010-09-24 Thread Xing Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Jin updated HIVE-1659:
---

Attachment: HIVE-1659.patch

This patch contains implementation of parse_url_tuple function, as well as a 
test file and a test output.

 parse_url_tuple:  a UDTF version of parse_url
 -

 Key: HIVE-1659
 URL: https://issues.apache.org/jira/browse/HIVE-1659
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.5.0
Reporter: Ning Zhang
 Attachments: HIVE-1659.patch


 The UDF parse_url take s a URL, parse it and extract QUERY/PATH etc from it. 
 However it can only extract an atomic value from the URL. If we want to 
 extract multiple piece of information, we need to call the function many 
 times. It is desirable to parse the URL once and extract all needed 
 information and return a tuple in a UDTF. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1659) parse_url_tuple: a UDTF version of parse_url

2010-09-24 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914358#action_12914358
 ] 

Ning Zhang commented on HIVE-1659:
--

Xing, this patch doesn't apply cleanly with the latest trunk. Can you 'svn up' 
and regenerate the patch. You may need to resolve any conflicts after 'svn up'.

 parse_url_tuple:  a UDTF version of parse_url
 -

 Key: HIVE-1659
 URL: https://issues.apache.org/jira/browse/HIVE-1659
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.5.0
Reporter: Ning Zhang
 Attachments: HIVE-1659.patch


 The UDF parse_url take s a URL, parse it and extract QUERY/PATH etc from it. 
 However it can only extract an atomic value from the URL. If we want to 
 extract multiple piece of information, we need to call the function many 
 times. It is desirable to parse the URL once and extract all needed 
 information and return a tuple in a UDTF. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1659) parse_url_tuple: a UDTF version of parse_url

2010-09-24 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914360#action_12914360
 ] 

Ning Zhang commented on HIVE-1659:
--

Also when you generate the patch, you need to run 'svn diff' at the root 
directory of the hive trunk. 

 parse_url_tuple:  a UDTF version of parse_url
 -

 Key: HIVE-1659
 URL: https://issues.apache.org/jira/browse/HIVE-1659
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.5.0
Reporter: Ning Zhang
 Attachments: HIVE-1659.patch


 The UDF parse_url take s a URL, parse it and extract QUERY/PATH etc from it. 
 However it can only extract an atomic value from the URL. If we want to 
 extract multiple piece of information, we need to call the function many 
 times. It is desirable to parse the URL once and extract all needed 
 information and return a tuple in a UDTF. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)

2010-09-24 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-537:
-

Attachment: patch-537-2.txt

Patch incorporating review comments:

Changes include:
* Added udf create_union to create union object. Added test query using 
create_union.
* Added UNIONTYPE keyword to Hive.g.  Added test query to create table with 
union column.
* Fixed a couple of minor bugs in LazySimpleSerde and LazyUnion.


 Hive TypeInfo/ObjectInspector to support union (besides struct, array, and 
 map)
 ---

 Key: HIVE-537
 URL: https://issues.apache.org/jira/browse/HIVE-537
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Amareshwari Sriramadasu
 Attachments: HIVE-537.1.patch, patch-537-1.txt, patch-537-2.txt, 
 patch-537.txt


 There are already some cases inside the code that we use heterogeneous data: 
 JoinOperator, and UnionOperator (in the sense that different parents can pass 
 in records with different ObjectInspectors).
 We currently use Operator's parentID to distinguish that. However that 
 approach does not extend to more complex plans that might be needed in the 
 future.
 We will support the union type like this:
 {code}
 TypeDefinition:
   type: primitivetype | structtype | arraytype | maptype | uniontype
   uniontype: union  tag : type (, tag : type)* 
 Example:
   union0:int,1:double,2:arraystring,3:structa:int,b:string
 Example of serialized data format:
   We will first store the tag byte before we serialize the object. On 
 deserialization, we will first read out the tag byte, then we know what is 
 the current type of the following object, so we can deserialize it 
 successfully.
 Interface for ObjectInspector:
 interface UnionObjectInspector {
   /** Returns the array of OIs that are for each of the tags
*/
   ObjectInspector[] getObjectInspectors();
   /** Return the tag of the object.
*/
   byte getTag(Object o);
   /** Return the field based on the tag value associated with the Object.
*/
   Object getField(Object o);
 };
 An example serialization format (Using deliminated format, with ' ' as 
 first-level delimitor and '=' as second-level delimitor)
 userid:int,log:union0:structtouserid:int,message:string,1:string
 123 1=login
 123 0=243=helloworld
 123 1=logout
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1668) Move HWI out to Github

2010-09-24 Thread Jeff Hammerbacher (JIRA)
Move HWI out to Github
--

 Key: HIVE-1668
 URL: https://issues.apache.org/jira/browse/HIVE-1668
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Jeff Hammerbacher


I have seen HWI cause a number of build and test errors, and it's now going to 
cost us some extra work for integration with security. We've worked on hundreds 
of clusters at Cloudera and I've never seen anyone use HWI. With the Beeswax UI 
available in Hue, it's unlikely that anyone would prefer to stick with HWI. I 
think it's time to move it out to Github.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)

2010-09-24 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-537:
-

   Status: Patch Available  (was: Open)
Fix Version/s: 0.7.0

All the tests passed with the patch.

Zheng, Can you have a look at the updated patch? Thanks.

 Hive TypeInfo/ObjectInspector to support union (besides struct, array, and 
 map)
 ---

 Key: HIVE-537
 URL: https://issues.apache.org/jira/browse/HIVE-537
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Amareshwari Sriramadasu
 Fix For: 0.7.0

 Attachments: HIVE-537.1.patch, patch-537-1.txt, patch-537-2.txt, 
 patch-537.txt


 There are already some cases inside the code that we use heterogeneous data: 
 JoinOperator, and UnionOperator (in the sense that different parents can pass 
 in records with different ObjectInspectors).
 We currently use Operator's parentID to distinguish that. However that 
 approach does not extend to more complex plans that might be needed in the 
 future.
 We will support the union type like this:
 {code}
 TypeDefinition:
   type: primitivetype | structtype | arraytype | maptype | uniontype
   uniontype: union  tag : type (, tag : type)* 
 Example:
   union0:int,1:double,2:arraystring,3:structa:int,b:string
 Example of serialized data format:
   We will first store the tag byte before we serialize the object. On 
 deserialization, we will first read out the tag byte, then we know what is 
 the current type of the following object, so we can deserialize it 
 successfully.
 Interface for ObjectInspector:
 interface UnionObjectInspector {
   /** Returns the array of OIs that are for each of the tags
*/
   ObjectInspector[] getObjectInspectors();
   /** Return the tag of the object.
*/
   byte getTag(Object o);
   /** Return the field based on the tag value associated with the Object.
*/
   Object getField(Object o);
 };
 An example serialization format (Using deliminated format, with ' ' as 
 first-level delimitor and '=' as second-level delimitor)
 userid:int,log:union0:structtouserid:int,message:string,1:string
 123 1=login
 123 0=243=helloworld
 123 1=logout
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-24 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1526:
-

Status: Patch Available  (was: Open)

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar, 
 libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-24 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1526:
-

Attachment: HIVE-1526.2.patch.txt

HIVE-1526.2.patch.txt:
* Manage slf4j dependencies with Ivy.
* Added slf4j dependencies to eclipse classpath.
* Added thriftif macro to ${hive.root}/build.xml which triggers recompilation 
of all thrift stubs.
* Modified odbc/Makefile to use Thrift libs and headers in THRIFT_HOME instead 
of the ones that were checked into service/include.
* Modified odbc/Makefile to build thrift generated cpp artifacts in ql/src
* Removed thrift headers/code from service/include (HIVE-1527)
* Added some missing #includes to the hiveclient source files in odbc/src/cpp.

Testing:
* Tested eclipse launch configurations.
* Built CPP hiveclient lib and tested against HiveServer using HiveClientTestC 
program.


 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar, 
 libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-842) Authentication Infrastructure for Hive

2010-09-24 Thread Venkatesh S (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914408#action_12914408
 ] 

Venkatesh S commented on HIVE-842:
--

 Should the metastore always take HDFS actions as the user making the RPC?
Yes, metastore will run as a super-user (Hadoop proxy user) enabling DO AS 
operations and impersonate the target user while accessing data on HDFS.

 If we see that Hadoop Security is enabled, should we enable SASL on the 
 metastore thrift server by default?
I'd think so.

 should there be an option whereby the metastore uses a keytab to authenticate 
 to HDFS, but doesn't require users to authenticate to it?
Wouldn't this leave a hole as it currently exists?

 Authentication Infrastructure for Hive
 --

 Key: HIVE-842
 URL: https://issues.apache.org/jira/browse/HIVE-842
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Edward Capriolo
Assignee: Todd Lipcon
 Attachments: HiveSecurityThoughts.pdf


 This issue deals with the authentication (user name,password) infrastructure. 
 Not the authorization components that specify what a user should be able to 
 do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Review Request: HIVE-1530: Include hive-default.xml configuration file in hive-common.jar

2010-09-24 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/902/
---

Review request for Hive Developers.


Summary
---

HIVE-1530.1.patch.txt:
* Move conf/hive-default.xml to common/resources/hive-default.xml and modify 
the build so that this gets included in hive-common-xxx.jar
* Copy contents of conf/hive-default.xml to conf/hive-site.xml.template
* Modify HiveConf so that it logs an INFO level message with the location of 
the hive-default.xml and hive-site.xml files which were loaded.


This addresses bug HIVE-1530.
http://issues.apache.org/jira/browse/HIVE-1530


Diffs
-

  build.xml 4b345b5 
  common/build.xml d9ac07e 
  common/resources/hive-default.xml PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 47b7518 
  conf/hive-default.xml 1465317 
  conf/hive-site.xml.template PRE-CREATION 

Diff: http://review.cloudera.org/r/902/diff


Testing
---


Thanks,

Carl



[jira] Updated: (HIVE-1530) Include hive-default.xml and hive-log4j.properties in hive-common JAR

2010-09-24 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1530:
-

   Status: Patch Available  (was: Open)
 Assignee: Carl Steinbach
Fix Version/s: 0.7.0

 Include hive-default.xml and hive-log4j.properties in hive-common JAR
 -

 Key: HIVE-1530
 URL: https://issues.apache.org/jira/browse/HIVE-1530
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: HIVE-1530.1.patch.txt


 hive-common-*.jar should include hive-default.xml and hive-log4j.properties,
 and similarly hive-exec-*.jar should include hive-exec-log4j.properties. The
 hive-default.xml file that currently sits in the conf/ directory should be 
 removed.
 Motivations for this change:
 * We explicitly tell users that they should never modify hive-default.xml yet 
 give them the opportunity to do so by placing the file in the conf dir.
 * Many users are familiar with the Hadoop configuration mechanism that does 
 not require *-default.xml files to be present in the HADOOP_CONF_DIR, and 
 assume that the same is true for HIVE_CONF_DIR.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1530) Include hive-default.xml and hive-log4j.properties in hive-common JAR

2010-09-24 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914419#action_12914419
 ] 

HBase Review Board commented on HIVE-1530:
--

Message from: Carl Steinbach c...@cloudera.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/902/
---

Review request for Hive Developers.


Summary
---

HIVE-1530.1.patch.txt:
* Move conf/hive-default.xml to common/resources/hive-default.xml and modify 
the build so that this gets included in hive-common-xxx.jar
* Copy contents of conf/hive-default.xml to conf/hive-site.xml.template
* Modify HiveConf so that it logs an INFO level message with the location of 
the hive-default.xml and hive-site.xml files which were loaded.


This addresses bug HIVE-1530.
http://issues.apache.org/jira/browse/HIVE-1530


Diffs
-

  build.xml 4b345b5 
  common/build.xml d9ac07e 
  common/resources/hive-default.xml PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 47b7518 
  conf/hive-default.xml 1465317 
  conf/hive-site.xml.template PRE-CREATION 

Diff: http://review.cloudera.org/r/902/diff


Testing
---


Thanks,

Carl




 Include hive-default.xml and hive-log4j.properties in hive-common JAR
 -

 Key: HIVE-1530
 URL: https://issues.apache.org/jira/browse/HIVE-1530
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: HIVE-1530.1.patch.txt


 hive-common-*.jar should include hive-default.xml and hive-log4j.properties,
 and similarly hive-exec-*.jar should include hive-exec-log4j.properties. The
 hive-default.xml file that currently sits in the conf/ directory should be 
 removed.
 Motivations for this change:
 * We explicitly tell users that they should never modify hive-default.xml yet 
 give them the opportunity to do so by placing the file in the conf dir.
 * Many users are familiar with the Hadoop configuration mechanism that does 
 not require *-default.xml files to be present in the HADOOP_CONF_DIR, and 
 assume that the same is true for HIVE_CONF_DIR.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-24 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914509#action_12914509
 ] 

John Sichi commented on HIVE-1378:
--

We did already, and no one responded, so I think Facebook was probably the last 
holdout.

 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.6.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Hive-trunk-h0.19 #549

2010-09-24 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/549/

--
[...truncated 12224 lines...]
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/srcbucket22.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/kv3.txt
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/kv1.seq
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/complex.seq
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/json.txt
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_table1.q.out
 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_table1.q.out
[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/srcbucket0.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/srcbucket1.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/srcbucket20.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/srcbucket21.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/srcbucket22.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/data/files/kv3.txt
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1

Build failed in Hudson: Hive-trunk-h0.18 #549

2010-09-24 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/549/changes

Changes:

[heyongqiang] HIVE-1661. Default values for parameters (Siying Dong via He 
Yongqiang)

--
[...truncated 30288 lines...]
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket22.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv3.txt
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.seq
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/complex.seq
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/json.txt
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_table1.q.out
 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_table1.q.out
[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket0.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket1.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket20.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket21.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket22.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 

[jira] Commented: (HIVE-1530) Include hive-default.xml and hive-log4j.properties in hive-common JAR

2010-09-24 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914528#action_12914528
 ] 

Philip Zeyliger commented on HIVE-1530:
---

+1.  I'm a big fan of this change.

We've repeatedly had customers using an old or weird hive-default or 
non-existent hive-default, and that's caused quite tricky to debug issues.

 Include hive-default.xml and hive-log4j.properties in hive-common JAR
 -

 Key: HIVE-1530
 URL: https://issues.apache.org/jira/browse/HIVE-1530
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: HIVE-1530.1.patch.txt


 hive-common-*.jar should include hive-default.xml and hive-log4j.properties,
 and similarly hive-exec-*.jar should include hive-exec-log4j.properties. The
 hive-default.xml file that currently sits in the conf/ directory should be 
 removed.
 Motivations for this change:
 * We explicitly tell users that they should never modify hive-default.xml yet 
 give them the opportunity to do so by placing the file in the conf dir.
 * Many users are familiar with the Hadoop configuration mechanism that does 
 not require *-default.xml files to be present in the HADOOP_CONF_DIR, and 
 assume that the same is true for HIVE_CONF_DIR.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1264) Make Hive work with Hadoop security

2010-09-24 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914573#action_12914573
 ] 

John Sichi commented on HIVE-1264:
--

Todd, I'm getting one test failure when running ant test.  Probably 
hbase-handler/build.xml needs a fix.  You should be able to repro it with

ant test -Dtestcase=TestHBaseMinimrCliDriver


testCliDriver_hbase_bulkError   org/apache/hadoop/hdfs/MiniDFSCluster 

java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/MiniDFSCluster
at org.apache.hadoop.hive.shims.Hadoop20Shims.getMiniDfs(Hadoop20Shims.java:90)
at org.apache.hadoop.hive.ql.QTestUtil.init(QTestUtil.java:224)
at org.apache.hadoop.hive.hbase.HBaseQTestUtil.init(HBaseQTestUtil.java:30)
at 
org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.setUp(TestHBaseMinimrCliDriver.java:43)
at junit.framework.TestCase.runBare(TestCase.java:125)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:118)
at junit.framework.TestSuite.runTest(TestSuite.java:208)
at junit.framework.TestSuite.run(TestSuite.java:203)
at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.extensions.TestSetup.run(TestSetup.java:23)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hdfs.MiniDFSCluster
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)

 Make Hive work with Hadoop security
 ---

 Key: HIVE-1264
 URL: https://issues.apache.org/jira/browse/HIVE-1264
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Jeff Hammerbacher
Assignee: Todd Lipcon
 Attachments: hive-1264-fb-mirror.txt, hive-1264.txt, 
 HiveHadoop20S_patch.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1264) Make Hive work with Hadoop security

2010-09-24 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1264:
-

Status: Open  (was: Patch Available)

 Make Hive work with Hadoop security
 ---

 Key: HIVE-1264
 URL: https://issues.apache.org/jira/browse/HIVE-1264
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Jeff Hammerbacher
Assignee: Todd Lipcon
 Attachments: hive-1264-fb-mirror.txt, hive-1264.txt, 
 HiveHadoop20S_patch.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1659) parse_url_tuple: a UDTF version of parse_url

2010-09-24 Thread Xing Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Jin updated HIVE-1659:
---

Attachment: HIVE-1659.patch2

generate patch after 'svn up' and in root directory

 parse_url_tuple:  a UDTF version of parse_url
 -

 Key: HIVE-1659
 URL: https://issues.apache.org/jira/browse/HIVE-1659
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.5.0
Reporter: Ning Zhang
 Attachments: HIVE-1659.patch, HIVE-1659.patch2


 The UDF parse_url take s a URL, parse it and extract QUERY/PATH etc from it. 
 However it can only extract an atomic value from the URL. If we want to 
 extract multiple piece of information, we need to call the function many 
 times. It is desirable to parse the URL once and extract all needed 
 information and return a tuple in a UDTF. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1668) Move HWI out to Github

2010-09-24 Thread Jay Booth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914577#action_12914577
 ] 

Jay Booth commented on HIVE-1668:
-

Are you guys contributing Hue to ASF?  It seems hasty to remove functionality 
in favor of a replacement if that replacement isn't going to be shipped with 
mainline Hive.

 Move HWI out to Github
 --

 Key: HIVE-1668
 URL: https://issues.apache.org/jira/browse/HIVE-1668
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Jeff Hammerbacher

 I have seen HWI cause a number of build and test errors, and it's now going 
 to cost us some extra work for integration with security. We've worked on 
 hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the 
 Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick 
 with HWI. I think it's time to move it out to Github.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1668) Move HWI out to Github

2010-09-24 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914584#action_12914584
 ] 

Edward Capriolo commented on HIVE-1668:
---

Jeff,
I disagree. The build and test errors are not insurmountable. In fact some if 
not most of the ERRORS were cascading changes that were not tested properly. 
For example:

https://issues.apache.org/jira/browse/HIVE-1183 was a fix I had to do because 
someone broke it here. https://issues.apache.org/jira/browse/HIVE-978 because 
someone wanted all jars to be named whatever.${version} and did not bother to 
look across all the shell script files that startup hive. 

https://issues.apache.org/jira/browse/HIVE-1294 again someone changed some 
shell scripts and only tested the cli.

https://issues.apache.org/jira/browse/HIVE-752 again someone broke hwi without 
testing it.

https://issues.apache.org/jira/browse/HIVE-1615, not really anyone's fault but 
no API stability across hive. I do not see why one method went away and another 
similar method took its place.

I have been of course talking about moving HWI to wikit for a while moving from 
JSP to Servlet/ Java code will fix errors, but the little time I do have I 
usually have to spend detecting and cleaning up other breakages.

HUE and Beeswax I honestly do not know, but sounds like you need extra magical 
stuff to make this work, and HWI works with hive on its own (onless people 
break it)

 Move HWI out to Github
 --

 Key: HIVE-1668
 URL: https://issues.apache.org/jira/browse/HIVE-1668
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Jeff Hammerbacher

 I have seen HWI cause a number of build and test errors, and it's now going 
 to cost us some extra work for integration with security. We've worked on 
 hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the 
 Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick 
 with HWI. I think it's time to move it out to Github.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-24 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914598#action_12914598
 ] 

Ning Zhang commented on HIVE-1378:
--

Before we decided to drop support for pre-0.20, we should have a separate JIRA 
to have a list of things that need to clean up: e.g., exclude downloading 
building hadoop 0.17. 

In the mean time, the change in the patch to be pre-0.20 compatible should be 
minimum. Steven, can you take a look the code and see how much it required to 
be done to be compatible with 0.17?

 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.6.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1659) parse_url_tuple: a UDTF version of parse_url

2010-09-24 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914602#action_12914602
 ] 

Ning Zhang commented on HIVE-1659:
--

Xing, there is a diff in show_functions.q. You need to overwrite the .out file 
with the addition of the new function. The following command will update the 
out file. 

 ant test -Dtestcase=TestCliDriver -Dqfile=show_functions.q -Doverwrite=true

Can you regenerate the patch after that?

 parse_url_tuple:  a UDTF version of parse_url
 -

 Key: HIVE-1659
 URL: https://issues.apache.org/jira/browse/HIVE-1659
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.5.0
Reporter: Ning Zhang
 Attachments: HIVE-1659.patch, HIVE-1659.patch2


 The UDF parse_url take s a URL, parse it and extract QUERY/PATH etc from it. 
 However it can only extract an atomic value from the URL. If we want to 
 extract multiple piece of information, we need to call the function many 
 times. It is desirable to parse the URL once and extract all needed 
 information and return a tuple in a UDTF. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1668) Move HWI out to Github

2010-09-24 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914605#action_12914605
 ] 

Edward Capriolo commented on HIVE-1668:
---

Plus, not to get too far off topic, but there is a huge portion of the hadoop 
community that thinks Security? So what? Who cares? I am not going to run 
Active Directory or Kerberos just so I can say My hadoop is is secure . It 
adds latency to many processes, complexity to the overall design of hadoop, and 
does not even encrypt data in transit. Many people are going to elect not to 
use hadoop security for those reasons. Is extra work a reason not to do 
something? Are we going to move the Hive Thrift server out to github too 
because of the burden of extra work? It is a lot of extra work for me when 
hadoop renames all its jmx counters or tells me all my code is deprecated 
because of our new slick mapreduce.* api. I have learned to roll with the 
punches.

 Move HWI out to Github
 --

 Key: HIVE-1668
 URL: https://issues.apache.org/jira/browse/HIVE-1668
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Jeff Hammerbacher

 I have seen HWI cause a number of build and test errors, and it's now going 
 to cost us some extra work for integration with security. We've worked on 
 hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the 
 Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick 
 with HWI. I think it's time to move it out to Github.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1659) parse_url_tuple: a UDTF version of parse_url

2010-09-24 Thread Xing Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Jin updated HIVE-1659:
---

Attachment: HIVE-1659.patch3

update show_functions.q and 'svn up' again. Thanks. Do I need to run all 
unitests on my machine?

 parse_url_tuple:  a UDTF version of parse_url
 -

 Key: HIVE-1659
 URL: https://issues.apache.org/jira/browse/HIVE-1659
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.5.0
Reporter: Ning Zhang
 Attachments: HIVE-1659.patch, HIVE-1659.patch2, HIVE-1659.patch3


 The UDF parse_url take s a URL, parse it and extract QUERY/PATH etc from it. 
 However it can only extract an atomic value from the URL. If we want to 
 extract multiple piece of information, we need to call the function many 
 times. It is desirable to parse the URL once and extract all needed 
 information and return a tuple in a UDTF. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1659) parse_url_tuple: a UDTF version of parse_url

2010-09-24 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1659:
-

Status: Resolved  (was: Patch Available)
Resolution: Fixed

Committed. Thanks Xing!

 parse_url_tuple:  a UDTF version of parse_url
 -

 Key: HIVE-1659
 URL: https://issues.apache.org/jira/browse/HIVE-1659
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.5.0
Reporter: Ning Zhang
 Attachments: HIVE-1659.patch, HIVE-1659.patch2, HIVE-1659.patch3


 The UDF parse_url take s a URL, parse it and extract QUERY/PATH etc from it. 
 However it can only extract an atomic value from the URL. If we want to 
 extract multiple piece of information, we need to call the function many 
 times. It is desirable to parse the URL once and extract all needed 
 information and return a tuple in a UDTF. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1668) Move HWI out to Github

2010-09-24 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914653#action_12914653
 ] 

Jeff Hammerbacher commented on HIVE-1668:
-

bq. Are you guys contributing Hue to ASF?

No.

bq. It seems hasty to remove functionality in favor of a replacement if that 
replacement isn't going to be shipped with mainline Hive. 

There's certainly precedent in other projects. For example, indexed HBase was 
moved out to Github for very similar reasons: while it provided a useful 
feature, it did so in a somewhat flaky way.

bq. I have learned to roll with the punches.

That's not a great argument for keeping code that's onerous to maintain in 
trunk.

Just trying to be pragmatic here.

 Move HWI out to Github
 --

 Key: HIVE-1668
 URL: https://issues.apache.org/jira/browse/HIVE-1668
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Jeff Hammerbacher

 I have seen HWI cause a number of build and test errors, and it's now going 
 to cost us some extra work for integration with security. We've worked on 
 hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the 
 Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick 
 with HWI. I think it's time to move it out to Github.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1668) Move HWI out to Github

2010-09-24 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914669#action_12914669
 ] 

Edward Capriolo commented on HIVE-1668:
---

{quote}That's not a great argument for keeping code that's onerous to maintain 
in trunk.{quote}
Its not onerous to maintain. As you can see from the tickets I pointed out it 
broke because it was not tested. 

For example, 
https://issues.apache.org/jira/browse/HIVE-752 when designing SHIM classes that 
specify a classname in a string, one has to make sure they get the class name 
correct. I know it was an over site, but I am sure someone fired up the CLI and 
made sure the class name was correct.

As for https://issues.apache.org/jira/browse/HIVE-978, I specifically mentioned 
how to test this any why it should be tested in the patch and it still turned 
out not to work right. 

pragmatic is the perfect word. HWI was never made to be fancy. Anyone who has 
hive can build and run the web interface. With no extra dependencies. It looks 
like to use Beeswax you need Hue, which means you need to go somewhere else and 
get it and install it. It seems like you need to patch or load extra plugins to 
your namenode and datanode like org.apache.hadoop.thriftfs.NamenodePlugin, It 
looks like (http://archive.cloudera.com/cdh/3/hue/manual.html#_install_hue) you 
need: 
gcc  gcc
libxml2-devel   libxml2-dev
libxslt-devel   libxslt-dev
mysql-devel librarysqlclient-dev
python-develpython-dev
python-setuptools   python-setuptools
sqlite-devellibsqlite3-dev 

The pragmatic approach, is to use the web interface provided by hive. You do 
not need anything external like python, or have to make any changes to their 
environment. That is why I think we should stay part of the hive distribution. 
 
I'm -1 on taking it out.  

 Move HWI out to Github
 --

 Key: HIVE-1668
 URL: https://issues.apache.org/jira/browse/HIVE-1668
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Jeff Hammerbacher

 I have seen HWI cause a number of build and test errors, and it's now going 
 to cost us some extra work for integration with security. We've worked on 
 hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the 
 Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick 
 with HWI. I think it's time to move it out to Github.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1668) Move HWI out to Github

2010-09-24 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914673#action_12914673
 ] 

Jeff Hammerbacher commented on HIVE-1668:
-

bq. Anyone who has hive can build and run the web interface.

Empirically, they don't. The value of the web interface to users is not nearly 
as high as the pain it causes the developers for maintenance.

Moving it to Github will help determine if there's demand for the interface. It 
should also help mature the product for eventual inclusion in trunk.

 Move HWI out to Github
 --

 Key: HIVE-1668
 URL: https://issues.apache.org/jira/browse/HIVE-1668
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Jeff Hammerbacher

 I have seen HWI cause a number of build and test errors, and it's now going 
 to cost us some extra work for integration with security. We've worked on 
 hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the 
 Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick 
 with HWI. I think it's time to move it out to Github.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1669) non-deterministic display of storage parameter in test

2010-09-24 Thread Ning Zhang (JIRA)
non-deterministic display of storage parameter in test
--

 Key: HIVE-1669
 URL: https://issues.apache.org/jira/browse/HIVE-1669
 Project: Hadoop Hive
  Issue Type: Test
Reporter: Ning Zhang


With the change to beautify the 'desc extended table', the storage parameters 
are displayed in non-deterministic manner (since its implementation is 
HashMap). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1361) table/partition level statistics

2010-09-24 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914728#action_12914728
 ] 

He Yongqiang commented on HIVE-1361:


+1 running tests.

 table/partition level statistics
 

 Key: HIVE-1361
 URL: https://issues.apache.org/jira/browse/HIVE-1361
 Project: Hadoop Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Ahmed M Aly
 Fix For: 0.7.0

 Attachments: HIVE-1361.2.patch, HIVE-1361.2_java_only.patch, 
 HIVE-1361.3.patch, HIVE-1361.4.java_only.patch, HIVE-1361.4.patch, 
 HIVE-1361.5.java_only.patch, HIVE-1361.5.patch, HIVE-1361.java_only.patch, 
 HIVE-1361.patch, stats0.patch


 At the first step, we gather table-level stats for non-partitioned table and 
 partition-level stats for partitioned table. Future work could extend the 
 table level stats to partitioned table as well. 
 There are 3 major milestones in this subtask: 
  1) extend the insert statement to gather table/partition level stats 
 on-the-fly.
  2) extend metastore API to support storing and retrieving stats for a 
 particular table/partition. 
  3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for 
 existing tables/partitions. 
 The proposed stats are:
 Partition-level stats: 
   - number of rows
   - total size in bytes
   - number of files
   - max, min, average row sizes
   - max, min, average file sizes
 Table-level stats in addition to partition level stats:
   - number of partitions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1669) non-deterministic display of storage parameter in test

2010-09-24 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914727#action_12914727
 ] 

He Yongqiang commented on HIVE-1669:


Ning, Can you post a fix for this after i commit the statistic jira? (HIVE-1361)

 non-deterministic display of storage parameter in test
 --

 Key: HIVE-1669
 URL: https://issues.apache.org/jira/browse/HIVE-1669
 Project: Hadoop Hive
  Issue Type: Test
Reporter: Ning Zhang

 With the change to beautify the 'desc extended table', the storage parameters 
 are displayed in non-deterministic manner (since its implementation is 
 HashMap). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-24 Thread Steven Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Wong updated HIVE-1378:
--

Attachment: HIVE-1378.7.patch

This new patch should be compatible with Hadoop 0.17.2.1.

 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.6.patch, HIVE-1378.7.patch, 
 HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1668) Move HWI out to Github

2010-09-24 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914741#action_12914741
 ] 

Edward Capriolo commented on HIVE-1668:
---

{quote}It should also help mature the product for eventual inclusion in 
trunk.{quote}
Why would we move something from hive out to github, just to move it back to 
hive?

{quote}Empirically, they don't. The value of the web interface to users is not 
nearly as high as the pain it causes the developers for maintenance.{quote}
Who are these developers who maintain it? Has anyone every added a feature 
beside me? I'm not complaining.

http://blog.milford.io/2010/06/getting-the-hive-web-interface-hwi-to-work-on-centos/
{quote}The Hive Web Interface is a pretty sweet deal.{quote} 
Sounds like people like it. 

Why are we debating the past state of hwi? It works now. If someone reports a 
bug I typically investigate and patch that same day.

I challenge anyone to open a ticket on core user, called remove name node web 
interface to github and tried to say  now offers a better name node 
interface using python. The ticket would instantly get a RESOLVED: WILL NOT 
FIX.  Why is this any different? 










 Move HWI out to Github
 --

 Key: HIVE-1668
 URL: https://issues.apache.org/jira/browse/HIVE-1668
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Jeff Hammerbacher

 I have seen HWI cause a number of build and test errors, and it's now going 
 to cost us some extra work for integration with security. We've worked on 
 hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the 
 Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick 
 with HWI. I think it's time to move it out to Github.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1670) MapJoin throws EOFExeption when the mapjoined table has 0 column selected

2010-09-24 Thread Ning Zhang (JIRA)
MapJoin throws EOFExeption when the mapjoined table has 0 column selected
-

 Key: HIVE-1670
 URL: https://issues.apache.org/jira/browse/HIVE-1670
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang


select /*+mapjoin(b) */ sum(a.key) from src a join src b on (a.key=b.key); 
throws EOFException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1670) MapJoin throws EOFExeption when the mapjoined table has 0 column selected

2010-09-24 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1670:
-

Attachment: HIVE-1670.patch

 MapJoin throws EOFExeption when the mapjoined table has 0 column selected
 -

 Key: HIVE-1670
 URL: https://issues.apache.org/jira/browse/HIVE-1670
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1670.patch


 select /*+mapjoin(b) */ sum(a.key) from src a join src b on (a.key=b.key); 
 throws EOFException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1670) MapJoin throws EOFExeption when the mapjoined table has 0 column selected

2010-09-24 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1670:
-

Status: Patch Available  (was: Open)

 MapJoin throws EOFExeption when the mapjoined table has 0 column selected
 -

 Key: HIVE-1670
 URL: https://issues.apache.org/jira/browse/HIVE-1670
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1670.patch


 select /*+mapjoin(b) */ sum(a.key) from src a join src b on (a.key=b.key); 
 throws EOFException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-24 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914756#action_12914756
 ] 

Ning Zhang commented on HIVE-1378:
--

+1. testing.

 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.6.patch, HIVE-1378.7.patch, 
 HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.