date:20100916

[jira] Updated: (HIVE-1650) TestContribNegativeCliDriver fails

2010-09-16 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1650:
-

Attachment: hive.1650.1.patch

> TestContribNegativeCliDriver fails
> --
>
> Key: HIVE-1650
> URL: https://issues.apache.org/jira/browse/HIVE-1650
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.1650.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1650) TestContribNegativeCliDriver fails

2010-09-16 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1650:
-

Status: Patch Available  (was: Open)

> TestContribNegativeCliDriver fails
> --
>
> Key: HIVE-1650
> URL: https://issues.apache.org/jira/browse/HIVE-1650
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.1650.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1650) TestContribNegativeCliDriver fails

2010-09-16 Thread Namit Jain (JIRA)

TestContribNegativeCliDriver fails
--

 Key: HIVE-1650
 URL: https://issues.apache.org/jira/browse/HIVE-1650
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1616) Add ProtocolBuffersStructObjectInspector

2010-09-16 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1616:
-

   Status: Resolved  (was: Patch Available)
 Hadoop Flags: [Reviewed]
Fix Version/s: 0.7.0
   Resolution: Fixed

Committed. Thanks Johan

> Add ProtocolBuffersStructObjectInspector
> 
>
> Key: HIVE-1616
> URL: https://issues.apache.org/jira/browse/HIVE-1616
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Johan Oskarsson
>Assignee: Johan Oskarsson
>Priority: Minor
> Fix For: 0.7.0
>
> Attachments: HIVE-1616.patch
>
>
> Much like there is a ThriftStructObjectInspector that ignores the isset 
> booleans there is a need for a ProtocolBuffersStructObjectInspector that 
> ignores has*. This can then be used together with Twitter's elephant-bird.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1361) table/partition level statistics

2010-09-16 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910470#action_12910470
 ] 

Namit Jain commented on HIVE-1361:
--

Will take a look 

> table/partition level statistics
> 
>
> Key: HIVE-1361
> URL: https://issues.apache.org/jira/browse/HIVE-1361
> Project: Hadoop Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Ning Zhang
>Assignee: Ahmed M Aly
> Fix For: 0.7.0
>
> Attachments: HIVE-1361.java_only.patch, HIVE-1361.patch, stats0.patch
>
>
> At the first step, we gather table-level stats for non-partitioned table and 
> partition-level stats for partitioned table. Future work could extend the 
> table level stats to partitioned table as well. 
> There are 3 major milestones in this subtask: 
>  1) extend the insert statement to gather table/partition level stats 
> on-the-fly.
>  2) extend metastore API to support storing and retrieving stats for a 
> particular table/partition. 
>  3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for 
> existing tables/partitions. 
> The proposed stats are:
> Partition-level stats: 
>   - number of rows
>   - total size in bytes
>   - number of files
>   - max, min, average row sizes
>   - max, min, average file sizes
> Table-level stats in addition to partition level stats:
>   - number of partitions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-16 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910455#action_12910455
 ] 

Ning Zhang commented on HIVE-1378:
--

Steven, there are conflicts when applying this patch. Can you regenerate it?

> Return value for map, array, and struct needs to return a string 
> -
>
> Key: HIVE-1378
> URL: https://issues.apache.org/jira/browse/HIVE-1378
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Drivers
>Reporter: Jerome Boulon
>Assignee: Steven Wong
> Fix For: 0.7.0
>
> Attachments: HIVE-1378.patch
>
>
> In order to be able to select/display any data from JDBC Hive driver, return 
> value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-16 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910444#action_12910444
 ] 

Ning Zhang commented on HIVE-1378:
--

Will take a look.

> Return value for map, array, and struct needs to return a string 
> -
>
> Key: HIVE-1378
> URL: https://issues.apache.org/jira/browse/HIVE-1378
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Drivers
>Reporter: Jerome Boulon
>Assignee: Steven Wong
> Fix For: 0.7.0
>
> Attachments: HIVE-1378.patch
>
>
> In order to be able to select/display any data from JDBC Hive driver, return 
> value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1649) Ability to update counters and status from TRANSFORM scripts

2010-09-16 Thread Carl Steinbach (JIRA)

Ability to update counters and status from TRANSFORM scripts


 Key: HIVE-1649
 URL: https://issues.apache.org/jira/browse/HIVE-1649
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach


Hadoop Streaming supports the ability to update counters and status by writing 
specially coded messages to the script's stderr stream.

A streaming process can use the stderr to emit counter information. 
{{reporter:counter:,,}} should be sent to stderr to 
update the counter.
A streaming process can use the stderr to emit status information. To set a 
status, {{reporter:status:}} should be sent to stderr.

Hive should support these same features with its TRANSFORM mechanism.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1633) CombineHiveInputFormat fails with "cannot find dir for emptyFile"

2010-09-16 Thread Amareshwari Sriramadasu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910435#action_12910435
 ] 

Amareshwari Sriramadasu commented on HIVE-1633:
---

Sorry If I misunderstood your comment. I looked for 
hdfs://xxx/.../hive_2010-09-07_12-15-00_299_4877141498303008976/-mr-10002/1/ in 
partToPartitionInfo shown in the exception. Only 
hdfs://xxx/.../hive_2010-09-07_12-15-00_299_4877141498303008976/-mr-10002/1/ 
appears. 
hdfs://xxx/.../hive_2010-09-07_12-15-00_299_4877141498303008976/-mr-10002/1/emptyFile
 does not appear in partToPartitionInfo. 

> CombineHiveInputFormat fails with "cannot find dir for emptyFile"
> -
>
> Key: HIVE-1633
> URL: https://issues.apache.org/jira/browse/HIVE-1633
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Clients
>Reporter: Amareshwari Sriramadasu
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

hive :NoSuchMethodEr ror‏

2010-09-16 Thread ZhangGang bertzhang


hello everyone,
 
hadoop:0.20.2
hive:0.5.0
how to solve the problem?
 
hive> create table test(AA STRING, BB STRING);
Exception in thread "main" java.lang.NoSuchMethodError: 
org.apache.commons.lang.StringUtils.endsWith(Ljava/lang/String;Ljava/lang/String;)Z
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:172)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 
help!

[jira] Commented: (HIVE-1633) CombineHiveInputFormat fails with "cannot find dir for emptyFile"

2010-09-16 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910431#action_12910431
 ] 

He Yongqiang commented on HIVE-1633:


so 'xxx' part is not the same in 
"hdfs://xxx/.../hive_2010-09-07_12-15-00_299_4877141498303008976/-mr-10002/1/" 
and 
"hdfs://xxx/.../hive_2010-09-07_12-15-00_299_4877141498303008976/-mr-10002/1/emptyFile"
?

> CombineHiveInputFormat fails with "cannot find dir for emptyFile"
> -
>
> Key: HIVE-1633
> URL: https://issues.apache.org/jira/browse/HIVE-1633
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Clients
>Reporter: Amareshwari Sriramadasu
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1633) CombineHiveInputFormat fails with "cannot find dir for emptyFile"

2010-09-16 Thread Amareshwari Sriramadasu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910430#action_12910430
 ] 

Amareshwari Sriramadasu commented on HIVE-1633:
---

It appears only once as 
"hdfs://xxx/.../hive_2010-09-07_12-15-00_299_4877141498303008976/-mr-10002/1/". 
there is no 
"hdfs://xxx/.../hive_2010-09-07_12-15-00_299_4877141498303008976/-mr-10002/1/emptyFile"

> CombineHiveInputFormat fails with "cannot find dir for emptyFile"
> -
>
> Key: HIVE-1633
> URL: https://issues.apache.org/jira/browse/HIVE-1633
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Clients
>Reporter: Amareshwari Sriramadasu
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1361) table/partition level statistics

2010-09-16 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1361:
-

Fix Version/s: 0.7.0
Affects Version/s: (was: 0.6.0)
  Component/s: Query Processor

> table/partition level statistics
> 
>
> Key: HIVE-1361
> URL: https://issues.apache.org/jira/browse/HIVE-1361
> Project: Hadoop Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Ning Zhang
>Assignee: Ahmed M Aly
> Fix For: 0.7.0
>
> Attachments: HIVE-1361.java_only.patch, HIVE-1361.patch, stats0.patch
>
>
> At the first step, we gather table-level stats for non-partitioned table and 
> partition-level stats for partitioned table. Future work could extend the 
> table level stats to partitioned table as well. 
> There are 3 major milestones in this subtask: 
>  1) extend the insert statement to gather table/partition level stats 
> on-the-fly.
>  2) extend metastore API to support storing and retrieving stats for a 
> particular table/partition. 
>  3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for 
> existing tables/partitions. 
> The proposed stats are:
> Partition-level stats: 
>   - number of rows
>   - total size in bytes
>   - number of files
>   - max, min, average row sizes
>   - max, min, average file sizes
> Table-level stats in addition to partition level stats:
>   - number of partitions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-33) [Hive]: Add ability to compute statistics on hive tables

2010-09-16 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-33:
---

Issue Type: New Feature  (was: Bug)

> [Hive]: Add ability to compute statistics on hive tables
> 
>
> Key: HIVE-33
> URL: https://issues.apache.org/jira/browse/HIVE-33
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Ashish Thusoo
>Assignee: Ahmed M Aly
>
> Add commands to collect partition and column level statistics in hive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1616) Add ProtocolBuffersStructObjectInspector

2010-09-16 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910421#action_12910421
 ] 

Namit Jain commented on HIVE-1616:
--

Will commit if the tests pass

> Add ProtocolBuffersStructObjectInspector
> 
>
> Key: HIVE-1616
> URL: https://issues.apache.org/jira/browse/HIVE-1616
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Johan Oskarsson
>Assignee: Johan Oskarsson
>Priority: Minor
> Attachments: HIVE-1616.patch
>
>
> Much like there is a ThriftStructObjectInspector that ignores the isset 
> booleans there is a need for a ProtocolBuffersStructObjectInspector that 
> ignores has*. This can then be used together with Twitter's elephant-bird.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1617) ScriptOperator's AutoProgressor can lead to an infinite loop

2010-09-16 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910419#action_12910419
 ] 

Namit Jain commented on HIVE-1617:
--

Otherwise it looks good

> ScriptOperator's AutoProgressor can lead to an infinite loop
> 
>
> Key: HIVE-1617
> URL: https://issues.apache.org/jira/browse/HIVE-1617
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Paul Yang
> Fix For: 0.7.0
>
> Attachments: HIVE-1617.1.patch
>
>
> In the default settings, the auto progressor can result in a infinite loop.
> There should be another configurable parameter which stops the auto progress 
> if the script has not made an progress.
> The default can be an hour or so - this way we will not get indefinitely stuck

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1617) ScriptOperator's AutoProgressor can lead to an infinite loop

2010-09-16 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1617:
-

Status: Open  (was: Patch Available)

> ScriptOperator's AutoProgressor can lead to an infinite loop
> 
>
> Key: HIVE-1617
> URL: https://issues.apache.org/jira/browse/HIVE-1617
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Paul Yang
> Fix For: 0.7.0
>
> Attachments: HIVE-1617.1.patch
>
>
> In the default settings, the auto progressor can result in a infinite loop.
> There should be another configurable parameter which stops the auto progress 
> if the script has not made an progress.
> The default can be an hour or so - this way we will not get indefinitely stuck

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1361) table/partition level statistics

2010-09-16 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1361:
-

Status: Patch Available  (was: Open)

> table/partition level statistics
> 
>
> Key: HIVE-1361
> URL: https://issues.apache.org/jira/browse/HIVE-1361
> Project: Hadoop Hive
>  Issue Type: Sub-task
>Affects Versions: 0.6.0
>Reporter: Ning Zhang
>Assignee: Ahmed M Aly
> Attachments: HIVE-1361.java_only.patch, HIVE-1361.patch, stats0.patch
>
>
> At the first step, we gather table-level stats for non-partitioned table and 
> partition-level stats for partitioned table. Future work could extend the 
> table level stats to partitioned table as well. 
> There are 3 major milestones in this subtask: 
>  1) extend the insert statement to gather table/partition level stats 
> on-the-fly.
>  2) extend metastore API to support storing and retrieving stats for a 
> particular table/partition. 
>  3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for 
> existing tables/partitions. 
> The proposed stats are:
> Partition-level stats: 
>   - number of rows
>   - total size in bytes
>   - number of files
>   - max, min, average row sizes
>   - max, min, average file sizes
> Table-level stats in addition to partition level stats:
>   - number of partitions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-474) Support for distinct selection on two or more columns

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910404#action_12910404
 ] 

John Sichi commented on HIVE-474:
-

We'll take a look at this one.

> Support for distinct selection on two or more columns
> -
>
> Key: HIVE-474
> URL: https://issues.apache.org/jira/browse/HIVE-474
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Alexis Rondeau
>Assignee: Mafish
> Attachments: hive-474.0.4.2rc.patch
>
>
> The ability to select distinct several, individual columns as by example: 
> select count(distinct user), count(distinct session) from actions;   
> Currently returns the following failure: 
> FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns 
> not Supported user

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1646) Hive 0.5 Build Crashing

2010-09-16 Thread Steven Wong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910400#action_12910400
 ] 

Steven Wong commented on HIVE-1646:
---

@Stephen I just looked at the saved output of my past ant runs. It hardly took 
any time. :)

Ant test will fail if any test case fails.


> Hive 0.5 Build Crashing
> ---
>
> Key: HIVE-1646
> URL: https://issues.apache.org/jira/browse/HIVE-1646
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.5.0
> Environment: SLES 10 SP2, SLES 11, RHEL 5.4, ANT 1.8.1 and ANT 1.7.1, 
> SUN JDK 1.6.14
>Reporter: Stephen Watt
> Fix For: 0.5.1
>
>
> I've tried this on a variety of configurations. Operating Systems SLES 10 
> SP2, SLES 11, RHEL 5.4 on a variety of machines using both ANT 1.8.1 and ANT 
> 1.7.1 and SUN JDK 1.6.14. I've tried building this by going to the Hive 
> Release page and download hive-0.5.0-src and using that. I've tried building 
> by obtaining the branch tag release using svn checkout 
> http://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.5.0/ 
> hive-0.5.0-dev. Always the same thing:
> When I run the Hive 0.5 build it runs for just under 2 hours and then crashes 
> with the following message (tail end of ant.log):
> - - -
> [junit] diff 
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build/ql/test/logs/negative/wrong_distinct2.q.out
>  
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/ql/src/test/results/compiler/errors/wrong_distinct2.q.out
> [junit] Done query: wrong_distinct2.q
> [junit] Tests run: 31, Failures: 0, Errors: 0, Time elapsed: 90.974 sec
> [junit] Running org.apache.hadoop.hive.ql.tool.TestLineageInfo
> [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.525 sec
> BUILD FAILED
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:151: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:91: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build-common.xml:327: Tests failed!
> Total time: 94 minutes 43 seconds
> - - -
> My build script is very simplistic :
> #!/bin/sh
> # Set Build Dependencies
> set PATH=$PATH:/home/hive/Java-Versions/jdk1.6.0_14/bin/
> export JAVA_HOME=/home/hive/Java-Versions/jdk1.6.0_14
> export BUILD_DIR=/home/hive/hive-0.5.0-build
> export ANT_HOME=$BUILD_DIR/apache-ant-1.8.1
> export HIVE_INSTALL=$BUILD_DIR/hive-0.5.0-dev/
> export PATH=$PATH:$ANT_HOME/bin
> # Run Build and Unit Test
> cd $HIVE_INSTALL
> ant clean test tar -logfile ant.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1264) Make Hive work with Hadoop security

2010-09-16 Thread HBase Review Board (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910389#action_12910389
 ] 

HBase Review Board commented on HIVE-1264:
--

Message from: "Carl Steinbach" 

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/860/#review1251
---

Ship it!


+1 Looks good to me.


build.properties


If this is the convention going forward then it's probably more appropriate 
to rename the old style as "oldstyle-name" instead of introducing a 
"newstyle-name".


- Carl





> Make Hive work with Hadoop security
> ---
>
> Key: HIVE-1264
> URL: https://issues.apache.org/jira/browse/HIVE-1264
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Jeff Hammerbacher
>Assignee: Todd Lipcon
> Attachments: hive-1264.txt, HiveHadoop20S_patch.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1617) ScriptOperator's AutoProgressor can lead to an infinite loop

2010-09-16 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910386#action_12910386
 ] 

Namit Jain commented on HIVE-1617:
--

Can you add a negative testcase - which times out 

> ScriptOperator's AutoProgressor can lead to an infinite loop
> 
>
> Key: HIVE-1617
> URL: https://issues.apache.org/jira/browse/HIVE-1617
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Paul Yang
> Fix For: 0.7.0
>
> Attachments: HIVE-1617.1.patch
>
>
> In the default settings, the auto progressor can result in a infinite loop.
> There should be another configurable parameter which stops the auto progress 
> if the script has not made an progress.
> The default can be an hour or so - this way we will not get indefinitely stuck

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Review Request: HIVE-1264. Shims for Hadoop 0.20 with security

2010-09-16 Thread Carl Steinbach


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/860/#review1251
---

Ship it!


+1 Looks good to me.


build.properties


If this is the convention going forward then it's probably more appropriate 
to rename the old style as "oldstyle-name" instead of introducing a 
"newstyle-name".


- Carl


On 2010-09-16 15:35:00, Todd Lipcon wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/860/
> ---
> 
> (Updated 2010-09-16 15:35:00)
> 
> 
> Review request for Hive Developers and John Sichi.
> 
> 
> Summary
> ---
> 
> Adds a shim layer for secure Hadoop, currently pulling a secure CDH3b3 
> prerelease snapshot
> 
> 
> This addresses bug HIVE-1264.
> http://issues.apache.org/jira/browse/HIVE-1264
> 
> 
> Diffs
> -
> 
>   build-common.xml 0b76688 
>   build.properties 3e392f7 
>   build.xml 4b345b5 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d2c7123 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java b2966de 
>   shims/build.xml b339871 
>   shims/ivy.xml de56e4f 
>   shims/src/0.17/java/org/apache/hadoop/hive/shims/Hadoop17Shims.java 17110ab 
>   shims/src/0.18/java/org/apache/hadoop/hive/shims/Hadoop18Shims.java 9cc8d56 
>   shims/src/0.19/java/org/apache/hadoop/hive/shims/Hadoop19Shims.java c643108 
>   shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 0675a79 
>   shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
> PRE-CREATION 
>   shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java 
> PRE-CREATION 
>   shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java 4310942 
>   shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java c847d69 
> 
> Diff: http://review.cloudera.org/r/860/diff
> 
> 
> Testing
> ---
> 
> Able to run MR jobs on our secure cluster with standalone (ie no separate 
> metastore, etc)
> 
> 
> Thanks,
> 
> Todd
> 
>

[jira] Commented: (HIVE-1264) Make Hive work with Hadoop security

2010-09-16 Thread HBase Review Board (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910381#action_12910381
 ] 

HBase Review Board commented on HIVE-1264:
--

Message from: "Todd Lipcon" 

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/860/
---

Review request for Hive Developers and John Sichi.


Summary
---

Adds a shim layer for secure Hadoop, currently pulling a secure CDH3b3 
prerelease snapshot


This addresses bug HIVE-1264.
http://issues.apache.org/jira/browse/HIVE-1264


Diffs
-

  build-common.xml 0b76688 
  build.properties 3e392f7 
  build.xml 4b345b5 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d2c7123 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java b2966de 
  shims/build.xml b339871 
  shims/ivy.xml de56e4f 
  shims/src/0.17/java/org/apache/hadoop/hive/shims/Hadoop17Shims.java 17110ab 
  shims/src/0.18/java/org/apache/hadoop/hive/shims/Hadoop18Shims.java 9cc8d56 
  shims/src/0.19/java/org/apache/hadoop/hive/shims/Hadoop19Shims.java c643108 
  shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 0675a79 
  shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
PRE-CREATION 
  shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java 
PRE-CREATION 
  shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java 4310942 
  shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java c847d69 

Diff: http://review.cloudera.org/r/860/diff


Testing
---

Able to run MR jobs on our secure cluster with standalone (ie no separate 
metastore, etc)


Thanks,

Todd




> Make Hive work with Hadoop security
> ---
>
> Key: HIVE-1264
> URL: https://issues.apache.org/jira/browse/HIVE-1264
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Jeff Hammerbacher
>Assignee: Todd Lipcon
> Attachments: hive-1264.txt, HiveHadoop20S_patch.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1361) table/partition level statistics

2010-09-16 Thread HBase Review Board (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910380#action_12910380
 ] 

HBase Review Board commented on HIVE-1361:
--

Message from: "Carl Steinbach" 

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/862/
---

Review request for Hive Developers.


Summary
---

HIVE-1361


This addresses bug HIVE-1361.
http://issues.apache.org/jira/browse/HIVE-1361


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 997199 
  
trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsAggregator.java
 PRE-CREATION 
  
trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsPublisher.java
 PRE-CREATION 
  
trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsSetupConstants.java
 PRE-CREATION 
  trunk/ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JobCloseFeedBack.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Stat.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsAggregator.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsAggregator.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsFactory.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsFactory.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsPublisher.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsPublisher.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsSetupConst.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsSetupConst.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsSetupConstants.java
 PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsSetupConstants.java
 PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsSetupConstants.java
 PRE-CREATION 

Diff: http://review.cloudera.org/r/862/diff


Testing
---


Thanks,

Carl




> table/partition level statistics
> 
>
> Key: HIVE-1361
> URL: https://issues.apache.org/jira/browse/HIVE-1361
> Project: Hadoop Hive
>  Issue Type: Sub-task
>Affects Versions: 0.6.0
>Reporter: Ning Zhang
>Assignee: Ahmed M Aly
> Attachments: HIVE-1361.java_only.patch, HIVE-13

[jira] Commented: (HIVE-1264) Make Hive work with Hadoop security

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910377#action_12910377
 ] 

John Sichi commented on HIVE-1264:
--

See here for existing checksum file conventions:

http://mirror.facebook.net/facebook/hive-deps/hadoop/core/


> Make Hive work with Hadoop security
> ---
>
> Key: HIVE-1264
> URL: https://issues.apache.org/jira/browse/HIVE-1264
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Jeff Hammerbacher
>Assignee: Todd Lipcon
> Attachments: hive-1264.txt, HiveHadoop20S_patch.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1264) Make Hive work with Hadoop security

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910376#action_12910376
 ] 

John Sichi commented on HIVE-1264:
--

OK, can you add a checksum file to your directory, and then I'll ask our ops to 
create the mirror?  Once that's done, we'll need one more patch which 
references the FB location as default.


> Make Hive work with Hadoop security
> ---
>
> Key: HIVE-1264
> URL: https://issues.apache.org/jira/browse/HIVE-1264
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Jeff Hammerbacher
>Assignee: Todd Lipcon
> Attachments: hive-1264.txt, HiveHadoop20S_patch.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1615) Web Interface JSP needs Refactoring for removed meta store methods

2010-09-16 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1615:
-

Fix Version/s: 0.6.0
Affects Version/s: (was: 0.6.0)
   (was: 0.7.0)

> Web Interface JSP needs Refactoring for removed meta store methods
> --
>
> Key: HIVE-1615
> URL: https://issues.apache.org/jira/browse/HIVE-1615
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Web UI
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Blocker
> Fix For: 0.6.0, 0.7.0
>
> Attachments: hive-1615.patch.2.txt, hive-1615.patch.txt
>
>
> Some meta store methods being called from JSP have been removed. Really 
> should prioritize compiling jsp into servlet code again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1615) Web Interface JSP needs Refactoring for removed meta store methods

2010-09-16 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910375#action_12910375
 ] 

Carl Steinbach commented on HIVE-1615:
--

This needs to be backported to 0.6. I verified that hive-1615.patch.2.txt 
applies cleanly to the 0.6 branch.

> Web Interface JSP needs Refactoring for removed meta store methods
> --
>
> Key: HIVE-1615
> URL: https://issues.apache.org/jira/browse/HIVE-1615
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: hive-1615.patch.2.txt, hive-1615.patch.txt
>
>
> Some meta store methods being called from JSP have been removed. Really 
> should prioritize compiling jsp into servlet code again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1615) Web Interface JSP needs Refactoring for removed meta store methods

2010-09-16 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1615:
-

Affects Version/s: 0.7.0

> Web Interface JSP needs Refactoring for removed meta store methods
> --
>
> Key: HIVE-1615
> URL: https://issues.apache.org/jira/browse/HIVE-1615
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: hive-1615.patch.2.txt, hive-1615.patch.txt
>
>
> Some meta store methods being called from JSP have been removed. Really 
> should prioritize compiling jsp into servlet code again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Review Request: HIVE-1361: table/partition level statistics

2010-09-16 Thread Carl Steinbach


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/862/
---

Review request for Hive Developers.


Summary
---

HIVE-1361


This addresses bug HIVE-1361.
http://issues.apache.org/jira/browse/HIVE-1361


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 997199 
  
trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsAggregator.java
 PRE-CREATION 
  
trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsPublisher.java
 PRE-CREATION 
  
trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsSetupConstants.java
 PRE-CREATION 
  trunk/ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JobCloseFeedBack.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Stat.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 997199 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsAggregator.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsAggregator.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsFactory.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsFactory.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsPublisher.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsPublisher.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsSetupConst.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsSetupConst.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsSetupConstants.java
 PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsSetupConstants.java
 PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsSetupConstants.java
 PRE-CREATION 

Diff: http://review.cloudera.org/r/862/diff


Testing
---


Thanks,

Carl

[jira] Commented: (HIVE-1264) Make Hive work with Hadoop security

2010-09-16 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910373#action_12910373
 ] 

Todd Lipcon commented on HIVE-1264:
---

Submitted to RB: https://review.cloudera.org/r/860/

Regarding the snapshot - it's fine by me to pull from there, I think the 
people.apache.org web server is reasonably stable. If it turns out to be flaky 
it's also cool if you want to mirror it - FB is probably more reliable than ASF.

> Make Hive work with Hadoop security
> ---
>
> Key: HIVE-1264
> URL: https://issues.apache.org/jira/browse/HIVE-1264
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Jeff Hammerbacher
>Assignee: Todd Lipcon
> Attachments: hive-1264.txt, HiveHadoop20S_patch.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1615) Web Interface JSP needs Refactoring for removed meta store methods

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910372#action_12910372
 ] 

John Sichi commented on HIVE-1615:
--

+1.  WIll commit when tests pass.


> Web Interface JSP needs Refactoring for removed meta store methods
> --
>
> Key: HIVE-1615
> URL: https://issues.apache.org/jira/browse/HIVE-1615
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 0.6.0
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: hive-1615.patch.2.txt, hive-1615.patch.txt
>
>
> Some meta store methods being called from JSP have been removed. Really 
> should prioritize compiling jsp into servlet code again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-16 Thread John Sichi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1364:
-

Status: Open  (was: Patch Available)

Canceling patch since I think we should widen TYPE_NAME and also drop it from 
the PRIMARY KEY on the COLUMNS table.

> Increase the maximum length of SERDEPROPERTIES values (currently 767 
> characters)
> 
>
> Key: HIVE-1364
> URL: https://issues.apache.org/jira/browse/HIVE-1364
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.5.0
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.6.0, 0.7.0
>
> Attachments: HIVE-1364.2.patch.txt, HIVE-1364.patch
>
>
> The value component of a SERDEPROPERTIES key/value pair is currently limited
> to a maximum length of 767 characters. I believe that the motivation for 
> limiting the length to 
> 767 characters is that this value is the maximum allowed length of an index in
> a MySQL database running on the InnoDB engine: 
> http://bugs.mysql.com/bug.php?id=13315
> * The Metastore OR mapping currently limits many fields (including 
> SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
> the fact that these fields are not indexed.
> * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
> * We can expect many users to hit the 767 character limit on 
> SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
> serdeproperty to map a table that has many columns.
> I propose increasing the maximum allowed length of 
> SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1264) Make Hive work with Hadoop security

2010-09-16 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-1264:
--

   Status: Patch Available  (was: Open)
Affects Version/s: 0.7.0

> Make Hive work with Hadoop security
> ---
>
> Key: HIVE-1264
> URL: https://issues.apache.org/jira/browse/HIVE-1264
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Jeff Hammerbacher
>Assignee: Todd Lipcon
> Attachments: hive-1264.txt, HiveHadoop20S_patch.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Review Request: HIVE-1264. Shims for Hadoop 0.20 with security

2010-09-16 Thread Todd Lipcon


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/860/
---

Review request for Hive Developers and John Sichi.


Summary
---

Adds a shim layer for secure Hadoop, currently pulling a secure CDH3b3 
prerelease snapshot


This addresses bug HIVE-1264.
http://issues.apache.org/jira/browse/HIVE-1264


Diffs
-

  build-common.xml 0b76688 
  build.properties 3e392f7 
  build.xml 4b345b5 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d2c7123 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java b2966de 
  shims/build.xml b339871 
  shims/ivy.xml de56e4f 
  shims/src/0.17/java/org/apache/hadoop/hive/shims/Hadoop17Shims.java 17110ab 
  shims/src/0.18/java/org/apache/hadoop/hive/shims/Hadoop18Shims.java 9cc8d56 
  shims/src/0.19/java/org/apache/hadoop/hive/shims/Hadoop19Shims.java c643108 
  shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 0675a79 
  shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
PRE-CREATION 
  shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java 
PRE-CREATION 
  shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java 4310942 
  shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java c847d69 

Diff: http://review.cloudera.org/r/860/diff


Testing
---

Able to run MR jobs on our secure cluster with standalone (ie no separate 
metastore, etc)


Thanks,

Todd

[jira] Updated: (HIVE-1648) Automatically gathering stats when reading a table/partition

2010-09-16 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1648:
-

Parent: HIVE-33
Issue Type: Sub-task  (was: New Feature)

> Automatically gathering stats when reading a table/partition
> 
>
> Key: HIVE-1648
> URL: https://issues.apache.org/jira/browse/HIVE-1648
> Project: Hadoop Hive
>  Issue Type: Sub-task
>Reporter: Ning Zhang
>
> HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to 
> gathering stats. This requires additional scan of the data. Stats gathering 
> can be piggy-backed on TableScanOperator whenever a table/partition is 
> scanned (given not LIMIT operator). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910361#action_12910361
 ] 

John Sichi commented on HIVE-1364:
--

(Actually TYPE_NAME is what was needed for HIVE-1632, not FTYPE, but I think we 
should address both.)


> Increase the maximum length of SERDEPROPERTIES values (currently 767 
> characters)
> 
>
> Key: HIVE-1364
> URL: https://issues.apache.org/jira/browse/HIVE-1364
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.5.0
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.6.0, 0.7.0
>
> Attachments: HIVE-1364.2.patch.txt, HIVE-1364.patch
>
>
> The value component of a SERDEPROPERTIES key/value pair is currently limited
> to a maximum length of 767 characters. I believe that the motivation for 
> limiting the length to 
> 767 characters is that this value is the maximum allowed length of an index in
> a MySQL database running on the InnoDB engine: 
> http://bugs.mysql.com/bug.php?id=13315
> * The Metastore OR mapping currently limits many fields (including 
> SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
> the fact that these fields are not indexed.
> * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
> * We can expect many users to hit the 767 character limit on 
> SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
> serdeproperty to map a table that has many columns.
> I propose increasing the maximum allowed length of 
> SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1648) Automatically gathering stats when reading a table/partition

2010-09-16 Thread Ning Zhang (JIRA)

Automatically gathering stats when reading a table/partition


 Key: HIVE-1648
 URL: https://issues.apache.org/jira/browse/HIVE-1648
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Ning Zhang


HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to 
gathering stats. This requires additional scan of the data. Stats gathering can 
be piggy-backed on TableScanOperator whenever a table/partition is scanned 
(given not LIMIT operator). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-33) [Hive]: Add ability to compute statistics on hive tables

2010-09-16 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-33?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910358#action_12910358
 ] 

Ning Zhang commented on HIVE-33:


Patches for HIVE-1361 are ready for review. Comments are welcome!

> [Hive]: Add ability to compute statistics on hive tables
> 
>
> Key: HIVE-33
> URL: https://issues.apache.org/jira/browse/HIVE-33
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Ashish Thusoo
>Assignee: Ahmed M Aly
>
> Add commands to collect partition and column level statistics in hive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1361) table/partition level statistics

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910350#action_12910350
 ] 

John Sichi commented on HIVE-1361:
--

Yay for Java-only patch :)

> table/partition level statistics
> 
>
> Key: HIVE-1361
> URL: https://issues.apache.org/jira/browse/HIVE-1361
> Project: Hadoop Hive
>  Issue Type: Sub-task
>Affects Versions: 0.6.0
>Reporter: Ning Zhang
>Assignee: Ahmed M Aly
> Attachments: HIVE-1361.java_only.patch, HIVE-1361.patch, stats0.patch
>
>
> At the first step, we gather table-level stats for non-partitioned table and 
> partition-level stats for partitioned table. Future work could extend the 
> table level stats to partitioned table as well. 
> There are 3 major milestones in this subtask: 
>  1) extend the insert statement to gather table/partition level stats 
> on-the-fly.
>  2) extend metastore API to support storing and retrieving stats for a 
> particular table/partition. 
>  3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for 
> existing tables/partitions. 
> The proposed stats are:
> Partition-level stats: 
>   - number of rows
>   - total size in bytes
>   - number of files
>   - max, min, average row sizes
>   - max, min, average file sizes
> Table-level stats in addition to partition level stats:
>   - number of partitions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1361) table/partition level statistics

2010-09-16 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1361:
-

Attachment: HIVE-1361.patch
HIVE-1361.java_only.patch

Uploading a full version (HIVE-1361.patch) and a Java code only version 
(HIVE-1361.java_only.patch). 

This patch is based on Ahmed's previous patch and implements the following 
feature:
  1) automatically gather stats (number of rows currently) whenever an INSERT 
OVERWRITE TABLE is issued. Each mapper/reducer push their partial stats to 
either MySQL/Derby through JDBC or HBase. The INSERT OVERWRITE statement could 
be anything include dynamic partition insert, multi-table inserts and inserting 
to bucketized partitions. A StatsTask is responsible for aggregating partial 
stats at the end of the query and update the metastore.
  2) The stats of a table/partition can be exposed to the user by 'DESC 
EXTENDED' to the table/partition. They are stored as the storage parameters 
(numRows, nuFiles, numPartitions). 
  3) Introducing a new command 'ANALYZE TABLE [PARTITION (PARTITION SPEC)] 
COMPUTE STATISTICS' to scan the table/partition and gather stats in a similar 
fashion as INSERT OVERWRITE command except that the plan has only 1 MR job 
consisting a TableScanOperator and a StatsTask. Partition spec could be full 
partition spec or partial partition spec similar to what dynamic partition 
insert uses. This allows the user to analyze a subset/all partitions of a 
table. The resulting stats are stored in the same parameter in the meatstore.

Tested locally (unit tests) for JDBC:derby, hbase and on a cluster with 
JDBC:MySQL. 

Will run the full unit tests again. 

> table/partition level statistics
> 
>
> Key: HIVE-1361
> URL: https://issues.apache.org/jira/browse/HIVE-1361
> Project: Hadoop Hive
>  Issue Type: Sub-task
>Affects Versions: 0.6.0
>Reporter: Ning Zhang
>Assignee: Ahmed M Aly
> Attachments: HIVE-1361.java_only.patch, HIVE-1361.patch, stats0.patch
>
>
> At the first step, we gather table-level stats for non-partitioned table and 
> partition-level stats for partitioned table. Future work could extend the 
> table level stats to partitioned table as well. 
> There are 3 major milestones in this subtask: 
>  1) extend the insert statement to gather table/partition level stats 
> on-the-fly.
>  2) extend metastore API to support storing and retrieving stats for a 
> particular table/partition. 
>  3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for 
> existing tables/partitions. 
> The proposed stats are:
> Partition-level stats: 
>   - number of rows
>   - total size in bytes
>   - number of files
>   - max, min, average row sizes
>   - max, min, average file sizes
> Table-level stats in addition to partition level stats:
>   - number of partitions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1264) Make Hive work with Hadoop security

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910346#action_12910346
 ] 

John Sichi commented on HIVE-1264:
--

Hey Todd,

I think you mentioned at the contributor meetup the other day that this is 
ready for review.  If so, click the Submit Patch button and create a 
reviewboard entry.

I just did a quick check to verify that I could apply the patch and run ant 
package+test (without changing hadoop.version) and ivy was able to fetch the 
dependency from your snapshot successfully.

If we commit it as is, every hive developer is going to automatically start 
pulling from that snapshot by default.  Is that OK, or should I try to get a 
copy hosted at http://mirror.facebook.net/facebook/hive-deps?


> Make Hive work with Hadoop security
> ---
>
> Key: HIVE-1264
> URL: https://issues.apache.org/jira/browse/HIVE-1264
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Jeff Hammerbacher
>Assignee: Todd Lipcon
> Attachments: hive-1264.txt, HiveHadoop20S_patch.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1534) Join filters do not work correctly with outer joins

2010-09-16 Thread John Sichi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1534:
-

Status: Open  (was: Patch Available)

> Join filters do not work correctly with outer joins
> ---
>
> Key: HIVE-1534
> URL: https://issues.apache.org/jira/browse/HIVE-1534
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-1534-1.txt, patch-1534.txt
>
>
>  SELECT * FROM T1 LEFT OUTER JOIN T2 ON (T1.c1=T2.c2 AND T1.c1 < 10)
> and  SELECT * FROM T1 RIGHT OUTER JOIN T2 ON (T1.c1=T2.c2 AND T2.c1 < 10)
> do not give correct results.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Fw: Access to JobClient

2010-09-16 Thread gaurav jain

Hi,



Do I have access to hadoop counters or JobClient if I am using a hive query 
either thru HIVE CLI or HIVE Java API?


-- Looked into HIVE code base and looks like there are no 
public 
interfaces. ExecDriver* classes maintains and use that information locally. 

   -- I need that for my internal reporting purposes.



Please suggest a way to accomplish the above tasks.


Regards,
Gaurav Jain

[jira] Commented: (HIVE-1342) Predicate push down get error result when sub-queries have the same alias name

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910339#action_12910339
 ] 

John Sichi commented on HIVE-1342:
--

For easy copy/paste into CLI, here are the three queries by themselves.

{noformat}
-- Q1
explain
SELECT a.foo as foo1, b.foo as foo2, b.bar
FROM pokes a LEFT OUTER JOIN pokes2 b
ON a.foo=b.foo
WHERE b.bar=3;

-- Q2
explain
SELECT * FROM
(SELECT a.foo as foo1, b.foo as foo2, b.bar
FROM pokes a LEFT OUTER JOIN pokes2 b
ON a.foo=b.foo) a
WHERE a.bar=3;

-- Q3
explain
SELECT * FROM
(SELECT a.foo as foo1, b.foo as foo2, a.bar
FROM pokes a JOIN pokes2 b
ON a.foo=b.foo) a
WHERE a.bar=3;
{noformat}


> Predicate push down get error result when sub-queries have the same alias 
> name 
> ---
>
> Key: HIVE-1342
> URL: https://issues.apache.org/jira/browse/HIVE-1342
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Ted Xu
>Assignee: Ted Xu
>Priority: Critical
> Fix For: 0.7.0
>
> Attachments: cmd.hql, explain, ppd_same_alias_1.patch, 
> ppd_same_alias_2.patch
>
>
> Query is over-optimized by PPD when sub-queries have the same alias name, see 
> the query:
> ---
> create table if not exists dm_fact_buyer_prd_info_d (
>   category_id string
>   ,gmv_trade_num  int
>   ,user_idint
>   )
> PARTITIONED BY (ds int);
> set hive.optimize.ppd=true;
> set hive.map.aggr=true;
> explain select category_id1,category_id2,assoc_idx
> from (
>   select 
>   category_id1
>   , category_id2
>   , count(distinct user_id) as assoc_idx
>   from (
>   select 
>   t1.category_id as category_id1
>   , t2.category_id as category_id2
>   , t1.user_id
>   from (
>   select category_id, user_id
>   from dm_fact_buyer_prd_info_d
>   group by category_id, user_id ) t1
>   join (
>   select category_id, user_id
>   from dm_fact_buyer_prd_info_d
>   group by category_id, user_id ) t2 on 
> t1.user_id=t2.user_id 
>   ) t1
>   group by category_id1, category_id2 ) t_o
>   where category_id1 <> category_id2
>   and assoc_idx > 2;
> -
> The query above will fail when execute, throwing exception: "can not cast 
> UDFOpNotEqual(Text, IntWritable) to UDFOpNotEqual(Text, Text)". 
> I explained the query and the execute plan looks really wired ( only Stage-1, 
> see the highlighted predicate):
> ---
> Stage: Stage-1
> Map Reduce
>   Alias -> Map Operator Tree:
> t_o:t1:t1:dm_fact_buyer_prd_info_d 
>   TableScan
> alias: dm_fact_buyer_prd_info_d
> Filter Operator
>   predicate:
>   expr: *(category_id <> user_id)*
>   type: boolean
>   Select Operator
> expressions:
>   expr: category_id
>   type: string
>   expr: user_id
>   type: bigint
> outputColumnNames: category_id, user_id
> Group By Operator
>   keys:
> expr: category_id
> type: string
> expr: user_id
> type: bigint
>   mode: hash
>   outputColumnNames: _col0, _col1
>   Reduce Output Operator
> key expressions:
>   expr: _col0
>   type: string
>   expr: _col1
>   type: bigint
> sort order: ++
> Map-reduce partition columns:
>   expr: _col0
>   type: string
>   expr: _col1
>   type: bigint
> tag: -1
>   Reduce Operator Tree:
> Group By Operator
>   keys:
> expr: KEY._col0
> type: string
> expr: KEY._col1
> type: bigint
>   mode: mergepartial
>   outputColumnNames: _col0, _col1
>   Select Operator
> expressions:
>

[jira] Commented: (HIVE-1342) Predicate push down get error result when sub-queries have the same alias name

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910336#action_12910336
 ] 

John Sichi commented on HIVE-1342:
--

Finallly got back to this one.  Let me provide some specific examples to better 
explain what I wrote.

First, latest trunk without any patch.

{noformat}
-- Q1.trunk:  Without a nested select, the plan is correct for this query.
-- (we're not allowed to push filter down into null-generating side of outer 
join)
hive> explain
> SELECT a.foo as foo1, b.foo as foo2, b.bar
> FROM pokes a LEFT OUTER JOIN pokes2 b
> ON a.foo=b.foo
> WHERE b.bar=3;
OK
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_LEFTOUTERJOIN (TOK_TABREF pokes a) (TOK_TABREF 
pokes2 b) (= (. (TOK_TABLE_OR_COL a) foo) (. (TOK_TABLE_OR_COL b) foo 
(TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR 
(. (TOK_TABLE_OR_COL a) foo) foo1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) foo) 
foo2) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) bar))) (TOK_WHERE (= (. 
(TOK_TABLE_OR_COL b) bar) 3

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Alias -> Map Operator Tree:
a 
  TableScan
alias: a
Reduce Output Operator
  key expressions:
expr: foo
type: int
  sort order: +
  Map-reduce partition columns:
expr: foo
type: int
  tag: 0
  value expressions:
expr: foo
type: int
b 
  TableScan
alias: b
Reduce Output Operator
  key expressions:
expr: foo
type: int
  sort order: +
  Map-reduce partition columns:
expr: foo
type: int
  tag: 1
  value expressions:
expr: foo
type: int
expr: bar
type: string
  Reduce Operator Tree:
Join Operator
  condition map:
   Left Outer Join0 to 1
  condition expressions:
0 {VALUE._col0}
1 {VALUE._col0} {VALUE._col1}
  handleSkewJoin: false
  outputColumnNames: _col0, _col4, _col5
  Filter Operator
predicate:
expr: (_col5 = 3)
type: boolean
Select Operator
  expressions:
expr: _col0
type: int
expr: _col4
type: int
expr: _col5
type: string
  outputColumnNames: _col0, _col1, _col2
  File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

  Stage: Stage-0
Fetch Operator
  limit: -1

-- Q2.trunk:  For this equivalent query written using a nested select, the plan 
is incorrect.
-- (filter got pushed down when it shouldn't; note that in the wrapping select, 
a.bar should resolve to b.bar in the nested select)
hive> explain
> SELECT * FROM
> (SELECT a.foo as foo1, b.foo as foo2, b.bar
> FROM pokes a LEFT OUTER JOIN pokes2 b
> ON a.foo=b.foo) a
> WHERE a.bar=3;
OK
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_LEFTOUTERJOIN 
(TOK_TABREF pokes a) (TOK_TABREF pokes2 b) (= (. (TOK_TABLE_OR_COL a) foo) (. 
(TOK_TABLE_OR_COL b) foo (TOK_INSERT (TOK_DESTINATION (TOK_DIR 
TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL a) foo) foo1) 
(TOK_SELEXPR (. (TOK_TABLE_OR_COL b) foo) foo2) (TOK_SELEXPR (. 
(TOK_TABLE_OR_COL b) bar) a)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR 
TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (= (. 
(TOK_TABLE_OR_COL a) bar) 3

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Alias -> Map Operator Tree:
a:a 
  TableScan
alias: a
Reduce Output Operator
  key expressions:
expr: foo
type: int
  sort order: +
  Map-reduce partition columns:
expr: foo
type: int
  tag: 0
  value expressions:
expr: foo
type: int
a:b 
  TableScan
alias: b
Filter Operator
  predicate:
  expr: (bar = 3)
  type: boolean

[jira] Commented: (HIVE-1646) Hive 0.5 Build Crashing

2010-09-16 Thread Stephen Watt (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910335#action_12910335
 ] 

Stephen Watt commented on HIVE-1646:


@Steven Many thanks for taking the time to reproduce the error ! I greatly 
appreciate it.

I ran the ant targets individually and like yourself, the "package" and "tar" 
targets works just fine for me also. I am also seeing the failure occurring in 
the "test" target. I did a bit more digging and ran the dependency targets for 
"test" (clean-test and jar) and they completed successfully as well. This 
leaves me to believe that the build failure is occurring with the  command below:

  


  

> Hive 0.5 Build Crashing
> ---
>
> Key: HIVE-1646
> URL: https://issues.apache.org/jira/browse/HIVE-1646
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.5.0
> Environment: SLES 10 SP2, SLES 11, RHEL 5.4, ANT 1.8.1 and ANT 1.7.1, 
> SUN JDK 1.6.14
>Reporter: Stephen Watt
> Fix For: 0.5.1
>
>
> I've tried this on a variety of configurations. Operating Systems SLES 10 
> SP2, SLES 11, RHEL 5.4 on a variety of machines using both ANT 1.8.1 and ANT 
> 1.7.1 and SUN JDK 1.6.14. I've tried building this by going to the Hive 
> Release page and download hive-0.5.0-src and using that. I've tried building 
> by obtaining the branch tag release using svn checkout 
> http://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.5.0/ 
> hive-0.5.0-dev. Always the same thing:
> When I run the Hive 0.5 build it runs for just under 2 hours and then crashes 
> with the following message (tail end of ant.log):
> - - -
> [junit] diff 
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build/ql/test/logs/negative/wrong_distinct2.q.out
>  
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/ql/src/test/results/compiler/errors/wrong_distinct2.q.out
> [junit] Done query: wrong_distinct2.q
> [junit] Tests run: 31, Failures: 0, Errors: 0, Time elapsed: 90.974 sec
> [junit] Running org.apache.hadoop.hive.ql.tool.TestLineageInfo
> [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.525 sec
> BUILD FAILED
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:151: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:91: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build-common.xml:327: Tests failed!
> Total time: 94 minutes 43 seconds
> - - -
> My build script is very simplistic :
> #!/bin/sh
> # Set Build Dependencies
> set PATH=$PATH:/home/hive/Java-Versions/jdk1.6.0_14/bin/
> export JAVA_HOME=/home/hive/Java-Versions/jdk1.6.0_14
> export BUILD_DIR=/home/hive/hive-0.5.0-build
> export ANT_HOME=$BUILD_DIR/apache-ant-1.8.1
> export HIVE_INSTALL=$BUILD_DIR/hive-0.5.0-dev/
> export PATH=$PATH:$ANT_HOME/bin
> # Run Build and Unit Test
> cd $HIVE_INSTALL
> ant clean test tar -logfile ant.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1617) ScriptOperator's AutoProgressor can lead to an infinite loop

2010-09-16 Thread Paul Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-1617:


Attachment: HIVE-1617.1.patch

This patch can be tested with the following commands:

set hive.auto.progress.timeout=1200;
set hive.exec.script.allow.partial.consumption=true;
set hive.exec.mode.local.auto=false;
set hive.script.auto.progress=true;
select transform(*) using 'sleep 3600' from src;

The logs should show reporter messages at regular intervals, up to the 20 
minute mark. Then the task should time out.

> ScriptOperator's AutoProgressor can lead to an infinite loop
> 
>
> Key: HIVE-1617
> URL: https://issues.apache.org/jira/browse/HIVE-1617
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Paul Yang
> Fix For: 0.7.0
>
> Attachments: HIVE-1617.1.patch
>
>
> In the default settings, the auto progressor can result in a infinite loop.
> There should be another configurable parameter which stops the auto progress 
> if the script has not made an progress.
> The default can be an hour or so - this way we will not get indefinitely stuck

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1617) ScriptOperator's AutoProgressor can lead to an infinite loop

2010-09-16 Thread Paul Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-1617:


Status: Patch Available  (was: Open)

> ScriptOperator's AutoProgressor can lead to an infinite loop
> 
>
> Key: HIVE-1617
> URL: https://issues.apache.org/jira/browse/HIVE-1617
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Paul Yang
> Fix For: 0.7.0
>
> Attachments: HIVE-1617.1.patch
>
>
> In the default settings, the auto progressor can result in a infinite loop.
> There should be another configurable parameter which stops the auto progress 
> if the script has not made an progress.
> The default can be an hour or so - this way we will not get indefinitely stuck

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1634) Allow access to Primitive types stored in binary format in HBase

2010-09-16 Thread HBase Review Board (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910312#action_12910312
 ] 

HBase Review Board commented on HIVE-1634:
--

Message from: "John Sichi" 

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/826/#review1247
---



trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java


We keep adding new List data members.  Probably time to move to a single 
List, with a new class ColumnMapping with fields for familyName, 
familyNameBytes, qualifierName, qualifierNameBytes, familyBinary, 
qualifierBinary.  That will be a lot cleaner and also allow you to avoid the 
boolean [] here, which is a little clumsy.



trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java


Doesn't this error message need to change?



trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java


I don't understand these TODO's.



trunk/hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java


Why is this assertion commented out?


- John





> Allow access to Primitive types stored in binary format in HBase
> 
>
> Key: HIVE-1634
> URL: https://issues.apache.org/jira/browse/HIVE-1634
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.7.0
>Reporter: Basab Maulik
>Assignee: Basab Maulik
> Attachments: HIVE-1634.0.patch, TestHiveHBaseExternalTable.java
>
>
> This addresses HIVE-1245 in part, for atomic or primitive types.
> The serde property "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" is a 
> specification of the storage option for the corresponding column in the serde 
> property "hbase.columns.mapping". Allowed values are '-' for table default, 
> 's' for standard string storage, and 'b' for binary storage as would be 
> obtained from o.a.h.hbase.utils.Bytes. Map types for HBase column families 
> use a colon separated pair such as 's:b' for the key and value part 
> specifiers respectively. See the test cases and queries for HBase handler for 
> additional examples.
> There is also a table property "hbase.table.default.storage.type" = "string" 
> to specify a table level default storage type. The other valid specification 
> is "binary". The table level default is overridden by a column level 
> specification.
> This control is available for the boolean, tinyint, smallint, int, bigint, 
> float, and double primitive types. The attached patch also relaxes the 
> mapping of map types to HBase column families to allow any primitive type to 
> be the map key.
> Attached is a program for creating a table and populating it in HBase. The 
> external table in Hive can access the data as shown in the example below.
> hive> create external table TestHiveHBaseExternalTable
> > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
> >  c_int int, c_long bigint, c_string string, c_float float, c_double 
> double)
> >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> >  with serdeproperties ("hbase.columns.mapping" = 
> ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double")
> >  tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
> OK
> Time taken: 0.691 seconds
> hive> select * from TestHiveHBaseExternalTable;
> OK
> key-1 NULLNULLNULLNULLNULLTest-String NULLNULL
> Time taken: 0.346 seconds
> hive> drop table TestHiveHBaseExternalTable;
> OK
> Time taken: 0.139 seconds
> hive> create external table TestHiveHBaseExternalTable
> > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
> >  c_int int, c_long bigint, c_string string, c_float float, c_double 
> double)
> >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> >  with serdeproperties (
> >  "hbase.columns.mapping" = 
> ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
> >  "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" )
> >  tblproperties (
> >  "hbase.table.name" = "TestHiveHBaseExternalTable",
> >  "hbase.table.default.storage.type" = "string");
> OK
> Time taken: 0.139 seconds
> hive> select * from TestHiveHBaseExternalTable;
> OK
> key-1 true-128-32768  -2147483648 -9223372036854775808
> Test-String -2.1793132E-11  2.01345E291
> Time taken: 0.151 seconds
> hive> drop table TestHiveHBaseExternalTabl

[jira] Commented: (HIVE-1634) Allow access to Primitive types stored in binary format in HBase

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910305#action_12910305
 ] 

John Sichi commented on HIVE-1634:
--

Hey Basab,

This is a great start.  Beyond the review comments I added, I do have some 
higher-level suggestions:

* For the column mapping, the reason I suggested "a:b:string" in the original 
JIRA description is that it's a pain to keep everything lined up by column 
position.  It's already less than ideal that we do the column name mapping by 
position, so I don't think we should make it worse by having a separate 
property for type.  Using the s/b shorthand is fine, and if you think that we 
shouldn't overload the colon, we can use a different separator, e.g. "cf:cq#s". 
 Since the existing property name is hbase.columns.mapping, I don't think it 
will be confusing to roll in the (optional) type info as well.

* I'm wondering whether we can just use the existing classes like 
LazyBinaryByte in package org.apache.hadoop.hive.serde2.lazybinary instead of 
creating new ones.  Or are these not compatible with hbase.utils.Bytes?

* For the tests, I noticed that you have attached TestHiveHBaseExternalTable.  
I think it would be a good idea if you can create and populate such a fixture 
table in HBaseTestSetup; that way it can be available (treated as read-only) to 
all of the HBase .q tests.  Otherwise, it's hard to verify that we're 
compatible with a table created directly through HBase API's rather than Hive.

* Also for the tests, it would be good if you can filter it down to only a 
small number of representative rows when pulling the initial test data set from 
the Hive src table.  That way, we can keep the .q.out files smaller.

* Once we get this one committed, be sure to update the wiki.


> Allow access to Primitive types stored in binary format in HBase
> 
>
> Key: HIVE-1634
> URL: https://issues.apache.org/jira/browse/HIVE-1634
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.7.0
>Reporter: Basab Maulik
>Assignee: Basab Maulik
> Attachments: HIVE-1634.0.patch, TestHiveHBaseExternalTable.java
>
>
> This addresses HIVE-1245 in part, for atomic or primitive types.
> The serde property "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" is a 
> specification of the storage option for the corresponding column in the serde 
> property "hbase.columns.mapping". Allowed values are '-' for table default, 
> 's' for standard string storage, and 'b' for binary storage as would be 
> obtained from o.a.h.hbase.utils.Bytes. Map types for HBase column families 
> use a colon separated pair such as 's:b' for the key and value part 
> specifiers respectively. See the test cases and queries for HBase handler for 
> additional examples.
> There is also a table property "hbase.table.default.storage.type" = "string" 
> to specify a table level default storage type. The other valid specification 
> is "binary". The table level default is overridden by a column level 
> specification.
> This control is available for the boolean, tinyint, smallint, int, bigint, 
> float, and double primitive types. The attached patch also relaxes the 
> mapping of map types to HBase column families to allow any primitive type to 
> be the map key.
> Attached is a program for creating a table and populating it in HBase. The 
> external table in Hive can access the data as shown in the example below.
> hive> create external table TestHiveHBaseExternalTable
> > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
> >  c_int int, c_long bigint, c_string string, c_float float, c_double 
> double)
> >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> >  with serdeproperties ("hbase.columns.mapping" = 
> ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double")
> >  tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
> OK
> Time taken: 0.691 seconds
> hive> select * from TestHiveHBaseExternalTable;
> OK
> key-1 NULLNULLNULLNULLNULLTest-String NULLNULL
> Time taken: 0.346 seconds
> hive> drop table TestHiveHBaseExternalTable;
> OK
> Time taken: 0.139 seconds
> hive> create external table TestHiveHBaseExternalTable
> > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
> >  c_int int, c_long bigint, c_string string, c_float float, c_double 
> double)
> >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> >  with serdeproperties (
> >  "hbase.columns.mapping" = 
> ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
> >  "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" )
> >  tblproperties (
> >  "hbase.tabl

[jira] Commented: (HIVE-1646) Hive 0.5 Build Crashing

2010-09-16 Thread Steven Wong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910285#action_12910285
 ] 

Steven Wong commented on HIVE-1646:
---

I ran ant package and ant test as 2 separate commands. The former succeeded, 
the latter failed. The last few output lines of my ant test are:

[junit] Done query: wrong_distinct2.q
[junit] Tests run: 31, Failures: 0, Errors: 0, Time elapsed: 340.872 sec
[junit] Running org.apache.hadoop.hive.ql.tool.TestLineageInfo
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.498 sec
[junit] Running org.apache.hadoop.hive.ql.udf.TestUDFDateAdd
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.264 sec
[junit] Running org.apache.hadoop.hive.ql.udf.TestUDFDateDiff
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.214 sec
[junit] Running org.apache.hadoop.hive.ql.udf.TestUDFDateSub
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.147 sec

BUILD FAILED
/Users/swong/dev/svn/asf/hadoop/hive/trunk/build.xml:168: The following error 
occurred while executing this line:
/Users/swong/dev/svn/asf/hadoop/hive/trunk/build.xml:105: The following error 
occurred while executing this line:
/Users/swong/dev/svn/asf/hadoop/hive/trunk/build-common.xml:446: Tests failed!

Total time: 190 minutes 53 seconds


> Hive 0.5 Build Crashing
> ---
>
> Key: HIVE-1646
> URL: https://issues.apache.org/jira/browse/HIVE-1646
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.5.0
> Environment: SLES 10 SP2, SLES 11, RHEL 5.4, ANT 1.8.1 and ANT 1.7.1, 
> SUN JDK 1.6.14
>Reporter: Stephen Watt
> Fix For: 0.5.1
>
>
> I've tried this on a variety of configurations. Operating Systems SLES 10 
> SP2, SLES 11, RHEL 5.4 on a variety of machines using both ANT 1.8.1 and ANT 
> 1.7.1 and SUN JDK 1.6.14. I've tried building this by going to the Hive 
> Release page and download hive-0.5.0-src and using that. I've tried building 
> by obtaining the branch tag release using svn checkout 
> http://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.5.0/ 
> hive-0.5.0-dev. Always the same thing:
> When I run the Hive 0.5 build it runs for just under 2 hours and then crashes 
> with the following message (tail end of ant.log):
> - - -
> [junit] diff 
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build/ql/test/logs/negative/wrong_distinct2.q.out
>  
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/ql/src/test/results/compiler/errors/wrong_distinct2.q.out
> [junit] Done query: wrong_distinct2.q
> [junit] Tests run: 31, Failures: 0, Errors: 0, Time elapsed: 90.974 sec
> [junit] Running org.apache.hadoop.hive.ql.tool.TestLineageInfo
> [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.525 sec
> BUILD FAILED
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:151: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:91: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build-common.xml:327: Tests failed!
> Total time: 94 minutes 43 seconds
> - - -
> My build script is very simplistic :
> #!/bin/sh
> # Set Build Dependencies
> set PATH=$PATH:/home/hive/Java-Versions/jdk1.6.0_14/bin/
> export JAVA_HOME=/home/hive/Java-Versions/jdk1.6.0_14
> export BUILD_DIR=/home/hive/hive-0.5.0-build
> export ANT_HOME=$BUILD_DIR/apache-ant-1.8.1
> export HIVE_INSTALL=$BUILD_DIR/hive-0.5.0-dev/
> export PATH=$PATH:$ANT_HOME/bin
> # Run Build and Unit Test
> cd $HIVE_INSTALL
> ant clean test tar -logfile ant.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1646) Hive 0.5 Build Crashing

2010-09-16 Thread Stephen Watt (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910278#action_12910278
 ] 

Stephen Watt commented on HIVE-1646:


@Steven thanks for the comment. When a build is successful I would it expect it 
to complete, and then provide a report of which test cases failed. You can then 
use that to debug each test case individually in eclipse. This is at least the 
way it works with the Hadoop build. With this Hive build, it appears the actual 
build is crashing as it never gets to the point where the build completes and 
it provides the ant build report. I would expect for an official Hive release 
for the build to at least complete without crashing, even if there are one or 
two test cases that fail. 

Does your build on OS/X actually complete and you see the build report? i.e.  
You are not seeing this failure message?
/home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:151: The following error 
occurred while executing this line:
/home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:91: The following error 
occurred while executing this line:
/home/hive/hive-0.5.0-build/hive-0.5.0-dev/build-common.xml:327: Tests failed!

> Hive 0.5 Build Crashing
> ---
>
> Key: HIVE-1646
> URL: https://issues.apache.org/jira/browse/HIVE-1646
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.5.0
> Environment: SLES 10 SP2, SLES 11, RHEL 5.4, ANT 1.8.1 and ANT 1.7.1, 
> SUN JDK 1.6.14
>Reporter: Stephen Watt
> Fix For: 0.5.1
>
>
> I've tried this on a variety of configurations. Operating Systems SLES 10 
> SP2, SLES 11, RHEL 5.4 on a variety of machines using both ANT 1.8.1 and ANT 
> 1.7.1 and SUN JDK 1.6.14. I've tried building this by going to the Hive 
> Release page and download hive-0.5.0-src and using that. I've tried building 
> by obtaining the branch tag release using svn checkout 
> http://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.5.0/ 
> hive-0.5.0-dev. Always the same thing:
> When I run the Hive 0.5 build it runs for just under 2 hours and then crashes 
> with the following message (tail end of ant.log):
> - - -
> [junit] diff 
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build/ql/test/logs/negative/wrong_distinct2.q.out
>  
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/ql/src/test/results/compiler/errors/wrong_distinct2.q.out
> [junit] Done query: wrong_distinct2.q
> [junit] Tests run: 31, Failures: 0, Errors: 0, Time elapsed: 90.974 sec
> [junit] Running org.apache.hadoop.hive.ql.tool.TestLineageInfo
> [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.525 sec
> BUILD FAILED
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:151: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:91: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build-common.xml:327: Tests failed!
> Total time: 94 minutes 43 seconds
> - - -
> My build script is very simplistic :
> #!/bin/sh
> # Set Build Dependencies
> set PATH=$PATH:/home/hive/Java-Versions/jdk1.6.0_14/bin/
> export JAVA_HOME=/home/hive/Java-Versions/jdk1.6.0_14
> export BUILD_DIR=/home/hive/hive-0.5.0-build
> export ANT_HOME=$BUILD_DIR/apache-ant-1.8.1
> export HIVE_INSTALL=$BUILD_DIR/hive-0.5.0-dev/
> export PATH=$PATH:$ANT_HOME/bin
> # Run Build and Unit Test
> cd $HIVE_INSTALL
> ant clean test tar -logfile ant.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1646) Hive 0.5 Build Crashing

2010-09-16 Thread Steven Wong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910266#action_12910266
 ] 

Steven Wong commented on HIVE-1646:
---

A suggestion: Search through your ant test output to look for nonzero values of 
"Failures:" or "Errors:". That will tell you which test class(es) failed. 
Suppose you find that TestCliDriver failed. Then look at the file 
TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml to identify which specific 
test case(s) failed. (I don't remember the exact location of the file, but you 
can find it somewhere under build.)

Not sure if this is related, but when I ran ant test on trunk on my Mac OS X, 2 
test cases failed because they called sed, and OS X's sed behaved differently 
than Linux's sed. I had to change the test cases a little for them to pass (not 
committed yet).


> Hive 0.5 Build Crashing
> ---
>
> Key: HIVE-1646
> URL: https://issues.apache.org/jira/browse/HIVE-1646
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.5.0
> Environment: SLES 10 SP2, SLES 11, RHEL 5.4, ANT 1.8.1 and ANT 1.7.1, 
> SUN JDK 1.6.14
>Reporter: Stephen Watt
> Fix For: 0.5.1
>
>
> I've tried this on a variety of configurations. Operating Systems SLES 10 
> SP2, SLES 11, RHEL 5.4 on a variety of machines using both ANT 1.8.1 and ANT 
> 1.7.1 and SUN JDK 1.6.14. I've tried building this by going to the Hive 
> Release page and download hive-0.5.0-src and using that. I've tried building 
> by obtaining the branch tag release using svn checkout 
> http://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.5.0/ 
> hive-0.5.0-dev. Always the same thing:
> When I run the Hive 0.5 build it runs for just under 2 hours and then crashes 
> with the following message (tail end of ant.log):
> - - -
> [junit] diff 
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build/ql/test/logs/negative/wrong_distinct2.q.out
>  
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/ql/src/test/results/compiler/errors/wrong_distinct2.q.out
> [junit] Done query: wrong_distinct2.q
> [junit] Tests run: 31, Failures: 0, Errors: 0, Time elapsed: 90.974 sec
> [junit] Running org.apache.hadoop.hive.ql.tool.TestLineageInfo
> [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.525 sec
> BUILD FAILED
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:151: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:91: The following error 
> occurred while executing this line:
> /home/hive/hive-0.5.0-build/hive-0.5.0-dev/build-common.xml:327: Tests failed!
> Total time: 94 minutes 43 seconds
> - - -
> My build script is very simplistic :
> #!/bin/sh
> # Set Build Dependencies
> set PATH=$PATH:/home/hive/Java-Versions/jdk1.6.0_14/bin/
> export JAVA_HOME=/home/hive/Java-Versions/jdk1.6.0_14
> export BUILD_DIR=/home/hive/hive-0.5.0-build
> export ANT_HOME=$BUILD_DIR/apache-ant-1.8.1
> export HIVE_INSTALL=$BUILD_DIR/hive-0.5.0-dev/
> export PATH=$PATH:$ANT_HOME/bin
> # Run Build and Unit Test
> cd $HIVE_INSTALL
> ant clean test tar -logfile ant.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1625) Added implementation to HivePreparedStatement, HiveBaseResultSet and HiveQueryResultSet.

2010-09-16 Thread HBase Review Board (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910263#action_12910263
 ] 

HBase Review Board commented on HIVE-1625:
--

Message from: "John Sichi" 

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/827/#review1246
---



http://svn.apache.org/repos/asf/hadoop/hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveParameterValue.java


These data members should be private.



http://svn.apache.org/repos/asf/hadoop/hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HivePreparedStatement.java


What is the purpose of typeMatches?


- John





> Added implementation to HivePreparedStatement, HiveBaseResultSet and 
> HiveQueryResultSet.
> 
>
> Key: HIVE-1625
> URL: https://issues.apache.org/jira/browse/HIVE-1625
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Sean Flatley
>Assignee: Sean Flatley
> Attachments: changelog.txt, HIVE-1625.patch, testJdbcDriver.log
>
>
> We implemented several of the HivePreparedStatement set methods, such as 
> setString(int, String) and the means to substitute place holders in the SQL 
> with the values set.  
> HiveQueryResultSet and HiveBaseResultSet were enhanced so that getStatement() 
> could be implemented.
> See attached change log for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)

2010-09-16 Thread John Sichi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-537:


Status: Open  (was: Patch Available)

> Hive TypeInfo/ObjectInspector to support union (besides struct, array, and 
> map)
> ---
>
> Key: HIVE-537
> URL: https://issues.apache.org/jira/browse/HIVE-537
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Zheng Shao
>Assignee: Amareshwari Sriramadasu
> Attachments: HIVE-537.1.patch, patch-537-1.txt, patch-537.txt
>
>
> There are already some cases inside the code that we use heterogeneous data: 
> JoinOperator, and UnionOperator (in the sense that different parents can pass 
> in records with different ObjectInspectors).
> We currently use Operator's parentID to distinguish that. However that 
> approach does not extend to more complex plans that might be needed in the 
> future.
> We will support the union type like this:
> {code}
> TypeDefinition:
>   type: primitivetype | structtype | arraytype | maptype | uniontype
>   uniontype: "union" "<" tag ":" type ("," tag ":" type)* ">"
> Example:
>   union<0:int,1:double,2:array,3:struct>
> Example of serialized data format:
>   We will first store the tag byte before we serialize the object. On 
> deserialization, we will first read out the tag byte, then we know what is 
> the current type of the following object, so we can deserialize it 
> successfully.
> Interface for ObjectInspector:
> interface UnionObjectInspector {
>   /** Returns the array of OIs that are for each of the tags
>*/
>   ObjectInspector[] getObjectInspectors();
>   /** Return the tag of the object.
>*/
>   byte getTag(Object o);
>   /** Return the field based on the tag value associated with the Object.
>*/
>   Object getField(Object o);
> };
> An example serialization format (Using deliminated format, with ' ' as 
> first-level delimitor and '=' as second-level delimitor)
> userid:int,log:union<0:struct>,1:string>
> 123 1=login
> 123 0=243=helloworld
> 123 1=logout
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-1647) Incorrect initialization of thread local variable inside IOContext ( implementation is not threadsafe )

2010-09-16 Thread John Sichi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi reassigned HIVE-1647:


Assignee: Raman Grover

> Incorrect initialization of thread local variable inside IOContext ( 
> implementation is not threadsafe ) 
> 
>
> Key: HIVE-1647
> URL: https://issues.apache.org/jira/browse/HIVE-1647
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Server Infrastructure
>Affects Versions: 0.5.0
>Reporter: Raman Grover
>Assignee: Raman Grover
> Fix For: 0.3.1
>
>   Original Estimate: 0.17h
>  Remaining Estimate: 0.17h
>
> Bug in org.apache.hadoop.hive.ql.io.IOContext
> in relation to initialization of thread local variable.
>  
> public class IOContext {
>  
>   private static ThreadLocal threadLocal = new 
> ThreadLocal(){ };
>  
>   static {
> if (threadLocal.get() == null) {
>   threadLocal.set(new IOContext());
> }
>   }
>  
> In a multi-threaded environment, the thread that gets to load the class first 
> for the JVM (assuming threads share the classloader),
> gets to initialize itself correctly by executing the code in the static 
> block. Once the class is loaded, 
> any subsequent threads would  have their respective threadlocal variable as 
> null.  Since IOContext
> is set during initialization of HiveRecordReader, In a scenario where 
> multiple threads get to acquire
>  an instance of HiveRecordReader, it would result in a NPE for all but the 
> first thread that gets to load the class in the VM.
>  
> Is the above scenario of multiple threads initializing HiveRecordReader a 
> typical one ?  or we could just provide the following fix...
>  
>   private static ThreadLocal threadLocal = new 
> ThreadLocal(){
> protected synchronized IOContext initialValue() {
>   return new IOContext();
> }  
>   };

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1647) Incorrect initialization of thread local variable inside IOContext ( implementation is not threadsafe )

2010-09-16 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910259#action_12910259
 ] 

He Yongqiang commented on HIVE-1647:


It's good to have the fix you proposed. Can you post a patch?

> Incorrect initialization of thread local variable inside IOContext ( 
> implementation is not threadsafe ) 
> 
>
> Key: HIVE-1647
> URL: https://issues.apache.org/jira/browse/HIVE-1647
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Server Infrastructure
>Affects Versions: 0.5.0
>Reporter: Raman Grover
> Fix For: 0.3.1
>
>   Original Estimate: 0.17h
>  Remaining Estimate: 0.17h
>
> Bug in org.apache.hadoop.hive.ql.io.IOContext
> in relation to initialization of thread local variable.
>  
> public class IOContext {
>  
>   private static ThreadLocal threadLocal = new 
> ThreadLocal(){ };
>  
>   static {
> if (threadLocal.get() == null) {
>   threadLocal.set(new IOContext());
> }
>   }
>  
> In a multi-threaded environment, the thread that gets to load the class first 
> for the JVM (assuming threads share the classloader),
> gets to initialize itself correctly by executing the code in the static 
> block. Once the class is loaded, 
> any subsequent threads would  have their respective threadlocal variable as 
> null.  Since IOContext
> is set during initialization of HiveRecordReader, In a scenario where 
> multiple threads get to acquire
>  an instance of HiveRecordReader, it would result in a NPE for all but the 
> first thread that gets to load the class in the VM.
>  
> Is the above scenario of multiple threads initializing HiveRecordReader a 
> typical one ?  or we could just provide the following fix...
>  
>   private static ThreadLocal threadLocal = new 
> ThreadLocal(){
> protected synchronized IOContext initialValue() {
>   return new IOContext();
> }  
>   };

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1647) Incorrect initialization of thread local variable inside IOContext ( implementation is not threadsafe )

2010-09-16 Thread Raman Grover (JIRA)

Incorrect initialization of thread local variable inside IOContext ( 
implementation is not threadsafe ) 


 Key: HIVE-1647
 URL: https://issues.apache.org/jira/browse/HIVE-1647
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Server Infrastructure
Affects Versions: 0.5.0
Reporter: Raman Grover
 Fix For: 0.3.1


Bug in org.apache.hadoop.hive.ql.io.IOContext
in relation to initialization of thread local variable.
 
public class IOContext {
 
  private static ThreadLocal threadLocal = new 
ThreadLocal(){ };
 
  static {
if (threadLocal.get() == null) {
  threadLocal.set(new IOContext());
}
  }
 
In a multi-threaded environment, the thread that gets to load the class first 
for the JVM (assuming threads share the classloader),
gets to initialize itself correctly by executing the code in the static block. 
Once the class is loaded, 
any subsequent threads would  have their respective threadlocal variable as 
null.  Since IOContext
is set during initialization of HiveRecordReader, In a scenario where multiple 
threads get to acquire
 an instance of HiveRecordReader, it would result in a NPE for all but the 
first thread that gets to load the class in the VM.
 
Is the above scenario of multiple threads initializing HiveRecordReader a 
typical one ?  or we could just provide the following fix...
 
  private static ThreadLocal threadLocal = new 
ThreadLocal(){
protected synchronized IOContext initialValue() {
  return new IOContext();
}  
  };


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1633) CombineHiveInputFormat fails with "cannot find dir for emptyFile"

2010-09-16 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910255#action_12910255
 ] 

He Yongqiang commented on HIVE-1633:


Can you search 
hdfs://xxx/.../hive_2010-09-07_12-15-00_299_4877141498303008976/-mr-10002/1 
(replacing xxx with actual file/host names)?

It should appear one time in partToPartitionInfo and another one time in 
"hdfs://xxx/.../hive_2010-09-07_12-15-00_299_4877141498303008976/-mr-10002/1/emptyFile".


> CombineHiveInputFormat fails with "cannot find dir for emptyFile"
> -
>
> Key: HIVE-1633
> URL: https://issues.apache.org/jira/browse/HIVE-1633
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Clients
>Reporter: Amareshwari Sriramadasu
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1625) Added implementation to HivePreparedStatement, HiveBaseResultSet and HiveQueryResultSet.

2010-09-16 Thread John Sichi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1625:
-

Status: Open  (was: Patch Available)

> Added implementation to HivePreparedStatement, HiveBaseResultSet and 
> HiveQueryResultSet.
> 
>
> Key: HIVE-1625
> URL: https://issues.apache.org/jira/browse/HIVE-1625
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Sean Flatley
>Assignee: Sean Flatley
> Attachments: changelog.txt, HIVE-1625.patch, testJdbcDriver.log
>
>
> We implemented several of the HivePreparedStatement set methods, such as 
> setString(int, String) and the means to substitute place holders in the SQL 
> with the values set.  
> HiveQueryResultSet and HiveBaseResultSet were enhanced so that getStatement() 
> could be implemented.
> See attached change log for details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HIVE-1645) ability to specify parent directory for zookeeper lock manager

2010-09-16 Thread He Yongqiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang resolved HIVE-1645.


Resolution: Fixed

I just committed! Thanks Namit!

> ability to specify parent directory for zookeeper lock manager
> --
>
> Key: HIVE-1645
> URL: https://issues.apache.org/jira/browse/HIVE-1645
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.1645.1.patch
>
>
> For concurrency support, it would be desirable if all the locks were created 
> under a common parent, so that zookeeper can be used
> for different purposes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1609) Support partition filtering in metastore

2010-09-16 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910253#action_12910253
 ] 

John Sichi commented on HIVE-1609:
--

@Ajay:  thanks for the explanations; I'm fine with those choices.

> Support partition filtering in metastore
> 
>
> Key: HIVE-1609
> URL: https://issues.apache.org/jira/browse/HIVE-1609
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Ajay Kidave
>Assignee: Ajay Kidave
> Fix For: 0.7.0
>
> Attachments: hive_1609.patch, hive_1609_2.patch, hive_1609_3.patch
>
>
> The metastore needs to have support for returning a list of partitions based 
> on user specified filter conditions. This will be useful for tools which need 
> to do partition pruning. Howl is one such use case. The way partition pruning 
> is done during hive query execution need not be changed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1646) Hive 0.5 Build Crashing

2010-09-16 Thread Stephen Watt (JIRA)

Hive 0.5 Build Crashing
---

 Key: HIVE-1646
 URL: https://issues.apache.org/jira/browse/HIVE-1646
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.5.0
 Environment: SLES 10 SP2, SLES 11, RHEL 5.4, ANT 1.8.1 and ANT 1.7.1, 
SUN JDK 1.6.14
Reporter: Stephen Watt
 Fix For: 0.5.1


I've tried this on a variety of configurations. Operating Systems SLES 10 SP2, 
SLES 11, RHEL 5.4 on a variety of machines using both ANT 1.8.1 and ANT 1.7.1 
and SUN JDK 1.6.14. I've tried building this by going to the Hive Release page 
and download hive-0.5.0-src and using that. I've tried building by obtaining 
the branch tag release using svn checkout 
http://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.5.0/ hive-0.5.0-dev. 
Always the same thing:

When I run the Hive 0.5 build it runs for just under 2 hours and then crashes 
with the following message (tail end of ant.log):
- - -
[junit] diff 
/home/hive/hive-0.5.0-build/hive-0.5.0-dev/build/ql/test/logs/negative/wrong_distinct2.q.out
 
/home/hive/hive-0.5.0-build/hive-0.5.0-dev/ql/src/test/results/compiler/errors/wrong_distinct2.q.out
[junit] Done query: wrong_distinct2.q
[junit] Tests run: 31, Failures: 0, Errors: 0, Time elapsed: 90.974 sec
[junit] Running org.apache.hadoop.hive.ql.tool.TestLineageInfo
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.525 sec

BUILD FAILED
/home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:151: The following error 
occurred while executing this line:
/home/hive/hive-0.5.0-build/hive-0.5.0-dev/build.xml:91: The following error 
occurred while executing this line:
/home/hive/hive-0.5.0-build/hive-0.5.0-dev/build-common.xml:327: Tests failed!

Total time: 94 minutes 43 seconds
- - -

My build script is very simplistic :

#!/bin/sh

# Set Build Dependencies
set PATH=$PATH:/home/hive/Java-Versions/jdk1.6.0_14/bin/

export JAVA_HOME=/home/hive/Java-Versions/jdk1.6.0_14
export BUILD_DIR=/home/hive/hive-0.5.0-build
export ANT_HOME=$BUILD_DIR/apache-ant-1.8.1
export HIVE_INSTALL=$BUILD_DIR/hive-0.5.0-dev/
export PATH=$PATH:$ANT_HOME/bin

# Run Build and Unit Test
cd $HIVE_INSTALL

ant clean test tar -logfile ant.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-675) add database/schema support Hive QL

2010-09-16 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-675:


Status: Resolved  (was: Patch Available)
Resolution: Fixed

Committed in 0.6. Thanks Carl

> add database/schema support Hive QL
> ---
>
> Key: HIVE-675
> URL: https://issues.apache.org/jira/browse/HIVE-675
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor
>Reporter: Prasad Chakka
>Assignee: Carl Steinbach
> Fix For: 0.6.0, 0.7.0
>
> Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
> hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
> hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
> HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
> HIVE-675-backport-v6.1.patch.txt, HIVE-675-backport-v6.2.patch.txt, 
> HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
> HIVE-675.13.patch.txt
>
>
> Currently all Hive tables reside in single namespace (default). Hive should 
> support multiple namespaces (databases or schemas) such that users can create 
> tables in their specific namespaces. These name spaces can have different 
> warehouse directories (with a default naming scheme) and possibly different 
> properties.
> There is already some support for this in metastore but Hive query parser 
> should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1639) ExecDriver.addInputPaths() error if partition name contains a comma

2010-09-16 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1639:
-

  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

Committed. Thanks Ning

> ExecDriver.addInputPaths() error if partition name contains a comma
> ---
>
> Key: HIVE-1639
> URL: https://issues.apache.org/jira/browse/HIVE-1639
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1639.2.patch, HIVE-1639.patch
>
>
> The ExecDriver.addInputPaths() calls FileInputFormat.addPaths(), which takes 
> a comma-separated string representing a set of paths. If the path name of a 
> input file contains a comma, this code throw an exception: 
> java.lang.IllegalArgumentException: Can not create a Path from an empty 
> string.
> Instead of calling FileInputFormat.addPaths(), ExecDriver.addInputPaths 
> should iterate all paths and call FileInputFormat.addInputPath. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1637) String partitions that are equal to "" or start with spaces or contains periods or asterisks cause errors

2010-09-16 Thread Raviv M-G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raviv M-G updated HIVE-1637:


Summary: String partitions that are equal to "" or start with spaces or 
contains periods or asterisks cause errors  (was: String partitions that are 
equal to "" or start with spaces or numbers cause errors)
Description: 
When a string partition is equal to "" (the empty string) or starts with a 
space or contains periods or asterisks joins on that table or selecting one 
field from that table will cause errors.  You can, however, still perform 
partition specific queries without a problem.


> select UT from researchaddress limit 10;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Cannot run job locally: Input Size (= 134217728) is larger than 
hive.exec.mode.local.auto.inputbytes.max (= 134217728)
java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
at org.apache.hadoop.fs.Path.(Path.java:90)
at 
org.apache.hadoop.mapred.FileInputFormat.addInputPaths(FileInputFormat.java:296)
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPaths(ExecDriver.java:1182)
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:569)
at 
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:120)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)
Job Submission failed with exception 'java.lang.IllegalArgumentException(Can 
not create a Path from an empty string)'


And attempting to rename a table with such a partition causes:

hive> alter table researchaddress rename to researchaddress2;
Invalid alter operation: Old partition location  is invalid. 
(hdfs://ourwebsite.org:54310/user/hive/warehouse/researchaddress/nu=0 DENMARK)



  was:
When a string partition is equal to "" (the empty string) or starts with a 
space or with a number joins on that table or selecting one field from that 
table will cause errors.  You can, however, still perform partition specific 
queries without a problem.


> select UT from researchaddress limit 10;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Cannot run job locally: Input Size (= 134217728) is larger than 
hive.exec.mode.local.auto.inputbytes.max (= 134217728)
java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
at org.apache.hadoop.fs.Path.(Path.java:90)
at 
org.apache.hadoop.mapred.FileInputFormat.addInputPaths(FileInputFormat.java:296)
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPaths(ExecDriver.java:1182)
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:569)
at 
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:120)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)
Job Submission failed with exception 'java.lang.IllegalArgumentException(Can 
not create a Path from an empty string)'


And attempting to rename a table with such a partition causes:

hive> alter table researchaddress rename to researchaddress2;
Invalid alter operation: Old partition location  is invalid. 
(hdfs://ourwebsite.org:54310/user/hive/warehouse/researchaddress/nu=0 DENMARK)




> String partitions that are equal to "" or start with spaces or contains 
> periods or asterisks cause errors
> -
>
> Key: HIVE-1637
> URL: https://issues.apache.org/jira/browse/HIVE-1637
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0
> Environment: java 1.6
>Reporter: Raviv M-G
>Priority: Minor
>
> When a string partition is equal to "" (the empty string) or starts with a 
> space or contains periods or asterisks joins on that table or selecting one 
> field from that table will cause errors.  You can, however, still perform 
> partition specific queries without a problem.
> > select UT from researchaddress limit 10;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Cannot run job locally: Input Size (= 134217728) is larger than 
> hive.exec.mode.local.auto.inputbytes.max (= 134217728)
> java.lang.IllegalArgume

68 matches

Mail list logo