from:"Carl Steinbach \\$JIRA\\$"

[jira] Created: (HIVE-1679) MetaStore does not detect and rollback failed transactions

2010-10-01 Thread Carl Steinbach (JIRA)

MetaStore does not detect and rollback failed transactions
--

 Key: HIVE-1679
 URL: https://issues.apache.org/jira/browse/HIVE-1679
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0, 0.6.0, 0.7.0
Reporter: Carl Steinbach


Most of the methods in HiveMetaStore and ObjectStore adhere to the following 
idiom when 
interacting with the ObjectStore:

{code}
boolean success = false;
try {
  ms.openTransaction();
  /* do some stuff */
  success = ms.commitTransaction();
} finally {
  if (!success) {
ms.rollbackTransaction();
  }
}
{code}

The problem with this is that ObjectStore.commitTransaction() always returns 
TRUE:

{code}
  public boolean commitTransaction() {
assert (openTrasactionCalls = 1);
if (!currentTransaction.isActive()) {
  throw new RuntimeException(
  Commit is called, but transaction is not active. Either there are
  +  mismatching open and close calls or rollback was called in 
the same trasaction);
}
openTrasactionCalls--;
if ((openTrasactionCalls == 0)  currentTransaction.isActive()) {
  transactionStatus = TXN_STATUS.COMMITED;
  currentTransaction.commit();
}
return true;
  }
{code}


Consequently, the transaction appears to always succeed and ObjectStore is never
directed to rollback transactions that have actually failed. 



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1681) ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back

2010-10-01 Thread Carl Steinbach (JIRA)

ObjectStore.commitTransaction() does not properly handle transactions that have 
already been rolled back


 Key: HIVE-1681
 URL: https://issues.apache.org/jira/browse/HIVE-1681
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0, 0.6.0, 0.7.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach


Here's the code for ObjectStore.commitTransaction() and 
ObjectStore.rollbackTransaction():

{code}
  public boolean commitTransaction() {
assert (openTrasactionCalls = 1);
if (!currentTransaction.isActive()) {
  throw new RuntimeException(
  Commit is called, but transaction is not active. Either there are
  +  mismatching open and close calls or rollback was called in 
the same trasaction);
}
openTrasactionCalls--;
if ((openTrasactionCalls == 0)  currentTransaction.isActive()) {
  transactionStatus = TXN_STATUS.COMMITED;
  currentTransaction.commit();
}
return true;
  }

  public void rollbackTransaction() {
if (openTrasactionCalls  1) {
  return;
}
openTrasactionCalls = 0;
if (currentTransaction.isActive()
 transactionStatus != TXN_STATUS.ROLLBACK) {
  transactionStatus = TXN_STATUS.ROLLBACK;
  // could already be rolled back
  currentTransaction.rollback();
}
  }

{code}

Now suppose a nested transaction throws an exception which results
in the nested pseudo-transaction calling rollbackTransaction(). This causes
rollbackTransaction() to rollback the actual transaction, as well as to set 
openTransactionCalls=0 and transactionStatus = TXN_STATUS.ROLLBACK.
Suppose also that this nested transaction squelches the original exception.
In this case the stack will unwind and the caller will eventually try to commit 
the
transaction by calling commitTransaction() which will see that 
currentTransaction.isActive() returns
FALSE and will throw a RuntimeException. The fix for this problem is
that commitTransaction() needs to first check transactionStatus and return 
immediately
if transactionStatus==TXN_STATUS.ROLLBACK.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1681) ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back

2010-10-01 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1681:
-

Attachment: HIVE-1681.1.patch.txt

 ObjectStore.commitTransaction() does not properly handle transactions that 
 have already been rolled back
 

 Key: HIVE-1681
 URL: https://issues.apache.org/jira/browse/HIVE-1681
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0, 0.6.0, 0.7.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1681.1.patch.txt


 Here's the code for ObjectStore.commitTransaction() and 
 ObjectStore.rollbackTransaction():
 {code}
   public boolean commitTransaction() {
 assert (openTrasactionCalls = 1);
 if (!currentTransaction.isActive()) {
   throw new RuntimeException(
   Commit is called, but transaction is not active. Either there are
   +  mismatching open and close calls or rollback was called in 
 the same trasaction);
 }
 openTrasactionCalls--;
 if ((openTrasactionCalls == 0)  currentTransaction.isActive()) {
   transactionStatus = TXN_STATUS.COMMITED;
   currentTransaction.commit();
 }
 return true;
   }
   public void rollbackTransaction() {
 if (openTrasactionCalls  1) {
   return;
 }
 openTrasactionCalls = 0;
 if (currentTransaction.isActive()
  transactionStatus != TXN_STATUS.ROLLBACK) {
   transactionStatus = TXN_STATUS.ROLLBACK;
   // could already be rolled back
   currentTransaction.rollback();
 }
   }
 {code}
 Now suppose a nested transaction throws an exception which results
 in the nested pseudo-transaction calling rollbackTransaction(). This causes
 rollbackTransaction() to rollback the actual transaction, as well as to set 
 openTransactionCalls=0 and transactionStatus = TXN_STATUS.ROLLBACK.
 Suppose also that this nested transaction squelches the original exception.
 In this case the stack will unwind and the caller will eventually try to 
 commit the
 transaction by calling commitTransaction() which will see that 
 currentTransaction.isActive() returns
 FALSE and will throw a RuntimeException. The fix for this problem is
 that commitTransaction() needs to first check transactionStatus and return 
 immediately
 if transactionStatus==TXN_STATUS.ROLLBACK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS

2010-10-01 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1157.
--

Resolution: Duplicate

 UDFs can't be loaded via add jar when jar is on HDFS
 --

 Key: HIVE-1157
 URL: https://issues.apache.org/jira/browse/HIVE-1157
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Philip Zeyliger
Priority: Minor
 Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, 
 HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.patch.v6.txt, 
 HIVE-1157.v2.patch.txt, output.txt


 As discussed on the mailing list, it would be nice if you could use UDFs that 
 are on jars on HDFS.  The proposed implementation would be for add jar to 
 recognize that the target file is on HDFS, copy it locally, and load it into 
 the classpath.
 {quote}
 Hi folks,
 I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
 Can you use a UDF where the jar which contains the function is on HDFS, and 
 not on the local filesystem.  Specifically, the following does not seem to 
 work:
 # This is Hive 0.5, from svn
 $bin/hive  
 Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
 hive add jar hdfs://localhost/FooTest.jar;   

 Added hdfs://localhost/FooTest.jar to class path
 hive create temporary function cube as 'com.cloudera.FooTestUDF';
 
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.FunctionTask
 Does this work for other people?  I could probably fix it by changing add 
 jar to download remote jars locally, when necessary (to load them into the 
 classpath), or update URLClassLoader (or whatever is underneath there) to 
 read directly from HDFS, which seems a bit more fragile.  But I wanted to 
 make sure that my interpretation of what's going on is right before I have at 
 it.
 Thanks,
 -- Philip
 {quote}
 {quote}
 Yes that's correct. I prefer to download the jars in add jar.
 Zheng
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1427) Provide metastore schema migration scripts (0.5 - 0.6)

2010-10-01 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1427:
-

Attachment: HIVE-1427.1.patch.txt

HIVE-1427.1.patch.txt:
* Upgrade scripts for derby and mysql.
* Includes all schema changes between 0.5.0 and branch-0.6, along with proposed 
changes in HIVE-1364.

I'm in the process of running upgrade tests on Derby and MySQL.

 Provide metastore schema migration scripts (0.5 - 0.6)
 ---

 Key: HIVE-1427
 URL: https://issues.apache.org/jira/browse/HIVE-1427
 Project: Hadoop Hive
  Issue Type: Task
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0

 Attachments: HIVE-1427.1.patch.txt


 At a minimum this ticket covers packaging up example MySQL migration scripts 
 (cumulative across all schema changes from 0.5 to 0.6) and explaining what to 
 do with them in the release notes.
 This is also probably a good point at which to decide and clearly state which 
 Metastore DBs we officially support in production, e.g. do we need to provide 
 migration scripts for Derby?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1427) Provide metastore schema migration scripts (0.5 - 0.6)

2010-10-01 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1427:
-

Status: Patch Available  (was: Open)

 Provide metastore schema migration scripts (0.5 - 0.6)
 ---

 Key: HIVE-1427
 URL: https://issues.apache.org/jira/browse/HIVE-1427
 Project: Hadoop Hive
  Issue Type: Task
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0

 Attachments: HIVE-1427.1.patch.txt


 At a minimum this ticket covers packaging up example MySQL migration scripts 
 (cumulative across all schema changes from 0.5 to 0.6) and explaining what to 
 do with them in the release notes.
 This is also probably a good point at which to decide and clearly state which 
 Metastore DBs we officially support in production, e.g. do we need to provide 
 migration scripts for Derby?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HIVE-1676) show table extended like does not work well with wildcards

2010-09-30 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1676.
--

Resolution: Duplicate

 show table extended like does not work well with wildcards
 --

 Key: HIVE-1676
 URL: https://issues.apache.org/jira/browse/HIVE-1676
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Pradeep Kamath
Priority: Minor

 As evident from the output below though there are tables that match the 
 wildcard, the output from show table extended like  does not contain the 
 matches.
 {noformat}
 bin/hive -e show tables 'foo*'
 Hive history 
 file=/tmp/pradeepk/hive_job_log_pradeepk_201009301037_568707409.txt
 OK
 foo
 foo2
 Time taken: 3.417 seconds
 bin/hive -e show table extended like 'foo*'
 Hive history 
 file=/tmp/pradeepk/hive_job_log_pradeepk_201009301037_410056681.txt
 OK
 Time taken: 2.948 seconds
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1427) Provide metastore schema migration scripts (0.5 - 0.6)

2010-09-30 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12916750#action_12916750
 ] 

Carl Steinbach commented on HIVE-1427:
--

@Ning: Will do. Can someone please review and commit HIVE-1364 since this 
ticket depends on it?

 Provide metastore schema migration scripts (0.5 - 0.6)
 ---

 Key: HIVE-1427
 URL: https://issues.apache.org/jira/browse/HIVE-1427
 Project: Hadoop Hive
  Issue Type: Task
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0


 At a minimum this ticket covers packaging up example MySQL migration scripts 
 (cumulative across all schema changes from 0.5 to 0.6) and explaining what to 
 do with them in the release notes.
 This is also probably a good point at which to decide and clearly state which 
 Metastore DBs we officially support in production, e.g. do we need to provide 
 migration scripts for Derby?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-30 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1526:
-

Fix Version/s: (was: 0.6.0)

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.7.0

 Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar, 
 libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-30 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12916751#action_12916751
 ] 

Carl Steinbach commented on HIVE-1526:
--

@Ning: I removed the 0.6 tag. Can you please review this change? Thanks.

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.7.0

 Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar, 
 libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Reopened: (HIVE-1524) parallel execution failed if mapred.job.name is set

2010-09-28 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reopened HIVE-1524:
--


Reopening for backport to 0.6

 parallel execution failed if mapred.job.name is set
 ---

 Key: HIVE-1524
 URL: https://issues.apache.org/jira/browse/HIVE-1524
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0
Reporter: Ning Zhang
Assignee: Ning Zhang
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1524-for-Hive-0.6.patch, HIVE-1524.2.patch, 
 HIVE-1524.patch


 The plan file name was generated based on mapred.job.name. If the user 
 specify mapred.job.name before the query, two parallel queries will have 
 conflict plan file name. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1524) parallel execution failed if mapred.job.name is set

2010-09-28 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1524:
-

Fix Version/s: 0.6.0
  Component/s: Query Processor

 parallel execution failed if mapred.job.name is set
 ---

 Key: HIVE-1524
 URL: https://issues.apache.org/jira/browse/HIVE-1524
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0
Reporter: Ning Zhang
Assignee: Ning Zhang
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1524-for-Hive-0.6.patch, HIVE-1524.2.patch, 
 HIVE-1524.patch


 The plan file name was generated based on mapred.job.name. If the user 
 specify mapred.job.name before the query, two parallel queries will have 
 conflict plan file name. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/schema support Hive QL

2010-09-28 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12915961#action_12915961
 ] 

Carl Steinbach commented on HIVE-675:
-

Hi Ning, I'm looking into it.

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675-backport-v6.1.patch.txt, HIVE-675-backport-v6.2.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/schema support Hive QL

2010-09-28 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12915986#action_12915986
 ] 

Carl Steinbach commented on HIVE-675:
-

@Ning: Can you please delete 
metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreRemote.java?

In the backport patch this file is deleted and replaced with 
TestRemoteHiveMetaStore.java, but it looks like for some reason 
this file was not actually deleted when the patch was applied.


 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675-backport-v6.1.patch.txt, HIVE-675-backport-v6.2.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS

2010-09-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1157:
-

Attachment: HIVE-1157.patch.v5.txt

Attaching an updated version of Phil's patch that applies cleanly with -p0

 UDFs can't be loaded via add jar when jar is on HDFS
 --

 Key: HIVE-1157
 URL: https://issues.apache.org/jira/browse/HIVE-1157
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Philip Zeyliger
Priority: Minor
 Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, 
 HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.v2.patch.txt, 
 output.txt


 As discussed on the mailing list, it would be nice if you could use UDFs that 
 are on jars on HDFS.  The proposed implementation would be for add jar to 
 recognize that the target file is on HDFS, copy it locally, and load it into 
 the classpath.
 {quote}
 Hi folks,
 I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
 Can you use a UDF where the jar which contains the function is on HDFS, and 
 not on the local filesystem.  Specifically, the following does not seem to 
 work:
 # This is Hive 0.5, from svn
 $bin/hive  
 Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
 hive add jar hdfs://localhost/FooTest.jar;   

 Added hdfs://localhost/FooTest.jar to class path
 hive create temporary function cube as 'com.cloudera.FooTestUDF';
 
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.FunctionTask
 Does this work for other people?  I could probably fix it by changing add 
 jar to download remote jars locally, when necessary (to load them into the 
 classpath), or update URLClassLoader (or whatever is underneath there) to 
 read directly from HDFS, which seems a bit more fragile.  But I wanted to 
 make sure that my interpretation of what's going on is right before I have at 
 it.
 Thanks,
 -- Philip
 {quote}
 {quote}
 Yes that's correct. I prefer to download the jars in add jar.
 Zheng
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS

2010-09-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1157:
-

Status: Patch Available  (was: Open)

 UDFs can't be loaded via add jar when jar is on HDFS
 --

 Key: HIVE-1157
 URL: https://issues.apache.org/jira/browse/HIVE-1157
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Philip Zeyliger
Priority: Minor
 Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, 
 HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.v2.patch.txt, 
 output.txt


 As discussed on the mailing list, it would be nice if you could use UDFs that 
 are on jars on HDFS.  The proposed implementation would be for add jar to 
 recognize that the target file is on HDFS, copy it locally, and load it into 
 the classpath.
 {quote}
 Hi folks,
 I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
 Can you use a UDF where the jar which contains the function is on HDFS, and 
 not on the local filesystem.  Specifically, the following does not seem to 
 work:
 # This is Hive 0.5, from svn
 $bin/hive  
 Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
 hive add jar hdfs://localhost/FooTest.jar;   

 Added hdfs://localhost/FooTest.jar to class path
 hive create temporary function cube as 'com.cloudera.FooTestUDF';
 
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.FunctionTask
 Does this work for other people?  I could probably fix it by changing add 
 jar to download remote jars locally, when necessary (to load them into the 
 classpath), or update URLClassLoader (or whatever is underneath there) to 
 read directly from HDFS, which seems a bit more fragile.  But I wanted to 
 make sure that my interpretation of what's going on is right before I have at 
 it.
 Thanks,
 -- Philip
 {quote}
 {quote}
 Yes that's correct. I prefer to download the jars in add jar.
 Zheng
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-24 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1526:
-

Status: Patch Available  (was: Open)

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar, 
 libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-24 Thread Carl Steinbach (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Carl Steinbach updated HIVE-1526:
-

Attachment: HIVE-1526.2.patch.txt

HIVE-1526.2.patch.txt:
* Manage slf4j dependencies with Ivy.
* Added slf4j dependencies to eclipse classpath.
* Added thriftif macro to ${hive.root}/build.xml which triggers recompilation
of all thrift stubs.
* Modified odbc/Makefile to use Thrift libs and headers in THRIFT_HOME instead
of the ones that were checked into service/include.
* Modified odbc/Makefile to build thrift generated cpp artifacts in ql/src
* Removed thrift headers/code from service/include (HIVE-1527)
* Added some missing #includes to the hiveclient source files in odbc/src/cpp.

Testing:
* Tested eclipse launch configurations.
* Built CPP hiveclient lib and tested against HiveServer using HiveClientTestC
program.

Hive should depend on a release version of Thrift
-

Key: HIVE-1526
URL: https://issues.apache.org/jira/browse/HIVE-1526
Project: Hadoop Hive
Issue Type: Task
Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
Fix For: 0.6.0, 0.7.0

Attachments: HIVE-1526.2.patch.txt, hive-1526.txt, libfb303.jar,
libthrift.jar

Hive should depend on a release version of Thrift, and ideally it should use
Ivy to resolve this dependency.
The Thrift folks are working on adding Thrift artifacts to a maven repository
here: https://issues.apache.org/jira/browse/THRIFT-363

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1530) Include hive-default.xml and hive-log4j.properties in hive-common JAR

2010-09-24 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1530:
-

   Status: Patch Available  (was: Open)
 Assignee: Carl Steinbach
Fix Version/s: 0.7.0

 Include hive-default.xml and hive-log4j.properties in hive-common JAR
 -

 Key: HIVE-1530
 URL: https://issues.apache.org/jira/browse/HIVE-1530
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: HIVE-1530.1.patch.txt


 hive-common-*.jar should include hive-default.xml and hive-log4j.properties,
 and similarly hive-exec-*.jar should include hive-exec-log4j.properties. The
 hive-default.xml file that currently sits in the conf/ directory should be 
 removed.
 Motivations for this change:
 * We explicitly tell users that they should never modify hive-default.xml yet 
 give them the opportunity to do so by placing the file in the conf dir.
 * Many users are familiar with the Hadoop configuration mechanism that does 
 not require *-default.xml files to be present in the HADOOP_CONF_DIR, and 
 assume that the same is true for HIVE_CONF_DIR.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-23 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12913933#action_12913933
 ] 

Carl Steinbach commented on HIVE-1526:
--

@Todd: Can you please regenerate this patch? Both 'patch -p0' and 'git apply 
-p0' fail. Thanks.

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-1526.txt, libfb303.jar, libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-23 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1526:
-

Fix Version/s: 0.6.0
   0.7.0
  Component/s: Clients

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-1526.txt, libfb303.jar, libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1364:
-

Attachment: HIVE-1364.3.patch.txt
HIVE-1364.3.backport-060.patch.txt

 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1364:
-

Status: Patch Available  (was: Open)

 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1364:
-

Attachment: HIVE-1364.4.patch.txt
HIVE-1364.4.backport-060.patch.txt

Updated version of the patch with changes requested by John.

 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, 
 HIVE-1364.4.backport-060.patch.txt, HIVE-1364.4.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-23 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914255#action_12914255
 ] 

Carl Steinbach commented on HIVE-1526:
--

Sorry, that was a false alarm about the patch. Turns out the github Hive mirror 
lags the main repo by about a week.

@Todd: This patch introduces unsatisfied dependencies on slf4j-api and 
slf4j-log4j12. Can you please update the patch to pull these dependencies down 
with Ivy?

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-1526.txt, libfb303.jar, libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1517) ability to select across a database

2010-09-20 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1517:
-

Fix Version/s: (was: 0.6.0)

@John: Yes, it's a checkpoint patch. Moving this to 0.7.0.

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1517) ability to select across a database

2010-09-17 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1517:
-

Attachment: HIVE-1517.1.patch.txt

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1517.1.patch.txt


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1615) Web Interface JSP needs Refactoring for removed meta store methods

2010-09-16 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1615:
-

Affects Version/s: 0.7.0

 Web Interface JSP needs Refactoring for removed meta store methods
 --

 Key: HIVE-1615
 URL: https://issues.apache.org/jira/browse/HIVE-1615
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Web UI
Affects Versions: 0.6.0, 0.7.0
Reporter: Edward Capriolo
Assignee: Edward Capriolo
Priority: Blocker
 Fix For: 0.7.0

 Attachments: hive-1615.patch.2.txt, hive-1615.patch.txt


 Some meta store methods being called from JSP have been removed. Really 
 should prioritize compiling jsp into servlet code again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1615) Web Interface JSP needs Refactoring for removed meta store methods

2010-09-16 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12910375#action_12910375
 ] 

Carl Steinbach commented on HIVE-1615:
--

This needs to be backported to 0.6. I verified that hive-1615.patch.2.txt 
applies cleanly to the 0.6 branch.

 Web Interface JSP needs Refactoring for removed meta store methods
 --

 Key: HIVE-1615
 URL: https://issues.apache.org/jira/browse/HIVE-1615
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Web UI
Affects Versions: 0.6.0, 0.7.0
Reporter: Edward Capriolo
Assignee: Edward Capriolo
Priority: Blocker
 Fix For: 0.7.0

 Attachments: hive-1615.patch.2.txt, hive-1615.patch.txt


 Some meta store methods being called from JSP have been removed. Really 
 should prioritize compiling jsp into servlet code again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-33) [Hive]: Add ability to compute statistics on hive tables

2010-09-16 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-33:
---

Issue Type: New Feature  (was: Bug)

 [Hive]: Add ability to compute statistics on hive tables
 

 Key: HIVE-33
 URL: https://issues.apache.org/jira/browse/HIVE-33
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Ashish Thusoo
Assignee: Ahmed M Aly

 Add commands to collect partition and column level statistics in hive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1361) table/partition level statistics

2010-09-16 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1361:
-

Fix Version/s: 0.7.0
Affects Version/s: (was: 0.6.0)
  Component/s: Query Processor

 table/partition level statistics
 

 Key: HIVE-1361
 URL: https://issues.apache.org/jira/browse/HIVE-1361
 Project: Hadoop Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Ahmed M Aly
 Fix For: 0.7.0

 Attachments: HIVE-1361.java_only.patch, HIVE-1361.patch, stats0.patch


 At the first step, we gather table-level stats for non-partitioned table and 
 partition-level stats for partitioned table. Future work could extend the 
 table level stats to partitioned table as well. 
 There are 3 major milestones in this subtask: 
  1) extend the insert statement to gather table/partition level stats 
 on-the-fly.
  2) extend metastore API to support storing and retrieving stats for a 
 particular table/partition. 
  3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for 
 existing tables/partitions. 
 The proposed stats are:
 Partition-level stats: 
   - number of rows
   - total size in bytes
   - number of files
   - max, min, average row sizes
   - max, min, average file sizes
 Table-level stats in addition to partition level stats:
   - number of partitions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1649) Ability to update counters and status from TRANSFORM scripts

2010-09-16 Thread Carl Steinbach (JIRA)

Ability to update counters and status from TRANSFORM scripts


 Key: HIVE-1649
 URL: https://issues.apache.org/jira/browse/HIVE-1649
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach


Hadoop Streaming supports the ability to update counters and status by writing 
specially coded messages to the script's stderr stream.

A streaming process can use the stderr to emit counter information. 
{{reporter:counter:group,counter,amount}} should be sent to stderr to 
update the counter.
A streaming process can use the stderr to emit status information. To set a 
status, {{reporter:status:message}} should be sent to stderr.

Hive should support these same features with its TRANSFORM mechanism.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-675) add database/schema support Hive QL

2010-09-15 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-675:


Attachment: HIVE-675-backport-v6.2.patch.txt

HIVE-675-backport-v6.2.patch.txt includes HIVE-1607.

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675-backport-v6.1.patch.txt, HIVE-675-backport-v6.2.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-675) add database/schema support Hive QL

2010-09-15 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-675:


Status: Patch Available  (was: Reopened)

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675-backport-v6.1.patch.txt, HIVE-675-backport-v6.2.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-675) add database/schema support Hive QL

2010-09-14 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-675:


Attachment: HIVE-675-backport-v6.1.patch.txt

Backport for 0.6.0. 

Should I squash HIVE-1607 into this or backport it separately?

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675-backport-v6.1.patch.txt, HIVE-675.10.patch.txt, 
 HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1636) Implement SHOW TABLES {FROM | IN} db_name

2010-09-14 Thread Carl Steinbach (JIRA)

Implement SHOW TABLES {FROM | IN} db_name
---

 Key: HIVE-1636
 URL: https://issues.apache.org/jira/browse/HIVE-1636
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach


Make it possible to list the tables in a specific database using the following 
syntax borrowed from MySQL:

{noformat}
SHOW TABLES [{FROM|IN} db_name]
{noformat}

See http://dev.mysql.com/doc/refman/5.0/en/show-tables.html


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-802) Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it

2010-09-09 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-802:


Attachment: datanucleus-core-1.1.2-patched.jar

 Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it
 -

 Key: HIVE-802
 URL: https://issues.apache.org/jira/browse/HIVE-802
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.5.0
Reporter: Todd Lipcon
Assignee: Arvind Prabhakar
 Attachments: datanucleus-core-1.1.2-patched.jar


 There's a bug in DataNucleus that causes this issue:
 http://www.jpox.org/servlet/jira/browse/NUCCORE-371
 To reproduce, simply put your hive source tree in a directory that contains a 
 '+' character.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1613) hive --service jar looks for hadoop version but was not defined

2010-09-08 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1613:
-

Fix Version/s: 0.7.0

 hive --service jar looks for hadoop version but was not defined
 ---

 Key: HIVE-1613
 URL: https://issues.apache.org/jira/browse/HIVE-1613
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 0.5.1
Reporter: Edward Capriolo
Assignee: Edward Capriolo
Priority: Blocker
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-1613.patch.txt


 hive --service jar fails. I have to open another ticket to clean up the 
 scripts and unify functions like version detection.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1613) hive --service jar looks for hadoop version but was not defined

2010-09-08 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907402#action_12907402
 ] 

Carl Steinbach commented on HIVE-1613:
--

+1 Looks good.


 hive --service jar looks for hadoop version but was not defined
 ---

 Key: HIVE-1613
 URL: https://issues.apache.org/jira/browse/HIVE-1613
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 0.5.1
Reporter: Edward Capriolo
Assignee: Edward Capriolo
Priority: Blocker
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-1613.patch.txt


 hive --service jar fails. I have to open another ticket to clean up the 
 scripts and unify functions like version detection.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Reopened: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675

2010-09-08 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reopened HIVE-1607:
--


Backport to 0.6.0

 Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
 

 Key: HIVE-1607
 URL: https://issues.apache.org/jira/browse/HIVE-1607
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: HIVE-1607.1.patch.txt, HIVE-1607.2.patch.txt


 Several methods were removed from the IMetaStoreClient interface as part of 
 HIVE-675:
 {code}
   /**
* Drop the table.
*
* @param tableName
*  The table to drop
* @param deleteData
*  Should we delete the underlying data
* @throws MetaException
*   Could not drop table properly.
* @throws UnknownTableException
*   The table wasn't found.
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   The table wasn't found.
*/
   public void dropTable(String tableName, boolean deleteData)
   throws MetaException, UnknownTableException, TException,
   NoSuchObjectException;
   /**
* Get a table object.
*
* @param tableName
*  Name of the table to fetch.
* @return An object representing the table.
* @throws MetaException
*   Could not fetch the table
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   In case the table wasn't found.
*/
   public Table getTable(String tableName) throws MetaException, TException,
   NoSuchObjectException;
   public boolean tableExists(String databaseName, String tableName) throws 
 MetaException,
   TException, UnknownDBException;
 {code}
 These methods should be reinstated with a deprecation warning.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1623) Factor out Hadoop version check logic in bin/hive scripts

2010-09-08 Thread Carl Steinbach (JIRA)

Factor out Hadoop version check logic in bin/hive scripts
-

 Key: HIVE-1623
 URL: https://issues.apache.org/jira/browse/HIVE-1623
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Carl Steinbach


The same Hadoop version check logic is repeated in each of the following files:

bin/ext/hiveserver.sh
bin/ext/hwi.sh
bin/ext/metastore.sh
bin/ext/util/execHiveCmd.sh

This code should be refactored into a version check function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675

2010-09-08 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1607:
-

Fix Version/s: 0.6.0

 Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
 

 Key: HIVE-1607
 URL: https://issues.apache.org/jira/browse/HIVE-1607
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1607.1.patch.txt, HIVE-1607.2.patch.txt


 Several methods were removed from the IMetaStoreClient interface as part of 
 HIVE-675:
 {code}
   /**
* Drop the table.
*
* @param tableName
*  The table to drop
* @param deleteData
*  Should we delete the underlying data
* @throws MetaException
*   Could not drop table properly.
* @throws UnknownTableException
*   The table wasn't found.
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   The table wasn't found.
*/
   public void dropTable(String tableName, boolean deleteData)
   throws MetaException, UnknownTableException, TException,
   NoSuchObjectException;
   /**
* Get a table object.
*
* @param tableName
*  Name of the table to fetch.
* @return An object representing the table.
* @throws MetaException
*   Could not fetch the table
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   In case the table wasn't found.
*/
   public Table getTable(String tableName) throws MetaException, TException,
   NoSuchObjectException;
   public boolean tableExists(String databaseName, String tableName) throws 
 MetaException,
   TException, UnknownDBException;
 {code}
 These methods should be reinstated with a deprecation warning.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/schema support Hive QL

2010-09-07 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907048#action_12907048
 ] 

Carl Steinbach commented on HIVE-675:
-

@Paul: No, not yet, but I think the following script should work:

{code}
ALTER TABLE DBS MODIFY DESC VARCHAR(4000);
ALTER TABLE DBS ADD COLUMN DB_LOCATION_URI VARCHAR(4000);
{code}


 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1517) ability to select across a database

2010-09-03 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1517:
-

Fix Version/s: 0.6.0
   0.7.0

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Reopened: (HIVE-675) add database/schema support Hive QL

2010-09-03 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reopened HIVE-675:
-


Working on a backport for 0.6.0

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-675) add database/schema support Hive QL

2010-09-03 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-675:


Fix Version/s: 0.6.0

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1446) Move Hive Documentation from the wiki to version control

2010-09-03 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1446:
-

Fix Version/s: (was: 0.6.0)

Postponing this work until 0.7.0

 Move Hive Documentation from the wiki to version control
 

 Key: HIVE-1446
 URL: https://issues.apache.org/jira/browse/HIVE-1446
 Project: Hadoop Hive
  Issue Type: Task
  Components: Documentation
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: hive-1446-part-1.diff, hive-1446.diff, hive-logo-wide.png


 Move the Hive Language Manual (and possibly some other documents) from the 
 Hive wiki to version control. This work needs to be coordinated with the 
 hive-dev and hive-user community in order to avoid missing any edits as well 
 as to avoid or limit unavailability of the docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1476) Hive's metastore when run as a thrift service creates directories as the service user instead of the real user issuing create table/alter table etc.

2010-09-02 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905622#action_12905622
 ] 

Carl Steinbach commented on HIVE-1476:
--

@Venkatesh: THRIFT-814 covers adding SPNEGO support to Thrift.

 Hive's metastore when run as a thrift service creates directories as the 
 service user instead of the real user issuing create table/alter table etc.
 

 Key: HIVE-1476
 URL: https://issues.apache.org/jira/browse/HIVE-1476
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Pradeep Kamath
 Attachments: HIVE-1476.patch, HIVE-1476.patch.2


 If the thrift metastore service is running as the user hive then all table 
 directories as a result of create table are created as that user rather than 
 the user who actually issued the create table command. This is different 
 semantically from non-thrift mode (i.e. local mode) when clients directly 
 connect to the metastore. In the latter case, directories are created as the 
 real user. The thrift mode should do the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1476) Hive's metastore when run as a thrift service creates directories as the service user instead of the real user issuing create table/alter table etc.

2010-09-02 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905623#action_12905623
 ] 

Carl Steinbach commented on HIVE-1476:
--

Edit: I mean THRIFT-889.

 Hive's metastore when run as a thrift service creates directories as the 
 service user instead of the real user issuing create table/alter table etc.
 

 Key: HIVE-1476
 URL: https://issues.apache.org/jira/browse/HIVE-1476
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Pradeep Kamath
 Attachments: HIVE-1476.patch, HIVE-1476.patch.2


 If the thrift metastore service is running as the user hive then all table 
 directories as a result of create table are created as that user rather than 
 the user who actually issued the create table command. This is different 
 semantically from non-thrift mode (i.e. local mode) when clients directly 
 connect to the metastore. In the latter case, directories are created as the 
 real user. The thrift mode should do the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar

2010-09-02 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905686#action_12905686
 ] 

Carl Steinbach commented on HIVE-1546:
--

bq. we've agreed at the high level on the approach of creating Howl as a 
wrapper around Hive

I thought Howl was supposed to be a wrapper around (and replacement for) the 
Hive metastore, not all of Hive.

I think there are clear advantages to Hive and Howl sharing the same metastore 
code as long as they access this facility through the public API, but can't say 
the same for the two projects using the same CLI code if it means allowing 
external projects to depend on loosely defined set of internal APIs. What 
benefits are we hoping to achieve by having Howl and Hive share the same CLI 
code, especially if Howl is only interested in a small part of it? What are the 
drawbacks of instead encouraging the Howl project to copy the CLI code and 
maintain their own version?


 Ability to plug custom Semantic Analyzers for Hive Grammar
 --

 Key: HIVE-1546
 URL: https://issues.apache.org/jira/browse/HIVE-1546
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.7.0

 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, 
 hive-1546_2.patch


 It will be useful if Semantic Analysis phase is made pluggable such that 
 other projects can do custom analysis of hive queries before doing metastore 
 operations on them. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar

2010-09-02 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905692#action_12905692
 ] 

Carl Steinbach commented on HIVE-1546:
--

What do you think of this option: we check the Howl SemanticAnalyzer into the 
Hive source tree and provide a config option that optionally enables it? This 
gives Howl the features they need without making the SemanticAnalyzer API 
public.


 Ability to plug custom Semantic Analyzers for Hive Grammar
 --

 Key: HIVE-1546
 URL: https://issues.apache.org/jira/browse/HIVE-1546
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.7.0

 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, 
 hive-1546_2.patch


 It will be useful if Semantic Analysis phase is made pluggable such that 
 other projects can do custom analysis of hive queries before doing metastore 
 operations on them. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar

2010-09-02 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905707#action_12905707
 ] 

Carl Steinbach commented on HIVE-1546:
--

I'm +1 on the approach outlined by John.

 Ability to plug custom Semantic Analyzers for Hive Grammar
 --

 Key: HIVE-1546
 URL: https://issues.apache.org/jira/browse/HIVE-1546
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.7.0

 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, 
 hive-1546_2.patch


 It will be useful if Semantic Analysis phase is made pluggable such that 
 other projects can do custom analysis of hive queries before doing metastore 
 operations on them. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-849) database_name.table_name.column_name not supported

2010-09-02 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905719#action_12905719
 ] 

Carl Steinbach commented on HIVE-849:
-

@Namit: Correct, but this issue is also covered by HIVE-1517, and the comments 
in that ticket provide more details, so I decided to resolve this ticket as a 
duplicate of HIVE-1517.

 database_name.table_name.column_name not supported
 

 Key: HIVE-849
 URL: https://issues.apache.org/jira/browse/HIVE-849
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Namit Jain
Assignee: Carl Steinbach



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar

2010-09-02 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905757#action_12905757
 ] 

Carl Steinbach commented on HIVE-1546:
--

I gather from Ashutosh's latest patch that you want to do the following:

* Provide your own implementation of HiveSemanticAnalyzerFactory.
* Subclass SemanticAnalyzer
* Subclass DDLSemanticAnalzyer

I looked at the public and protected members in these classes and think
that at a minimum we would have to mark the following classes as limited
private and evolving:

* HiveSemanticAnalyzerFactory
* BaseSemanticAnalyzer
* SemanticAnalyzer
* DDLSemanticAnalyzer
* ASTNode
* HiveParser (i.e. Hive's ANTLR grammar)
* SemanticAnalyzer Context (org.apache.hadoop.hive.ql.Context)
* Task and FetchTask
* QB
* QBParseInfo
* QBMetaData
* QBJoinTree
* CreateTableDesc

So anytime we touch one of these classes we would need to coordinate with the 
Howl folks to make sure we aren't breaking one of their plugins? I don't think 
this is a good tradeoff if the main benefit we can expect is a simpler build 
and release process for Howl.

 Ability to plug custom Semantic Analyzers for Hive Grammar
 --

 Key: HIVE-1546
 URL: https://issues.apache.org/jira/browse/HIVE-1546
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.7.0

 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, 
 hive-1546_2.patch


 It will be useful if Semantic Analysis phase is made pluggable such that 
 other projects can do custom analysis of hive queries before doing metastore 
 operations on them. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1609) Support partition filtering in metastore

2010-09-02 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905769#action_12905769
 ] 

Carl Steinbach commented on HIVE-1609:
--

DynamicSerDe is the component that has a JavaCC dependency. I think 
DynamicSerDe (and TCTLSeparatedProtocol) were deprecated a long time ago. 
Should we try to remove this code?

 Support partition filtering in metastore
 

 Key: HIVE-1609
 URL: https://issues.apache.org/jira/browse/HIVE-1609
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Ajay Kidave
 Fix For: 0.7.0

 Attachments: hive_1609.patch, hive_1609_2.patch


 The metastore needs to have support for returning a list of partitions based 
 on user specified filter conditions. This will be useful for tools which need 
 to do partition pruning. Howl is one such use case. The way partition pruning 
 is done during hive query execution need not be changed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1609) Support partition filtering in metastore

2010-09-01 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1609:
-

Fix Version/s: 0.7.0
   (was: 0.6.0)
Affects Version/s: (was: 0.5.0)

 Support partition filtering in metastore
 

 Key: HIVE-1609
 URL: https://issues.apache.org/jira/browse/HIVE-1609
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Ajay Kidave
 Fix For: 0.7.0

 Attachments: hive_1609.patch


 The metastore needs to have support for returning a list of partitions based 
 on user specified filter conditions. This will be useful for tools which need 
 to do partition pruning. Howl is one such use case. The way partition pruning 
 is done during hive query execution need not be changed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1609) Support partition filtering in metastore

2010-09-01 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1609:
-

Status: Open  (was: Patch Available)

 Support partition filtering in metastore
 

 Key: HIVE-1609
 URL: https://issues.apache.org/jira/browse/HIVE-1609
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Ajay Kidave
 Fix For: 0.7.0

 Attachments: hive_1609.patch


 The metastore needs to have support for returning a list of partitions based 
 on user specified filter conditions. This will be useful for tools which need 
 to do partition pruning. Howl is one such use case. The way partition pruning 
 is done during hive query execution need not be changed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar

2010-09-01 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905390#action_12905390
 ] 

Carl Steinbach commented on HIVE-1546:
--

@Ashutosh: Can you provide some background on what you hope to accomplish with 
this? What is the motivating use case, i.e. what custom SemanticAnalyzer do you 
plan to write?

Also, how are the new INPUTDRIVER and OUTPUTDRIVER properties used? By adding 
these to the Hive grammar it seems like we may be providing a mechanism for 
defining tables in the MetaStore that Hive can't read or write to. If that's 
the case what are your plans for adding this support to Hive?



 Ability to plug custom Semantic Analyzers for Hive Grammar
 --

 Key: HIVE-1546
 URL: https://issues.apache.org/jira/browse/HIVE-1546
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.7.0

 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, 
 hive-1546_2.patch


 It will be useful if Semantic Analysis phase is made pluggable such that 
 other projects can do custom analysis of hive queries before doing metastore 
 operations on them. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar

2010-09-01 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905406#action_12905406
 ] 

Carl Steinbach commented on HIVE-1546:
--

If the main motivation for this ticket is the ability to produce a crippled 
version of the HiveCLI that is only capable of executing DDL, then I think we 
should consider simpler approaches that don't involve making SemanticAnalyzer a 
public API. 

SemanticAnalyzer is in serious need of refactoring. Making this API public will 
severely restrict our ability to do this work in the future.






 Ability to plug custom Semantic Analyzers for Hive Grammar
 --

 Key: HIVE-1546
 URL: https://issues.apache.org/jira/browse/HIVE-1546
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.7.0

 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, 
 hive-1546_2.patch


 It will be useful if Semantic Analysis phase is made pluggable such that 
 other projects can do custom analysis of hive queries before doing metastore 
 operations on them. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1546) Ability to plug custom Semantic Analyzers for Hive Grammar

2010-09-01 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905411#action_12905411
 ] 

Carl Steinbach commented on HIVE-1546:
--

@ John: Good to know, but what's the motivation for this change? Was is it 
covered in the back-channel discussions you mentioned above? And is making 
SemanticAnalyzer a public API really a good idea?

 Ability to plug custom Semantic Analyzers for Hive Grammar
 --

 Key: HIVE-1546
 URL: https://issues.apache.org/jira/browse/HIVE-1546
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.7.0

 Attachments: hive-1546-3.patch, hive-1546-4.patch, hive-1546.patch, 
 hive-1546_2.patch


 It will be useful if Semantic Analysis phase is made pluggable such that 
 other projects can do custom analysis of hive queries before doing metastore 
 operations on them. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1545) Add a bunch of UDFs and UDAFs

2010-08-31 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1545:
-

Component/s: UDF

 Add a bunch of UDFs and UDAFs
 -

 Key: HIVE-1545
 URL: https://issues.apache.org/jira/browse/HIVE-1545
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Jonathan Chang
Assignee: Jonathan Chang
Priority: Minor
 Attachments: udfs.tar.gz


 Here some UD(A)Fs which can be incorporated into the Hive distribution:
 UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 
 5, 3) returns 1.
 UDFBucket - Find the bucket in which the first argument belongs. e.g., 
 BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x  b_{i} 
 but = b_{i+1}. Returns 0 if x is smaller than all the buckets.
 UDFFindInArray - Finds the 1-index of the first element in the array given as 
 the second argument. Returns 0 if not found. Returns NULL if either argument 
 is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, 
 array(1,2,3)) will return 0.
 UDFGreatCircleDist - Finds the great circle distance (in km) between two 
 lat/long coordinates (in degrees).
 UDFLDA - Performs LDA inference on a vector given fixed topics.
 UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 
 whenever any of its parameters changes.
 UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 
 5.
 UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches 
 in an array.
 UDFUnescape - Returns the string unescaped (using C/Java style unescaping).
 UDFWhich - Given a boolean array, return the indices which are TRUE.
 UDFJaccard
 UDAFCollect - Takes all the values associated with a row and converts it into 
 a list. Make sure to have: set hive.map.aggr = false;
 UDAFCollectMap - Like collect except that it takes tuples and generates a map.
 UDAFEntropy - Compute the entropy of a column.
 UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two 
 columns.
 UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value 
 of VAL.
 UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated 
 with the N (passed as the third parameter) largest values of VAL.
 UDAFHistogram

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1016) Ability to access DistributedCache from UDFs

2010-08-31 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904860#action_12904860
 ] 

Carl Steinbach commented on HIVE-1016:
--

@Namit: GenericUDF.initialize() is called both at compile-time and run-time.

 Ability to access DistributedCache from UDFs
 

 Key: HIVE-1016
 URL: https://issues.apache.org/jira/browse/HIVE-1016
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1016.1.patch.txt


 There have been several requests on the mailing list for
 information about how to access the DistributedCache from UDFs, e.g.:
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html
 While responses to these emails suggested several workarounds, the only 
 correct
 way of accessing the distributed cache is via the static methods of Hadoop's
 DistributedCache class, and all of these methods require that the JobConf be 
 passed
 in as a parameter. Hence, giving UDFs access to the distributed cache
 reduces to giving UDFs access to the JobConf.
 I propose the following changes to GenericUDF/UDAF/UDTF:
 * Add an exec_init(Configuration conf) method that is called during Operator 
 initialization at runtime.
 * Change the name of the initialize method to compile_init to make it 
 clear that this method is called at compile-time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/schema support Hive QL

2010-08-30 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904364#action_12904364
 ] 

Carl Steinbach commented on HIVE-675:
-

@John: Are you referring to the changes I made to 
create_database/get_database/get_databases/drop_database in 
hive_metastore.thrift?

In that file I replaced

{{liststring get_databases() throws(1:MetaException o1)}}

with

{{liststring get_databases(1:string pattern) throws(1:MetaException o1)}}

I can easily revert this change, but want to know if there are are other things 
you think I need to fix.

I think the changes I made to create_database and drop_database should not be 
an issue since these calls weren't actually supported in previous versions.


 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1016) Ability to access DistributedCache from UDFs

2010-08-30 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1016:
-

Attachment: HIVE-1016.1.patch.txt

 Ability to access DistributedCache from UDFs
 

 Key: HIVE-1016
 URL: https://issues.apache.org/jira/browse/HIVE-1016
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1016.1.patch.txt


 There have been several requests on the mailing list for
 information about how to access the DistributedCache from UDFs, e.g.:
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html
 While responses to these emails suggested several workarounds, the only 
 correct
 way of accessing the distributed cache is via the static methods of Hadoop's
 DistributedCache class, and all of these methods require that the JobConf be 
 passed
 in as a parameter. Hence, giving UDFs access to the distributed cache
 reduces to giving UDFs access to the JobConf.
 I propose the following changes to GenericUDF/UDAF/UDTF:
 * Add an exec_init(Configuration conf) method that is called during Operator 
 initialization at runtime.
 * Change the name of the initialize method to compile_init to make it 
 clear that this method is called at compile-time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1016) Ability to access DistributedCache from UDFs

2010-08-30 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1016:
-

Status: Patch Available  (was: Open)

 Ability to access DistributedCache from UDFs
 

 Key: HIVE-1016
 URL: https://issues.apache.org/jira/browse/HIVE-1016
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1016.1.patch.txt


 There have been several requests on the mailing list for
 information about how to access the DistributedCache from UDFs, e.g.:
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html
 While responses to these emails suggested several workarounds, the only 
 correct
 way of accessing the distributed cache is via the static methods of Hadoop's
 DistributedCache class, and all of these methods require that the JobConf be 
 passed
 in as a parameter. Hence, giving UDFs access to the distributed cache
 reduces to giving UDFs access to the JobConf.
 I propose the following changes to GenericUDF/UDAF/UDTF:
 * Add an exec_init(Configuration conf) method that is called during Operator 
 initialization at runtime.
 * Change the name of the initialize method to compile_init to make it 
 clear that this method is called at compile-time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/schema support Hive QL

2010-08-30 Thread Carl Steinbach (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904388#action_12904388
]

Carl Steinbach commented on HIVE-675:
-

I removed these methods since they implicitly target the {{default}} database.
I can put them
back with a deprecation warning, but I also want to point out that old code
that depends on
these methods is probably no longer correct now that Hive supports multiple
databases.
Removing these methods entirely may be the easiest way to help people find
these errors.

add database/schema support Hive QL
---

Key: HIVE-675
URL: https://issues.apache.org/jira/browse/HIVE-675
Project: Hadoop Hive
Issue Type: New Feature
Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
Fix For: 0.7.0

Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch,
hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch,
hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt,
HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt,
HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt,
HIVE-675.13.patch.txt

Currently all Hive tables reside in single namespace (default). Hive should
support multiple namespaces (databases or schemas) such that users can create
tables in their specific namespaces. These name spaces can have different
warehouse directories (with a default naming scheme) and possibly different
properties.
There is already some support for this in metastore but Hive query parser
should have this feature as well.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1016) Ability to access DistributedCache from UDFs

2010-08-30 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904400#action_12904400
 ] 

Carl Steinbach commented on HIVE-1016:
--

@Namit: I initially preferred that approach too, and I think it would make 
sense if all of the UDF
classes inherited from the same abstract base class. However, we have a bunch 
of unrelated
UDF base classes (UDF, UDAF, GenericUDF, GenericUDAFEvaluator (which already 
has a
runtime init() method), and GenericUDTF), and taking the approach you suggested 
would require
modifications to all of these classes as well as the code that calls them. I 
also think it's likely that
we'll want to make more runtime context available to UDFs in the future, and 
it's easier to proxy
this through the UDFContext singleton than to keep adding methods to each of 
the different UDF
base classes.

 Ability to access DistributedCache from UDFs
 

 Key: HIVE-1016
 URL: https://issues.apache.org/jira/browse/HIVE-1016
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1016.1.patch.txt


 There have been several requests on the mailing list for
 information about how to access the DistributedCache from UDFs, e.g.:
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html
 While responses to these emails suggested several workarounds, the only 
 correct
 way of accessing the distributed cache is via the static methods of Hadoop's
 DistributedCache class, and all of these methods require that the JobConf be 
 passed
 in as a parameter. Hence, giving UDFs access to the distributed cache
 reduces to giving UDFs access to the JobConf.
 I propose the following changes to GenericUDF/UDAF/UDTF:
 * Add an exec_init(Configuration conf) method that is called during Operator 
 initialization at runtime.
 * Change the name of the initialize method to compile_init to make it 
 clear that this method is called at compile-time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1016) Ability to access DistributedCache from UDFs

2010-08-30 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1016:
-

Status: Open  (was: Patch Available)

Ok, I'll rework the patch with your suggestions.

 Ability to access DistributedCache from UDFs
 

 Key: HIVE-1016
 URL: https://issues.apache.org/jira/browse/HIVE-1016
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1016.1.patch.txt


 There have been several requests on the mailing list for
 information about how to access the DistributedCache from UDFs, e.g.:
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html
 While responses to these emails suggested several workarounds, the only 
 correct
 way of accessing the distributed cache is via the static methods of Hadoop's
 DistributedCache class, and all of these methods require that the JobConf be 
 passed
 in as a parameter. Hence, giving UDFs access to the distributed cache
 reduces to giving UDFs access to the JobConf.
 I propose the following changes to GenericUDF/UDAF/UDTF:
 * Add an exec_init(Configuration conf) method that is called during Operator 
 initialization at runtime.
 * Change the name of the initialize method to compile_init to make it 
 clear that this method is called at compile-time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675

2010-08-30 Thread Carl Steinbach (JIRA)

Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675


 Key: HIVE-1607
 URL: https://issues.apache.org/jira/browse/HIVE-1607
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach


Several methods were removed from the IMetaStoreClient interface as part of 
HIVE-675:

{code}

  /**
   * Drop the table.
   *
   * @param tableName
   *  The table to drop
   * @param deleteData
   *  Should we delete the underlying data
   * @throws MetaException
   *   Could not drop table properly.
   * @throws UnknownTableException
   *   The table wasn't found.
   * @throws TException
   *   A thrift communication error occurred
   * @throws NoSuchObjectException
   *   The table wasn't found.
   */
  public void dropTable(String tableName, boolean deleteData)
  throws MetaException, UnknownTableException, TException,
  NoSuchObjectException;

  /**
   * Get a table object.
   *
   * @param tableName
   *  Name of the table to fetch.
   * @return An object representing the table.
   * @throws MetaException
   *   Could not fetch the table
   * @throws TException
   *   A thrift communication error occurred
   * @throws NoSuchObjectException
   *   In case the table wasn't found.
   */
  public Table getTable(String tableName) throws MetaException, TException,
  NoSuchObjectException;

  public boolean tableExists(String databaseName, String tableName) throws 
MetaException,
  TException, UnknownDBException;

{code}

These methods should be reinstated with a deprecation warning.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/schema support Hive QL

2010-08-30 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904420#action_12904420
 ] 

Carl Steinbach commented on HIVE-675:
-

@Ning: I filed HIVE-1607. Patch to follow shortly. Sorry for inconvenience.

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675

2010-08-30 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1607:
-

Attachment: HIVE-1607.1.patch.txt

 Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
 

 Key: HIVE-1607
 URL: https://issues.apache.org/jira/browse/HIVE-1607
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1607.1.patch.txt


 Several methods were removed from the IMetaStoreClient interface as part of 
 HIVE-675:
 {code}
   /**
* Drop the table.
*
* @param tableName
*  The table to drop
* @param deleteData
*  Should we delete the underlying data
* @throws MetaException
*   Could not drop table properly.
* @throws UnknownTableException
*   The table wasn't found.
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   The table wasn't found.
*/
   public void dropTable(String tableName, boolean deleteData)
   throws MetaException, UnknownTableException, TException,
   NoSuchObjectException;
   /**
* Get a table object.
*
* @param tableName
*  Name of the table to fetch.
* @return An object representing the table.
* @throws MetaException
*   Could not fetch the table
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   In case the table wasn't found.
*/
   public Table getTable(String tableName) throws MetaException, TException,
   NoSuchObjectException;
   public boolean tableExists(String databaseName, String tableName) throws 
 MetaException,
   TException, UnknownDBException;
 {code}
 These methods should be reinstated with a deprecation warning.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675

2010-08-30 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1607:
-

Status: Patch Available  (was: Open)

 Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
 

 Key: HIVE-1607
 URL: https://issues.apache.org/jira/browse/HIVE-1607
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1607.1.patch.txt


 Several methods were removed from the IMetaStoreClient interface as part of 
 HIVE-675:
 {code}
   /**
* Drop the table.
*
* @param tableName
*  The table to drop
* @param deleteData
*  Should we delete the underlying data
* @throws MetaException
*   Could not drop table properly.
* @throws UnknownTableException
*   The table wasn't found.
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   The table wasn't found.
*/
   public void dropTable(String tableName, boolean deleteData)
   throws MetaException, UnknownTableException, TException,
   NoSuchObjectException;
   /**
* Get a table object.
*
* @param tableName
*  Name of the table to fetch.
* @return An object representing the table.
* @throws MetaException
*   Could not fetch the table
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   In case the table wasn't found.
*/
   public Table getTable(String tableName) throws MetaException, TException,
   NoSuchObjectException;
   public boolean tableExists(String databaseName, String tableName) throws 
 MetaException,
   TException, UnknownDBException;
 {code}
 These methods should be reinstated with a deprecation warning.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1607) Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675

2010-08-30 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1607:
-

Attachment: HIVE-1607.2.patch.txt

 Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
 

 Key: HIVE-1607
 URL: https://issues.apache.org/jira/browse/HIVE-1607
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1607.1.patch.txt, HIVE-1607.2.patch.txt


 Several methods were removed from the IMetaStoreClient interface as part of 
 HIVE-675:
 {code}
   /**
* Drop the table.
*
* @param tableName
*  The table to drop
* @param deleteData
*  Should we delete the underlying data
* @throws MetaException
*   Could not drop table properly.
* @throws UnknownTableException
*   The table wasn't found.
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   The table wasn't found.
*/
   public void dropTable(String tableName, boolean deleteData)
   throws MetaException, UnknownTableException, TException,
   NoSuchObjectException;
   /**
* Get a table object.
*
* @param tableName
*  Name of the table to fetch.
* @return An object representing the table.
* @throws MetaException
*   Could not fetch the table
* @throws TException
*   A thrift communication error occurred
* @throws NoSuchObjectException
*   In case the table wasn't found.
*/
   public Table getTable(String tableName) throws MetaException, TException,
   NoSuchObjectException;
   public boolean tableExists(String databaseName, String tableName) throws 
 MetaException,
   TException, UnknownDBException;
 {code}
 These methods should be reinstated with a deprecation warning.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1594) Typo of hive.merge.size.smallfiles.avgsize prevents change of value

2010-08-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1594:
-

Fix Version/s: (was: 0.7.0)
Affects Version/s: (was: 0.6.0)
   (was: 0.7.0)

 Typo of hive.merge.size.smallfiles.avgsize prevents change of value
 ---

 Key: HIVE-1594
 URL: https://issues.apache.org/jira/browse/HIVE-1594
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.5.0
Reporter: Yun Huang Yong
Assignee: Yun Huang Yong
Priority: Minor
 Fix For: 0.6.0

 Attachments: HIVE-1594-0.5.patch, HIVE-1594.patch


 The setting is described as namehive.merge.size.smallfiles.avgsize/name, 
 however common/src/java/org/apache/hadoop/hive/conf/HiveConf.java reads it as 
 hive.merge.smallfiles.avgsize (note the missing '.size.') so the user's 
 setting has no effect and the value is stuck at the default of 16MB.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1401) Web Interface can ony browse default

2010-08-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1401:
-

Fix Version/s: (was: 0.7.0)

 Web Interface can ony browse default
 

 Key: HIVE-1401
 URL: https://issues.apache.org/jira/browse/HIVE-1401
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Web UI
Affects Versions: 0.5.0
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Fix For: 0.6.0

 Attachments: HIVE-1401-1-patch.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1307) More generic and efficient merge method

2010-08-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1307:
-

Fix Version/s: 0.6.0
   (was: 0.7.0)
Affects Version/s: (was: 0.6.0)
  Component/s: Query Processor

 More generic and efficient merge method
 ---

 Key: HIVE-1307
 URL: https://issues.apache.org/jira/browse/HIVE-1307
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Ning Zhang
 Fix For: 0.6.0

 Attachments: HIVE-1307.0.patch, HIVE-1307.2.patch, HIVE-1307.3.patch, 
 HIVE-1307.3_java.patch, HIVE-1307.4.patch, HIVE-1307.5.patch, 
 HIVE-1307.6.patch, HIVE-1307.7.patch, HIVE-1307.8.patch, HIVE-1307.9.patch, 
 HIVE-1307.patch, HIVE-1307_2_branch_0.6.patch, HIVE-1307_branch_0.6.patch, 
 HIVE-1307_java_only.patch


 Currently if hive.merge.mapfiles/mapredfiles=true, a new mapreduce job is 
 create to read the input files and output to one reducer for merging. This MR 
 job is created at compile time and one MR job for one partition. In the case 
 of dynamic partition case, multiple partitions could be created at execution 
 time and generating merging MR job at compile time is impossible. 
 We should generalize the merge framework to allow multiple partitions and 
 most of the time a map-only job should be sufficient if we use 
 CombineHiveInputFormat. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1543) set abort in ExecMapper when Hive's record reader got an IOException

2010-08-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1543:
-

Component/s: Query Processor

 set abort in ExecMapper when Hive's record reader got an IOException
 

 Key: HIVE-1543
 URL: https://issues.apache.org/jira/browse/HIVE-1543
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Ning Zhang
 Fix For: 0.6.0

 Attachments: HIVE-1543.1.patch, HIVE-1543.2_branch0.6.patch, 
 HIVE-1543.patch, HIVE-1543_branch0.6.patch


 When RecordReader got an IOException, ExecMapper does not know and will close 
 the operators as if there is not error. We should catch this exception and 
 avoid writing partial results to HDFS which will be removed later anyways.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1531) Make Hive build work with Ivy versions 2.1.0

2010-08-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1531:
-

Fix Version/s: (was: 0.7.0)

 Make Hive build work with Ivy versions  2.1.0
 --

 Key: HIVE-1531
 URL: https://issues.apache.org/jira/browse/HIVE-1531
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0

 Attachments: HIVE-1531.patch.txt


 Many projects in the Hadoop ecosystem still use Ivy 2.0.0 (including Hadoop 
 and Pig),
 yet Hive requires version 2.1.0. Ordinarily this would not be a problem, but 
 many users
 have a copy of an older version of Ivy in their $ANT_HOME directory, and this 
 copy will
 always get picked up in preference to what the Hive build downloads for 
 itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1411) DataNucleus throws NucleusException if core-3.1.1 JAR appears more than once on CLASSPATH

2010-08-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1411:
-

Fix Version/s: (was: 0.7.0)

 DataNucleus throws NucleusException if core-3.1.1 JAR appears more than once 
 on CLASSPATH
 -

 Key: HIVE-1411
 URL: https://issues.apache.org/jira/browse/HIVE-1411
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.4.0, 0.4.1, 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0

 Attachments: HIVE-1411.patch.txt


 DataNucleus barfs when the core-3.1.1 JAR file appears more than once on the 
 CLASSPATH:
 {code}
 2010-03-06 12:33:25,565 ERROR exec.DDLTask 
 (SessionState.java:printError(279)) - FAILED: Error in metadata: 
 javax.jdo.JDOFatalInter 
 nalException: Unexpected exception caught. 
 NestedThrowables: 
 java.lang.reflect.InvocationTargetException 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 javax.jdo.JDOFatalInternalException: Unexpected exception caught. 
 NestedThrowables: 
 java.lang.reflect.InvocationTargetException 
 at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:258) 
 at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:879) 
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:103) 
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:379) 
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:285) 
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) 
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) 
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156) 
 Caused by: javax.jdo.JDOFatalInternalException: Unexpected exception caught. 
 NestedThrowables: 
 java.lang.reflect.InvocationTargetException 
 at 
 javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1186)
 at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:803) 
 at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:698) 
 at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:164) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:181)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:125) 
 at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:104) 
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) 
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) 
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:130)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:146)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:118)
  
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.(HiveMetaStore.java:100)
  
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:74)
  
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:783) 
 at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:794) 
 at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:252) 
 ... 12 more 
 Caused by: java.lang.reflect.InvocationTargetException 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597) 
 at javax.jdo.JDOHelper$16.run(JDOHelper.java:1956) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.jdo.JDOHelper.invoke(JDOHelper.java:1951) 
 at 
 javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1159)
 ... 28 more 
 Caused by: org.datanucleus.exceptions.NucleusException: Plugin (Bundle) 
 org.eclipse.jdt.core is already registered. Ensure you do 
 nt have multiple JAR versions of the same plugin in the classpath. The URL 
 file:/Users/hadop/hadoop-0.20.1+152/build/ivy/lib/Hadoo 
 p/common/core-3.1.1.jar is already registered, and you are trying to 
 register an identical plugin located at URL file:/Users/hado 
 p/hadoop-0.20.1+152/lib/core-3.1.1.jar. 
 at

[jira] Updated: (HIVE-1492) FileSinkOperator should remove duplicated files from the same task based on file sizes

2010-08-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1492:
-

Fix Version/s: (was: 0.7.0)
Affects Version/s: (was: 0.7.0)
  Component/s: Query Processor

 FileSinkOperator should remove duplicated files from the same task based on 
 file sizes
 --

 Key: HIVE-1492
 URL: https://issues.apache.org/jira/browse/HIVE-1492
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Ning Zhang
 Fix For: 0.6.0

 Attachments: HIVE-1492.patch, HIVE-1492_branch-0.6.patch


 FileSinkOperator.jobClose() calls Utilities.removeTempOrDuplicateFiles() to 
 retain only one file for each task. A task could produce multiple files due 
 to failed attempts or speculative runs. The largest file should be retained 
 rather than the first file for each task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1016) Ability to access DistributedCache from UDFs

2010-08-27 Thread Carl Steinbach (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903748#action_12903748
]

Carl Steinbach commented on HIVE-1016:
--

Yes, I'm working on it. I'll have a patch ready for review by Monday.
(Reassigned this back to myself).

Ability to access DistributedCache from UDFs

Key: HIVE-1016
URL: https://issues.apache.org/jira/browse/HIVE-1016
Project: Hadoop Hive
Issue Type: New Feature
Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach

There have been several requests on the mailing list for
information about how to access the DistributedCache from UDFs, e.g.:
http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html
http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html
While responses to these emails suggested several workarounds, the only
correct
way of accessing the distributed cache is via the static methods of Hadoop's
DistributedCache class, and all of these methods require that the JobConf be
passed
in as a parameter. Hence, giving UDFs access to the distributed cache
reduces to giving UDFs access to the JobConf.
I propose the following changes to GenericUDF/UDAF/UDTF:
* Add an exec_init(Configuration conf) method that is called during Operator
initialization at runtime.
* Change the name of the initialize method to compile_init to make it
clear that this method is called at compile-time.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-1016) Ability to access DistributedCache from UDFs

2010-08-27 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reassigned HIVE-1016:


Assignee: Carl Steinbach  (was: Namit Jain)

 Ability to access DistributedCache from UDFs
 

 Key: HIVE-1016
 URL: https://issues.apache.org/jira/browse/HIVE-1016
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach

 There have been several requests on the mailing list for
 information about how to access the DistributedCache from UDFs, e.g.:
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01650.html
 http://www.mail-archive.com/hive-u...@hadoop.apache.org/msg01926.html
 While responses to these emails suggested several workarounds, the only 
 correct
 way of accessing the distributed cache is via the static methods of Hadoop's
 DistributedCache class, and all of these methods require that the JobConf be 
 passed
 in as a parameter. Hence, giving UDFs access to the distributed cache
 reduces to giving UDFs access to the JobConf.
 I propose the following changes to GenericUDF/UDAF/UDTF:
 * Add an exec_init(Configuration conf) method that is called during Operator 
 initialization at runtime.
 * Change the name of the initialize method to compile_init to make it 
 clear that this method is called at compile-time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/schema support Hive QL

2010-08-26 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903082#action_12903082
 ] 

Carl Steinbach commented on HIVE-675:
-

Cited the wrong lines from the patch. Here are the correct ones:

{noformat}
...
diff --git 
metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreRemote.java
 
metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreRemote.java
deleted file mode 100644
index bc950b9..000
--- 
metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreRemote.java
+++ /dev/null
...
{noformat}

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-1593) udtf_explode.q is an empty file

2010-08-25 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reassigned HIVE-1593:


Assignee: Carl Steinbach  (was: Paul Yang)

 udtf_explode.q is an empty file
 ---

 Key: HIVE-1593
 URL: https://issues.apache.org/jira/browse/HIVE-1593
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Carl Steinbach
Priority: Minor
 Fix For: 0.7.0

 Attachments: HIVE-1593.1.patch.txt


 jsichi-mac:clientpositive jsichi$ pwd
 /Users/jsichi/open/hive-trunk/ql/src/test/queries/clientpositive
 jsichi-mac:clientpositive jsichi$ cat udtf_explode.q 
 jsichi-mac:clientpositive jsichi$ 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1593) udtf_explode.q is an empty file

2010-08-25 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1593:
-

Status: Patch Available  (was: Open)

* Reverted the contents of udtf_explode.q to its pre-HIVE-1031 state.
* Updated udtf_explode.q.out


 udtf_explode.q is an empty file
 ---

 Key: HIVE-1593
 URL: https://issues.apache.org/jira/browse/HIVE-1593
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Carl Steinbach
Priority: Minor
 Fix For: 0.7.0

 Attachments: HIVE-1593.1.patch.txt


 jsichi-mac:clientpositive jsichi$ pwd
 /Users/jsichi/open/hive-trunk/ql/src/test/queries/clientpositive
 jsichi-mac:clientpositive jsichi$ cat udtf_explode.q 
 jsichi-mac:clientpositive jsichi$ 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-675) add database/schema support Hive QL

2010-08-25 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-675:


Attachment: HIVE-675.13.patch.txt

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-675) add database/schema support Hive QL

2010-08-25 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-675:


Status: Patch Available  (was: Open)

@Namit: Looks like I was missing some tab characters in the test outputs. I 
verified that all tests pass with HIVE-675.13.patch.txt

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1369) LazySimpleSerDe should be able to read classes that support some form of toString()

2010-08-25 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902589#action_12902589
 ] 

Carl Steinbach commented on HIVE-1369:
--

@Namit: Will do.

 LazySimpleSerDe should be able to read classes that support some form of 
 toString()
 ---

 Key: HIVE-1369
 URL: https://issues.apache.org/jira/browse/HIVE-1369
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.5.0
Reporter: Alex Kozlov
Assignee: Alex Kozlov
Priority: Minor
 Fix For: 0.7.0

 Attachments: HIVE-1369.patch, HIVE-1369.svn.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 Currently LazySimpleSerDe is able to deserialize only BytesWritable or Text 
 objects.  It should be pretty easy to extend the class to read any object 
 that implements toString() method.
 Ideas or concerns?
 Alex K

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/schema support Hive QL

2010-08-25 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902636#action_12902636
 ] 

Carl Steinbach commented on HIVE-675:
-

That's correct. I will start work on the backport patch after this gets 
committed to trunk.

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt, 
 HIVE-675.13.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-675) add database/schema support Hive QL

2010-08-24 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902041#action_12902041
 ] 

Carl Steinbach commented on HIVE-675:
-

Namit, will do. 

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-675) add database/schema support Hive QL

2010-08-24 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-675:


Attachment: HIVE-675.12.patch.txt

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1589) Add HBase/ZK JARs to Eclipse classpath

2010-08-24 Thread Carl Steinbach (JIRA)

Add HBase/ZK JARs to Eclipse classpath
--

 Key: HIVE-1589
 URL: https://issues.apache.org/jira/browse/HIVE-1589
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach


The eclipse configuration was broken by the addition of HBase and ZK JARs in 
HIVE-1293.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1589) Add HBase/ZK JARs to Eclipse classpath

2010-08-24 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1589:
-

Attachment: HIVE-1589.1.patch.txt

 Add HBase/ZK JARs to Eclipse classpath
 --

 Key: HIVE-1589
 URL: https://issues.apache.org/jira/browse/HIVE-1589
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1589.1.patch.txt


 The eclipse configuration was broken by the addition of HBase and ZK JARs in 
 HIVE-1293.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1589) Add HBase/ZK JARs to Eclipse classpath

2010-08-24 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1589:
-

Status: Patch Available  (was: Open)

 Add HBase/ZK JARs to Eclipse classpath
 --

 Key: HIVE-1589
 URL: https://issues.apache.org/jira/browse/HIVE-1589
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1589.1.patch.txt


 The eclipse configuration was broken by the addition of HBase and ZK JARs in 
 HIVE-1293.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-1590) DDLTask.dropDatabase() needs to populate inputs/outputs

2010-08-24 Thread Carl Steinbach (JIRA)

DDLTask.dropDatabase() needs to populate inputs/outputs
---

 Key: HIVE-1590
 URL: https://issues.apache.org/jira/browse/HIVE-1590
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore, Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach


From Namit:

bq. Also, inputs and outputs need to be populated for 'drop database ..' It 
should consist of all the tables/partitions in that database.

Note that for the time being this does not make a difference since we don't 
allow you to drop a database that contains tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-675) add database/schema support Hive QL

2010-08-24 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-675:


Status: Patch Available  (was: Open)

 add database/schema support Hive QL
 ---

 Key: HIVE-675
 URL: https://issues.apache.org/jira/browse/HIVE-675
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Reporter: Prasad Chakka
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-675-2009-9-16.patch, hive-675-2009-9-19.patch, 
 hive-675-2009-9-21.patch, hive-675-2009-9-23.patch, hive-675-2009-9-7.patch, 
 hive-675-2009-9-8.patch, HIVE-675-2010-08-16.patch.txt, 
 HIVE-675-2010-7-16.patch.txt, HIVE-675-2010-8-4.patch.txt, 
 HIVE-675.10.patch.txt, HIVE-675.11.patch.txt, HIVE-675.12.patch.txt


 Currently all Hive tables reside in single namespace (default). Hive should 
 support multiple namespaces (databases or schemas) such that users can create 
 tables in their specific namespaces. These name spaces can have different 
 warehouse directories (with a default naming scheme) and possibly different 
 properties.
 There is already some support for this in metastore but Hive query parser 
 should have this feature as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1211) Tapping logs from child processes

2010-08-24 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1211:
-

Attachment: HIVE-1211.3.patch.txt

 Tapping logs from child processes
 -

 Key: HIVE-1211
 URL: https://issues.apache.org/jira/browse/HIVE-1211
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Logging
Reporter: bc Wong
Assignee: bc Wong
 Fix For: 0.7.0

 Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch, 
 HIVE-1211.3.patch.txt


 Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to 
 the parent's stdout/stderr. There is little one can do to to sort out which 
 log is from which query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1211) Tapping logs from child processes

2010-08-24 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902128#action_12902128
 ] 

Carl Steinbach commented on HIVE-1211:
--

Rebased the patch to trunk.

+1

 Tapping logs from child processes
 -

 Key: HIVE-1211
 URL: https://issues.apache.org/jira/browse/HIVE-1211
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Logging
Reporter: bc Wong
Assignee: bc Wong
 Fix For: 0.7.0

 Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch, 
 HIVE-1211.3.patch.txt


 Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to 
 the parent's stdout/stderr. There is little one can do to to sort out which 
 log is from which query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1211) Tapping logs from child processes

2010-08-24 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1211:
-

Attachment: HIVE-1211.4.patch.txt

 Tapping logs from child processes
 -

 Key: HIVE-1211
 URL: https://issues.apache.org/jira/browse/HIVE-1211
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Logging
Reporter: bc Wong
Assignee: bc Wong
 Fix For: 0.7.0

 Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch, 
 HIVE-1211.3.patch.txt, HIVE-1211.4.patch.txt


 Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to 
 the parent's stdout/stderr. There is little one can do to to sort out which 
 log is from which query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS

2010-08-24 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902208#action_12902208
 ] 

Carl Steinbach commented on HIVE-1157:
--

Hi Philip, please rebase the patch and I will take a look. Thanks.

 UDFs can't be loaded via add jar when jar is on HDFS
 --

 Key: HIVE-1157
 URL: https://issues.apache.org/jira/browse/HIVE-1157
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Philip Zeyliger
Priority: Minor
 Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, 
 HIVE-1157.v2.patch.txt, output.txt


 As discussed on the mailing list, it would be nice if you could use UDFs that 
 are on jars on HDFS.  The proposed implementation would be for add jar to 
 recognize that the target file is on HDFS, copy it locally, and load it into 
 the classpath.
 {quote}
 Hi folks,
 I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
 Can you use a UDF where the jar which contains the function is on HDFS, and 
 not on the local filesystem.  Specifically, the following does not seem to 
 work:
 # This is Hive 0.5, from svn
 $bin/hive  
 Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
 hive add jar hdfs://localhost/FooTest.jar;   

 Added hdfs://localhost/FooTest.jar to class path
 hive create temporary function cube as 'com.cloudera.FooTestUDF';
 
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.FunctionTask
 Does this work for other people?  I could probably fix it by changing add 
 jar to download remote jars locally, when necessary (to load them into the 
 classpath), or update URLClassLoader (or whatever is underneath there) to 
 read directly from HDFS, which seems a bit more fragile.  But I wanted to 
 make sure that my interpretation of what's going on is right before I have at 
 it.
 Thanks,
 -- Philip
 {quote}
 {quote}
 Yes that's correct. I prefer to download the jars in add jar.
 Zheng
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

1 2 3 4 5 6 7 8 >

1 - 100 of 725 matches

Mail list logo