[jira] Commented: (HIVE-1071) Making RCFile concatenatable to reduce the number of files of the output

2010-01-20 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803025#action_12803025
 ] 

dhruba borthakur commented on HIVE-1071:


we could create a API in HDFS that concatenates a set of files into one file. 
The partial last block of each file will be zero filled, this is required 
because all the blocks (except the last block) in a single HDFS file should 
have the same size.

once we have the above-mentioned HDFS API, then we can merge a bunch of RC 
files into one single file without doing much physical IO. The RC file format 
has to be such that it can safely ignore zero-filled areas in the middle of the 
file. Can it do this?

 Making RCFile concatenatable to reduce the number of files of the output
 --

 Key: HIVE-1071
 URL: https://issues.apache.org/jira/browse/HIVE-1071
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Zheng Shao

 Hive automatically determine the number of reducers most of the time.
 Sometimes, we create a lot of small files.
 Hive has an option to merge those small files though a map-reduce job.
 Dhruba has the idea which can fix it even faster:
 if we can make RCFile concatenatable, then we can simply tell the namenode to 
 merge these files.
 Pros: This approach does not do any I/O so it's faster.
 Cons: We have to zero-fill the files to make sure they can be concatenated 
 (all blocks except the last have to be full HDFS blocks).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-892) Hive command line unable to kill jobs when the comnand line is interrupted

2009-10-20 Thread dhruba borthakur (JIRA)
Hive command line unable to kill jobs when the comnand line is interrupted 
---

 Key: HIVE-892
 URL: https://issues.apache.org/jira/browse/HIVE-892
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: dhruba borthakur
Assignee: dhruba borthakur


The Hadoop 0.20 version of JT insists that a kill command submitted to the JT 
is via a POST command. The Hive execDriver submits the kill command via a 
HTTP-GET command. This means that the hive client is unable to kill hadoop 0.20 
jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-892) Hive command line unable to kill jobs when the comnand line is interrupted

2009-10-20 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-892:
--

Attachment: killJobs.txt

Use the HTTP POST command to kill jobs. This should work for both all versions 
of hadoop.

 Hive command line unable to kill jobs when the comnand line is interrupted 
 ---

 Key: HIVE-892
 URL: https://issues.apache.org/jira/browse/HIVE-892
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: killJobs.txt


 The Hadoop 0.20 version of JT insists that a kill command submitted to the JT 
 is via a POST command. The Hive execDriver submits the kill command via a 
 HTTP-GET command. This means that the hive client is unable to kill hadoop 
 0.20 jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-892) Hive command line unable to kill jobs when the comnand line is interrupted

2009-10-20 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-892:
--

Status: Patch Available  (was: Open)

 Hive command line unable to kill jobs when the comnand line is interrupted 
 ---

 Key: HIVE-892
 URL: https://issues.apache.org/jira/browse/HIVE-892
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: killJobs.txt


 The Hadoop 0.20 version of JT insists that a kill command submitted to the JT 
 is via a POST command. The Hive execDriver submits the kill command via a 
 HTTP-GET command. This means that the hive client is unable to kill hadoop 
 0.20 jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-682) add UDF concat_ws

2009-09-30 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur reassigned HIVE-682:
-

Assignee: Jonathan Chang

 add UDF concat_ws
 -

 Key: HIVE-682
 URL: https://issues.apache.org/jira/browse/HIVE-682
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Jonathan Chang
 Attachments: concat_ws.patch


 add UDF concat_ws
 look at 
 http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html
 for details

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-653) Limit job hive queries to conform to underlying hadoop capacities

2009-07-17 Thread dhruba borthakur (JIRA)
Limit job hive queries to conform to underlying  hadoop capacities
--

 Key: HIVE-653
 URL: https://issues.apache.org/jira/browse/HIVE-653
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: dhruba borthakur


The Hive client should match underlying cluster capacity with cost estimates 
from a newly submitted job. The newly submitted job should be rejected if it 
exceeds the capacity of the cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-653) Limit job hive queries to conform to underlying hadoop capacities

2009-07-17 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732720#action_12732720
 ] 

dhruba borthakur commented on HIVE-653:
---

The Hadoop JobClient class could be extended to check the following:

1.  The total number of mappers/reduces that a job can use
2.   The total size of data that the job will process

These are not strictly related to hive but might be useful for most warehouse 
type of applications.

Another solution would be to use the Hive pre-query hook to check these 
constraints.

 Limit job hive queries to conform to underlying  hadoop capacities
 --

 Key: HIVE-653
 URL: https://issues.apache.org/jira/browse/HIVE-653
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: dhruba borthakur

 The Hive client should match underlying cluster capacity with cost estimates 
 from a newly submitted job. The newly submitted job should be rejected if it 
 exceeds the capacity of the cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-74) Hive can use CombineFileInputFormat for when the input are many small files

2009-04-20 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-74?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-74:
-

Attachment: hiveCombineSplit2.patch

This combines multiple blocks from files into a single split.

 Hive can use CombineFileInputFormat for when the input are many small files
 ---

 Key: HIVE-74
 URL: https://issues.apache.org/jira/browse/HIVE-74
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.4.0

 Attachments: hiveCombineSplit.patch, hiveCombineSplit.patch, 
 hiveCombineSplit2.patch


 There are cases when the input to a Hive job are thousands of small files. In 
 this case, there is a mapper for each file. Most of the overhead for spawning 
 all these mappers can be avoided if Hive used CombineFileInputFormat 
 introduced via HADOOP-4565

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (HIVE-74) Hive can use CombineFileInputFormat for when the input are many small files

2009-04-20 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-74?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701036#action_12701036
 ] 

dhruba borthakur edited comment on HIVE-74 at 4/20/09 8:43 PM:
---

This combines multiple blocks from files into a single split. All files of the 
same table are part of a single pool.

  was (Author: dhruba):
This combines multiple blocks from files into a single split.
  
 Hive can use CombineFileInputFormat for when the input are many small files
 ---

 Key: HIVE-74
 URL: https://issues.apache.org/jira/browse/HIVE-74
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.4.0

 Attachments: hiveCombineSplit.patch, hiveCombineSplit.patch, 
 hiveCombineSplit2.patch


 There are cases when the input to a Hive job are thousands of small files. In 
 this case, there is a mapper for each file. Most of the overhead for spawning 
 all these mappers can be avoided if Hive used CombineFileInputFormat 
 introduced via HADOOP-4565

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.

2009-02-22 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12675768#action_12675768
 ] 

dhruba borthakur commented on HIVE-79:
--

This should go into trunk and not into any branch, right?

 Print number of rows inserted to table(s) when  the query is finished.
 --

 Key: HIVE-79
 URL: https://issues.apache.org/jira/browse/HIVE-79
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Logging
Reporter: Suresh Antony
Assignee: Suresh Antony
Priority: Minor
 Fix For: 0.2.0

 Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt


 It is good to print the number of rows inserted into each table at end of 
 query. 
 insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
 This query can print something like:
 tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-83) Set up a continuous build of Hive with Hudson

2009-02-11 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12672739#action_12672739
 ] 

dhruba borthakur commented on HIVE-83:
--

I have got an hudson account named dhruba. However, this account will be used 
by Johan to setup the Hive-Hudson builds. 

 Set up a continuous build of Hive with Hudson
 -

 Key: HIVE-83
 URL: https://issues.apache.org/jira/browse/HIVE-83
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure
Reporter: Jeff Hammerbacher

 Other projects like Zookeeper and HBase are leveraging Apache's hosted Hudson 
 server (http://hudson.zones.apache.org/hudson/view/HBase). Perhaps Hive 
 should as well?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-232) metastore.warehouse configuration should use inherited hadoop configuration

2009-01-16 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-232:
--

Hadoop Flags: [Reviewed]
  Status: Patch Available  (was: Open)

 metastore.warehouse configuration should use inherited hadoop configuration
 ---

 Key: HIVE-232
 URL: https://issues.apache.org/jira/browse/HIVE-232
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Josh Ferguson
Assignee: Prasad Chakka
 Attachments: hive-232.2.patch, hive-232.2.patch, hive-232.patch


 the hive.metastore.warehouse.dir configuration property in hive-*.xml needs 
 to use the protocol, host, and port when it is inherited from the fs.name 
 property in hadoop-site.xml.
 When it doesn't and no protocol is found then a broad range of Move 
 operations when the source and target are both in the DFS will fail.
 Currently this can be worked around by prepending the protocol, host and port 
 of the hadoop nameserver into the value of the hive.metastore.warehouse.dir 
 property.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-232) metastore.warehouse configuration should use inherited hadoop configuration

2009-01-16 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-232:
--

   Resolution: Fixed
Fix Version/s: 0.2.0
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks Prasad!

 metastore.warehouse configuration should use inherited hadoop configuration
 ---

 Key: HIVE-232
 URL: https://issues.apache.org/jira/browse/HIVE-232
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Josh Ferguson
Assignee: Prasad Chakka
 Fix For: 0.2.0

 Attachments: hive-232.2.patch, hive-232.2.patch, hive-232.patch


 the hive.metastore.warehouse.dir configuration property in hive-*.xml needs 
 to use the protocol, host, and port when it is inherited from the fs.name 
 property in hadoop-site.xml.
 When it doesn't and no protocol is found then a broad range of Move 
 operations when the source and target are both in the DFS will fail.
 Currently this can be worked around by prepending the protocol, host and port 
 of the hadoop nameserver into the value of the hive.metastore.warehouse.dir 
 property.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-230) While loading a table from a query that returns empty data results in null pointer exception

2009-01-14 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-230:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks Prasad.

 While loading a table from a query that returns empty data results in null 
 pointer exception 
 -

 Key: HIVE-230
 URL: https://issues.apache.org/jira/browse/HIVE-230
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Prasad Chakka
Assignee: Prasad Chakka
 Fix For: 0.2.0

 Attachments: hive-230.2.patch, hive-230.patch, hive-230.patch


 If the select query returns zero rows then the insert will fail with null 
 pointer exception
 INSERT OVERWRITE TABLE test_pc SELECT a.userid, a.ip FROM test_pc2 a WHERE 
 (userid=595058415);
 2009-01-13 10:16:21,396 ERROR exec.MoveTask 
 (SessionState.java:printError(254)) - Failed with exception null
 java.lang.NullPointerException
   at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:127)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:212)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:305)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:166)
   at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
   at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-220) incorrect log directory in TestMTQueries causing null pointer exception

2009-01-09 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-220:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks Prasad!

 incorrect log directory in TestMTQueries causing null pointer exception
 ---

 Key: HIVE-220
 URL: https://issues.apache.org/jira/browse/HIVE-220
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Prasad Chakka
Assignee: Prasad Chakka
Priority: Critical
 Fix For: 0.2.0

 Attachments: hive-220.patch, hive-220.patch, hive-220.patch


 mistyped  ')' on line 38 of TestMTQueries.java causing null pointer exception.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-84) MetaStore Client is not thread safe

2009-01-06 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-84:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks Prasad!

 MetaStore Client is not thread safe
 ---

 Key: HIVE-84
 URL: https://issues.apache.org/jira/browse/HIVE-84
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.2.0
 Environment: with patch for hive-77 - run:
 ant -lib ./testlibs -Dtestcase=TestMTQueries test
Reporter: Joydeep Sen Sarma
Assignee: Prasad Chakka
 Fix For: 0.2.0

 Attachments: hive-84.patch


 when running DDL Tasks in concurrent threads - the following exception trace 
 is observed:
 java.sql.SQLIntegrityConstraintViolationException: The statement was aborted 
 because it would have caused a duplicate ke\ y value in a unique or primary 
 key constraint or unique index identified by 'UNIQUETABLE' defined on 'TBLS'.
   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:207)
   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:209)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:174)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:185)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:210)
   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:390)
   at org.apache.hadoop.hive.ql.QTestUtil$QTRunner.run(QTestUtil.java:681)
   at java.lang.Thread.run(Thread.java:619)
 Caused by: javax.jdo.JDODataStoreException: Insert of object 
 org.apache.hadoop.hive.metastore.model.mta...@3bc8d400 us\ ing statement 
 INSERT INTO TBLS 
 (TBL_ID,CREATE_TIME,DB_ID,RETENTION,TBL_NAME,SD_ID,OWNER,LAST_ACCESS_TIME) 
 VALUES (?,?,?\ ,?,?,?,?,?) failed : The statement was aborted because it 
 would have caused a duplicate key value in a unique or primar\ y key 
 constraint or unique index identified by 'UNIQUETABLE' defined on 'TBLS'.
 NestedThrowables:
 java.sql.SQLIntegrityConstraintViolationException: The statement was aborted 
 because it would have caused a duplicate ke\ y value in a unique or primary 
 key constraint or unique index identified by 'UNIQUETABLE' defined on 'TBLS'.
   at 
 org.jpox.jdo.JPOXJDOHelper.getJDOExceptionForJPOXException(JPOXJDOHelper.java:291)
   at 
 org.jpox.jdo.AbstractPersistenceManager.jdoMakePersistent(AbstractPersistenceManager.java:671)
   at 
 org.jpox.jdo.AbstractPersistenceManager.makePersistent(AbstractPersistenceManager.java:691)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:479)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:292)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:252)
   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:205)
   ... 7 more
 Caused by: java.sql.SQLIntegrityConstraintViolationException: The statement 
 was aborted because it would have caused a d\ uplicate key value in a unique 
 or primary key constraint or unique index identified by 'UNIQUETABLE' defined 
 on 'TBLS'.
   at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
 Source)
   at 
 org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
 Source)
   at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown 
 Source)
   at 
 org.jpox.store.rdbms.SQLController.executeStatementUpdate(SQLController.java:396)
   at 
 org.jpox.store.rdbms.request.InsertRequest.execute(InsertRequest.java:370)
   at 
 org.jpox.store.rdbms.RDBMSPersistenceHandler.insertTable(RDBMSPersistenceHandler.java:157)
   at 
 org.jpox.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:136)
   at 
 org.jpox.state.JDOStateManagerImpl.internalMakePersistent(JDOStateManagerImpl.java:3082)
   at 
 org.jpox.state.JDOStateManagerImpl.makePersistent(JDOStateManagerImpl.java:3062)
   at 
 org.jpox.ObjectManagerImpl.persistObjectInternal(ObjectManagerImpl.java:1231)
   at org.jpox.ObjectManagerImpl.persistObject(ObjectManagerImpl.java:1077)
   at 
 org.jpox.jdo.AbstractPersistenceManager.jdoMakePersistent(AbstractPersistenceManager.java:666)
   

[jira] Updated: (HIVE-48) Support JDBC connections for interoperability between Hive and RDBMS

2009-01-05 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-48:
-

Status: Open  (was: Patch Available)

I get compilation problems:

core-compile:
[javac] Compiling 10 source files to 
/mnt/vol/devrs004.snc1/dhruba/commithive/build/jdbc/classes
[javac] 
/mnt/vol/devrs004.snc1/dhruba/commithive/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveConnection.java:452:
 unreported exception java.sql.SQLException; must be caught or declared to be 
thrown
[javac] throw new SQLException(Method not supported);
[javac] ^
[javac] 
/mnt/vol/devrs004.snc1/dhruba/commithive/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveConnection.java:462:
 unreported exception java.sql.SQLException; must be caught or declared to be 
thrown
[javac] throw new SQLException(Method not supported);


 Support JDBC connections for interoperability between Hive and RDBMS
 

 Key: HIVE-48
 URL: https://issues.apache.org/jira/browse/HIVE-48
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Clients
Reporter: YoungWoo Kim
Assignee: Raghotham Murthy
Priority: Minor
 Attachments: hadoop-4101.1.patch, hadoop-4101.2.patch, 
 hadoop-4101.3.patch, hadoop-4101.4.patch, hive-48.5.patch, hive-48.6.patch, 
 hive-48.7.patch


 In many DW and BI systems, the data are stored in RDBMS for now such as 
 oracle, mysql, postgresql ... for reporting, charting and etc.
 It would be useful to be able to import data from RDBMS and export data to 
 RDBMS using JDBC connections.
 If Hive support JDBC connections, It wll be much easier to use 3rd party 
 DW/BI tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-48) Support JDBC connections for interoperability between Hive and RDBMS

2009-01-05 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-48:
-

Status: Patch Available  (was: Open)

 Support JDBC connections for interoperability between Hive and RDBMS
 

 Key: HIVE-48
 URL: https://issues.apache.org/jira/browse/HIVE-48
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Clients
Reporter: YoungWoo Kim
Assignee: Raghotham Murthy
Priority: Minor
 Attachments: hadoop-4101.1.patch, hadoop-4101.2.patch, 
 hadoop-4101.3.patch, hadoop-4101.4.patch, hive-48.5.patch, hive-48.6.patch, 
 hive-48.7.patch, hive-48.8.patch


 In many DW and BI systems, the data are stored in RDBMS for now such as 
 oracle, mysql, postgresql ... for reporting, charting and etc.
 It would be useful to be able to import data from RDBMS and export data to 
 RDBMS using JDBC connections.
 If Hive support JDBC connections, It wll be much easier to use 3rd party 
 DW/BI tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-48) Support JDBC connections for interoperability between Hive and RDBMS

2009-01-05 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-48:
-

   Resolution: Fixed
Fix Version/s: 0.2.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks Raghu and Michi.

 Support JDBC connections for interoperability between Hive and RDBMS
 

 Key: HIVE-48
 URL: https://issues.apache.org/jira/browse/HIVE-48
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Clients
Reporter: YoungWoo Kim
Assignee: Raghotham Murthy
Priority: Minor
 Fix For: 0.2.0

 Attachments: hadoop-4101.1.patch, hadoop-4101.2.patch, 
 hadoop-4101.3.patch, hadoop-4101.4.patch, hive-48.5.patch, hive-48.6.patch, 
 hive-48.7.patch, hive-48.8.patch


 In many DW and BI systems, the data are stored in RDBMS for now such as 
 oracle, mysql, postgresql ... for reporting, charting and etc.
 It would be useful to be able to import data from RDBMS and export data to 
 RDBMS using JDBC connections.
 If Hive support JDBC connections, It wll be much easier to use 3rd party 
 DW/BI tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-202) LINEAGE is not working for join quries

2008-12-31 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-202:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks Suresh!

 LINEAGE is not  working for join quries
 ---

 Key: HIVE-202
 URL: https://issues.apache.org/jira/browse/HIVE-202
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 0.2.0
 Environment: lineage is not working for join quires
Reporter: Suresh Antony
Assignee: Suresh Antony
Priority: Minor
 Fix For: 0.2.0

 Attachments: patch_202.txt, patch_202.txt


 lineage is not giving input tables  in case of join quires.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-196) Failure when doing 2 tests with the same user on the same machine

2008-12-30 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-196:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks Ashish!

 Failure when doing 2 tests with the same user on the same machine
 -

 Key: HIVE-196
 URL: https://issues.apache.org/jira/browse/HIVE-196
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Zheng Shao
Assignee: Ashish Thusoo
 Attachments: patch-196.txt


 org.apache.hadoop.util.Shell$ExitCodeException: chmod: cannot access 
 `/tmp/zshao/kv1.txt': No such file or directory
 We should make a unique directory for each of the test runs, instead of 
 sharing /tmp/${username}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-76) Column number mismatch between query and destination tables when alias.* expressions are present in the select list of a join

2008-11-23 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-76:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks Ashish!

 Column number mismatch between query and destination tables when alias.* 
 expressions are present in the select list of a join
 -

 Key: HIVE-76
 URL: https://issues.apache.org/jira/browse/HIVE-76
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.20.0
Reporter: Ashish Thusoo
Assignee: Ashish Thusoo
 Fix For: 0.20.0

 Attachments: patch-76.txt, patch-76_1.txt


 Column number mismatch between query and destination tables when alias.* 
 expressions are present in the select list of a join. The reason is due to a 
 bug in how the row resolver is constructed in SemanticAnalyzer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-72) wrong results if partition pruning not strict and no mep-reduce job needed

2008-11-20 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HIVE-72:
-

   Resolution: Fixed
Fix Version/s: 0.20.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks Namit!

 wrong results if partition pruning not strict and no mep-reduce job needed
 --

 Key: HIVE-72
 URL: https://issues.apache.org/jira/browse/HIVE-72
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.20.0

 Attachments: patch1.mapred.txt


 Suppose T is a partitioned table on ds, where ds is a string column, the 
 following queries:
  SELECT a.* FROM T a WHERE a.ds=2008-09-08 LIMIT 1;
  SELECT a.* FROM T a WHERE a.ds=2008-11-10 LIMIT 1;
 return the first row from the first partition.
 This is because of the typecast to double.
 for a.ds=2008-01-01 or anything (a.ds=1),
  evaluate (Double, Double) is invoked at partition pruning.
 Since '2008-11-01' is not a valid double, it is converted to a null, and 
 therefore the result of pruning returns null (unknown) - not FALSE.
 All unknowns are also accepted, therefore all partitions are accepted which 
 explains this behavior.
 filter is not invoked since it is a select * query, so map-reduce job is 
 started.
 We just turn off this optimization if pruning indicates that there can be 
 unknown partitions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.