[jira] Updated: (HIVE-187) ODBC driver

2009-08-15 Thread Eric Hwang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hwang updated HIVE-187:


Attachment: HIVE-187.3.patch

Uploaded a new patch with minor changes:
Combined hiveclient.hpp with hiveclient.h to prevent duplication of header info.
Fixed some formatting issues
Did some clean up on test cases
Changed the thrift generated namespaces to be more specific


> ODBC driver
> ---
>
> Key: HIVE-187
> URL: https://issues.apache.org/jira/browse/HIVE-187
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Clients
>Affects Versions: 0.2.0
>Reporter: Raghotham Murthy
>Assignee: Eric Hwang
> Fix For: 0.4.0
>
> Attachments: HIVE-187.1.patch, HIVE-187.2.patch, HIVE-187.3.patch, 
> thrift_home_linux_32.tgz, thrift_home_linux_64.tgz, unixODBC-2.2.14-1.tgz, 
> unixODBC-2.2.14-2.tgz, unixODBC-2.2.14-3.tgz, unixODBC-2.2.14.tgz
>
>
> We need to provide the a small number of functions to get basic query
> execution and retrieval of results. This is based on the tutorial provided
> here: http://www.easysoft.com/developer/languages/c/odbc_tutorial.html
>  
> The minimum set of ODBC functions required are:
> SQLAllocHandle - for environment, connection, statement
> SQLSetEnvAttr
> SQLDriverConnect
> SQLExecDirect
> SQLNumResultCols
> SQLFetch
> SQLGetData
> SQLDisconnect
> SQLFreeHandle
>  
> If required the plan would be to do the following:
> 1. generate c++ client stubs for thrift server
> 2. implement the required functions in c++ by calling the c++ client
> 3. make the c++ functions in (2) extern C and then use those in the odbc
> SQL* functions
> 4. provide a .so (in linux) which can be used by the ODBC clients.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-187) ODBC driver

2009-08-15 Thread Eric Hwang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hwang updated HIVE-187:


Attachment: unixODBC-2.2.14-3.tgz

Added some minor modifications to the unixODBC API wrapper that allows for 
proper handling of result sets without any rows.

> ODBC driver
> ---
>
> Key: HIVE-187
> URL: https://issues.apache.org/jira/browse/HIVE-187
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Clients
>Affects Versions: 0.2.0
>Reporter: Raghotham Murthy
>Assignee: Eric Hwang
> Fix For: 0.4.0
>
> Attachments: HIVE-187.1.patch, HIVE-187.2.patch, 
> thrift_home_linux_32.tgz, thrift_home_linux_64.tgz, unixODBC-2.2.14-1.tgz, 
> unixODBC-2.2.14-2.tgz, unixODBC-2.2.14-3.tgz, unixODBC-2.2.14.tgz
>
>
> We need to provide the a small number of functions to get basic query
> execution and retrieval of results. This is based on the tutorial provided
> here: http://www.easysoft.com/developer/languages/c/odbc_tutorial.html
>  
> The minimum set of ODBC functions required are:
> SQLAllocHandle - for environment, connection, statement
> SQLSetEnvAttr
> SQLDriverConnect
> SQLExecDirect
> SQLNumResultCols
> SQLFetch
> SQLGetData
> SQLDisconnect
> SQLFreeHandle
>  
> If required the plan would be to do the following:
> 1. generate c++ client stubs for thrift server
> 2. implement the required functions in c++ by calling the c++ client
> 3. make the c++ functions in (2) extern C and then use those in the odbc
> SQL* functions
> 4. provide a .so (in linux) which can be used by the ODBC clients.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-645) A UDF that can export data to JDBC databases.

2009-08-15 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-645:
-

Attachment: hive-645-3.patch

Moved the the gen-udf to the contrib package. Also added the test cases that 
Namit suggested. The first test case uses static values. The second selects 
keys and values from the source table to be "dboutputed".

> A UDF that can export data to JDBC databases.
> -
>
> Key: HIVE-645
> URL: https://issues.apache.org/jira/browse/HIVE-645
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
> Attachments: hive-645-2.patch, hive-645-3.patch, hive-645.patch
>
>
> A UDF that can export data to JDBC databases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-756) performance improvement for RCFile and ColumnarSerDe in Hive

2009-08-15 Thread Zheng Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743792#action_12743792
 ] 

Zheng Shao commented on HIVE-756:
-

@hive-756.patch: line 121:
Can we distinguish between the following 2 cases?
1. Columns information is provided but empty: we ignore all columns
2. Columns information is not provided: we read all columns.
In this way if the caller (some non-hive applications) does not know the RCFile 
column information settings, it can still read in all columns.

> performance improvement for RCFile and ColumnarSerDe in Hive
> 
>
> Key: HIVE-756
> URL: https://issues.apache.org/jira/browse/HIVE-756
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: hive-756.patch
>
>
> There are some easy performance improvements in the columnar storage in Hive 
> I found during Hackathon. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-756) performance improvement for RCFile and ColumnarSerDe in Hive

2009-08-15 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao reassigned HIVE-756:
---

Assignee: Ning Zhang

> performance improvement for RCFile and ColumnarSerDe in Hive
> 
>
> Key: HIVE-756
> URL: https://issues.apache.org/jira/browse/HIVE-756
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: hive-756.patch
>
>
> There are some easy performance improvements in the columnar storage in Hive 
> I found during Hackathon. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-759) add hive.intermediate.compression.codec option

2009-08-15 Thread Zheng Shao (JIRA)
add hive.intermediate.compression.codec option
--

 Key: HIVE-759
 URL: https://issues.apache.org/jira/browse/HIVE-759
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Zheng Shao


Hive uses the jobconf compression codec for all map-reduce jobs. This includes 
both mapred.map.output.compression.codec and mapred.output.compression.codec.

In some cases, we want to distinguish between the codec used for intermediate 
map-reduce jobs (that produces intermediate data between jobs) and the final 
map-reduce jobs (that produces data stored in tables).

For intermediate data, lzo might be a better fit because it's much faster; for 
final data, gzip might be a better fit because it saves disk spaces.

We should introduce two new options:
{code}
hive.intermediate.compression.codec=org.apache.hadoop.io.compress.LzoCodec
hive.intermediate.compression.type=BLOCK
{code}
And use these 2 options to override the mapred.output.compression.* in the 
FileSinkOperator that produces intermediate data.

Note that it's possible that a single map-reduce job may have 2 
FileSInkOperators: one produces intermediate data, and one produces final data. 
So we need to add a flag to fileSinkDesc for that.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-737) bin/hive doesn't start the shell when using a src build of hadoop

2009-08-15 Thread Ashish Thusoo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo updated HIVE-737:
---

   Resolution: Fixed
Fix Version/s: 0.4.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

committed. Thanks Johan!!

> bin/hive doesn't start the shell when using a src build of hadoop
> -
>
> Key: HIVE-737
> URL: https://issues.apache.org/jira/browse/HIVE-737
> Project: Hadoop Hive
>  Issue Type: Improvement
>Affects Versions: 0.4.0
>Reporter: Johan Oskarsson
>Assignee: Johan Oskarsson
>Priority: Blocker
> Fix For: 0.4.0
>
> Attachments: HIVE-737-3.patch, HIVE-737.patch, HIVE-737.patch
>
>
> After HIVE-487 "bin/hive" doesn't start the shell on our setup. This is 
> because we use a source version of hadoop, where the jar files are in 
> HADOOP_HOME/build/*.jar instead of in lib or in the root HADOOP_HOME.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.