[jira] [Commented] (HIVE-12049) HiveServer2: Provide an option to write serialized thrift objects in final tasks

2016-04-22 Thread Rohit Dholakia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254590#comment-15254590
 ] 

Rohit Dholakia commented on HIVE-12049:
---

Welcome, it's been great working on the patch and related work! Thanks 
[~vgumashta] [~thejas] for all the help! 

> HiveServer2: Provide an option to write serialized thrift objects in final 
> tasks
> 
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.0
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.25.patch, HIVE-12049.26.patch, HIVE-12049.3.patch, 
> HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch, 
> HIVE-12049.7.patch, HIVE-12049.9.patch, new-driver-profiles.png, 
> old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.25.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Status: Patch Available  (was: Open)

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-19 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Status: Open  (was: Patch Available)

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, 
> HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch, 
> new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-15 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.19.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, 
> HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch, 
> new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-07 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.18.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.18.patch, HIVE-12049.2.patch, HIVE-12049.3.patch, 
> HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch, 
> HIVE-12049.7.patch, HIVE-12049.9.patch, new-driver-profiles.png, 
> old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-01 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.17.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, 
> HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-01 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.16.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, 
> HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch, 
> new-driver-profiles.png, old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-04-01 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.15.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.15.patch, HIVE-12049.2.patch, HIVE-12049.3.patch, 
> HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch, 
> HIVE-12049.7.patch, HIVE-12049.9.patch, new-driver-profiles.png, 
> old-driver-profiles.png
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13359) NoClassFoundError hadoop configuration with jdbc-standalone JAR

2016-03-24 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-13359:
--
Attachment: HIVE-13359.1.patch

> NoClassFoundError hadoop configuration with jdbc-standalone JAR
> ---
>
> Key: HIVE-13359
> URL: https://issues.apache.org/jira/browse/HIVE-13359
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-13359.1.patch
>
>
> When the hive-jdbc-SNAPSHOT-standalone.jar is used to run queries, it leads 
> to a NoClassDefFoundError for org/apache/hadoop/conf/Configuration. This 
> patch will resolve it by updating the jdbc/pom.xml file to not exclude 
> commons-configuration and org.apache.hadoop:* as part of the maven shaded 
> plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-15 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.14.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-15 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.13.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, 
> HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-07 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.12.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.2.patch, HIVE-12049.3.patch, 
> HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch, 
> HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-07 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Status: Open  (was: Patch Available)

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-07 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Status: Patch Available  (was: Open)

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-03 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.11.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-02-19 Thread Rohit Dholakia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154602#comment-15154602
 ] 

Rohit Dholakia commented on HIVE-12049:
---

uploaded a new version. 

* stray comments removed. 
* A fix at several places so that CLI queries don't use the new ThriftJDBC 
SerDe. 


> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, 
> HIVE-12049.6.patch, HIVE-12049.7.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-02-19 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.7.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, 
> HIVE-12049.6.patch, HIVE-12049.7.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-02-16 Thread Rohit Dholakia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149452#comment-15149452
 ] 

Rohit Dholakia commented on HIVE-12049:
---

uploaded a new version of end to end patch. has some bug fixes and some changes 
to the FileSinkOperator and ThriftJDBCSerDe. 

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-02-16 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12049:
--
Attachment: HIVE-12049.6.patch

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch, 
> HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2016-01-11 Thread Rohit Dholakia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15092258#comment-15092258
 ] 

Rohit Dholakia commented on HIVE-12442:
---

Thanks! 

> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>  Labels: HiveServer2, RPC, Thrift
> Fix For: 2.1.0
>
> Attachments: hive-12442.1.patch, hive-12442.2.patch, 
> hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch
>
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2016-01-09 Thread Rohit Dholakia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090821#comment-15090821
 ] 

Rohit Dholakia commented on HIVE-12442:
---

Updated rb with the most recent patch. 

> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>  Labels: HiveServer2, RPC, Thrift
> Fix For: 2.1.0
>
> Attachments: hive-12442.1.patch, hive-12442.2.patch, 
> hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch
>
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2016-01-08 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12442:
--
Attachment: hive-12442.5.patch

> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>  Labels: HiveServer2, RPC, Thrift
> Fix For: 2.1.0
>
> Attachments: hive-12442.1.patch, hive-12442.2.patch, 
> hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch
>
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2016-01-08 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12442:
--
Attachment: HIVE-12442.5.patch

> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>  Labels: HiveServer2, RPC, Thrift
> Fix For: 2.1.0
>
> Attachments: HIVE-12442.5.patch, hive-12442.1.patch, 
> hive-12442.2.patch, hive-12442.3.patch, hive-12442.4.patch
>
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2016-01-08 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12442:
--
Attachment: (was: HIVE-12442.5.patch)

> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>  Labels: HiveServer2, RPC, Thrift
> Fix For: 2.1.0
>
> Attachments: hive-12442.1.patch, hive-12442.2.patch, 
> hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch
>
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2016-01-08 Thread Rohit Dholakia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089736#comment-15089736
 ] 

Rohit Dholakia commented on HIVE-12442:
---

Ok sure. uploaded a new patch generated using 0.9.3. thanks. 

> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>  Labels: HiveServer2, RPC, Thrift
> Fix For: 2.1.0
>
> Attachments: hive-12442.1.patch, hive-12442.2.patch, 
> hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch
>
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2015-12-20 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12442:
--
Attachment: hive-12442.4.patch

> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>  Labels: HiveServer2, RPC, Thrift
> Attachments: hive-12442.1.patch, hive-12442.2.patch, 
> hive-12442.3.patch, hive-12442.4.patch
>
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2015-12-18 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12442:
--
Attachment: hive-12442.2.patch

> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>  Labels: hiveserver, thrift
> Attachments: hive-12442.1.patch, hive-12442.2.patch
>
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.
> https://reviews.apache.org/r/41379



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2015-12-17 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12442:
--
Attachment: hive-12442.1.patch

wip patch. 

> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>  Labels: hiveserver, thrift
> Attachments: hive-12442.1.patch
>
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.
> https://reviews.apache.org/r/41379



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2015-12-15 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-12442:
--
Description: 
For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
types from HS2's thrift API. This jira will look at the least invasive way to 
do that.

https://reviews.apache.org/r/41379


  was:For implementing HIVE-12427, the tasks will need to have knowledge of 
thrift types from HS2's thrift API. This jira will look at the least invasive 
way to do that.


> Refactor/repackage HiveServer2's Thrift code so that it can be used in the 
> tasks
> 
>
> Key: HIVE-12442
> URL: https://issues.apache.org/jira/browse/HIVE-12442
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vaibhav Gumashta
>Assignee: Rohit Dholakia
>
> For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
> types from HS2's thrift API. This jira will look at the least invasive way to 
> do that.
> https://reviews.apache.org/r/41379



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-30 Thread Rohit Dholakia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609043#comment-14609043
 ] 

Rohit Dholakia commented on HIVE-10438:
---

1. We see your point. But we believe that ResultSet compression using type 
information delivers better compression ratios. For instance, using the integer 
plugin attached with the patch has 10% more compression ratio than Snappy 
(results also attached). I think we can incorporate your suggestion by adding a 
switch to specify whether to use type-sensitive (different compressors for 
different column types) or type-insensitive compression (e.g same technology 
for all column types). For this, the interface ColumnCompressor will only need 
to be extended by one method. 

3. We have now done update patch. 

4. We have added a few tests to the src/test folder of hive-service which uses 
Snappy for compression and decompression using a few default values. 

Thanks. 

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
 Proposal-rscompressor.pdf, README.txt, 
 Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
 hs2driver-master.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 
 https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-25 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: HIVE-10438-1.patch

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
 Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, 
 hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 
 https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-25 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: README.txt

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
 Proposal-rscompressor.pdf, README.txt, 
 Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
 hs2driver-master.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 
 https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-25 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: (was: readme.txt)

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
 Proposal-rscompressor.pdf, README.txt, 
 Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
 hs2driver-master.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 
 https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-25 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: (was: hs2resultSetcompressor.zip)

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
 Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, 
 hs2driver-master.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 
 https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-23 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Description: 
This JIRA proposes an architecture for enabling ResultSet compression which 
uses an external plugin. 

The patch has three aspects to it: 
0. An architecture for enabling ResultSet compression with external plugins
1. An example plugin to demonstrate end-to-end functionality 
2. A container to allow everyone to write and test ResultSet compressors with a 
query submitter (https://github.com/xiaom/hs2driver) 

Also attaching a design document explaining the changes, experimental results 
document, and a pdf explaining how to setup the docker container to observe 
end-to-end functionality of ResultSet compression. 

https://reviews.apache.org/r/35792/ Review board link. 

  was:
This JIRA proposes an architecture for enabling ResultSet compression which 
uses an external plugin. 

The patch has three aspects to it: 
0. An architecture for enabling ResultSet compression with external plugins
1. An example plugin to demonstrate end-to-end functionality 
2. A container to allow everyone to write and test ResultSet compressors with a 
query submitter (https://github.com/xiaom/hs2driver) 

Also attaching a design document explaining the changes, experimental results 
document, and a pdf explaining how to setup the docker container to observe 
end-to-end functionality of ResultSet compression. 




 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: HIVE-10438.patch, Proposal-rscompressor.pdf, 
 Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, 
 hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 
 https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-17 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: (was: CompressorProtocolHS2.patch)

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: HIVE-10438.patch, Proposal-rscompressor.pdf, 
 Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, 
 hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-17 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: HIVE-10438.patch

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, HIVE-10438.patch, 
 Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, 
 hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-05-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: Results_Snappy_protobuf_TBinary_TCompact.pdf

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, 
 hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-05-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: (was: TestingIntegerCompression.pdf)

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-05-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: (was: CompressorProtocolHS2.patch)

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2driver-master.zip, 
 hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-05-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: CompressorProtocolHS2.patch

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2driver-master.zip, 
 hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-23 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Affects Version/s: (was: 1.1.0)
   1.2.0

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2resultSetcompressor.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors.
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-23 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Description: 
This JIRA proposes an architecture for enabling ResultSet compression which 
uses an external plugin. 

The patch has three aspects to it: 
0. An architecture for enabling ResultSet compression with external plugins
1. An example plugin to demonstrate end-to-end functionality 
2. A container to allow everyone to write and test ResultSet compressors with a 
query submitter (https://github.com/xiaom/hs2driver) 

Also attaching a design document explaining the changes, experimental results 
document, and a pdf explaining how to setup the docker container to observe 
end-to-end functionality of ResultSet compression. 



  was:
This JIRA proposes an architecture for enabling ResultSet compression which 
uses an external plugin. 

The patch has three aspects to it: 
0. An architecture for enabling ResultSet compression with external plugins
1. An example plugin to demonstrate end-to-end functionality 
2. A container to allow everyone to write and test ResultSet compressors.

Also attaching a design document explaining the changes, experimental results 
document, and a pdf explaining how to setup the docker container to observe 
end-to-end functionality of ResultSet compression. 




 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2resultSetcompressor.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-23 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: hs2driver-master.zip

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2driver-master.zip, 
 hs2resultSetcompressor.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-23 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: readme.txt

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.2.0
Reporter: Rohit Dholakia
Assignee: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2driver-master.zip, 
 hs2resultSetcompressor.zip, readme.txt


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors with 
 a query submitter (https://github.com/xiaom/hs2driver) 
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: Proposal-rscompressor.pdf

Design document explaining the changes for ResultSet compressor architecture

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.1.0
Reporter: Rohit Dholakia
  Labels: patch
 Attachments: Proposal-rscompressor.pdf


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors.
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10439) Architecture for ResultSet Compression via external plugin

2015-04-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia resolved HIVE-10439.
---
Resolution: Duplicate

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10439
 URL: https://issues.apache.org/jira/browse/HIVE-10439
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.1.0
Reporter: Rohit Dholakia
  Labels: patch

 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors.
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: TestingIntegerCompression.pdf

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.1.0
Reporter: Rohit Dholakia
  Labels: patch
 Attachments: Proposal-rscompressor.pdf, TestingIntegerCompression.pdf


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors.
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10440) Architecture for ResultSet Compression via external plugin

2015-04-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia resolved HIVE-10440.
---
Resolution: Duplicate

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10440
 URL: https://issues.apache.org/jira/browse/HIVE-10440
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.1.0
Reporter: Rohit Dholakia
  Labels: patch

 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors.
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: CompressorProtocolHS2.patch

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.1.0
Reporter: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors.
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-04-22 Thread Rohit Dholakia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Dholakia updated HIVE-10438:
--
Attachment: hs2resultSetcompressor.zip

 Architecture for  ResultSet Compression via external plugin
 ---

 Key: HIVE-10438
 URL: https://issues.apache.org/jira/browse/HIVE-10438
 Project: Hive
  Issue Type: New Feature
  Components: Hive, Thrift API
Affects Versions: 1.1.0
Reporter: Rohit Dholakia
  Labels: patch
 Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, 
 TestingIntegerCompression.pdf, hs2resultSetcompressor.zip


 This JIRA proposes an architecture for enabling ResultSet compression which 
 uses an external plugin. 
 The patch has three aspects to it: 
 0. An architecture for enabling ResultSet compression with external plugins
 1. An example plugin to demonstrate end-to-end functionality 
 2. A container to allow everyone to write and test ResultSet compressors.
 Also attaching a design document explaining the changes, experimental results 
 document, and a pdf explaining how to setup the docker container to observe 
 end-to-end functionality of ResultSet compression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)