[jira] [Commented] (HIVE-12049) HiveServer2: Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254590#comment-15254590 ] Rohit Dholakia commented on HIVE-12049: --- Welcome, it's been great working on the patch and related work! Thanks [~vgumashta] [~thejas] for all the help! > HiveServer2: Provide an option to write serialized thrift objects in final > tasks > > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Affects Versions: 2.0.0 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, > HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, > HIVE-12049.25.patch, HIVE-12049.26.patch, HIVE-12049.3.patch, > HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch, > HIVE-12049.7.patch, HIVE-12049.9.patch, new-driver-profiles.png, > old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.25.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, > HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, > HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, > HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Status: Patch Available (was: Open) > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, > HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, > HIVE-12049.25.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, > HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Status: Open (was: Patch Available) > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, > HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, > HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, > HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch, > new-driver-profiles.png, old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.19.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, > HIVE-12049.18.patch, HIVE-12049.19.patch, HIVE-12049.2.patch, > HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, > HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch, > new-driver-profiles.png, old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.18.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, > HIVE-12049.18.patch, HIVE-12049.2.patch, HIVE-12049.3.patch, > HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch, > HIVE-12049.7.patch, HIVE-12049.9.patch, new-driver-profiles.png, > old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.17.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, > HIVE-12049.9.patch, new-driver-profiles.png, old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.16.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.2.patch, > HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, > HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch, > new-driver-profiles.png, old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.15.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.2.patch, HIVE-12049.3.patch, > HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch, > HIVE-12049.7.patch, HIVE-12049.9.patch, new-driver-profiles.png, > old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13359) NoClassFoundError hadoop configuration with jdbc-standalone JAR
[ https://issues.apache.org/jira/browse/HIVE-13359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-13359: -- Attachment: HIVE-13359.1.patch > NoClassFoundError hadoop configuration with jdbc-standalone JAR > --- > > Key: HIVE-13359 > URL: https://issues.apache.org/jira/browse/HIVE-13359 > Project: Hive > Issue Type: Bug > Components: JDBC >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-13359.1.patch > > > When the hive-jdbc-SNAPSHOT-standalone.jar is used to run queries, it leads > to a NoClassDefFoundError for org/apache/hadoop/conf/Configuration. This > patch will resolve it by updating the jdbc/pom.xml file to not exclude > commons-configuration and org.apache.hadoop:* as part of the maven shaded > plugin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.14.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.13.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.2.patch, > HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, > HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.12.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.2.patch, HIVE-12049.3.patch, > HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch, > HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Status: Open (was: Patch Available) > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Status: Patch Available (was: Open) > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.11.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, > HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154602#comment-15154602 ] Rohit Dholakia commented on HIVE-12049: --- uploaded a new version. * stray comments removed. * A fix at several places so that CLI queries don't use the new ThriftJDBC SerDe. > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch, > HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, > HIVE-12049.6.patch, HIVE-12049.7.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.7.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch, > HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, > HIVE-12049.6.patch, HIVE-12049.7.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149452#comment-15149452 ] Rohit Dholakia commented on HIVE-12049: --- uploaded a new version of end to end patch. has some bug fixes and some changes to the FileSinkOperator and ThriftJDBCSerDe. > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch, > HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.6.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch, > HIVE-12049.3.patch, HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15092258#comment-15092258 ] Rohit Dholakia commented on HIVE-12442: --- Thanks! > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Affects Versions: 1.2.1 >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > Labels: HiveServer2, RPC, Thrift > Fix For: 2.1.0 > > Attachments: hive-12442.1.patch, hive-12442.2.patch, > hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch > > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090821#comment-15090821 ] Rohit Dholakia commented on HIVE-12442: --- Updated rb with the most recent patch. > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Affects Versions: 1.2.1 >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > Labels: HiveServer2, RPC, Thrift > Fix For: 2.1.0 > > Attachments: hive-12442.1.patch, hive-12442.2.patch, > hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch > > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12442: -- Attachment: hive-12442.5.patch > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Affects Versions: 1.2.1 >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > Labels: HiveServer2, RPC, Thrift > Fix For: 2.1.0 > > Attachments: hive-12442.1.patch, hive-12442.2.patch, > hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch > > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12442: -- Attachment: HIVE-12442.5.patch > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Affects Versions: 1.2.1 >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > Labels: HiveServer2, RPC, Thrift > Fix For: 2.1.0 > > Attachments: HIVE-12442.5.patch, hive-12442.1.patch, > hive-12442.2.patch, hive-12442.3.patch, hive-12442.4.patch > > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12442: -- Attachment: (was: HIVE-12442.5.patch) > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Affects Versions: 1.2.1 >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > Labels: HiveServer2, RPC, Thrift > Fix For: 2.1.0 > > Attachments: hive-12442.1.patch, hive-12442.2.patch, > hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch > > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089736#comment-15089736 ] Rohit Dholakia commented on HIVE-12442: --- Ok sure. uploaded a new patch generated using 0.9.3. thanks. > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Affects Versions: 1.2.1 >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > Labels: HiveServer2, RPC, Thrift > Fix For: 2.1.0 > > Attachments: hive-12442.1.patch, hive-12442.2.patch, > hive-12442.3.patch, hive-12442.4.patch, hive-12442.5.patch > > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12442: -- Attachment: hive-12442.4.patch > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Affects Versions: 1.2.1 >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > Labels: HiveServer2, RPC, Thrift > Attachments: hive-12442.1.patch, hive-12442.2.patch, > hive-12442.3.patch, hive-12442.4.patch > > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12442: -- Attachment: hive-12442.2.patch > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Affects Versions: 1.2.1 >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > Labels: hiveserver, thrift > Attachments: hive-12442.1.patch, hive-12442.2.patch > > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. > https://reviews.apache.org/r/41379 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12442: -- Attachment: hive-12442.1.patch wip patch. > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Affects Versions: 1.2.1 >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > Labels: hiveserver, thrift > Attachments: hive-12442.1.patch > > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. > https://reviews.apache.org/r/41379 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
[ https://issues.apache.org/jira/browse/HIVE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12442: -- Description: For implementing HIVE-12427, the tasks will need to have knowledge of thrift types from HS2's thrift API. This jira will look at the least invasive way to do that. https://reviews.apache.org/r/41379 was:For implementing HIVE-12427, the tasks will need to have knowledge of thrift types from HS2's thrift API. This jira will look at the least invasive way to do that. > Refactor/repackage HiveServer2's Thrift code so that it can be used in the > tasks > > > Key: HIVE-12442 > URL: https://issues.apache.org/jira/browse/HIVE-12442 > Project: Hive > Issue Type: Sub-task >Reporter: Vaibhav Gumashta >Assignee: Rohit Dholakia > > For implementing HIVE-12427, the tasks will need to have knowledge of thrift > types from HS2's thrift API. This jira will look at the least invasive way to > do that. > https://reviews.apache.org/r/41379 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609043#comment-14609043 ] Rohit Dholakia commented on HIVE-10438: --- 1. We see your point. But we believe that ResultSet compression using type information delivers better compression ratios. For instance, using the integer plugin attached with the patch has 10% more compression ratio than Snappy (results also attached). I think we can incorporate your suggestion by adding a switch to specify whether to use type-sensitive (different compressors for different column types) or type-insensitive compression (e.g same technology for all column types). For this, the interface ColumnCompressor will only need to be extended by one method. 3. We have now done update patch. 4. We have added a few tests to the src/test folder of hive-service which uses Snappy for compression and decompression using a few default values. Thanks. Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438-1.patch, HIVE-10438.patch, Proposal-rscompressor.pdf, README.txt, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, hs2driver-master.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: HIVE-10438-1.patch Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438-1.patch, HIVE-10438.patch, Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: README.txt Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438-1.patch, HIVE-10438.patch, Proposal-rscompressor.pdf, README.txt, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, hs2driver-master.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: (was: readme.txt) Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438-1.patch, HIVE-10438.patch, Proposal-rscompressor.pdf, README.txt, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, hs2driver-master.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: (was: hs2resultSetcompressor.zip) Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438-1.patch, HIVE-10438.patch, Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Description: This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. was: This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438.patch, Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: (was: CompressorProtocolHS2.patch) Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438.patch, Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: HIVE-10438.patch Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, HIVE-10438.patch, Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: Results_Snappy_protobuf_TBinary_TCompact.pdf Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: (was: TestingIntegerCompression.pdf) Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: (was: CompressorProtocolHS2.patch) Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, TestingIntegerCompression.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: CompressorProtocolHS2.patch Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, TestingIntegerCompression.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Affects Version/s: (was: 1.1.0) 1.2.0 Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, TestingIntegerCompression.pdf, hs2resultSetcompressor.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors. Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Description: This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. was: This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors. Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, TestingIntegerCompression.pdf, hs2resultSetcompressor.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: hs2driver-master.zip Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, TestingIntegerCompression.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: readme.txt Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, TestingIntegerCompression.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: Proposal-rscompressor.pdf Design document explaining the changes for ResultSet compressor architecture Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.1.0 Reporter: Rohit Dholakia Labels: patch Attachments: Proposal-rscompressor.pdf This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors. Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10439) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia resolved HIVE-10439. --- Resolution: Duplicate Architecture for ResultSet Compression via external plugin --- Key: HIVE-10439 URL: https://issues.apache.org/jira/browse/HIVE-10439 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.1.0 Reporter: Rohit Dholakia Labels: patch This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors. Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: TestingIntegerCompression.pdf Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.1.0 Reporter: Rohit Dholakia Labels: patch Attachments: Proposal-rscompressor.pdf, TestingIntegerCompression.pdf This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors. Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10440) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia resolved HIVE-10440. --- Resolution: Duplicate Architecture for ResultSet Compression via external plugin --- Key: HIVE-10440 URL: https://issues.apache.org/jira/browse/HIVE-10440 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.1.0 Reporter: Rohit Dholakia Labels: patch This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors. Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: CompressorProtocolHS2.patch Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.1.0 Reporter: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, TestingIntegerCompression.pdf This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors. Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-10438: -- Attachment: hs2resultSetcompressor.zip Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.1.0 Reporter: Rohit Dholakia Labels: patch Attachments: CompressorProtocolHS2.patch, Proposal-rscompressor.pdf, TestingIntegerCompression.pdf, hs2resultSetcompressor.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors. Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)