[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955856#comment-14955856 ] Xuefu Zhang commented on HIVE-10438: Some additional comments on RB. > Architecture for ResultSet Compression via external plugin > --- > > Key: HIVE-10438 > URL: https://issues.apache.org/jira/browse/HIVE-10438 > Project: Hive > Issue Type: New Feature > Components: Hive, Thrift API >Affects Versions: 1.2.0 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Labels: patch > Attachments: HIVE-10438-1.patch, HIVE-10438.patch, > Proposal-rscompressor.pdf, README.txt, > Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, > hs2driver-master.zip > > > This JIRA proposes an architecture for enabling ResultSet compression which > uses an external plugin. > The patch has three aspects to it: > 0. An architecture for enabling ResultSet compression with external plugins > 1. An example plugin to demonstrate end-to-end functionality > 2. A container to allow everyone to write and test ResultSet compressors with > a query submitter (https://github.com/xiaom/hs2driver) > Also attaching a design document explaining the changes, experimental results > document, and a pdf explaining how to setup the docker container to observe > end-to-end functionality of ResultSet compression. > https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681325#comment-14681325 ] Vaibhav Gumashta commented on HIVE-10438: - [~rohitdholakia] Thanks for the patch Rohit. I've some comments on review board whenever you get time. Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438-1.patch, HIVE-10438.patch, Proposal-rscompressor.pdf, README.txt, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, hs2driver-master.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609043#comment-14609043 ] Rohit Dholakia commented on HIVE-10438: --- 1. We see your point. But we believe that ResultSet compression using type information delivers better compression ratios. For instance, using the integer plugin attached with the patch has 10% more compression ratio than Snappy (results also attached). I think we can incorporate your suggestion by adding a switch to specify whether to use type-sensitive (different compressors for different column types) or type-insensitive compression (e.g same technology for all column types). For this, the interface ColumnCompressor will only need to be extended by one method. 3. We have now done update patch. 4. We have added a few tests to the src/test folder of hive-service which uses Snappy for compression and decompression using a few default values. Thanks. Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438-1.patch, HIVE-10438.patch, Proposal-rscompressor.pdf, README.txt, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, hs2driver-master.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604704#comment-14604704 ] Xuefu Zhang commented on HIVE-10438: Here are some of my high-level thoughts: 1. I don't think Hive needs to support multiple compressors at the same time. This is very unlikely in a real production scenario, though different users might choose different compression technologies (i.e. snappy vs lzo). For simplicity, we should start just one. Thus, we need to two flags on server side: #1, enable/disable compression; #2, the class name (some sort of identifier) of the compressor. 2. JDBC client should be able to specify whether to use result set compression. This can be done via a hiveconf variable specified in JdBC connection string hiveConfs section below: {code} jdbc:hive2://host:port/dbName;sessionConfs?hiveConfs#hiveVars {code} An example of this variable can be hive.client.use.resultset.compression. 3. When updating patch, please choose update patch instead of add file so as to make it easy to see diffs between the patches. Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438-1.patch, HIVE-10438.patch, Proposal-rscompressor.pdf, README.txt, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, hs2driver-master.zip This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. https://reviews.apache.org/r/35792/ Review board link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597389#comment-14597389 ] Hive QA commented on HIVE-10438: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12741206/HIVE-10438.patch {color:green}SUCCESS:{color} +1 9013 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4346/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4346/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4346/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12741206 - PreCommit-HIVE-TRUNK-Build Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438.patch, Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin
[ https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590655#comment-14590655 ] Hive QA commented on HIVE-10438: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740169/HIVE-10438.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4290/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4290/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4290/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4290/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin From https://github.com/apache/hive 6675a73..cae4646 branch-1 - origin/branch-1 70108e1..caf4ecc branch-1.2 - origin/branch-1.2 a3792b7..37e82ba master - origin/master + git reset --hard HEAD HEAD is now at a3792b7 HIVE-11026 : Make vector_outer_join* test more robust (Ashutosh Chauhan via Hari Sankar) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at 37e82ba HIVE-11006 improve logging wrt ACID module (Eugene Koifman, reviewed by Alan Gates) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12740169 - PreCommit-HIVE-TRUNK-Build Architecture for ResultSet Compression via external plugin --- Key: HIVE-10438 URL: https://issues.apache.org/jira/browse/HIVE-10438 Project: Hive Issue Type: New Feature Components: Hive, Thrift API Affects Versions: 1.2.0 Reporter: Rohit Dholakia Assignee: Rohit Dholakia Labels: patch Attachments: HIVE-10438.patch, Proposal-rscompressor.pdf, Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2driver-master.zip, hs2resultSetcompressor.zip, readme.txt This JIRA proposes an architecture for enabling ResultSet compression which uses an external plugin. The patch has three aspects to it: 0. An architecture for enabling ResultSet compression with external plugins 1. An example plugin to demonstrate end-to-end functionality 2. A container to allow everyone to write and test ResultSet compressors with a query submitter (https://github.com/xiaom/hs2driver) Also attaching a design document explaining the changes, experimental results document, and a pdf explaining how to setup the docker container to observe end-to-end functionality of ResultSet compression. -- This message was sent by Atlassian JIRA (v6.3.4#6332)