[jira] [Updated] (HIVE-17898) Explain plan output enhancement
[ https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17898: --- Status: Patch Available (was: Open) > Explain plan output enhancement > --- > > Key: HIVE-17898 > URL: https://issues.apache.org/jira/browse/HIVE-17898 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17898.1.patch, HIVE-17898.2.patch, > HIVE-17898.3.patch, HIVE-17898.4.patch, HIVE-17898.5.patch, > HIVE-17898.6.patch, HIVE-17898.7.patch > > > We would like to enhance the explain plan output to display additional > information e.g.: > TableScan operator should have following additional info > * Actual table name (currently only alias name is displayed) > * Database name > * Column names being scanned -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17898) Explain plan output enhancement
[ https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17898: --- Status: Open (was: Patch Available) > Explain plan output enhancement > --- > > Key: HIVE-17898 > URL: https://issues.apache.org/jira/browse/HIVE-17898 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17898.1.patch, HIVE-17898.2.patch, > HIVE-17898.3.patch, HIVE-17898.4.patch, HIVE-17898.5.patch, > HIVE-17898.6.patch, HIVE-17898.7.patch > > > We would like to enhance the explain plan output to display additional > information e.g.: > TableScan operator should have following additional info > * Actual table name (currently only alias name is displayed) > * Database name > * Column names being scanned -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18043) Vectorization: Support List type in MapWork
[ https://issues.apache.org/jira/browse/HIVE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Ma updated HIVE-18043: Attachment: HIVE-18043.001.patch The implementation is based on the patch of HIVE-16198. [~Ferd], [~vihangk1], as discussed in HIVE-17931, you can get the q-tests in this patch. > Vectorization: Support List type in MapWork > --- > > Key: HIVE-18043 > URL: https://issues.apache.org/jira/browse/HIVE-18043 > Project: Hive > Issue Type: Improvement >Reporter: Colin Ma >Assignee: Colin Ma > Attachments: HIVE-18043.001.patch > > > Support Complex Types in vectorization is finished in HIVE-16589, but List > type is still not support in MapWork. It should be supported to improve the > performance when vectorization is enable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18043) Vectorization: Support List type in MapWork
[ https://issues.apache.org/jira/browse/HIVE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Ma updated HIVE-18043: Status: Patch Available (was: Open) > Vectorization: Support List type in MapWork > --- > > Key: HIVE-18043 > URL: https://issues.apache.org/jira/browse/HIVE-18043 > Project: Hive > Issue Type: Improvement >Reporter: Colin Ma >Assignee: Colin Ma > Attachments: HIVE-18043.001.patch > > > Support Complex Types in vectorization is finished in HIVE-16589, but List > type is still not support in MapWork. It should be supported to improve the > performance when vectorization is enable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18080) Performance degradation on VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled
[ https://issues.apache.org/jira/browse/HIVE-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258856#comment-16258856 ] liyunzhang commented on HIVE-18080: --- [~gopalv]: {{-prof perfasm}} depends [PrintAssembly|https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly] while PrintAssembly depends on [Kenai project|http://www.oracle.com/splash/kenai.com/decommissioning/index.html]. Oracle closes Kenai project. I download hsdis.so from others. Not sure this outdated hsdis.so can print assembly the instruction of AVX512 or not. > Performance degradation on > VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled > -- > > Key: HIVE-18080 > URL: https://issues.apache.org/jira/browse/HIVE-18080 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang > Attachments: log.logic.avx1.single.0, log_logic.avx1.part > > > Use Xeon(R) Platinum 8180 CPU to test the performance of > [AVX512|https://en.wikipedia.org/wiki/AVX-512]. > {code} > #cat /proc/cpuinfo |grep "model name"|head -n 1 > model name: Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz > {code} > Before that I have compiled hive with JDK9 as JDK9 enables AVX512 > Use hive microbenchmark(HIVE-10189) to evaluate the performance improvement. > It seems performance(20%+) in cases in > {{VectorizedArithmeticBench}},{{VectorizedComparisonBench}},{{VectorizedLikeBench}},{{VectorizedLogicBench}} > execpt > {{VectorizedLogicBench#IfExprLongColumnLongColumnBench}},{{VectorizedLogicBench#IfExprRepeatingLongColumnLongColumnBench}} > and > {{VectorizedLogicBench#IfExprLongColumnRepeatingLongColumnBench}}.The data is > like following > When i use Skylake CPU to evaluate the performance improvement of AVX512. > I found the performance in VectorizedLogicBench is like following > || ||AVX2 us/op||AVX512 us/op || (AVX2-AVX512)/AVX2|| > |ColAndColBench|122510| 87014| 28.9%| > |IfExprLongColumnLongColumnBench | 1325759| 1436073| -8.3% | > |IfExprLongColumnRepeatingLongColumnBench|1397447|1480450| -5.9%| > |IfExprRepeatingLongColumnLongColumnBench|1401164|1483062| -5.9% | > |NotColBench|77042.83|51513.28| 33%| > There are degradation in > IfExprLongColumnLongColumnBench,IfExprLongColumnRepeatingLongColumnBench, > IfExprRepeatingLongColumnLongColumnBench, very confused why there is > degradation on IfExprLongColumnLongColumnBench cases. > Here we use {{taskset -cp 1 $pid}} to run the benchmark on single core to > avoid the impact of dynamic CPU frequency scaling. > my script > {code} > export JAVA_HOME=/home/zly/jdk-9.0.1/ > export PATH=$JAVA_HOME/bin:$PATH > export LD_LIBRARY_PATH=/home/zly/jdk-9.0.1/mylib > for i in 0 1 2; do > java -server -XX:UseAVX=3 -jar benchmarks.jar > org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 > -f 1 -bm avgt -tu us >log.logic.avx3.single.$i & export pid=$! > taskset -cp 1 $pid > wait $pid > done > for i in 0 1 2; do > java -server -XX:UseAVX=2 -jar benchmarks.jar > org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 > -f 1 -bm avgt -tu us >log.logic.avx2.single.$i & export pid=$! > taskset -cp 1 $pid > wait $pid > done > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-18080) Performance degradation on VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled
[ https://issues.apache.org/jira/browse/HIVE-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258853#comment-16258853 ] liyunzhang edited comment on HIVE-18080 at 11/20/17 6:38 AM: - [~gopalv]: using following command with {{-prof perfasm}} to run the VectorizedLogicBench#IfExprLongColumnLongColumnBench in AVX1 {code} export JAVA_HOME=/home/zly/sr601/jdk-9.0.1/ export PATH=$JAVA_HOME/bin:$PATH export LD_LIBRARY_PATH=/home/zly/sr601/jdk-9.0.1/mylib i=0 java -server -XX:UseAVX=1 -jar benchmarks.jar -prof perfasm org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 -f 1 -bm avgt -tu us >log.logic.avx1.single.$i & export pid=$! taskset -cp 1 $pid wait $pid {code} the [log.logic.avx1.single.0|https://issues.apache.org/jira/secure/attachment/12898421/log.logic.avx1.single.0] attached, find some warning {code} PrintAssembly processed: 51105 total address lines. Perf output processed (skipped 1.020 seconds): Column 1: cycles (0 events) Column 2: instructions (0 events) [Hottest Regions]... [Hottest Methods (after inlining)].. [Distribution by Area].. WARNING: The perf event count is suspiciously low (0). The performance data might be inaccurate or misleading. Try to do the profiling again, or tune up the sampling frequency. {code} was (Author: kellyzly): [~gopal]: using following command with {{-prof perfasm}} to run the VectorizedLogicBench#IfExprLongColumnLongColumnBench in AVX1 {code} export JAVA_HOME=/home/zly/sr601/jdk-9.0.1/ export PATH=$JAVA_HOME/bin:$PATH export LD_LIBRARY_PATH=/home/zly/sr601/jdk-9.0.1/mylib i=0 java -server -XX:UseAVX=1 -jar benchmarks.jar -prof perfasm org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 -f 1 -bm avgt -tu us >log.logic.avx1.single.$i & export pid=$! taskset -cp 1 $pid wait $pid {code} the output attached, find some warning {code} PrintAssembly processed: 51105 total address lines. Perf output processed (skipped 1.020 seconds): Column 1: cycles (0 events) Column 2: instructions (0 events) [Hottest Regions]... [Hottest Methods (after inlining)].. [Distribution by Area].. WARNING: The perf event count is suspiciously low (0). The performance data might be inaccurate or misleading. Try to do the profiling again, or tune up the sampling frequency. {code} > Performance degradation on > VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled > -- > > Key: HIVE-18080 > URL: https://issues.apache.org/jira/browse/HIVE-18080 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang > Attachments: log.logic.avx1.single.0, log_logic.avx1.part > > > Use Xeon(R) Platinum 8180 CPU to test the performance of > [AVX512|https://en.wikipedia.org/wiki/AVX-512]. > {code} > #cat /proc/cpuinfo |grep "model name"|head -n 1 > model name: Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz > {code} > Before that I have compiled hive with JDK9 as JDK9 enables AVX512 > Use hive microbenchmark(HIVE-10189) to evaluate the performance improvement. > It seems performance(20%+) in cases in > {{VectorizedArithmeticBench}},{{VectorizedComparisonBench}},{{VectorizedLikeBench}},{{VectorizedLogicBench}} > execpt > {{VectorizedLogicBench#IfExprLongColumnLongColumnBench}},{{VectorizedLogicBench#IfExprRepeatingLongColumnLongColumnBench}} > and > {{VectorizedLogicBench#IfExprLongColumnRepeatingLongColumnBench}}.The data is > like following > When i use Skylake CPU to evaluate the performance improvement of AVX512. > I found the performance in VectorizedLogicBench is like following > || ||AVX2
[jira] [Updated] (HIVE-18080) Performance degradation on VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled
[ https://issues.apache.org/jira/browse/HIVE-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang updated HIVE-18080: -- Attachment: log.logic.avx1.single.0 > Performance degradation on > VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled > -- > > Key: HIVE-18080 > URL: https://issues.apache.org/jira/browse/HIVE-18080 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang > Attachments: log.logic.avx1.single.0, log_logic.avx1.part > > > Use Xeon(R) Platinum 8180 CPU to test the performance of > [AVX512|https://en.wikipedia.org/wiki/AVX-512]. > {code} > #cat /proc/cpuinfo |grep "model name"|head -n 1 > model name: Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz > {code} > Before that I have compiled hive with JDK9 as JDK9 enables AVX512 > Use hive microbenchmark(HIVE-10189) to evaluate the performance improvement. > It seems performance(20%+) in cases in > {{VectorizedArithmeticBench}},{{VectorizedComparisonBench}},{{VectorizedLikeBench}},{{VectorizedLogicBench}} > execpt > {{VectorizedLogicBench#IfExprLongColumnLongColumnBench}},{{VectorizedLogicBench#IfExprRepeatingLongColumnLongColumnBench}} > and > {{VectorizedLogicBench#IfExprLongColumnRepeatingLongColumnBench}}.The data is > like following > When i use Skylake CPU to evaluate the performance improvement of AVX512. > I found the performance in VectorizedLogicBench is like following > || ||AVX2 us/op||AVX512 us/op || (AVX2-AVX512)/AVX2|| > |ColAndColBench|122510| 87014| 28.9%| > |IfExprLongColumnLongColumnBench | 1325759| 1436073| -8.3% | > |IfExprLongColumnRepeatingLongColumnBench|1397447|1480450| -5.9%| > |IfExprRepeatingLongColumnLongColumnBench|1401164|1483062| -5.9% | > |NotColBench|77042.83|51513.28| 33%| > There are degradation in > IfExprLongColumnLongColumnBench,IfExprLongColumnRepeatingLongColumnBench, > IfExprRepeatingLongColumnLongColumnBench, very confused why there is > degradation on IfExprLongColumnLongColumnBench cases. > Here we use {{taskset -cp 1 $pid}} to run the benchmark on single core to > avoid the impact of dynamic CPU frequency scaling. > my script > {code} > export JAVA_HOME=/home/zly/jdk-9.0.1/ > export PATH=$JAVA_HOME/bin:$PATH > export LD_LIBRARY_PATH=/home/zly/jdk-9.0.1/mylib > for i in 0 1 2; do > java -server -XX:UseAVX=3 -jar benchmarks.jar > org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 > -f 1 -bm avgt -tu us >log.logic.avx3.single.$i & export pid=$! > taskset -cp 1 $pid > wait $pid > done > for i in 0 1 2; do > java -server -XX:UseAVX=2 -jar benchmarks.jar > org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 > -f 1 -bm avgt -tu us >log.logic.avx2.single.$i & export pid=$! > taskset -cp 1 $pid > wait $pid > done > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-18080) Performance degradation on VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled
[ https://issues.apache.org/jira/browse/HIVE-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258853#comment-16258853 ] liyunzhang edited comment on HIVE-18080 at 11/20/17 6:35 AM: - [~gopal]: using following command with {{-prof perfasm}} to run the VectorizedLogicBench#IfExprLongColumnLongColumnBench in AVX1 {code} export JAVA_HOME=/home/zly/sr601/jdk-9.0.1/ export PATH=$JAVA_HOME/bin:$PATH export LD_LIBRARY_PATH=/home/zly/sr601/jdk-9.0.1/mylib i=0 java -server -XX:UseAVX=1 -jar benchmarks.jar -prof perfasm org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 -f 1 -bm avgt -tu us >log.logic.avx1.single.$i & export pid=$! taskset -cp 1 $pid wait $pid {code} the output attached, find some warning {code} PrintAssembly processed: 51105 total address lines. Perf output processed (skipped 1.020 seconds): Column 1: cycles (0 events) Column 2: instructions (0 events) [Hottest Regions]... [Hottest Methods (after inlining)].. [Distribution by Area].. WARNING: The perf event count is suspiciously low (0). The performance data might be inaccurate or misleading. Try to do the profiling again, or tune up the sampling frequency. {code} was (Author: kellyzly): [~gopal]: using following command to run the VectorizedLogicBench#IfExprLongColumnLongColumnBench in AVX1 {code} export JAVA_HOME=/home/zly/sr601/jdk-9.0.1/ export PATH=$JAVA_HOME/bin:$PATH export LD_LIBRARY_PATH=/home/zly/sr601/jdk-9.0.1/mylib i=0 java -server -XX:UseAVX=1 -jar benchmarks.jar -prof perfasm org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 -f 1 -bm avgt -tu us >log.logic.avx1.single.$i & export pid=$! taskset -cp 1 $pid wait $pid {code} the output attached, find some warning {code} PrintAssembly processed: 51105 total address lines. Perf output processed (skipped 1.020 seconds): Column 1: cycles (0 events) Column 2: instructions (0 events) [Hottest Regions]... [Hottest Methods (after inlining)].. [Distribution by Area].. WARNING: The perf event count is suspiciously low (0). The performance data might be inaccurate or misleading. Try to do the profiling again, or tune up the sampling frequency. {code} > Performance degradation on > VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled > -- > > Key: HIVE-18080 > URL: https://issues.apache.org/jira/browse/HIVE-18080 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang > Attachments: log_logic.avx1.part > > > Use Xeon(R) Platinum 8180 CPU to test the performance of > [AVX512|https://en.wikipedia.org/wiki/AVX-512]. > {code} > #cat /proc/cpuinfo |grep "model name"|head -n 1 > model name: Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz > {code} > Before that I have compiled hive with JDK9 as JDK9 enables AVX512 > Use hive microbenchmark(HIVE-10189) to evaluate the performance improvement. > It seems performance(20%+) in cases in > {{VectorizedArithmeticBench}},{{VectorizedComparisonBench}},{{VectorizedLikeBench}},{{VectorizedLogicBench}} > execpt > {{VectorizedLogicBench#IfExprLongColumnLongColumnBench}},{{VectorizedLogicBench#IfExprRepeatingLongColumnLongColumnBench}} > and > {{VectorizedLogicBench#IfExprLongColumnRepeatingLongColumnBench}}.The data is > like following > When i use Skylake CPU to evaluate the performance improvement of AVX512. > I found the performance in VectorizedLogicBench is like following > || ||AVX2 us/op||AVX512 us/op || (AVX2-AVX512)/AVX2|| > |ColAndColBench|122510| 87014| 28.9%| > |IfExprLongColumnLongColumnBench | 1325759| 1436073| -8.3% | >
[jira] [Comment Edited] (HIVE-18080) Performance degradation on VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled
[ https://issues.apache.org/jira/browse/HIVE-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258853#comment-16258853 ] liyunzhang edited comment on HIVE-18080 at 11/20/17 6:35 AM: - [~gopal]: using following command to run the VectorizedLogicBench#IfExprLongColumnLongColumnBench in AVX1 {code} export JAVA_HOME=/home/zly/sr601/jdk-9.0.1/ export PATH=$JAVA_HOME/bin:$PATH export LD_LIBRARY_PATH=/home/zly/sr601/jdk-9.0.1/mylib i=0 java -server -XX:UseAVX=1 -jar benchmarks.jar -prof perfasm org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 -f 1 -bm avgt -tu us >log.logic.avx1.single.$i & export pid=$! taskset -cp 1 $pid wait $pid {code} the output attached, find some warning {code} PrintAssembly processed: 51105 total address lines. Perf output processed (skipped 1.020 seconds): Column 1: cycles (0 events) Column 2: instructions (0 events) [Hottest Regions]... [Hottest Methods (after inlining)].. [Distribution by Area].. WARNING: The perf event count is suspiciously low (0). The performance data might be inaccurate or misleading. Try to do the profiling again, or tune up the sampling frequency. {code} was (Author: kellyzly): [~gopal]: using following command to run the VectorizedLogicBench#IfExprLongColumnLongColumnBench in AVX1 {code} export JAVA_HOME=/home/zly/sr601/jdk-9.0.1/ export PATH=$JAVA_HOME/bin:$PATH export LD_LIBRARY_PATH=/home/zly/sr601/jdk-9.0.1/mylib java -server -XX:UseAVX=1 -jar benchmarks.jar -prof perfasm org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 -f 1 -bm avgt -tu us >log.logic.avx1.single.$i & export pid=$! taskset -cp 1 $pid wait $pid {code} the output attached, find some warning {code} PrintAssembly processed: 51105 total address lines. Perf output processed (skipped 1.020 seconds): Column 1: cycles (0 events) Column 2: instructions (0 events) [Hottest Regions]... [Hottest Methods (after inlining)].. [Distribution by Area].. WARNING: The perf event count is suspiciously low (0). The performance data might be inaccurate or misleading. Try to do the profiling again, or tune up the sampling frequency. {code} > Performance degradation on > VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled > -- > > Key: HIVE-18080 > URL: https://issues.apache.org/jira/browse/HIVE-18080 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang > Attachments: log_logic.avx1.part > > > Use Xeon(R) Platinum 8180 CPU to test the performance of > [AVX512|https://en.wikipedia.org/wiki/AVX-512]. > {code} > #cat /proc/cpuinfo |grep "model name"|head -n 1 > model name: Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz > {code} > Before that I have compiled hive with JDK9 as JDK9 enables AVX512 > Use hive microbenchmark(HIVE-10189) to evaluate the performance improvement. > It seems performance(20%+) in cases in > {{VectorizedArithmeticBench}},{{VectorizedComparisonBench}},{{VectorizedLikeBench}},{{VectorizedLogicBench}} > execpt > {{VectorizedLogicBench#IfExprLongColumnLongColumnBench}},{{VectorizedLogicBench#IfExprRepeatingLongColumnLongColumnBench}} > and > {{VectorizedLogicBench#IfExprLongColumnRepeatingLongColumnBench}}.The data is > like following > When i use Skylake CPU to evaluate the performance improvement of AVX512. > I found the performance in VectorizedLogicBench is like following > || ||AVX2 us/op||AVX512 us/op || (AVX2-AVX512)/AVX2|| > |ColAndColBench|122510| 87014| 28.9%| > |IfExprLongColumnLongColumnBench | 1325759| 1436073| -8.3% | >
[jira] [Commented] (HIVE-18080) Performance degradation on VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled
[ https://issues.apache.org/jira/browse/HIVE-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258853#comment-16258853 ] liyunzhang commented on HIVE-18080: --- [~gopal]: using following command to run the VectorizedLogicBench#IfExprLongColumnLongColumnBench in AVX1 {code} export JAVA_HOME=/home/zly/sr601/jdk-9.0.1/ export PATH=$JAVA_HOME/bin:$PATH export LD_LIBRARY_PATH=/home/zly/sr601/jdk-9.0.1/mylib java -server -XX:UseAVX=1 -jar benchmarks.jar -prof perfasm org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 -f 1 -bm avgt -tu us >log.logic.avx1.single.$i & export pid=$! taskset -cp 1 $pid wait $pid {code} the output attached, find some warning {code} PrintAssembly processed: 51105 total address lines. Perf output processed (skipped 1.020 seconds): Column 1: cycles (0 events) Column 2: instructions (0 events) [Hottest Regions]... [Hottest Methods (after inlining)].. [Distribution by Area].. WARNING: The perf event count is suspiciously low (0). The performance data might be inaccurate or misleading. Try to do the profiling again, or tune up the sampling frequency. {code} > Performance degradation on > VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled > -- > > Key: HIVE-18080 > URL: https://issues.apache.org/jira/browse/HIVE-18080 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang > Attachments: log_logic.avx1.part > > > Use Xeon(R) Platinum 8180 CPU to test the performance of > [AVX512|https://en.wikipedia.org/wiki/AVX-512]. > {code} > #cat /proc/cpuinfo |grep "model name"|head -n 1 > model name: Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz > {code} > Before that I have compiled hive with JDK9 as JDK9 enables AVX512 > Use hive microbenchmark(HIVE-10189) to evaluate the performance improvement. > It seems performance(20%+) in cases in > {{VectorizedArithmeticBench}},{{VectorizedComparisonBench}},{{VectorizedLikeBench}},{{VectorizedLogicBench}} > execpt > {{VectorizedLogicBench#IfExprLongColumnLongColumnBench}},{{VectorizedLogicBench#IfExprRepeatingLongColumnLongColumnBench}} > and > {{VectorizedLogicBench#IfExprLongColumnRepeatingLongColumnBench}}.The data is > like following > When i use Skylake CPU to evaluate the performance improvement of AVX512. > I found the performance in VectorizedLogicBench is like following > || ||AVX2 us/op||AVX512 us/op || (AVX2-AVX512)/AVX2|| > |ColAndColBench|122510| 87014| 28.9%| > |IfExprLongColumnLongColumnBench | 1325759| 1436073| -8.3% | > |IfExprLongColumnRepeatingLongColumnBench|1397447|1480450| -5.9%| > |IfExprRepeatingLongColumnLongColumnBench|1401164|1483062| -5.9% | > |NotColBench|77042.83|51513.28| 33%| > There are degradation in > IfExprLongColumnLongColumnBench,IfExprLongColumnRepeatingLongColumnBench, > IfExprRepeatingLongColumnLongColumnBench, very confused why there is > degradation on IfExprLongColumnLongColumnBench cases. > Here we use {{taskset -cp 1 $pid}} to run the benchmark on single core to > avoid the impact of dynamic CPU frequency scaling. > my script > {code} > export JAVA_HOME=/home/zly/jdk-9.0.1/ > export PATH=$JAVA_HOME/bin:$PATH > export LD_LIBRARY_PATH=/home/zly/jdk-9.0.1/mylib > for i in 0 1 2; do > java -server -XX:UseAVX=3 -jar benchmarks.jar > org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 > -f 1 -bm avgt -tu us >log.logic.avx3.single.$i & export pid=$! > taskset -cp 1 $pid > wait $pid > done > for i in 0 1 2; do > java -server -XX:UseAVX=2 -jar benchmarks.jar > org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 > -f 1 -bm avgt -tu us >log.logic.avx2.single.$i & export pid=$! > taskset -cp 1 $pid > wait $pid > done > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18104) Issue in HIVE Update Command for set columns
[ https://issues.apache.org/jira/browse/HIVE-18104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Ranjan updated HIVE-18104: --- Description: When Updating a table, error comes in when a wrong column name is entered in where clause but Mapreduce executes successfully when column name in set clause is wrong, though no value gets updated. hive> describe test_table; OK run_sitevarchar(50) run_yearint run_month int data_loaded_yn varchar(1) run_datetimestamp message string datetimetimestamp Time taken: 0.169 seconds, Fetched: 10 row(s) hive> update test_table set abc='Y' where message='Processing'; Query ID = 20171120052859_d95524f8-a9d3-48ad-aa84-2932696d3432 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1508354216914_35481) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 .. SUCCEEDED 2 200 0 0 Reducer 2 .. SUCCEEDED 2 200 0 0 VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 9.52 s Loading data to table test_table Table test_table stats: [numFiles=39, numRows=3, totalSize=56417, rawDataSize=0] OK Time taken: 10.517 seconds was: When Updating a table, error comes in when a wrong column name is entered in where clause but Mapreduce executes successfully when column name in set clause is wrong, though no value gets updated. hive> describe test_table; OK run_sitevarchar(50) run_yearint run_month int data_loaded_yn varchar(1) run_datetimestamp message string datetimetimestamp Time taken: 0.169 seconds, Fetched: 10 row(s) hive> test_table set abc='Y' where message='Processing'; Query ID = 20171120052859_d95524f8-a9d3-48ad-aa84-2932696d3432 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1508354216914_35481) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 .. SUCCEEDED 2 200 0 0 Reducer 2 .. SUCCEEDED 2 200 0 0 VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 9.52 s Loading data to table test_table Table test_table stats: [numFiles=39, numRows=3, totalSize=56417, rawDataSize=0] OK Time taken: 10.517 seconds > Issue in HIVE Update Command for set columns > > > Key: HIVE-18104 > URL: https://issues.apache.org/jira/browse/HIVE-18104 > Project: Hive > Issue Type: Bug > Components: CLI >Reporter: Ravi Ranjan >Priority: Critical > > When Updating a table, error comes in when a wrong column name is entered in > where clause but Mapreduce executes successfully when column name in set > clause is wrong, though no value gets updated. > hive> describe test_table; > OK > run_sitevarchar(50) > run_yearint > run_month int > data_loaded_yn varchar(1) > run_datetimestamp > message string > datetimetimestamp > Time taken: 0.169 seconds, Fetched: 10 row(s) > > hive> update test_table set abc='Y' where message='Processing'; > Query ID = 20171120052859_d95524f8-a9d3-48ad-aa84-2932696d3432 > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1508354216914_35481) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 2 200 0 > 0 > Reducer 2 .. SUCCEEDED 2 2
[jira] [Updated] (HIVE-18104) Issue in HIVE Update Command for set columns
[ https://issues.apache.org/jira/browse/HIVE-18104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Ranjan updated HIVE-18104: --- Description: When Updating a table, error comes in when a wrong column name is entered in where clause but Mapreduce executes successfully when column name in set clause is wrong, though no value gets updated. hive> describe test_table; OK run_sitevarchar(50) run_yearint run_month int data_loaded_yn varchar(1) run_datetimestamp message string datetimetimestamp Time taken: 0.169 seconds, Fetched: 10 row(s) hive> test_table set abc='Y' where message='Processing'; Query ID = 20171120052859_d95524f8-a9d3-48ad-aa84-2932696d3432 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1508354216914_35481) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 .. SUCCEEDED 2 200 0 0 Reducer 2 .. SUCCEEDED 2 200 0 0 VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 9.52 s Loading data to table test_table Table test_table stats: [numFiles=39, numRows=3, totalSize=56417, rawDataSize=0] OK Time taken: 10.517 seconds was: When Updating a table, error comes in when a wrong column name is entered in where clause but Mapreduce executes successfully when column name in set clause is wrong and no value gets updated. hive> describe test_table; OK run_sitevarchar(50) run_yearint run_month int data_loaded_yn varchar(1) run_datetimestamp message string datetimetimestamp Time taken: 0.169 seconds, Fetched: 10 row(s) hive> test_table set abc='Y' where message='Processing'; Query ID = 20171120052859_d95524f8-a9d3-48ad-aa84-2932696d3432 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1508354216914_35481) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 .. SUCCEEDED 2 200 0 0 Reducer 2 .. SUCCEEDED 2 200 0 0 VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 9.52 s Loading data to table test_table Table test_table stats: [numFiles=39, numRows=3, totalSize=56417, rawDataSize=0] OK Time taken: 10.517 seconds > Issue in HIVE Update Command for set columns > > > Key: HIVE-18104 > URL: https://issues.apache.org/jira/browse/HIVE-18104 > Project: Hive > Issue Type: Bug > Components: CLI >Reporter: Ravi Ranjan >Priority: Critical > > When Updating a table, error comes in when a wrong column name is entered in > where clause but Mapreduce executes successfully when column name in set > clause is wrong, though no value gets updated. > hive> describe test_table; > OK > run_sitevarchar(50) > run_yearint > run_month int > data_loaded_yn varchar(1) > run_datetimestamp > message string > datetimetimestamp > Time taken: 0.169 seconds, Fetched: 10 row(s) > > hive> test_table set abc='Y' where message='Processing'; > Query ID = 20171120052859_d95524f8-a9d3-48ad-aa84-2932696d3432 > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1508354216914_35481) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 2 200 0 > 0 > Reducer 2 .. SUCCEEDED 2 200
[jira] [Updated] (HIVE-18104) Issue in HIVE Update Command for set columns
[ https://issues.apache.org/jira/browse/HIVE-18104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Ranjan updated HIVE-18104: --- Description: When Updating a table, error comes in when a wrong column name is entered in where clause but Mapreduce executes successfully when column name in set clause is wrong and no value gets updated. hive> describe test_table; OK run_sitevarchar(50) run_yearint run_month int data_loaded_yn varchar(1) run_datetimestamp message string datetimetimestamp Time taken: 0.169 seconds, Fetched: 10 row(s) hive> test_table set abc='Y' where message='Processing'; Query ID = 20171120052859_d95524f8-a9d3-48ad-aa84-2932696d3432 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1508354216914_35481) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 .. SUCCEEDED 2 200 0 0 Reducer 2 .. SUCCEEDED 2 200 0 0 VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 9.52 s Loading data to table test_table Table test_table stats: [numFiles=39, numRows=3, totalSize=56417, rawDataSize=0] OK Time taken: 10.517 seconds was: When Updating a table, error comes in when a wrong column name is entered in where clause but Mapreduce executes successfully when column name in set clause is wrong and no value gets updated. hive> describe test_table;OKrun_site varchar(50)run_year intrun_month intdata_loaded_yn varchar(1)run_date timestampmessage stringdatetime timestampTime taken: 0.169 seconds, Fetched: 10 row(s)hive> test_table set abc='Y' where message='Processing';Query ID = 20171120052859_d95524f8-a9d3-48ad-aa84-2932696d3432Total jobs = 1Launching Job 1 out of 1Status: Running (Executing on YARN cluster with App id application_1508354216914_35481) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLEDMap 1 .. SUCCEEDED 2 2 0 0 0 0Reducer 2 .. SUCCEEDED 2 2 0 0 0 0VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 9.52 sLoading data to table test_tableTable astir_mi_db.astir_hv_lt_scenario_run stats: [numFiles=39, numRows=3, totalSize=56417, rawDataSize=0]OKTime taken: 10.517 seconds > Issue in HIVE Update Command for set columns > > > Key: HIVE-18104 > URL: https://issues.apache.org/jira/browse/HIVE-18104 > Project: Hive > Issue Type: Bug > Components: CLI >Reporter: Ravi Ranjan >Priority: Critical > > When Updating a table, error comes in when a wrong column name is entered in > where clause but Mapreduce executes successfully when column name in set > clause is wrong and no value gets updated. > hive> describe test_table; > OK > run_sitevarchar(50) > run_yearint > run_month int > data_loaded_yn varchar(1) > run_datetimestamp > message string > datetimetimestamp > Time taken: 0.169 seconds, Fetched: 10 row(s) > > hive> test_table set abc='Y' where message='Processing'; > Query ID = 20171120052859_d95524f8-a9d3-48ad-aa84-2932696d3432 > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1508354216914_35481) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 2 200 0 > 0 > Reducer 2 .. SUCCEEDED 2 200
[jira] [Comment Edited] (HIVE-17902) add notions of default pool and start adding unmanaged mapping
[ https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248783#comment-16248783 ] Lefty Leverenz edited comment on HIVE-17902 at 11/20/17 4:28 AM: - Doc note: This adds *hive.metastore.wm.default.pool.size* to HiveConf.java, so it needs to be documented in the wiki. (Perhaps the LLAP section of Configuration Properties will have a subsection for workload management.) * [Configuration Properties -- LLAP | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAP] Added a TODOC3.0 label. Update 19/Nov/17: Also document the non-reserved keywords DEFAULT and POOL for 3.0.0 in the DDL doc. * [DDL -- Keywords | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Keywords,Non-reservedKeywordsandReservedKeywords] was (Author: le...@hortonworks.com): Doc note: This adds *hive.metastore.wm.default.pool.size* to HiveConf.java, so it needs to be documented in the wiki. (Perhaps the LLAP section of Configuration Properties will have a subsection for workload management.) * [Configuration Properties -- LLAP | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAP] Added a TODOC3.0 label. > add notions of default pool and start adding unmanaged mapping > -- > > Key: HIVE-17902 > URL: https://issues.apache.org/jira/browse/HIVE-17902 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, > HIVE-17902.03.patch, HIVE-17902.04.patch, HIVE-17902.05.patch, > HIVE-17902.06.patch, HIVE-17902.07.patch, HIVE-17902.08.patch, > HIVE-17902.09.patch, HIVE-17902.10.patch, HIVE-17902.patch > > > This is needed to map queries between WM and non-WM execution -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17932) Remove option to control partition level basic stats fetching
[ https://issues.apache.org/jira/browse/HIVE-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258798#comment-16258798 ] Lefty Leverenz commented on HIVE-17932: --- Thanks Zoltan, I've removed the TODOC3.0 label. > Remove option to control partition level basic stats fetching > - > > Key: HIVE-17932 > URL: https://issues.apache.org/jira/browse/HIVE-17932 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 3.0.0 > > Attachments: HIVE-17932.01.patch > > > disabling the fetching of partition > stats({{hive.stats.fetch.partition.stats}}) may cause problematic cases to > arise for partitioned tables...the user might just want to disable the cbo > instead tweaking the fetching of partition stats. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17965) Remove HIVELIMITTABLESCANPARTITION support
[ https://issues.apache.org/jira/browse/HIVE-17965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258797#comment-16258797 ] Lefty Leverenz commented on HIVE-17965: --- Thanks Zoltan, I've removed the TODOC3.0 label. > Remove HIVELIMITTABLESCANPARTITION support > -- > > Key: HIVE-17965 > URL: https://issues.apache.org/jira/browse/HIVE-17965 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Trivial > Fix For: 3.0.0 > > Attachments: HIVE-17965.01.patch > > > HIVE-13884 marked it as deprecated -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17965) Remove HIVELIMITTABLESCANPARTITION support
[ https://issues.apache.org/jira/browse/HIVE-17965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-17965: -- Labels: (was: TODOC3.0) > Remove HIVELIMITTABLESCANPARTITION support > -- > > Key: HIVE-17965 > URL: https://issues.apache.org/jira/browse/HIVE-17965 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Trivial > Fix For: 3.0.0 > > Attachments: HIVE-17965.01.patch > > > HIVE-13884 marked it as deprecated -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17932) Remove option to control partition level basic stats fetching
[ https://issues.apache.org/jira/browse/HIVE-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-17932: -- Labels: (was: TODOC3.0) > Remove option to control partition level basic stats fetching > - > > Key: HIVE-17932 > URL: https://issues.apache.org/jira/browse/HIVE-17932 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 3.0.0 > > Attachments: HIVE-17932.01.patch > > > disabling the fetching of partition > stats({{hive.stats.fetch.partition.stats}}) may cause problematic cases to > arise for partitioned tables...the user might just want to disable the cbo > instead tweaking the fetching of partition stats. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader
[ https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258782#comment-16258782 ] Hive QA commented on HIVE-17528: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12898408/HIVE-17528.5.patch {color:green}SUCCESS:{color} +1 due to 30 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11443 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] (batchId=78) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=159) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=103) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=226) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7915/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7915/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7915/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12898408 - PreCommit-HIVE-Build > Add more q-tests for Hive-on-Spark with Parquet vectorized reader > - > > Key: HIVE-17528 > URL: https://issues.apache.org/jira/browse/HIVE-17528 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu > Attachments: HIVE-17528.1.patch, HIVE-17528.2.patch, > HIVE-17528.3.patch, HIVE-17528.4.patch, HIVE-17528.5.patch, HIVE-17528.patch > > > Most of the vectorization related q-tests operate on ORC tables using Tez. It > would be good to add more coverage on a different combination of engine and > file-format. We can model existing q-tests using parquet tables and run it > using TestSparkCliDriver -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14495) Add SHOW MATERIALIZED VIEWS statement
[ https://issues.apache.org/jira/browse/HIVE-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258775#comment-16258775 ] Lefty Leverenz commented on HIVE-14495: --- Doc note: This needs to be documented in the wiki. * [DDL -- SHOW | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Show] Added a TODOC3.0 label. > Add SHOW MATERIALIZED VIEWS statement > - > > Key: HIVE-14495 > URL: https://issues.apache.org/jira/browse/HIVE-14495 > Project: Hive > Issue Type: Sub-task > Components: Materialized views >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-14495.01.patch, HIVE-14495.patch > > > In the spirit of {{SHOW TABLES}}, we should support the following statement: > {code:sql} > SHOW MATERIALIZED VIEWS [IN database_name] ['identifier_with_wildcards']; > {code} > In contrast to {{SHOW TABLES}}, this command would only list the materialized > views. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15018) ALTER rewriting flag in materialized view
[ https://issues.apache.org/jira/browse/HIVE-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258776#comment-16258776 ] Lefty Leverenz commented on HIVE-15018: --- Doc note: This needs to be documented in the wiki. * [DDL | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL] Added a TODOC3.0 label. > ALTER rewriting flag in materialized view > -- > > Key: HIVE-15018 > URL: https://issues.apache.org/jira/browse/HIVE-15018 > Project: Hive > Issue Type: Sub-task > Components: Materialized views >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-15018.01.patch, HIVE-15018.patch > > > We should extend the ALTER statement in case we want to change the rewriting > behavior of the materialized view after we have created it. > {code:sql} > ALTER MATERIALIZED VIEW [db_name.]materialized_view_name DISABLE REWRITE; > {code} > {code:sql} > ALTER MATERIALIZED VIEW [db_name.]materialized_view_name ENABLE REWRITE; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-14495) Add SHOW MATERIALIZED VIEWS statement
[ https://issues.apache.org/jira/browse/HIVE-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-14495: -- Labels: TODOC3.0 (was: ) > Add SHOW MATERIALIZED VIEWS statement > - > > Key: HIVE-14495 > URL: https://issues.apache.org/jira/browse/HIVE-14495 > Project: Hive > Issue Type: Sub-task > Components: Materialized views >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-14495.01.patch, HIVE-14495.patch > > > In the spirit of {{SHOW TABLES}}, we should support the following statement: > {code:sql} > SHOW MATERIALIZED VIEWS [IN database_name] ['identifier_with_wildcards']; > {code} > In contrast to {{SHOW TABLES}}, this command would only list the materialized > views. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15018) ALTER rewriting flag in materialized view
[ https://issues.apache.org/jira/browse/HIVE-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-15018: -- Labels: TODOC3.0 (was: ) > ALTER rewriting flag in materialized view > -- > > Key: HIVE-15018 > URL: https://issues.apache.org/jira/browse/HIVE-15018 > Project: Hive > Issue Type: Sub-task > Components: Materialized views >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-15018.01.patch, HIVE-15018.patch > > > We should extend the ALTER statement in case we want to change the rewriting > behavior of the materialized view after we have created it. > {code:sql} > ALTER MATERIALIZED VIEW [db_name.]materialized_view_name DISABLE REWRITE; > {code} > {code:sql} > ALTER MATERIALIZED VIEW [db_name.]materialized_view_name ENABLE REWRITE; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"
[ https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-16756: --- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) Patch merged to master and branch-2. Thanks for the review [~mmccline] > Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: > / by zero" > > > Key: HIVE-16756 > URL: https://issues.apache.org/jira/browse/HIVE-16756 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.0 >Reporter: Matt McCline >Assignee: Vihang Karajgaonkar >Priority: Critical > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16756.01.patch, HIVE-16756.02.patch, > HIVE-16756.03.patch, HIVE-16756.05-branch-2.patch, > HIVE-16756.06-branch-2.patch > > > vectorization_div0.q needs to test the long data type testing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"
[ https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258764#comment-16258764 ] Vihang Karajgaonkar commented on HIVE-16756: {{vectorized_ptf}} is failing for a while on branch-2. Other failures are unrelated to this patch. > Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: > / by zero" > > > Key: HIVE-16756 > URL: https://issues.apache.org/jira/browse/HIVE-16756 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.0 >Reporter: Matt McCline >Assignee: Vihang Karajgaonkar >Priority: Critical > Attachments: HIVE-16756.01.patch, HIVE-16756.02.patch, > HIVE-16756.03.patch, HIVE-16756.05-branch-2.patch, > HIVE-16756.06-branch-2.patch > > > vectorization_div0.q needs to test the long data type testing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17964) HoS: some spark configs doesn't require re-creating a session
[ https://issues.apache.org/jira/browse/HIVE-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258760#comment-16258760 ] Lefty Leverenz commented on HIVE-17964: --- Doc note: This adds *hive.spark.rsc.conf.list* to HiveConf.java, so it needs to be documented in the wiki. * [Configuration Properties -- Spark | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Spark] Added a TODOC3.0 label. > HoS: some spark configs doesn't require re-creating a session > - > > Key: HIVE-17964 > URL: https://issues.apache.org/jira/browse/HIVE-17964 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-17964.1.patch, HIVE-17964.2.patch, > HIVE-17964.3.patch > > > I guess the {{hive.spark.}} configs were initially intended for the RSC. > Therefore when they're changed, we'll re-create the session for them to take > effect. There're some configs not related to RSC that also start with > {{hive.spark.}}. We'd better rename them so that we don't unnecessarily > re-create sessions, which is usually time consuming. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14560) Support exchange partition between s3 and hdfs tables
[ https://issues.apache.org/jira/browse/HIVE-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258756#comment-16258756 ] Lefty Leverenz commented on HIVE-14560: --- Should this be documented in the wiki, or is it just a bug fix? > Support exchange partition between s3 and hdfs tables > - > > Key: HIVE-14560 > URL: https://issues.apache.org/jira/browse/HIVE-14560 > Project: Hive > Issue Type: Bug >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 3.0.0 > > Attachments: HIVE-14560.02.patch, HIVE-14560.patch > > > {code} > alter table s3_tbl exchange partition (country='USA', state='CA') with table > hdfs_tbl; > {code} > results in: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got > exception: java.lang.IllegalArgumentException Wrong FS: > s3a://hive-on-s3/s3_tbl/country=USA/state=CA, expected: > hdfs://localhost:9000) (state=08S01,code=1) > {code} > because the check for whether the s3 destination table path exists occurs on > the hdfs filesystem. > Furthermore, exchanging between s3 to hdfs fails because the hdfs rename > operation is not supported across filesystems. Fix uses copy + deletion in > the case that the file systems differ. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18056) CachedStore: Have a whitelist/blacklist config to allow selective caching of tables/partitions and allow read while prewarming
[ https://issues.apache.org/jira/browse/HIVE-18056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258730#comment-16258730 ] Lefty Leverenz commented on HIVE-18056: --- Doc note: This adds *hive.metastore.cached.rawstore.cached.object.whitelist* and *hive.metastore.cached.rawstore.cached.object.blacklist* to HiveConf.java, so they need to be documented in the wiki. * [Configuration Properties -- Metastore | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore] General documentation is also needed for CachedStore. Added a TODOC3.0 label. > CachedStore: Have a whitelist/blacklist config to allow selective caching of > tables/partitions and allow read while prewarming > -- > > Key: HIVE-18056 > URL: https://issues.apache.org/jira/browse/HIVE-18056 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Vaibhav Gumashta >Assignee: Daniel Dai > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-18056.1.patch, HIVE-18056.2.patch, > HIVE-18056.3.patch, HIVE-18056.4.patch, HIVE-18056.5.patch, > HIVE-18056.6.patch, HIVE-18056.7.patch, HIVE-18056.8.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader
[ https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-17528: Attachment: HIVE-17528.5.patch Rebase to the latest code. > Add more q-tests for Hive-on-Spark with Parquet vectorized reader > - > > Key: HIVE-17528 > URL: https://issues.apache.org/jira/browse/HIVE-17528 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu > Attachments: HIVE-17528.1.patch, HIVE-17528.2.patch, > HIVE-17528.3.patch, HIVE-17528.4.patch, HIVE-17528.5.patch, HIVE-17528.patch > > > Most of the vectorization related q-tests operate on ORC tables using Tez. It > would be good to add more coverage on a different combination of engine and > file-format. We can model existing q-tests using parquet tables and run it > using TestSparkCliDriver -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18102) Hive insertion for complex types not working when "transactional=true" o ACID is enabled
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nillohit Nandi updated HIVE-18102: -- Summary: Hive insertion for complex types not working when "transactional=true" o ACID is enabled (was: Hive insertion for complex types not working when "transactional=true") > Hive insertion for complex types not working when "transactional=true" o ACID > is enabled > > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi >Assignee: Hive QA > > I am merging into a table daily which has a column type as an array of > structs : > {color:blue}segment_info ARRAY < STRUCT idlpSegmentValue: STRING >> > {color} > *When table is created without transactional=true, behaviour is fine.* > Example snippet: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select * from struct_merge; > {code} > hive> select * from default.struct_merge; > OK > {color:blue}1 > [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] > {color} > Time taken: 0.125 seconds, Fetched: 1 row(s) > *With transactional = true, behaviour is erratic, null values are populated > as values of nested Structs.* > Eg: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC > TBLPROPERTIES ('transactional'='true'); > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select * from struct_merge; > //this one gives null values > {code} > hive> select * from default.struct_merge1; > OK > {color:red}1 > [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] > {color} > Time taken: 0.608 seconds, Fetched: 1 row(s) > *Can this behaviour be explained? I need the transaction property since I am > merging into a common table on daily data.* -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"
[ https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258641#comment-16258641 ] Hive QA commented on HIVE-16756: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12898390/HIVE-16756.06-branch-2.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10657 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explaindenpendencydiffengs] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[table_nonprintable] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=153) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[merge_negative_5] (batchId=88) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[explaindenpendencydiffengs] (batchId=115) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf] (batchId=125) org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=176) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7914/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7914/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7914/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12898390 - PreCommit-HIVE-Build > Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: > / by zero" > > > Key: HIVE-16756 > URL: https://issues.apache.org/jira/browse/HIVE-16756 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.0 >Reporter: Matt McCline >Assignee: Vihang Karajgaonkar >Priority: Critical > Attachments: HIVE-16756.01.patch, HIVE-16756.02.patch, > HIVE-16756.03.patch, HIVE-16756.05-branch-2.patch, > HIVE-16756.06-branch-2.patch > > > vectorization_div0.q needs to test the long data type testing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-18103) TestSparkCliDriver.testCliDriver[vectorized_ptf] failing on branch-2
[ https://issues.apache.org/jira/browse/HIVE-18103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar reassigned HIVE-18103: -- > TestSparkCliDriver.testCliDriver[vectorized_ptf] failing on branch-2 > > > Key: HIVE-18103 > URL: https://issues.apache.org/jira/browse/HIVE-18103 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > > TestSparkCliDriver.testCliDriver[vectorized_ptf.q] and > TestSparkCliDriver.testCliDriver[vectorization_7.q] are failing on branch-2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"
[ https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258611#comment-16258611 ] Vihang Karajgaonkar commented on HIVE-16756: some of the vectorization test failures are related. It looks like there are differences in the template file between master and branch-2 which is causing this. I regenerated the {{LongColModuloLongColumn.java}} from the template then applied the fix on top of it to fix these test failures. Some of the vectorization tests are showing diff failures even without patch. I will create a separate JIRA to fix them on branch-2 > Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: > / by zero" > > > Key: HIVE-16756 > URL: https://issues.apache.org/jira/browse/HIVE-16756 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.0 >Reporter: Matt McCline >Assignee: Vihang Karajgaonkar >Priority: Critical > Attachments: HIVE-16756.01.patch, HIVE-16756.02.patch, > HIVE-16756.03.patch, HIVE-16756.05-branch-2.patch, > HIVE-16756.06-branch-2.patch > > > vectorization_div0.q needs to test the long data type testing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"
[ https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-16756: --- Attachment: HIVE-16756.06-branch-2.patch > Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: > / by zero" > > > Key: HIVE-16756 > URL: https://issues.apache.org/jira/browse/HIVE-16756 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.0 >Reporter: Matt McCline >Assignee: Vihang Karajgaonkar >Priority: Critical > Attachments: HIVE-16756.01.patch, HIVE-16756.02.patch, > HIVE-16756.03.patch, HIVE-16756.05-branch-2.patch, > HIVE-16756.06-branch-2.patch > > > vectorization_div0.q needs to test the long data type testing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18102) Hive insertion for complex types not working when "transactional=true"
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-18102: -- Component/s: Transactions > Hive insertion for complex types not working when "transactional=true" > -- > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi >Assignee: Hive QA > > I am merging into a table daily which has a column type as an array of > structs : > {color:blue}segment_info ARRAY < STRUCT idlpSegmentValue: STRING >> > {color} > *When table is created without transactional=true, behaviour is fine.* > Example snippet: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select * from struct_merge; > {code} > hive> select * from default.struct_merge; > OK > {color:blue}1 > [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] > {color} > Time taken: 0.125 seconds, Fetched: 1 row(s) > *With transactional = true, behaviour is erratic, null values are populated > as values of nested Structs.* > Eg: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC > TBLPROPERTIES ('transactional'='true'); > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select * from struct_merge; > //this one gives null values > {code} > hive> select * from default.struct_merge1; > OK > {color:red}1 > [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] > {color} > Time taken: 0.608 seconds, Fetched: 1 row(s) > *Can this behaviour be explained? I need the transaction property since I am > merging into a common table on daily data.* -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-18102) Hive insertion for complex types not working when "transactional=true"
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nillohit Nandi reassigned HIVE-18102: - Assignee: Hive QA (was: Kiet Ly) > Hive insertion for complex types not working when "transactional=true" > -- > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi >Assignee: Hive QA > > I am merging into a table daily which has a column type as an array of > structs : > {color:blue}segment_info ARRAY < STRUCT idlpSegmentValue: STRING >> > {color} > *When table is created without transactional=true, behaviour is fine.* > Example snippet: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select * from struct_merge; > {code} > hive> select * from default.struct_merge; > OK > {color:blue}1 > [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] > {color} > Time taken: 0.125 seconds, Fetched: 1 row(s) > *With transactional = true, behaviour is erratic, null values are populated > as values of nested Structs.* > Eg: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC > TBLPROPERTIES ('transactional'='true'); > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select * from struct_merge; > //this one gives null values > {code} > hive> select * from default.struct_merge1; > OK > {color:red}1 > [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] > {color} > Time taken: 0.608 seconds, Fetched: 1 row(s) > *Can this behaviour be explained? I need the transaction property since I am > merging into a common table on daily data.* -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-18102) Hive insertion for complex types not working when "transactional=true"
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nillohit Nandi reassigned HIVE-18102: - Assignee: Kiet Ly > Hive insertion for complex types not working when "transactional=true" > -- > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi >Assignee: Kiet Ly > > I am merging into a table daily which has a column type as an array of > structs : > {color:blue}segment_info ARRAY < STRUCT idlpSegmentValue: STRING >> > {color} > *When table is created without transactional=true, behaviour is fine.* > Example snippet: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select * from struct_merge; > {code} > hive> select * from default.struct_merge; > OK > {color:blue}1 > [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] > {color} > Time taken: 0.125 seconds, Fetched: 1 row(s) > *With transactional = true, behaviour is erratic, null values are populated > as values of nested Structs.* > Eg: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC > TBLPROPERTIES ('transactional'='true'); > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select * from struct_merge; > //this one gives null values > {code} > hive> select * from default.struct_merge1; > OK > {color:red}1 > [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] > {color} > Time taken: 0.608 seconds, Fetched: 1 row(s) > *Can this behaviour be explained? I need the transaction property since I am > merging into a common table on daily data.* -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-13198) Authorization issues with cascading views
[ https://issues.apache.org/jira/browse/HIVE-13198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang Haihua updated HIVE-13198: --- Description: dHere is a use case. They have a base table t1, from which they create a view v1. They further create a view v2 from v1 by applying a filter. User has access to only view v2, not view v1 or table t1. When user tries to access v2, they are denied access. Steps to recreate: There is a base table t1 that exists in the default database with primary key id and some employee data (name, ssn etc) Create view v1 - “create view v1 as select * from default.t1;” Created v2 - “create view v2 as select * from v1 where id =1;” Permissions provided for user to select all columns from view v2. When user runs select * from v2, hive throws an error “user does not have permissions to select view v1". Apparently Hive is converting the query to underlying views. SELECT * FROM v2 LIMIT 100 To select `v1`.`id`, `v1`.`name`, `v1`.`ssn`, `v1`.`join_date`, `v1`.`location` from `hr`.`v1` where `v1`.`id`=1 Hive should only check for permissions for the view being run in the query, not any parent views. (This is consistent with ORACLE). was: Here is a use case. They have a base table t1, from which they create a view v1. They further create a view v2 from v1 by applying a filter. User has access to only view v2, not view v1 or table t1. When user tries to access v2, they are denied access. Steps to recreate: There is a base table t1 that exists in the default database with primary key id and some employee data (name, ssn etc) Create view v1 - “create view v1 as select * from default.t1;” Created v2 - “create view v2 as select * from v1 where id =1;” Permissions provided for user to select all columns from view v2. When user runs select * from v2, hive throws an error “user does not have permissions to select view v1". Apparently Hive is converting the query to underlying views. SELECT * FROM v2 LIMIT 100 To select `v1`.`id`, `v1`.`name`, `v1`.`ssn`, `v1`.`join_date`, `v1`.`location` from `hr`.`v1` where `v1`.`id`=1 Hive should only check for permissions for the view being run in the query, not any parent views. (This is consistent with ORACLE). > Authorization issues with cascading views > - > > Key: HIVE-13198 > URL: https://issues.apache.org/jira/browse/HIVE-13198 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-13198.01.patch, HIVE-13198.02.patch > > > dHere is a use case. They have a base table t1, from which they create a view > v1. They further create a view v2 from v1 by applying a filter. User has > access to only view v2, not view v1 or table t1. When user tries to access > v2, they are denied access. > Steps to recreate: > There is a base table t1 that exists in the default database with primary key > id and some employee data (name, ssn etc) > Create view v1 - “create view v1 as select * from default.t1;” > Created v2 - “create view v2 as select * from v1 where id =1;” > Permissions provided for user to select all columns from view v2. When user > runs select * from v2, hive throws an error “user does not have permissions > to select view v1". > Apparently Hive is converting the query to underlying views. > SELECT * FROM v2 LIMIT 100 > To > select `v1`.`id`, `v1`.`name`, `v1`.`ssn`, `v1`.`join_date`, `v1`.`location` > from `hr`.`v1` where `v1`.`id`=1 > Hive should only check for permissions for the view being run in the query, > not any parent views. (This is consistent with ORACLE). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18102) Hive insertion for complex types not working when "transactional=true"
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nillohit Nandi updated HIVE-18102: -- Description: I am merging into a table daily which has a column type as an array of structs : {color:blue}segment_info ARRAY < STRUCT > {color} *When table is created without transactional=true, behaviour is fine.* Example snippet: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; {code} hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color} Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs.* Eg: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values {code} hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color} Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* was: I am merging into a table daily which has a column type as an array of structs : {color:blue}segment_info ARRAY < STRUCT > {color} *When table is created without transactional=true, behaviour is fine. * Example snippet: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; {code} hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color} Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values {code} hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color} Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* > Hive insertion for complex types not working when "transactional=true" > -- > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi > > I am merging into a table daily which has a column type as an array of > structs : > {color:blue}segment_info ARRAY < STRUCT idlpSegmentValue: STRING >> > {color} > *When table is created without transactional=true, behaviour is fine.* > Example snippet: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), >
[jira] [Updated] (HIVE-18102) Hive insertion for complex types not working when "transactional=true"
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nillohit Nandi updated HIVE-18102: -- Description: I am merging into a table daily which has a column type as an array of structs : {color:blue}segment_info ARRAY < STRUCT > {color} *When table is created without transactional=true, behaviour is fine. * Example snippet: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; {code} hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color}Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values {code} hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color}Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* was: I am merging into a table daily which has a column type as an array of structs : {color:#205081}segment_info ARRAY < STRUCT > {color} *When table is created without transactional=true, behaviour is fine. * Example snippet: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; {code} hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color}Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color}Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* > Hive insertion for complex types not working when "transactional=true" > -- > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi > > I am merging into a table daily which has a column type as an array of > structs : > {color:blue}segment_info ARRAY < STRUCT idlpSegmentValue: STRING >> > {color} > *When table is created without transactional=true, behaviour is fine. > * > Example snippet: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), >
[jira] [Updated] (HIVE-18102) Hive insertion for complex types not working when "transactional=true"
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nillohit Nandi updated HIVE-18102: -- Description: I am merging into a table daily which has a column type as an array of structs : {color:blue}segment_info ARRAY < STRUCT > {color} *When table is created without transactional=true, behaviour is fine. * Example snippet: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; {code} hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color} Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values {code} hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color} Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* was: I am merging into a table daily which has a column type as an array of structs : {color:blue}segment_info ARRAY < STRUCT > {color} *When table is created without transactional=true, behaviour is fine. * Example snippet: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; {code} hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color}Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values {code} hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color}Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* > Hive insertion for complex types not working when "transactional=true" > -- > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi > > I am merging into a table daily which has a column type as an array of > structs : > {color:blue}segment_info ARRAY < STRUCT idlpSegmentValue: STRING >> > {color} > *When table is created without transactional=true, behaviour is fine. > * > Example snippet: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), >
[jira] [Updated] (HIVE-18102) Hive insertion for complex types not working when "transactional=true"
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nillohit Nandi updated HIVE-18102: -- Description: I am merging into a table daily which has a column type as an array of structs : {color:#205081}segment_info ARRAY < STRUCT > {color} *When table is created without transactional=true, behaviour is fine. * Example snippet: {code:sql} drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; {code} hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color}Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color}Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* was: I am merging into a table daily which has a column type as an array of structs : {color:#205081}segment_info ARRAY < STRUCT > {color} *When table is created without transactional=true, behaviour is fine. * Example snippet: ??author?? drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color}Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color}Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* > Hive insertion for complex types not working when "transactional=true" > -- > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi > > I am merging into a table daily which has a column type as an array of > structs : > {color:#205081}segment_info ARRAY < STRUCT idlpSegmentValue: STRING >> > {color} > *When table is created without transactional=true, behaviour is fine. > * > Example snippet: > {code:sql} > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), >
[jira] [Updated] (HIVE-18102) Hive insertion for complex types not working when "transactional=true"
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nillohit Nandi updated HIVE-18102: -- Description: I am merging into a table daily which has a column type as an array of structs : {color:#205081}segment_info ARRAY < STRUCT > {color} *When table is created without transactional=true, behaviour is fine. * Example snippet: ??author?? drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color}Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color}Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* was: I am merging into a table daily which has a column type as an array of structs : segment_info ARRAY < STRUCT > *When table is created without transactional=true, behaviour is fine. * Example snippet: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color}Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color}Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* > Hive insertion for complex types not working when "transactional=true" > -- > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi > > I am merging into a table daily which has a column type as an array of > structs : > {color:#205081}segment_info ARRAY < STRUCT idlpSegmentValue: STRING >> > {color} > *When table is created without transactional=true, behaviour is fine. > * > Example snippet: > ??author?? > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select *
[jira] [Updated] (HIVE-18102) Hive insertion for complex types not working when "transactional=true"
[ https://issues.apache.org/jira/browse/HIVE-18102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nillohit Nandi updated HIVE-18102: -- Description: I am merging into a table daily which has a column type as an array of structs : segment_info ARRAY < STRUCT > *When table is created without transactional=true, behaviour is fine. * Example snippet: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; hive> select * from default.struct_merge; OK {color:blue}1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] {color}Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values hive> select * from default.struct_merge1; OK {color:red}1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] {color}Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* was: I am merging into a table daily which has a column type as an array of structs : segment_info ARRAY < STRUCT > *When table is created without transactional=true, behaviour is fine. * Example snippet: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC; INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; hive> select * from default.struct_merge; OK 1 [{"idlpSegmentName":"viant","idlpSegmentValue":"z"},{"idlpSegmentName":"instyle","idlpSegmentValue":"3"}] Time taken: 0.125 seconds, Fetched: 1 row(s) *With transactional = true, behaviour is erratic, null values are populated as values of nested Structs. * Eg: drop table struct_merge; CREATE TABLE struct_merge ( lr_id STRING, segment_info ARRAY < STRUCT > ) CLUSTERED BY(lr_id) INTO 1 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO TABLE struct_merge Select 1 AS lr_id , ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) AS segment_info; select * from struct_merge; //this one gives null values hive> select * from default.struct_merge1; OK 1 [{"idlpSegmentName":null,"idlpSegmentValue":null},{"idlpSegmentName":null,"idlpSegmentValue":null}] Time taken: 0.608 seconds, Fetched: 1 row(s) *Can this behaviour be explained? I need the transaction property since I am merging into a common table on daily data.* > Hive insertion for complex types not working when "transactional=true" > -- > > Key: HIVE-18102 > URL: https://issues.apache.org/jira/browse/HIVE-18102 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.1 > Environment: Running EMR cluster on AWS, with : > Master: Running1m3.xlarge > Core: Running4m3.xlarge >Reporter: Nillohit Nandi > > I am merging into a table daily which has a column type as an array of > structs : > segment_info ARRAY < STRUCT STRING >> > *When table is created without transactional=true, behaviour is fine. > * > Example snippet: > drop table struct_merge; > CREATE TABLE struct_merge ( > lr_id STRING, > segment_info ARRAY < STRUCT STRING >> > ) > CLUSTERED BY(lr_id) > INTO 1 BUCKETS > STORED AS ORC; > INSERT INTO TABLE struct_merge >Select 1 AS lr_id , > > ARRAY(NAMED_STRUCT('idlpSegmentName','viant','idlpSegmentValue','z'), > NAMED_STRUCT('idlpSegmentName','instyle','idlpSegmentValue','3')) > AS segment_info; > select * from struct_merge; > hive> select * from default.struct_merge; > OK > {color:blue}1 >