[jira] [Commented] (HIVE-21886) REPL - With table list - Handle rename events during replace policy
[ https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875379#comment-16875379 ] Hive QA commented on HIVE-21886: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12973227/HIVE-21886.05.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16359 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/17796/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17796/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17796/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12973227 - PreCommit-HIVE-Build > REPL - With table list - Handle rename events during replace policy > --- > > Key: HIVE-21886 > URL: https://issues.apache.org/jira/browse/HIVE-21886 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: DR, Replication, pull-request-available > Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, > HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, > HIVE-21886.05.patch > > Time Spent: 11h 10m > Remaining Estimate: 0h > > If some rename events are found to be dumped and replayed while replace > policy is getting executed, it needs to take care of the policy inclusion in > both the policy for each table name. > 1. Create a list of tables to be bootstrapped. > 2. During handling of alter table, if the alter type is rename > 1. If the old table name is present in the list of table to be > bootstrapped, remove it. > 2. If the new table name, matches the new policy, add it to the list > of tables to be bootstrapped. > 3. If the old table does not match the old policy drop it, even if the > table is not present at target. > 3. During handling of drop table > 1. if the table is in the list of tables to be bootstrapped, then > remove it and ignore the event. > 4. During other event handling > 1. if the table is there in the list of tables to be bootstrapped, > then ignore the event. > 2. If the new policy does not match the table name, then ignore the > event. > > Rename handling during replace policy > # Old name not matching old policy – The old table will not be there at the > target cluster. The table will not be returned by get-all-table. > ## Old name is not matching new policy > ### New name not matching old policy > New name not matching new policy > * Ignore the event, no need to do anything. > New name matching new policy > * The table will be returned by get-all-table. Replace policy handler > will bootstrap this table as its matching new policy and not matching old > policy. > * All the future events will be ignored as part of check added by > replace policy handling. > * All the event with old table name will anyways be ignored as the old > name is not matching the new policy. > ### New name matching old policy > New name not matching new policy > * As the new name is not matching the new policy, the table need not be > replicated. > * As the old name is not matching the new policy, the rename events will > be ignored. > * So nothing to be done for this scenario. > New name matching new policy > * As the new name is matching both old and new policy, replace handler > will not bootstrap the table. > * Add the table to the list of tables to be bootstrapped. > * Ignore all the events with new name. > * If there is a drop event for the table (with new name), then remove > the table from the the list of table to be bootstrapped. > * In case of rename event (double rename) > ** If the new name satisfies the table pattern, then add the new name to > the list of tables to be bootstrapped and remove the old name from the list > of tables to be bootstrapped. > ** If the new name does not satisfies then just removed the table name > from the list of tables to be bootstrapped. > ## Old name is matching new policy – As per replace policy handler, which > checks based on old table, the table should be bootstrapped and event should > be ignored. But rename handler should decide based on new name.The old table > name will no
[jira] [Commented] (HIVE-21886) REPL - With table list - Handle rename events during replace policy
[ https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875369#comment-16875369 ] Hive QA commented on HIVE-21886: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 37s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 58s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 21s{color} | {color:blue} testutils/ptest2 in master has 24 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 56s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-17796/dev-support/hive-personality.sh | | git revision | master / b2a265a | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-17796/yetus/patch-asflicense-problems.txt | | modules | C: ql itests/hive-unit testutils/ptest2 U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-17796/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > REPL - With table list - Handle rename events during replace policy > --- > > Key: HIVE-21886 > URL: https://issues.apache.org/jira/browse/HIVE-21886 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: DR, Replication, pull-request-available > Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, > HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, > HIVE-21886.05.patch > > Time Spent: 11h 10m > Remaining Estimate: 0h > > If some rename events are found to be dumped and replayed while replace > policy is getting executed, it needs to take care of the policy inclusion in > both the policy for each table name. > 1. Create a list of t
[jira] [Updated] (HIVE-21886) REPL - With table list - Handle rename events during replace policy
[ https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21886: --- Status: Patch Available (was: Open) > REPL - With table list - Handle rename events during replace policy > --- > > Key: HIVE-21886 > URL: https://issues.apache.org/jira/browse/HIVE-21886 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: DR, Replication, pull-request-available > Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, > HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, > HIVE-21886.05.patch > > Time Spent: 11h 10m > Remaining Estimate: 0h > > If some rename events are found to be dumped and replayed while replace > policy is getting executed, it needs to take care of the policy inclusion in > both the policy for each table name. > 1. Create a list of tables to be bootstrapped. > 2. During handling of alter table, if the alter type is rename > 1. If the old table name is present in the list of table to be > bootstrapped, remove it. > 2. If the new table name, matches the new policy, add it to the list > of tables to be bootstrapped. > 3. If the old table does not match the old policy drop it, even if the > table is not present at target. > 3. During handling of drop table > 1. if the table is in the list of tables to be bootstrapped, then > remove it and ignore the event. > 4. During other event handling > 1. if the table is there in the list of tables to be bootstrapped, > then ignore the event. > 2. If the new policy does not match the table name, then ignore the > event. > > Rename handling during replace policy > # Old name not matching old policy – The old table will not be there at the > target cluster. The table will not be returned by get-all-table. > ## Old name is not matching new policy > ### New name not matching old policy > New name not matching new policy > * Ignore the event, no need to do anything. > New name matching new policy > * The table will be returned by get-all-table. Replace policy handler > will bootstrap this table as its matching new policy and not matching old > policy. > * All the future events will be ignored as part of check added by > replace policy handling. > * All the event with old table name will anyways be ignored as the old > name is not matching the new policy. > ### New name matching old policy > New name not matching new policy > * As the new name is not matching the new policy, the table need not be > replicated. > * As the old name is not matching the new policy, the rename events will > be ignored. > * So nothing to be done for this scenario. > New name matching new policy > * As the new name is matching both old and new policy, replace handler > will not bootstrap the table. > * Add the table to the list of tables to be bootstrapped. > * Ignore all the events with new name. > * If there is a drop event for the table (with new name), then remove > the table from the the list of table to be bootstrapped. > * In case of rename event (double rename) > ** If the new name satisfies the table pattern, then add the new name to > the list of tables to be bootstrapped and remove the old name from the list > of tables to be bootstrapped. > ** If the new name does not satisfies then just removed the table name > from the list of tables to be bootstrapped. > ## Old name is matching new policy – As per replace policy handler, which > checks based on old table, the table should be bootstrapped and event should > be ignored. But rename handler should decide based on new name.The old table > name will not be returned by get-all-table, so replace handler will not d > anything for the old table. > ### New name not matching old policy > New name not matching new policy > * As the old table is not there at target and new name is not matching > new policy. Ignore the event. > * No need to add the table to the list of tables to be bootstrapped. > * All the subsequent events will be ignored as the new name is not > matching the new policy. > New name matching new policy > * As the new name is not matching old policy but matching new policy, > the table will be bootstrapped by replace policy handler. So rename event > need not add this table to list of table to be bootstrapped. > * All the future events will be ignored by replace policy handler. > * For rename event (double rename) > ** If there is a rename, the table (with intermitten
[jira] [Updated] (HIVE-21886) REPL - With table list - Handle rename events during replace policy
[ https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21886: --- Status: Open (was: Patch Available) > REPL - With table list - Handle rename events during replace policy > --- > > Key: HIVE-21886 > URL: https://issues.apache.org/jira/browse/HIVE-21886 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: DR, Replication, pull-request-available > Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, > HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, > HIVE-21886.05.patch > > Time Spent: 11h 10m > Remaining Estimate: 0h > > If some rename events are found to be dumped and replayed while replace > policy is getting executed, it needs to take care of the policy inclusion in > both the policy for each table name. > 1. Create a list of tables to be bootstrapped. > 2. During handling of alter table, if the alter type is rename > 1. If the old table name is present in the list of table to be > bootstrapped, remove it. > 2. If the new table name, matches the new policy, add it to the list > of tables to be bootstrapped. > 3. If the old table does not match the old policy drop it, even if the > table is not present at target. > 3. During handling of drop table > 1. if the table is in the list of tables to be bootstrapped, then > remove it and ignore the event. > 4. During other event handling > 1. if the table is there in the list of tables to be bootstrapped, > then ignore the event. > 2. If the new policy does not match the table name, then ignore the > event. > > Rename handling during replace policy > # Old name not matching old policy – The old table will not be there at the > target cluster. The table will not be returned by get-all-table. > ## Old name is not matching new policy > ### New name not matching old policy > New name not matching new policy > * Ignore the event, no need to do anything. > New name matching new policy > * The table will be returned by get-all-table. Replace policy handler > will bootstrap this table as its matching new policy and not matching old > policy. > * All the future events will be ignored as part of check added by > replace policy handling. > * All the event with old table name will anyways be ignored as the old > name is not matching the new policy. > ### New name matching old policy > New name not matching new policy > * As the new name is not matching the new policy, the table need not be > replicated. > * As the old name is not matching the new policy, the rename events will > be ignored. > * So nothing to be done for this scenario. > New name matching new policy > * As the new name is matching both old and new policy, replace handler > will not bootstrap the table. > * Add the table to the list of tables to be bootstrapped. > * Ignore all the events with new name. > * If there is a drop event for the table (with new name), then remove > the table from the the list of table to be bootstrapped. > * In case of rename event (double rename) > ** If the new name satisfies the table pattern, then add the new name to > the list of tables to be bootstrapped and remove the old name from the list > of tables to be bootstrapped. > ** If the new name does not satisfies then just removed the table name > from the list of tables to be bootstrapped. > ## Old name is matching new policy – As per replace policy handler, which > checks based on old table, the table should be bootstrapped and event should > be ignored. But rename handler should decide based on new name.The old table > name will not be returned by get-all-table, so replace handler will not d > anything for the old table. > ### New name not matching old policy > New name not matching new policy > * As the old table is not there at target and new name is not matching > new policy. Ignore the event. > * No need to add the table to the list of tables to be bootstrapped. > * All the subsequent events will be ignored as the new name is not > matching the new policy. > New name matching new policy > * As the new name is not matching old policy but matching new policy, > the table will be bootstrapped by replace policy handler. So rename event > need not add this table to list of table to be bootstrapped. > * All the future events will be ignored by replace policy handler. > * For rename event (double rename) > ** If there is a rename, the table (with intermitten
[jira] [Updated] (HIVE-21886) REPL - With table list - Handle rename events during replace policy
[ https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21886: --- Attachment: HIVE-21886.05.patch > REPL - With table list - Handle rename events during replace policy > --- > > Key: HIVE-21886 > URL: https://issues.apache.org/jira/browse/HIVE-21886 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: DR, Replication, pull-request-available > Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, > HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch, > HIVE-21886.05.patch > > Time Spent: 11h 10m > Remaining Estimate: 0h > > If some rename events are found to be dumped and replayed while replace > policy is getting executed, it needs to take care of the policy inclusion in > both the policy for each table name. > 1. Create a list of tables to be bootstrapped. > 2. During handling of alter table, if the alter type is rename > 1. If the old table name is present in the list of table to be > bootstrapped, remove it. > 2. If the new table name, matches the new policy, add it to the list > of tables to be bootstrapped. > 3. If the old table does not match the old policy drop it, even if the > table is not present at target. > 3. During handling of drop table > 1. if the table is in the list of tables to be bootstrapped, then > remove it and ignore the event. > 4. During other event handling > 1. if the table is there in the list of tables to be bootstrapped, > then ignore the event. > 2. If the new policy does not match the table name, then ignore the > event. > > Rename handling during replace policy > # Old name not matching old policy – The old table will not be there at the > target cluster. The table will not be returned by get-all-table. > ## Old name is not matching new policy > ### New name not matching old policy > New name not matching new policy > * Ignore the event, no need to do anything. > New name matching new policy > * The table will be returned by get-all-table. Replace policy handler > will bootstrap this table as its matching new policy and not matching old > policy. > * All the future events will be ignored as part of check added by > replace policy handling. > * All the event with old table name will anyways be ignored as the old > name is not matching the new policy. > ### New name matching old policy > New name not matching new policy > * As the new name is not matching the new policy, the table need not be > replicated. > * As the old name is not matching the new policy, the rename events will > be ignored. > * So nothing to be done for this scenario. > New name matching new policy > * As the new name is matching both old and new policy, replace handler > will not bootstrap the table. > * Add the table to the list of tables to be bootstrapped. > * Ignore all the events with new name. > * If there is a drop event for the table (with new name), then remove > the table from the the list of table to be bootstrapped. > * In case of rename event (double rename) > ** If the new name satisfies the table pattern, then add the new name to > the list of tables to be bootstrapped and remove the old name from the list > of tables to be bootstrapped. > ** If the new name does not satisfies then just removed the table name > from the list of tables to be bootstrapped. > ## Old name is matching new policy – As per replace policy handler, which > checks based on old table, the table should be bootstrapped and event should > be ignored. But rename handler should decide based on new name.The old table > name will not be returned by get-all-table, so replace handler will not d > anything for the old table. > ### New name not matching old policy > New name not matching new policy > * As the old table is not there at target and new name is not matching > new policy. Ignore the event. > * No need to add the table to the list of tables to be bootstrapped. > * All the subsequent events will be ignored as the new name is not > matching the new policy. > New name matching new policy > * As the new name is not matching old policy but matching new policy, > the table will be bootstrapped by replace policy handler. So rename event > need not add this table to list of table to be bootstrapped. > * All the future events will be ignored by replace policy handler. > * For rename event (double rename) > ** If there is a rename, the table (with intermittent new
[jira] [Commented] (HIVE-21637) Synchronized metastore cache
[ https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875340#comment-16875340 ] Hive QA commented on HIVE-21637: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 59s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 30s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 26s{color} | {color:blue} storage-api in master has 48 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 34s{color} | {color:blue} standalone-metastore/metastore-common in master has 31 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 12s{color} | {color:blue} standalone-metastore/metastore-server in master has 179 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 3s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 29s{color} | {color:blue} beeline in master has 44 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 29s{color} | {color:blue} hcatalog/server-extensions in master has 3 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 32s{color} | {color:blue} standalone-metastore/metastore-tools/metastore-benchmarks in master has 3 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 42s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 51s{color} | {color:blue} itests/util in master has 44 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 17s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 4s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s{color} | {color:red} storage-api: The patch generated 1 new + 5 unchanged - 0 fixed = 6 total (was 5) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s{color} | {color:red} standalone-metastore/metastore-common: The patch generated 9 new + 498 unchanged - 2 fixed = 507 total (was 500) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 46s{color} | {color:red} standalone-metastore/metastore-server: The patch generated 160 new + 2193 unchanged - 65 fixed = 2353 total (was 2258) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 53s{color} | {color:red} ql: The patch generated 25 new + 962 unchanged - 10 fixed = 987 total (was 972) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} standalone-metastore/metastore-tools/tools-common: The patch generated 5 new + 31 unchanged - 0 fixed = 36 total (was 31) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s{color} | {color:red} itests/hcatalog-unit: The patch generated 2 new + 24 unchanged - 3 fixed = 26 total (was 27) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s{color} | {color:red} itests/hive-unit: The patch generated 3 new + 163 unchanged - 1 fixed = 166 total (was 164) {color} | | {color:red}-1{color} | {color:red} checkstyle
[jira] [Updated] (HIVE-21637) Synchronized metastore cache
[ https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-21637: -- Attachment: HIVE-21637.13.patch > Synchronized metastore cache > > > Key: HIVE-21637 > URL: https://issues.apache.org/jira/browse/HIVE-21637 > Project: Hive > Issue Type: New Feature >Reporter: Daniel Dai >Assignee: Daniel Dai >Priority: Major > Attachments: HIVE-21637-1.patch, HIVE-21637.10.patch, > HIVE-21637.11.patch, HIVE-21637.12.patch, HIVE-21637.13.patch, > HIVE-21637.2.patch, HIVE-21637.3.patch, HIVE-21637.4.patch, > HIVE-21637.5.patch, HIVE-21637.6.patch, HIVE-21637.7.patch, > HIVE-21637.8.patch, HIVE-21637.9.patch > > > Currently, HMS has a cache implemented by CachedStore. The cache is > asynchronized and in HMS HA setting, we can only get eventual consistency. In > this Jira, we try to make it synchronized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21637) Synchronized metastore cache
[ https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-21637: -- Attachment: (was: HIVE-21637.13.patch) > Synchronized metastore cache > > > Key: HIVE-21637 > URL: https://issues.apache.org/jira/browse/HIVE-21637 > Project: Hive > Issue Type: New Feature >Reporter: Daniel Dai >Assignee: Daniel Dai >Priority: Major > Attachments: HIVE-21637-1.patch, HIVE-21637.10.patch, > HIVE-21637.11.patch, HIVE-21637.12.patch, HIVE-21637.2.patch, > HIVE-21637.3.patch, HIVE-21637.4.patch, HIVE-21637.5.patch, > HIVE-21637.6.patch, HIVE-21637.7.patch, HIVE-21637.8.patch, HIVE-21637.9.patch > > > Currently, HMS has a cache implemented by CachedStore. The cache is > asynchronized and in HMS HA setting, we can only get eventual consistency. In > this Jira, we try to make it synchronized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21637) Synchronized metastore cache
[ https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-21637: -- Attachment: HIVE-21637.13.patch > Synchronized metastore cache > > > Key: HIVE-21637 > URL: https://issues.apache.org/jira/browse/HIVE-21637 > Project: Hive > Issue Type: New Feature >Reporter: Daniel Dai >Assignee: Daniel Dai >Priority: Major > Attachments: HIVE-21637-1.patch, HIVE-21637.10.patch, > HIVE-21637.11.patch, HIVE-21637.12.patch, HIVE-21637.13.patch, > HIVE-21637.2.patch, HIVE-21637.3.patch, HIVE-21637.4.patch, > HIVE-21637.5.patch, HIVE-21637.6.patch, HIVE-21637.7.patch, > HIVE-21637.8.patch, HIVE-21637.9.patch > > > Currently, HMS has a cache implemented by CachedStore. The cache is > asynchronized and in HMS HA setting, we can only get eventual consistency. In > this Jira, we try to make it synchronized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21927) HiveServer Web UI: Setting the HttpOnly option in the cookies
[ https://issues.apache.org/jira/browse/HIVE-21927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-21927: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) > HiveServer Web UI: Setting the HttpOnly option in the cookies > - > > Key: HIVE-21927 > URL: https://issues.apache.org/jira/browse/HIVE-21927 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-21927.01.patch, HIVE-21927.patch > > > Intend of this JIRA is to introduce the HttpOnly option in the cookie. > cookie: before change > {code:java} > hdp32bFALSE / FALSE 0 JSESSIONID > 8dkibwayfnrc4y4hvpu3vh74 > {code} > after change: > {code:java} > #HttpOnly_hdp32b FALSE / FALSE 0 JSESSIONID > e1npdkbo3inj1xnd6gdc6ihws > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21927) HiveServer Web UI: Setting the HttpOnly option in the cookies
[ https://issues.apache.org/jira/browse/HIVE-21927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875257#comment-16875257 ] Daniel Dai commented on HIVE-21927: --- +1. Patch pushed to master. Thanks Rajkumar! > HiveServer Web UI: Setting the HttpOnly option in the cookies > - > > Key: HIVE-21927 > URL: https://issues.apache.org/jira/browse/HIVE-21927 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21927.01.patch, HIVE-21927.patch > > > Intend of this JIRA is to introduce the HttpOnly option in the cookie. > cookie: before change > {code:java} > hdp32bFALSE / FALSE 0 JSESSIONID > 8dkibwayfnrc4y4hvpu3vh74 > {code} > after change: > {code:java} > #HttpOnly_hdp32b FALSE / FALSE 0 JSESSIONID > e1npdkbo3inj1xnd6gdc6ihws > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-21935) Hive Vectorization : degraded performance with vectorize UDF
[ https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875255#comment-16875255 ] Gopal V edited comment on HIVE-21935 at 6/28/19 9:53 PM: - Actually, the execution is buffering 1024 rows - the index is not evaluated until the 1024 split calls are queued up was (Author: gopalv): The loop looks like it is constant folding the value and executing the UDF to do that. > Hive Vectorization : degraded performance with vectorize UDF > -- > > Key: HIVE-21935 > URL: https://issues.apache.org/jira/browse/HIVE-21935 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 3.1.1 > Environment: Hive-3, JDK-8 >Reporter: Rajkumar Singh >Priority: Major > Labels: performance > Attachments: CustomSplit-1.0-SNAPSHOT.jar > > > with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we > were seeing severe performance degradation. looking at the task jstacks it > seems that it is running the code which vectorizes UDF and stuck in some loop. > {code:java} > jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql
[jira] [Commented] (HIVE-21935) Hive Vectorization : degraded performance with vectorize UDF
[ https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875255#comment-16875255 ] Gopal V commented on HIVE-21935: The loop looks like it is constant folding the value and executing the UDF to do that. > Hive Vectorization : degraded performance with vectorize UDF > -- > > Key: HIVE-21935 > URL: https://issues.apache.org/jira/browse/HIVE-21935 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 3.1.1 > Environment: Hive-3, JDK-8 >Reporter: Rajkumar Singh >Priority: Major > Labels: performance > Attachments: CustomSplit-1.0-SNAPSHOT.jar > > > with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we > were seeing severe performance degradation. looking at the task jstacks it > seems that it is running the code which vectorizes UDF and stuck in some loop. > {code:java} > jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.had
[jira] [Updated] (HIVE-21935) Hive Vectorization : degraded performance issue with vectorize UDF
[ https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar Singh updated HIVE-21935: -- Summary: Hive Vectorization : degraded performance issue with vectorize UDF (was: Hive Vectorization : Server performance issue with vectorize UDF ) > Hive Vectorization : degraded performance issue with vectorize UDF > > > Key: HIVE-21935 > URL: https://issues.apache.org/jira/browse/HIVE-21935 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 3.1.1 > Environment: Hive-3, JDK-8 >Reporter: Rajkumar Singh >Priority: Major > Labels: performance > Attachments: CustomSplit-1.0-SNAPSHOT.jar > > > with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we > were seeing severe performance degradation. looking at the task jstacks it > seems that it is running the code which vectorizes UDF and stuck in some loop. > {code:java} > jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource
[jira] [Updated] (HIVE-21935) Hive Vectorization : degraded performance with vectorize UDF
[ https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar Singh updated HIVE-21935: -- Summary: Hive Vectorization : degraded performance with vectorize UDF (was: Hive Vectorization : degraded performance issue with vectorize UDF ) > Hive Vectorization : degraded performance with vectorize UDF > -- > > Key: HIVE-21935 > URL: https://issues.apache.org/jira/browse/HIVE-21935 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 3.1.1 > Environment: Hive-3, JDK-8 >Reporter: Rajkumar Singh >Priority: Major > Labels: performance > Attachments: CustomSplit-1.0-SNAPSHOT.jar > > > with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we > were seeing severe performance degradation. looking at the task jstacks it > seems that it is running the code which vectorizes UDF and stuck in some loop. > {code:java} > jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) >
[jira] [Updated] (HIVE-21935) Hive Vectorization : Server performance issue with vectorize UDF
[ https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar Singh updated HIVE-21935: -- Attachment: CustomSplit-1.0-SNAPSHOT.jar > Hive Vectorization : Server performance issue with vectorize UDF > -- > > Key: HIVE-21935 > URL: https://issues.apache.org/jira/browse/HIVE-21935 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 3.1.1 > Environment: Hive-3, JDK-8 >Reporter: Rajkumar Singh >Priority: Major > Attachments: CustomSplit-1.0-SNAPSHOT.jar > > > with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we > were seeing severe performance degradation. looking at the task jstacks it > seems that it is running the code which vectorizes UDF and stuck in some loop. > {code:java} > jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) > at > org.apache.hadoop.hive.ql.e
[jira] [Updated] (HIVE-21935) Hive Vectorization : Server performance issue with vectorize UDF
[ https://issues.apache.org/jira/browse/HIVE-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar Singh updated HIVE-21935: -- Labels: performance (was: ) > Hive Vectorization : Server performance issue with vectorize UDF > -- > > Key: HIVE-21935 > URL: https://issues.apache.org/jira/browse/HIVE-21935 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 3.1.1 > Environment: Hive-3, JDK-8 >Reporter: Rajkumar Singh >Priority: Major > Labels: performance > Attachments: CustomSplit-1.0-SNAPSHOT.jar > > > with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we > were seeing severe performance degradation. looking at the task jstacks it > seems that it is running the code which vectorizes UDF and stuck in some loop. > {code:java} > jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20 > "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 > runnable [0x7f1547581000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570) > at > org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) > at > org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) > at > org.
[jira] [Comment Edited] (HIVE-21848) Table property name definition between ORC and Parquet encrytion
[ https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875247#comment-16875247 ] Xinli Shang edited comment on HIVE-21848 at 6/28/19 9:39 PM: - Hi [~owen.omalley], yes, I looked at the HadoopShims.java earlier. I still remember you had a super smart workaround to avoid two round trips to generate/encrypt a working key from KMS. It reduced half of the traffic. For the nested column questions above, I generally agree that makes sense. There are only a few corner cases that we need to discuss. For the example above "name: struct", if we see the table properties have the following entry, "encrypt.columns" = "pii:name;other_category:name.first", what do we do? Should we through exception? Or we just ignore "other_category:name.first" to let parent to override it? Do we allow exclusion of some leaf columns not to be encrypted, if their parent is specified to be encrypted? I guess people will raise the feature request later when it is roll out. With that said, I am not objecting the proposal but just some thoughts on corner cases. was (Author: sha...@uber.com): Hi [~owen.omalley], yes, I looked at the HadoopShims.java earlier. I still remember you had a super smart workaround to avoid two round trips to get generate/encrypt a working key from KMS. It reduced half of the traffic. For the nested column questions above, I generally agree that makes sense. There are only a few corner cases that we need to discuss. For the example above "name: struct", if we see the table properties have the following entry, "encrypt.columns" = "pii:name;other_category:name.first", what do we do? Should we through exception? Or we just ignore "other_category:name.first" to let parent to override it? Do we allow exclusion of some leaf columns not to be encrypted, if their parent is specified to be encrypted? I guess people will raise the feature request later when it is roll out. With that said, I am not objecting the proposal but just some thoughts on corner cases. > Table property name definition between ORC and Parquet encrytion > > > Key: HIVE-21848 > URL: https://issues.apache.org/jira/browse/HIVE-21848 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Xinli Shang >Assignee: Xinli Shang >Priority: Major > Fix For: 3.0.0 > > > The goal of this Jira is to define a superset of unified table property names > that can be used for both Parquet and ORC column encryption. There is no code > change needed for this Jira. > *Background:* > ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To > configure the encryption, e.g. which column is sensitive, what master key to > be used, algorithm, etc, table properties can be used. It is important that > both Parquet and ORC can use unified names. > According to the slide > [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692], > ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in > the Parquet community, it is still discussing to provide several ways and > using table properties is one of the options, while there is no detailed > design of the table property names yet. > So it is a good time to discuss within two communities to have unified table > names as a superset. > *Proposal:* > There are several encryption properties that need to be specified for a > table. Here is the list. This is the superset of Parquet and ORC. Some of > them might not apply to both. > # PII columns including nest columns > # Column key metadata, master key metadata > # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. > ORC might support AES_CTR. > # Encryption footer - Parquet allow footer to be encrypted or plaintext > # Footer key metadata > Here is the table properties proposal. > |*Table Property Name*|*Value*|*Notes*| > |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.| > |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted > footer. By default, it is encrypted.| > |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to > the KMS to define what key metadata is. The metadata should have enough > information to figure out the corresponding key by the KMS. | > |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column > name for example, ‘address.zipcode’. > > It is up to the KMS to define what key metadata is. The metadata should have > enough information to figure out the corresponding key by the KMS.| > -- This message was sent by Atlassian JIRA (v7.6.3#76
[jira] [Commented] (HIVE-21848) Table property name definition between ORC and Parquet encrytion
[ https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875247#comment-16875247 ] Xinli Shang commented on HIVE-21848: Hi [~owen.omalley], yes, I looked at the HadoopShims.java earlier. I still remember you had a super smart workaround to avoid two round trips to get generate/encrypt a working key from KMS. It reduced half of the traffic. For the nested column questions above, I generally agree that makes sense. There are only a few corner cases that we need to discuss. For the example above "name: struct", if we see the table properties have the following entry, "encrypt.columns" = "pii:name;other_category:name.first", what do we do? Should we through exception? Or we just ignore "other_category:name.first" to let parent to override it? Do we allow exclusion of some leaf columns not to be encrypted, if their parent is specified to be encrypted? I guess people will raise the feature request later when it is roll out. With that said, I am not objecting the proposal but just some thoughts on corner cases. > Table property name definition between ORC and Parquet encrytion > > > Key: HIVE-21848 > URL: https://issues.apache.org/jira/browse/HIVE-21848 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Xinli Shang >Assignee: Xinli Shang >Priority: Major > Fix For: 3.0.0 > > > The goal of this Jira is to define a superset of unified table property names > that can be used for both Parquet and ORC column encryption. There is no code > change needed for this Jira. > *Background:* > ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To > configure the encryption, e.g. which column is sensitive, what master key to > be used, algorithm, etc, table properties can be used. It is important that > both Parquet and ORC can use unified names. > According to the slide > [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692], > ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in > the Parquet community, it is still discussing to provide several ways and > using table properties is one of the options, while there is no detailed > design of the table property names yet. > So it is a good time to discuss within two communities to have unified table > names as a superset. > *Proposal:* > There are several encryption properties that need to be specified for a > table. Here is the list. This is the superset of Parquet and ORC. Some of > them might not apply to both. > # PII columns including nest columns > # Column key metadata, master key metadata > # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. > ORC might support AES_CTR. > # Encryption footer - Parquet allow footer to be encrypted or plaintext > # Footer key metadata > Here is the table properties proposal. > |*Table Property Name*|*Value*|*Notes*| > |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.| > |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted > footer. By default, it is encrypted.| > |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to > the KMS to define what key metadata is. The metadata should have enough > information to figure out the corresponding key by the KMS. | > |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column > name for example, ‘address.zipcode’. > > It is up to the KMS to define what key metadata is. The metadata should have > enough information to figure out the corresponding key by the KMS.| > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-21867: --- Affects Version/s: 4.0.0 Status: In Progress (was: Patch Available) Pushed to master, thanks for reviewing [~vgarg]! > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Affects Versions: 4.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, > HIVE-21867.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?focusedWorklogId=269511&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269511 ] ASF GitHub Bot logged work on HIVE-21867: - Author: ASF GitHub Bot Created on: 28/Jun/19 20:27 Start Date: 28/Jun/19 20:27 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #687: HIVE-21867 URL: https://github.com/apache/hive/pull/687 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 269511) Time Spent: 1h 40m (was: 1.5h) > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Affects Versions: 4.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, > HIVE-21867.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez resolved HIVE-21867. Resolution: Fixed Fix Version/s: 4.0.0 > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Affects Versions: 4.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, > HIVE-21867.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21743) day( ) gives wrong day from the date in Apache Hive 3.1 server
[ https://issues.apache.org/jira/browse/HIVE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adarshdeep Cheema updated HIVE-21743: - Due Date: (was: 23/May/19) > day( ) gives wrong day from the date in Apache Hive 3.1 server > -- > > Key: HIVE-21743 > URL: https://issues.apache.org/jira/browse/HIVE-21743 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.1 > Environment: Server: Apache Hive 3.1 > Driver hive-jdbc-3.1.0.3.1.0.0-78 >Reporter: Adarshdeep Cheema >Priority: Critical > > Using Apache Hive 3.1 server > Run the following SQL and you will get 3 instead of 1 > SELECT > (day( DATE '0001-01-01')) > FROM > `table` > PLEASE NOTE THIS DOES NOT HAPPEN WITH Apache HIVE 2.1 SERVER -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21921) Support for correlated quantified predicates
[ https://issues.apache.org/jira/browse/HIVE-21921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875186#comment-16875186 ] Vineet Garg commented on HIVE-21921: [~jcamachorodriguez] [~ashutoshc] Can you take a look? https://github.com/apache/hive/pull/693 > Support for correlated quantified predicates > > > Key: HIVE-21921 > URL: https://issues.apache.org/jira/browse/HIVE-21921 > Project: Hive > Issue Type: New Feature > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21921.1.patch, HIVE-21921.2.patch, > HIVE-21921.3.patch, HIVE-21921.4.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21921) Support for correlated quantified predicates
[ https://issues.apache.org/jira/browse/HIVE-21921?focusedWorklogId=269485&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269485 ] ASF GitHub Bot logged work on HIVE-21921: - Author: ASF GitHub Bot Created on: 28/Jun/19 19:42 Start Date: 28/Jun/19 19:42 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on pull request #693: HIVE-21921: Support for correlated quantified predicates URL: https://github.com/apache/hive/pull/693 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 269485) Time Spent: 10m Remaining Estimate: 0h > Support for correlated quantified predicates > > > Key: HIVE-21921 > URL: https://issues.apache.org/jira/browse/HIVE-21921 > Project: Hive > Issue Type: New Feature > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21921.1.patch, HIVE-21921.2.patch, > HIVE-21921.3.patch, HIVE-21921.4.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21921) Support for correlated quantified predicates
[ https://issues.apache.org/jira/browse/HIVE-21921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-21921: -- Labels: pull-request-available (was: ) > Support for correlated quantified predicates > > > Key: HIVE-21921 > URL: https://issues.apache.org/jira/browse/HIVE-21921 > Project: Hive > Issue Type: New Feature > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21921.1.patch, HIVE-21921.2.patch, > HIVE-21921.3.patch, HIVE-21921.4.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875183#comment-16875183 ] Hive QA commented on HIVE-21867: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12973179/HIVE-21867.05.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16357 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/17793/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17793/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17793/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12973179 - PreCommit-HIVE-Build > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, > HIVE-21867.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21934) Materialized view on top of Druid not pushing everything
[ https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-21934: --- Component/s: Materialized views Druid integration > Materialized view on top of Druid not pushing everything > > > Key: HIVE-21934 > URL: https://issues.apache.org/jira/browse/HIVE-21934 > Project: Hive > Issue Type: Improvement > Components: Druid integration, Materialized views >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez >Priority: Major > > The title is not very informative, but examples hopefully are. > this is the plan with the view > {code} > explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`, > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`, > SUM(1) AS `sum_number_of_records_ok`, > YEAR(`dates_n1`.`__time`) AS `yr___time_ok` > FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0` > JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON > (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`) > JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON > (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`) > JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON > (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`) > JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON > (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`) > GROUP BY MONTH(`dates_n1`.`__time`), > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT), > YEAR(`dates_n1`.`__time`) > INFO : Starting task [Stage-3:EXPLAIN] in serial mode > INFO : Completed executing > command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977); > Time taken: 0.005 seconds > INFO : OK > ++ > | Explain | > ++ > | Plan optimized by CBO. | > | | > | Vertex dependency in root stage | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | | > | Stage-0 | > | Fetch Operator | > | limit:-1 | > | Stage-1 | > | Reducer 2 vectorized, llap | > | File Output Operator [FS_13] | > | Select Operator [SEL_12] (rows=300018951 width=38) | > | Output:["_col0","_col1","_col2","_col3"] | > | Group By Operator [GBY_11] (rows=300018951 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0, > KEY._col1, KEY._col2 | > | <-Map 1 [SIMPLE_EDGE] vectorized, llap | > | SHUFFLE [RS_10] | > | PartitionCols:_col0, _col1, _col2 | > | Group By Operator [GBY_9] (rows=600037902 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, > _col1, _col2 | > | Select Operator [SEL_8] (rows=600037902 width=38) | > | Output:["_col0","_col1","_col2"] | > | TableScan [TS_0] (rows=600037902 width=38) | > | > mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"} > | > | | > ++ > > {code} > if i use a simple druid table without MV > {code} > explain SELECT MONTH(`__time`) AS `mn___time_ok`, > CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`, > SUM(1) AS `sum_number_of_records_ok`, > YEAR(`__time`) AS `yr___time_ok` > FROM `druid_ssb.ssb_druid_100` > GROUP BY MONTH(`__time`), > CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT), > YEAR(`__time`); > {code} > {code} > ++ > | Explain | > ++ > | Plan optimized by CBO. | > | | > | Stage-0 | > | Fetch Operator | > | limit:-1 | > | Select Operator [SEL_1] | > | Output:["_col0","_col1","_col2","_col3"] | > | TableScan [TS_0] | > | > Output:["extract_month","vc","$f3","extract_year"],properties:\{"druid.fieldNames":"extract_month,vc,extract_year,$f3","druid.fieldTypes":"int,bigint,int,bigint","druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_ssb.ssb_druid_100\",\"granularity\":\"all\",\"dimensions\":[{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_month\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"M\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}},\{\"type\":\"default\",\"dimension\":\"vc\",\"outputName\":\"vc\",\"outputType\":\"LONG\"},\{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\"
[jira] [Updated] (HIVE-21934) Materialized view on top of Druid not pushing everything
[ https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] slim bouguerra updated HIVE-21934: -- Summary: Materialized view on top of Druid not pushing everything (was: Materialized view on top of Druid not pushing every thing) > Materialized view on top of Druid not pushing everything > > > Key: HIVE-21934 > URL: https://issues.apache.org/jira/browse/HIVE-21934 > Project: Hive > Issue Type: Improvement >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez >Priority: Major > > The title is not very informative, but examples hopefully are. > this is the plan with the view > {code} > explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`, > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`, > SUM(1) AS `sum_number_of_records_ok`, > YEAR(`dates_n1`.`__time`) AS `yr___time_ok` > FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0` > JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON > (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`) > JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON > (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`) > JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON > (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`) > JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON > (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`) > GROUP BY MONTH(`dates_n1`.`__time`), > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT), > YEAR(`dates_n1`.`__time`) > INFO : Starting task [Stage-3:EXPLAIN] in serial mode > INFO : Completed executing > command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977); > Time taken: 0.005 seconds > INFO : OK > ++ > | Explain | > ++ > | Plan optimized by CBO. | > | | > | Vertex dependency in root stage | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | | > | Stage-0 | > | Fetch Operator | > | limit:-1 | > | Stage-1 | > | Reducer 2 vectorized, llap | > | File Output Operator [FS_13] | > | Select Operator [SEL_12] (rows=300018951 width=38) | > | Output:["_col0","_col1","_col2","_col3"] | > | Group By Operator [GBY_11] (rows=300018951 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0, > KEY._col1, KEY._col2 | > | <-Map 1 [SIMPLE_EDGE] vectorized, llap | > | SHUFFLE [RS_10] | > | PartitionCols:_col0, _col1, _col2 | > | Group By Operator [GBY_9] (rows=600037902 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, > _col1, _col2 | > | Select Operator [SEL_8] (rows=600037902 width=38) | > | Output:["_col0","_col1","_col2"] | > | TableScan [TS_0] (rows=600037902 width=38) | > | > mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"} > | > | | > ++ > > {code} > if i use a simple druid table without MV > {code} > explain SELECT MONTH(`__time`) AS `mn___time_ok`, > CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`, > SUM(1) AS `sum_number_of_records_ok`, > YEAR(`__time`) AS `yr___time_ok` > FROM `druid_ssb.ssb_druid_100` > GROUP BY MONTH(`__time`), > CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT), > YEAR(`__time`); > {code} > {code} > ++ > | Explain | > ++ > | Plan optimized by CBO. | > | | > | Stage-0 | > | Fetch Operator | > | limit:-1 | > | Select Operator [SEL_1] | > | Output:["_col0","_col1","_col2","_col3"] | > | TableScan [TS_0] | > | > Output:["extract_month","vc","$f3","extract_year"],properties:\{"druid.fieldNames":"extract_month,vc,extract_year,$f3","druid.fieldTypes":"int,bigint,int,bigint","druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_ssb.ssb_druid_100\",\"granularity\":\"all\",\"dimensions\":[{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_month\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"M\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}},\{\"type\":\"default\",\"dimension\":\"vc\",\"outputName\":\"vc\",\"outputType\":\"LONG\"},\{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_ye
[jira] [Commented] (HIVE-21934) Materialized view on top of Druid not pushing every thing
[ https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875141#comment-16875141 ] slim bouguerra commented on HIVE-21934: --- source tables {code} CREATE TABLE `lineorder_n0`( `lo_orderkey` bigint, `lo_linenumber` int, `lo_custkey` bigint not null disable rely, `lo_partkey` bigint not null disable rely, `lo_suppkey` bigint not null disable rely, `lo_orderdate` bigint not null disable rely, `lo_ordpriority` string, `lo_shippriority` string, `lo_quantity` double, `lo_extendedprice` double, `lo_ordtotalprice` double, `lo_discount` double, `lo_revenue` double, `lo_supplycost` double, `lo_tax` double, `lo_commitdate` bigint, `lo_shipmode` string, primary key (`lo_orderkey`) disable rely, constraint fk21 foreign key (`lo_custkey`) references `customer_n1`(`c_custkey`) disable rely, constraint fk22 foreign key (`lo_orderdate`) references `dates_n1`(`d_datekey`) disable rely, constraint fk23 foreign key (`lo_partkey`) references `ssb_part_n0`(`p_partkey`) disable rely, constraint fk24 foreign key (`lo_suppkey`) references `supplier_n0`(`s_suppkey`) disable rely) STORED AS ORC TBLPROPERTIES ('transactional'='true'); {code} {code} CREATE TABLE `dates_n1`( `d_datekey` bigint, `__time` timestamp, `d_date` string, `d_dayofweek` string, `d_month` string, `d_year` int, `d_yearmonthnum` int, `d_yearmonth` string, `d_daynuminweek` int, `d_daynuminmonth` int, `d_daynuminyear` int, `d_monthnuminyear` int, `d_weeknuminyear` int, `d_sellingseason` string, `d_lastdayinweekfl` int, `d_lastdayinmonthfl` int, `d_holidayfl` int , `d_weekdayfl`int, primary key (`d_datekey`) disable rely ) STORED AS ORC TBLPROPERTIES ('transactional'='true'); {code} > Materialized view on top of Druid not pushing every thing > - > > Key: HIVE-21934 > URL: https://issues.apache.org/jira/browse/HIVE-21934 > Project: Hive > Issue Type: Improvement >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez >Priority: Major > > The title is not very informative, but examples hopefully are. > this is the plan with the view > {code} > explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`, > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`, > SUM(1) AS `sum_number_of_records_ok`, > YEAR(`dates_n1`.`__time`) AS `yr___time_ok` > FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0` > JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON > (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`) > JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON > (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`) > JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON > (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`) > JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON > (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`) > GROUP BY MONTH(`dates_n1`.`__time`), > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT), > YEAR(`dates_n1`.`__time`) > INFO : Starting task [Stage-3:EXPLAIN] in serial mode > INFO : Completed executing > command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977); > Time taken: 0.005 seconds > INFO : OK > ++ > | Explain | > ++ > | Plan optimized by CBO. | > | | > | Vertex dependency in root stage | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | | > | Stage-0 | > | Fetch Operator | > | limit:-1 | > | Stage-1 | > | Reducer 2 vectorized, llap | > | File Output Operator [FS_13] | > | Select Operator [SEL_12] (rows=300018951 width=38) | > | Output:["_col0","_col1","_col2","_col3"] | > | Group By Operator [GBY_11] (rows=300018951 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0, > KEY._col1, KEY._col2 | > | <-Map 1 [SIMPLE_EDGE] vectorized, llap | > | SHUFFLE [RS_10] | > | PartitionCols:_col0, _col1, _col2 | > | Group By Operator [GBY_9] (rows=600037902 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, > _col1, _col2 | > | Select Operator [SEL_8] (rows=600037902 width=38) | > | Output:["_col0","_col1","_col2"] | > | TableScan [TS_0] (rows=600037902 width=38) | > | > mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"
[jira] [Commented] (HIVE-21934) Materialized view on top of Druid not pushing every thing
[ https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875140#comment-16875140 ] slim bouguerra commented on HIVE-21934: --- view definition {code} CREATE MATERIALIZED VIEW `ssb_mv_druid_100` STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' TBLPROPERTIES ( "druid.segment.granularity" = "MONTH", "druid.query.granularity" = "DAY") AS SELECT `__time` as `__time` , cast(c_city as string) c_city, cast(c_nation as string) c_nation, cast(c_region as string) c_region, c_mktsegment as c_mktsegment, cast(d_weeknuminyear as string) d_weeknuminyear, cast(d_year as string) d_year, cast(d_yearmonth as string) d_yearmonth, cast(d_yearmonthnum as string) d_yearmonthnum, cast(p_brand1 as string) p_brand1, cast(p_category as string) p_category, cast(p_mfgr as string) p_mfgr, p_type, s_name, cast(s_city as string) s_city, cast(s_nation as string) s_nation, cast(s_region as string) s_region, cast(`lo_ordpriority` as string) lo_ordpriority, cast(`lo_shippriority` as string) lo_shippriority, `d_sellingseason` `lo_shipmode`, lo_revenue, lo_supplycost , lo_discount , `lo_quantity`, `lo_extendedprice`, `lo_ordtotalprice`, lo_extendedprice * lo_discount discounted_price, lo_revenue - lo_supplycost net_revenue FROM customer_n1, dates_n1, lineorder_n1, ssb_part_n0, supplier_n0 where lo_orderdate = d_datekey and lo_partkey = p_partkey and lo_suppkey = s_suppkey and lo_custkey = c_custkey; {code} > Materialized view on top of Druid not pushing every thing > - > > Key: HIVE-21934 > URL: https://issues.apache.org/jira/browse/HIVE-21934 > Project: Hive > Issue Type: Improvement >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez >Priority: Major > > The title is not very informative, but examples hopefully are. > this is the plan with the view > {code} > explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`, > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`, > SUM(1) AS `sum_number_of_records_ok`, > YEAR(`dates_n1`.`__time`) AS `yr___time_ok` > FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0` > JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON > (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`) > JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON > (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`) > JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON > (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`) > JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON > (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`) > GROUP BY MONTH(`dates_n1`.`__time`), > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT), > YEAR(`dates_n1`.`__time`) > INFO : Starting task [Stage-3:EXPLAIN] in serial mode > INFO : Completed executing > command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977); > Time taken: 0.005 seconds > INFO : OK > ++ > | Explain | > ++ > | Plan optimized by CBO. | > | | > | Vertex dependency in root stage | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | | > | Stage-0 | > | Fetch Operator | > | limit:-1 | > | Stage-1 | > | Reducer 2 vectorized, llap | > | File Output Operator [FS_13] | > | Select Operator [SEL_12] (rows=300018951 width=38) | > | Output:["_col0","_col1","_col2","_col3"] | > | Group By Operator [GBY_11] (rows=300018951 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0, > KEY._col1, KEY._col2 | > | <-Map 1 [SIMPLE_EDGE] vectorized, llap | > | SHUFFLE [RS_10] | > | PartitionCols:_col0, _col1, _col2 | > | Group By Operator [GBY_9] (rows=600037902 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, > _col1, _col2 | > | Select Operator [SEL_8] (rows=600037902 width=38) | > | Output:["_col0","_col1","_col2"] | > | TableScan [TS_0] (rows=600037902 width=38) | > | > mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"} > | > | | > ++ > > {code} > if i use a simple druid table without MV > {code} > explain SELECT MONTH(`__time`) AS `mn___time_ok`, > CAST((MONTH(`__time`
[jira] [Assigned] (HIVE-21934) Materialized view on top of Druid not pushing every thing
[ https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] slim bouguerra reassigned HIVE-21934: - > Materialized view on top of Druid not pushing every thing > - > > Key: HIVE-21934 > URL: https://issues.apache.org/jira/browse/HIVE-21934 > Project: Hive > Issue Type: Improvement >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez >Priority: Major > > The title is not very informative, but examples hopefully are. > this is the plan with the view > {code} > explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`, > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`, > SUM(1) AS `sum_number_of_records_ok`, > YEAR(`dates_n1`.`__time`) AS `yr___time_ok` > FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0` > JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON > (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`) > JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON > (`lineorder_n0`.`lo_custkey` = `customer_n1`.`c_custkey`) > JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON > (`lineorder_n0`.`lo_suppkey` = `supplier_n0`.`s_suppkey`) > JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON > (`lineorder_n0`.`lo_partkey` = `ssb_part_n0`.`p_partkey`) > GROUP BY MONTH(`dates_n1`.`__time`), > CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT), > YEAR(`dates_n1`.`__time`) > INFO : Starting task [Stage-3:EXPLAIN] in serial mode > INFO : Completed executing > command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977); > Time taken: 0.005 seconds > INFO : OK > ++ > | Explain | > ++ > | Plan optimized by CBO. | > | | > | Vertex dependency in root stage | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | | > | Stage-0 | > | Fetch Operator | > | limit:-1 | > | Stage-1 | > | Reducer 2 vectorized, llap | > | File Output Operator [FS_13] | > | Select Operator [SEL_12] (rows=300018951 width=38) | > | Output:["_col0","_col1","_col2","_col3"] | > | Group By Operator [GBY_11] (rows=300018951 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0, > KEY._col1, KEY._col2 | > | <-Map 1 [SIMPLE_EDGE] vectorized, llap | > | SHUFFLE [RS_10] | > | PartitionCols:_col0, _col1, _col2 | > | Group By Operator [GBY_9] (rows=600037902 width=38) | > | > Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, > _col1, _col2 | > | Select Operator [SEL_8] (rows=600037902 width=38) | > | Output:["_col0","_col1","_col2"] | > | TableScan [TS_0] (rows=600037902 width=38) | > | > mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"} > | > | | > ++ > > {code} > if i use a simple druid table without MV > {code} > explain SELECT MONTH(`__time`) AS `mn___time_ok`, > CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`, > SUM(1) AS `sum_number_of_records_ok`, > YEAR(`__time`) AS `yr___time_ok` > FROM `druid_ssb.ssb_druid_100` > GROUP BY MONTH(`__time`), > CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT), > YEAR(`__time`); > {code} > {code} > ++ > | Explain | > ++ > | Plan optimized by CBO. | > | | > | Stage-0 | > | Fetch Operator | > | limit:-1 | > | Select Operator [SEL_1] | > | Output:["_col0","_col1","_col2","_col3"] | > | TableScan [TS_0] | > | > Output:["extract_month","vc","$f3","extract_year"],properties:\{"druid.fieldNames":"extract_month,vc,extract_year,$f3","druid.fieldTypes":"int,bigint,int,bigint","druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_ssb.ssb_druid_100\",\"granularity\":\"all\",\"dimensions\":[{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_month\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"M\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}},\{\"type\":\"default\",\"dimension\":\"vc\",\"outputName\":\"vc\",\"outputType\":\"LONG\"},\{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_year\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}}],\"v
[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875139#comment-16875139 ] Hive QA commented on HIVE-21867: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 58s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s{color} | {color:red} ql: The patch generated 1 new + 155 unchanged - 0 fixed = 156 total (was 155) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-17793/dev-support/hive-personality.sh | | git revision | master / 21177ef | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-17793/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-17793/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, > HIVE-21867.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21932) IndexOutOfRangeException in FileChksumIterator
[ https://issues.apache.org/jira/browse/HIVE-21932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-21932: --- Resolution: Fixed Fix Version/s: 3.2.0 4.0.0 Status: Resolved (was: Patch Available) Patch merged in master and branch-3 > IndexOutOfRangeException in FileChksumIterator > -- > > Key: HIVE-21932 > URL: https://issues.apache.org/jira/browse/HIVE-21932 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Fix For: 4.0.0, 3.2.0 > > Attachments: HIVE-21932.01.patch > > > According to definition of {{InsertEventRequestData}} in > {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But > the FileChksumIterator does not handle it correctly when a client fires a > insert event which does not have file checksums. The issue is that > {{InsertEvent}} class initializes fileChecksums list to a empty arrayList so > the following check will never come into play > {noformat} > result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? > chksums.get(i) : null, > subDirs != null ? subDirs.get(i) : null); > {noformat} > The chksums check above should include a {{!chksums.isEmpty()}} check as well > in the above line. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings
[ https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875115#comment-16875115 ] Hive QA commented on HIVE-21578: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12973172/HIVE-21578.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16359 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/17792/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17792/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17792/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12973172 - PreCommit-HIVE-Build > Introduce SQL:2016 formats FM, FX, and nested strings > - > > Key: HIVE-21578 > URL: https://issues.apache.org/jira/browse/HIVE-21578 > Project: Hive > Issue Type: Improvement >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-21578.01.patch, HIVE-21578.02.patch > > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > * "text" (nested strings) > * FM > * FX > [Definitions > here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875099#comment-16875099 ] Vineet Garg commented on HIVE-21867: +1 LGTM > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, > HIVE-21867.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21932) IndexOutOfRangeException in FileChksumIterator
[ https://issues.apache.org/jira/browse/HIVE-21932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-21932: --- Description: According to definition of {{InsertEventRequestData}} in {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But the FileChksumIterator does not handle it correctly when a client fires a insert event which does not have file checksums. The issue is that {{InsertEvent}} class initializes fileChecksums list to a empty arrayList so the following check will never come into play {noformat} result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? chksums.get(i) : null, subDirs != null ? subDirs.get(i) : null); {noformat} The chksums check above should include a {{!chksums.isEmpty()}} check as well in the above line. was: According to definition of {{InsertEventRequestData}} in {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But the FileChksumIterator does not handle it correctly when a client fires a insert event which does not have file checksums. The issue is that {{InsertEvent}} class initializes fileChecksums list to a empty arrayList to the following check will never come into play {noformat} result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? chksums.get(i) : null, subDirs != null ? subDirs.get(i) : null); {noformat} The chksums check above should include a {{!chksums.isEmpty()}} check as well in the above line. > IndexOutOfRangeException in FileChksumIterator > -- > > Key: HIVE-21932 > URL: https://issues.apache.org/jira/browse/HIVE-21932 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21932.01.patch > > > According to definition of {{InsertEventRequestData}} in > {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But > the FileChksumIterator does not handle it correctly when a client fires a > insert event which does not have file checksums. The issue is that > {{InsertEvent}} class initializes fileChecksums list to a empty arrayList so > the following check will never come into play > {noformat} > result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? > chksums.get(i) : null, > subDirs != null ? subDirs.get(i) : null); > {noformat} > The chksums check above should include a {{!chksums.isEmpty()}} check as well > in the above line. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21932) IndexOutOfRangeException in FileChksumIterator
[ https://issues.apache.org/jira/browse/HIVE-21932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-21932: --- Summary: IndexOutOfRangeException in FileChksumIterator (was: IndexOutOfRangeExeption in FileChksumIterator) > IndexOutOfRangeException in FileChksumIterator > -- > > Key: HIVE-21932 > URL: https://issues.apache.org/jira/browse/HIVE-21932 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21932.01.patch > > > According to definition of {{InsertEventRequestData}} in > {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But > the FileChksumIterator does not handle it correctly when a client fires a > insert event which does not have file checksums. The issue is that > {{InsertEvent}} class initializes fileChecksums list to a empty arrayList to > the following check will never come into play > {noformat} > result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? > chksums.get(i) : null, > subDirs != null ? subDirs.get(i) : null); > {noformat} > The chksums check above should include a {{!chksums.isEmpty()}} check as well > in the above line. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings
[ https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875076#comment-16875076 ] Hive QA commented on HIVE-21578: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 44s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 34s{color} | {color:blue} common in master has 62 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 1s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} common: The patch generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 28m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-17792/dev-support/hive-personality.sh | | git revision | master / 21177ef | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-17792/yetus/diff-checkstyle-common.txt | | modules | C: common ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-17792/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Introduce SQL:2016 formats FM, FX, and nested strings > - > > Key: HIVE-21578 > URL: https://issues.apache.org/jira/browse/HIVE-21578 > Project: Hive > Issue Type: Improvement >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-21578.01.patch, HIVE-21578.02.patch > > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > * "text" (nested strings) > * FM > * FX > [Definitions > here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21888) Set hive.parquet.timestamp.skip.conversion default to true
[ https://issues.apache.org/jira/browse/HIVE-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-21888: --- Resolution: Fixed Fix Version/s: 3.2.0 4.0.0 Status: Resolved (was: Patch Available) Pushed to master, branch-3. Thanks [~klcopp] > Set hive.parquet.timestamp.skip.conversion default to true > -- > > Key: HIVE-21888 > URL: https://issues.apache.org/jira/browse/HIVE-21888 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Fix For: 4.0.0, 3.2.0 > > Attachments: HIVE-21888.02.patch, HIVE-21888.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider
[ https://issues.apache.org/jira/browse/HIVE-21910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875053#comment-16875053 ] Hive QA commented on HIVE-21910: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12973170/HIVE-21910.2.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 16346 tests executed *Failed tests:* {noformat} TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=156) [intersect_all.q,unionDistinct_1.q,table_nonprintable.q,orc_llap_counters1.q,mm_cttas.q,whroot_external1.q,global_limit.q,cte_2.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,results_cache_diff_fs.q,cttl.q,parallel_colstats.q,load_hdfs_file_with_space_in_the_name.q] org.apache.hadoop.hive.ql.TestWarehouseExternalDir.org.apache.hadoop.hive.ql.TestWarehouseExternalDir (batchId=255) org.apache.hadoop.hive.ql.TestWarehouseExternalDir.testExternalDefaultPaths (batchId=255) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/17791/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17791/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17791/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12973170 - PreCommit-HIVE-Build > Multiple target location generation in HostAffinitySplitLocationProvider > > > Key: HIVE-21910 > URL: https://issues.apache.org/jira/browse/HIVE-21910 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21910.2.patch, HIVE-21910.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > We need to generate multiple target locations by > HostAffinitySplitLocationProvider, so we will have deterministic fallback > nodes in case the target node is disabled -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider
[ https://issues.apache.org/jira/browse/HIVE-21910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875012#comment-16875012 ] Hive QA commented on HIVE-21910: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 45s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 34s{color} | {color:blue} common in master has 62 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 26s{color} | {color:blue} llap-tez in master has 17 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 3s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s{color} | {color:red} ql: The patch generated 5 new + 41 unchanged - 1 fixed = 46 total (was 42) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 12s{color} | {color:red} ql generated 1 new + 2252 unchanged - 1 fixed = 2253 total (was 2253) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 31m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Null passed for non-null parameter of new java.util.HashSet(Collection) in new org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider(List, boolean, int) Method invoked at HostAffinitySplitLocationProvider.java:of new java.util.HashSet(Collection) in new org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider(List, boolean, int) Method invoked at HostAffinitySplitLocationProvider.java:[line 68] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-17791/dev-support/hive-personality.sh | | git revision | master / 5b46790 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-17791/yetus/diff-checkstyle-ql.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-17791/yetus/new-findbugs-ql.html | | modules | C: common llap-tez ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-17791/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Multiple target location generation in HostAffinitySplitLocationProvider > > > Key: HIVE-21910 > URL: https://issues.apache.org/jira/browse/HIVE-21910 > Project: H
[jira] [Commented] (HIVE-21933) Remove unused methods from Utilities
[ https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874988#comment-16874988 ] Hive QA commented on HIVE-21933: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12973167/HIVE-21933.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16357 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/17790/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17790/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17790/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12973167 - PreCommit-HIVE-Build > Remove unused methods from Utilities > > > Key: HIVE-21933 > URL: https://issues.apache.org/jira/browse/HIVE-21933 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ivan Suller >Assignee: Ivan Suller >Priority: Trivial > Attachments: HIVE-21933.1.patch > > > Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected > many methods which are not used anymore. Removing them is the right thing to > do. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-21867: --- Attachment: HIVE-21867.05.patch > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.05.patch, > HIVE-21867.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21933) Remove unused methods from Utilities
[ https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874960#comment-16874960 ] Hive QA commented on HIVE-21933: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 58s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} ql: The patch generated 0 new + 130 unchanged - 4 fixed = 130 total (was 134) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-17790/dev-support/hive-personality.sh | | git revision | master / 5b46790 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-17790/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove unused methods from Utilities > > > Key: HIVE-21933 > URL: https://issues.apache.org/jira/browse/HIVE-21933 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ivan Suller >Assignee: Ivan Suller >Priority: Trivial > Attachments: HIVE-21933.1.patch > > > Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected > many methods which are not used anymore. Removing them is the right thing to > do. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings
[ https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-21578: - Status: Open (was: Patch Available) > Introduce SQL:2016 formats FM, FX, and nested strings > - > > Key: HIVE-21578 > URL: https://issues.apache.org/jira/browse/HIVE-21578 > Project: Hive > Issue Type: Improvement >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-21578.01.patch, HIVE-21578.02.patch > > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > * "text" (nested strings) > * FM > * FX > [Definitions > here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings
[ https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-21578: - Attachment: HIVE-21578.02.patch Status: Patch Available (was: Open) > Introduce SQL:2016 formats FM, FX, and nested strings > - > > Key: HIVE-21578 > URL: https://issues.apache.org/jira/browse/HIVE-21578 > Project: Hive > Issue Type: Improvement >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-21578.01.patch, HIVE-21578.02.patch > > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > * "text" (nested strings) > * FM > * FX > [Definitions > here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings
[ https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874925#comment-16874925 ] Hive QA commented on HIVE-21578: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12973164/HIVE-21578.01.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16359 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/17789/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17789/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17789/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12973164 - PreCommit-HIVE-Build > Introduce SQL:2016 formats FM, FX, and nested strings > - > > Key: HIVE-21578 > URL: https://issues.apache.org/jira/browse/HIVE-21578 > Project: Hive > Issue Type: Improvement >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-21578.01.patch > > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > * "text" (nested strings) > * FM > * FX > [Definitions > here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18735) Create table like loses transactional attribute
[ https://issues.apache.org/jira/browse/HIVE-18735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-18735: - Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) > Create table like loses transactional attribute > --- > > Key: HIVE-18735 > URL: https://issues.apache.org/jira/browse/HIVE-18735 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Eugene Koifman >Assignee: Laszlo Pinter >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-18735.01.patch, HIVE-18735.02.patch, > HIVE-18735.03.patch, HIVE-18735.04.patch, HIVE-18735.05.patch, > HIVE-18735.06.patch > > > {noformat} > create table T1(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES ('transactional'='true')"; > create table T like T1; > show create table T ; > CREATE TABLE `T`( > `a` int, > `b` int) > CLUSTERED BY ( > a) > INTO 2 BUCKETS > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > LOCATION > > 'file:/Users/ekoifman/IdeaProjects/hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands-1518813536099/warehouse/t' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1518813564') > {noformat} > Specifying props explicitly does work > {noformat} > create table T1(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES ('transactional'='true')"; > create table T like T1 TBLPROPERTIES ('transactional'='true'); > show create table T ; > CREATE TABLE `T`( > `a` int, > `b` int) > CLUSTERED BY ( > a) > INTO 2 BUCKETS > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > LOCATION > > 'file:/Users/ekoifman/IdeaProjects/hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands-1518814098564/warehouse/t' > TBLPROPERTIES ( > 'transactional'='true', > 'transactional_properties'='default', > 'transient_lastDdlTime'='1518814111') > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18735) Create table like loses transactional attribute
[ https://issues.apache.org/jira/browse/HIVE-18735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874923#comment-16874923 ] Marta Kuczora commented on HIVE-18735: -- Pushed to master. (Got +1 on Review Board on Wednesday.) Thanks a lot [~lpinter] for the patch. > Create table like loses transactional attribute > --- > > Key: HIVE-18735 > URL: https://issues.apache.org/jira/browse/HIVE-18735 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Eugene Koifman >Assignee: Laszlo Pinter >Priority: Major > Attachments: HIVE-18735.01.patch, HIVE-18735.02.patch, > HIVE-18735.03.patch, HIVE-18735.04.patch, HIVE-18735.05.patch, > HIVE-18735.06.patch > > > {noformat} > create table T1(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES ('transactional'='true')"; > create table T like T1; > show create table T ; > CREATE TABLE `T`( > `a` int, > `b` int) > CLUSTERED BY ( > a) > INTO 2 BUCKETS > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > LOCATION > > 'file:/Users/ekoifman/IdeaProjects/hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands-1518813536099/warehouse/t' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1518813564') > {noformat} > Specifying props explicitly does work > {noformat} > create table T1(a int, b int) clustered by (a) into 2 buckets stored as orc > TBLPROPERTIES ('transactional'='true')"; > create table T like T1 TBLPROPERTIES ('transactional'='true'); > show create table T ; > CREATE TABLE `T`( > `a` int, > `b` int) > CLUSTERED BY ( > a) > INTO 2 BUCKETS > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > LOCATION > > 'file:/Users/ekoifman/IdeaProjects/hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands-1518814098564/warehouse/t' > TBLPROPERTIES ( > 'transactional'='true', > 'transactional_properties'='default', > 'transient_lastDdlTime'='1518814111') > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider
[ https://issues.apache.org/jira/browse/HIVE-21910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-21910: -- Attachment: HIVE-21910.2.patch > Multiple target location generation in HostAffinitySplitLocationProvider > > > Key: HIVE-21910 > URL: https://issues.apache.org/jira/browse/HIVE-21910 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21910.2.patch, HIVE-21910.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > We need to generate multiple target locations by > HostAffinitySplitLocationProvider, so we will have deterministic fallback > nodes in case the target node is disabled -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider
[ https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269233&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269233 ] ASF GitHub Bot logged work on HIVE-21910: - Author: ASF GitHub Bot Created on: 28/Jun/19 12:48 Start Date: 28/Jun/19 12:48 Worklog Time Spent: 10m Work Description: pvary commented on pull request #690: HIVE-21910: Multiple target location generation in HostAffinitySplitLocationProvider URL: https://github.com/apache/hive/pull/690#discussion_r298578751 ## File path: llap-tez/src/test/org/apache/hadoop/hive/llap/tezplugins/TestLlapTaskSchedulerService.java ## @@ -946,6 +946,161 @@ public void testForcedLocalityUnknownHost() throws IOException, InterruptedExcep } } + @Test(timeout = 1) Review comment: All of the tests are 10s timeout in this file, so I decided to keep this for the shake of consistency This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 269233) Time Spent: 1h 40m (was: 1.5h) > Multiple target location generation in HostAffinitySplitLocationProvider > > > Key: HIVE-21910 > URL: https://issues.apache.org/jira/browse/HIVE-21910 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21910.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > We need to generate multiple target locations by > HostAffinitySplitLocationProvider, so we will have deterministic fallback > nodes in case the target node is disabled -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21933) Remove unused methods from Utilities
[ https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Suller updated HIVE-21933: --- Attachment: HIVE-21933.1.patch > Remove unused methods from Utilities > > > Key: HIVE-21933 > URL: https://issues.apache.org/jira/browse/HIVE-21933 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ivan Suller >Priority: Trivial > Attachments: HIVE-21933.1.patch > > > Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected > many methods which are not used anymore. Removing them is the right thing to > do. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21933) Remove unused methods from Utilities
[ https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Suller updated HIVE-21933: --- Status: Patch Available (was: Open) > Remove unused methods from Utilities > > > Key: HIVE-21933 > URL: https://issues.apache.org/jira/browse/HIVE-21933 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ivan Suller >Assignee: Ivan Suller >Priority: Trivial > Attachments: HIVE-21933.1.patch > > > Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected > many methods which are not used anymore. Removing them is the right thing to > do. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-21933) Remove unused methods from Utilities
[ https://issues.apache.org/jira/browse/HIVE-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Suller reassigned HIVE-21933: -- Assignee: Ivan Suller > Remove unused methods from Utilities > > > Key: HIVE-21933 > URL: https://issues.apache.org/jira/browse/HIVE-21933 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ivan Suller >Assignee: Ivan Suller >Priority: Trivial > Attachments: HIVE-21933.1.patch > > > Over the years it seems org.apache.hadoop.hive.ql.exec.Utilities collected > many methods which are not used anymore. Removing them is the right thing to > do. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings
[ https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874898#comment-16874898 ] Hive QA commented on HIVE-21578: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 54s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 37s{color} | {color:blue} common in master has 62 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 58s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} common: The patch generated 14 new + 0 unchanged - 0 fixed = 14 total (was 0) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-17789/dev-support/hive-personality.sh | | git revision | master / 57c4217 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-17789/yetus/diff-checkstyle-common.txt | | modules | C: common ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-17789/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Introduce SQL:2016 formats FM, FX, and nested strings > - > > Key: HIVE-21578 > URL: https://issues.apache.org/jira/browse/HIVE-21578 > Project: Hive > Issue Type: Improvement >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-21578.01.patch > > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > * "text" (nested strings) > * FM > * FX > [Definitions > here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider
[ https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269225&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269225 ] ASF GitHub Bot logged work on HIVE-21910: - Author: ASF GitHub Bot Created on: 28/Jun/19 12:31 Start Date: 28/Jun/19 12:31 Worklog Time Spent: 10m Work Description: pvary commented on pull request #690: HIVE-21910: Multiple target location generation in HostAffinitySplitLocationProvider URL: https://github.com/apache/hive/pull/690#discussion_r298573326 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java ## @@ -72,11 +78,17 @@ public HostAffinitySplitLocationProvider(List knownLocations) { FileSplit fsplit = (FileSplit) split; String splitDesc = "Split at " + fsplit.getPath() + " with offset= " + fsplit.getStart() + ", length=" + fsplit.getLength(); -List preferredLocations = preferLocations(fsplit); -String location = -preferredLocations.get(determineLocation(preferredLocations, fsplit.getPath().toString(), -fsplit.getStart(), splitDesc)); -return (location != null) ? new String[] { location } : null; +List preferredLocations = new ArrayList<>(preferLocations(fsplit)); +List finalLocations = new ArrayList<>(numberOfLocations); +// Generate new preferred locations until we need more, or we do not have any preferred +// location left +while (finalLocations.size() < numberOfLocations && preferredLocations.size() > 0) { + String nextLocation = preferredLocations.get(determineLocation(preferredLocations, + fsplit.getPath().toString(), fsplit.getStart(), splitDesc)); + finalLocations.add(nextLocation); + preferredLocations.remove(nextLocation); Review comment: I did some measurements for the split generation with this code: ` @Test (timeout = 500) public void testOrcSplitsBasic() throws IOException { HostAffinitySplitLocationProvider locationProvider = new HostAffinitySplitLocationProvider(executorLocations, true, 1); InputSplit os1 = createMockFileSplit(true, "path1", 0, 1000, new String[] {locations.get(0), locations.get(1), locations.get(2), locations.get(3)}); long start = System.nanoTime(); for(int i=0;i<10;i++) { locationProvider.getLocations(os1); } LOG.error("TIME: " + (System.nanoTime()-start)/100); } ` I got the following results: Original code (~6100ms for 100k requests): - 5859 - 6511 - 6813 - 5721 - 5663 New code with 1 location (~5823ms for 100k requests): - 5877 - 5621 - 5613 - 5883 - 6120 New code with 2 locations (~6579ms for 100k request): - 6433 - 6825 - 6574 - 6444 - 6621 I do not see why the new code should be faster, so this means probably high variation for the data. Generating 2 locations instead of 1 seems like a 10% overhead. Since this is 0.006ms per request this seems reasonable for me. What is your opinion? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 269225) Time Spent: 1.5h (was: 1h 20m) > Multiple target location generation in HostAffinitySplitLocationProvider > > > Key: HIVE-21910 > URL: https://issues.apache.org/jira/browse/HIVE-21910 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21910.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > We need to generate multiple target locations by > HostAffinitySplitLocationProvider, so we will have deterministic fallback > nodes in case the target node is disabled -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21930) WINDOW COUNT DISTINCT return wrong value with PARTITION BY
[ https://issues.apache.org/jira/browse/HIVE-21930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor updated HIVE-21930: Description: count(distinct a) over (partiton by b) return wring result. For example (T is CTE here): {code:java} select p, day, ts , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number , count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as lines , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as days FROM T{code} WINDOW specification doesn't affect on results: same wrong with and without window. count(1) and count(distinct day) return the same result. Count distinct is wrong. I've add size(collect_set(day) OVER (PARTITION BY phone)) as days2 and count(distinct return correct result. Following query return non-empty result: {code:java} select A.*, B.days, B. from ( select p, day, ts , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number , count(1) OVER (PARTITION BY p ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as lines , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as days , size(collect_set(day) OVER (PARTITION BY phone)) as days2 , dense_rank() over (partition by phone order by day) + dense_rank() over (partition by phone order by day desc) - 1 as days3 FROM T ) as A join ( select p, day, ts , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number , count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as lines , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as days FROM T ) as B on A.p=B.p and A.line_number=B.line_number where A.days!=B.days order by A.p, A.line_number {code} was: count(distinct a) over (partiton by b) return wring result. For example: {code:java} select p, day, ts , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number , count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as lines , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as days FROM T{code} WINDOW specification doesn't affect on results: same wrong with and without window. count(1) and count(distinct day) return the same result. Count distinct is wrong. I've add size(collect_set(day) OVER (PARTITION BY phone)) as days2 and count(distinct return correct result. Following query return non-empty result: {code:java} select A.*, B.days, B. from ( select p, day, ts , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number , count(1) OVER (PARTITION BY p ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as lines , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as days , size(collect_set(day) OVER (PARTITION BY phone)) as days2 , dense_rank() over (partition by phone order by day) + dense_rank() over (partition by phone order by day desc) - 1 as days3 FROM T ) as A join ( select p, day, ts , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number , count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as lines , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as days FROM T ) as B on A.p=B.p and A.line_number=B.line_number where A.days!=B.days order by A.p, A.line_number {code} > WINDOW COUNT DISTINCT return wrong value with PARTITION BY > -- > > Key: HIVE-21930 > URL: https://issues.apache.org/jira/browse/HIVE-21930 > Project: Hive > Issue Type: Bug > Components: PTF-Windowing >Affects Versions: 3.1.0 > Environment: Beeline version 3.1.0.3.0.1.0-187 by Apache Hive >Reporter: Igor >Priority: Major > Labels: distinct, window_funcion > > count(distinct a) over (partiton by b) return wring result. For example (T is > CTE here): > {code:java} > select p, day, ts > , row_number() OVER (PARTITION BY phone ORDER BY ts ASC) as line_number > , count(1) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED PRECEDING AND > UNBOUNDED FOLLOWING) as lines > , count(distinct day) OVER (PARTITION BY phone ROWS BETWEEN UNBOUNDED > PRECEDING AND UNBOUNDED FOLLOWING) as days > FROM T{code} > WINDOW specification doesn't affect on results: same wrong with and without > window. > count(1) and count(distinct day) return the same result. Count distinct is > wrong. > > I've add size(collect_set(day) OVER (PARTITION BY phone)) as days2 and > count(distinct return correct result. > Following query return non-empty result: > {code:java} > select A.*, B.days, B. from
[jira] [Updated] (HIVE-21578) Introduce SQL:2016 formats FM, FX, and nested strings
[ https://issues.apache.org/jira/browse/HIVE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-21578: - Attachment: HIVE-21578.01.patch Status: Patch Available (was: Open) > Introduce SQL:2016 formats FM, FX, and nested strings > - > > Key: HIVE-21578 > URL: https://issues.apache.org/jira/browse/HIVE-21578 > Project: Hive > Issue Type: Improvement >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-21578.01.patch > > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > * "text" (nested strings) > * FM > * FX > [Definitions > here|https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
[ https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874869#comment-16874869 ] Hive QA commented on HIVE-21880: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12973146/HIVE-21880.03.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 16360 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.llap.cache.TestBuddyAllocator.testMTT[2] (batchId=350) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions (batchId=275) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll (batchId=275) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomNonExistent (batchId=275) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighBytesRead (batchId=275) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=275) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryElapsedTime (batchId=275) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryExecutionTime (batchId=275) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/17788/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17788/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17788/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12973146 - PreCommit-HIVE-Build > Enable flaky test > TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites. > --- > > Key: HIVE-21880 > URL: https://issues.apache.org/jira/browse/HIVE-21880 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Sankar Hariappan >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21880.01.patch, HIVE-21880.02.patch, > HIVE-21880.03.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Need tp enable > TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites > which is disabled as it is flaky and randomly failing with below error. > {code} > Error Message > Notification events are missing in the meta store. > Stacktrace > java.lang.IllegalStateException: Notification events are missing in the meta > store. > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) > at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107) > at > org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159) > at > org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782) > at > org.apache.hadoop.hive.ql.reexec.Re
[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider
[ https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269203&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269203 ] ASF GitHub Bot logged work on HIVE-21910: - Author: ASF GitHub Bot Created on: 28/Jun/19 11:32 Start Date: 28/Jun/19 11:32 Worklog Time Spent: 10m Work Description: pvary commented on pull request #690: HIVE-21910: Multiple target location generation in HostAffinitySplitLocationProvider URL: https://github.com/apache/hive/pull/690#discussion_r298557620 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java ## @@ -52,13 +52,19 @@ private final List locations; private final Set locationSet; + private final int numberOfLocations; - public HostAffinitySplitLocationProvider(List knownLocations) { + public HostAffinitySplitLocationProvider(List knownLocations, int numberOfLocations) { Preconditions.checkState(knownLocations != null && !knownLocations.isEmpty(), HostAffinitySplitLocationProvider.class.getName() + " needs at least 1 location to function"); +Preconditions.checkArgument(numberOfLocations >= 0, Review comment: Yeah - remained from previous thoughts Set to 1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 269203) Time Spent: 1h 20m (was: 1h 10m) > Multiple target location generation in HostAffinitySplitLocationProvider > > > Key: HIVE-21910 > URL: https://issues.apache.org/jira/browse/HIVE-21910 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21910.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > We need to generate multiple target locations by > HostAffinitySplitLocationProvider, so we will have deterministic fallback > nodes in case the target node is disabled -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider
[ https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269201&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269201 ] ASF GitHub Bot logged work on HIVE-21910: - Author: ASF GitHub Bot Created on: 28/Jun/19 11:27 Start Date: 28/Jun/19 11:27 Worklog Time Spent: 10m Work Description: pvary commented on pull request #690: HIVE-21910: Multiple target location generation in HostAffinitySplitLocationProvider URL: https://github.com/apache/hive/pull/690#discussion_r298556443 ## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ## @@ -4440,6 +4440,12 @@ private static void populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal "preferring one of the locations provided by the split itself. If there is no llap daemon " + "running on any of those locations (or on the cloud), fall back to a cache affinity to" + " an LLAP node. This is effective only if hive.execution.mode is llap."), + LLAP_CLIENT_CONSISTENT_SPLITS_NUMBER("hive.llap.client.consistent.splits.number", 1, +"The number of the preferred locations to generate if hive.llap.client.consistent.splits\n" + Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 269201) Time Spent: 1h 10m (was: 1h) > Multiple target location generation in HostAffinitySplitLocationProvider > > > Key: HIVE-21910 > URL: https://issues.apache.org/jira/browse/HIVE-21910 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21910.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > We need to generate multiple target locations by > HostAffinitySplitLocationProvider, so we will have deterministic fallback > nodes in case the target node is disabled -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21910) Multiple target location generation in HostAffinitySplitLocationProvider
[ https://issues.apache.org/jira/browse/HIVE-21910?focusedWorklogId=269199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-269199 ] ASF GitHub Bot logged work on HIVE-21910: - Author: ASF GitHub Bot Created on: 28/Jun/19 11:26 Start Date: 28/Jun/19 11:26 Worklog Time Spent: 10m Work Description: pvary commented on pull request #690: HIVE-21910: Multiple target location generation in HostAffinitySplitLocationProvider URL: https://github.com/apache/hive/pull/690#discussion_r298556011 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java ## @@ -72,11 +78,17 @@ public HostAffinitySplitLocationProvider(List knownLocations) { FileSplit fsplit = (FileSplit) split; String splitDesc = "Split at " + fsplit.getPath() + " with offset= " + fsplit.getStart() + ", length=" + fsplit.getLength(); -List preferredLocations = preferLocations(fsplit); -String location = -preferredLocations.get(determineLocation(preferredLocations, fsplit.getPath().toString(), -fsplit.getStart(), splitDesc)); -return (location != null) ? new String[] { location } : null; +List preferredLocations = new ArrayList<>(preferLocations(fsplit)); Review comment: We might want to keep it configurable. Until the cluster reaches the threshold where the highest loaded node starts to struggle for resources keeping the tasks aligned with HDFS location still makes sense. What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 269199) Time Spent: 1h (was: 50m) > Multiple target location generation in HostAffinitySplitLocationProvider > > > Key: HIVE-21910 > URL: https://issues.apache.org/jira/browse/HIVE-21910 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21910.patch > > Time Spent: 1h > Remaining Estimate: 0h > > We need to generate multiple target locations by > HostAffinitySplitLocationProvider, so we will have deterministic fallback > nodes in case the target node is disabled -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
[ https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874862#comment-16874862 ] Hive QA commented on HIVE-21880: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 54s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 55s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 35s{color} | {color:blue} standalone-metastore/metastore-common in master has 31 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 12s{color} | {color:blue} standalone-metastore/metastore-server in master has 179 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 3s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 27s{color} | {color:blue} hcatalog/server-extensions in master has 3 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 7s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 30s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} itests/hcatalog-unit: The patch generated 81 new + 0 unchanged - 0 fixed = 81 total (was 0) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 32 unchanged - 0 fixed = 33 total (was 32) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 31s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-17788/dev-support/hive-personality.sh | | git revision | master / 57c4217 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-17788/yetus/diff-checkstyle-itests_hcatalog-unit.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-17788/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: standalone-metastore/metastore-common standalone-metastore/metastore-server ql hcatalog/server-extensions itests/hcatalog-unit itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-17788/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Enable flaky test > TestRep
[jira] [Commented] (HIVE-21923) Vectorized MapJoin may miss results when only the join key is selected
[ https://issues.apache.org/jira/browse/HIVE-21923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874832#comment-16874832 ] Zoltan Haindrich commented on HIVE-21923: - this issue was introduced in HIVE-18908 > Vectorized MapJoin may miss results when only the join key is selected > -- > > Key: HIVE-21923 > URL: https://issues.apache.org/jira/browse/HIVE-21923 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-21923.01.patch > > > HIVE-21189 have introduced some resultset changes > in ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out > https://github.com/apache/hive/commit/5799398450c17d06e8ef144ce835a8524f5abec9#diff-56b3ab96b6c90fdbebe2c4f84e8595afL500 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874830#comment-16874830 ] Hive QA commented on HIVE-21867: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12973143/HIVE-21867.05.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 16325 tests executed *Failed tests:* {noformat} TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed out) (batchId=232) TestObjectStore - did not produce a TEST-*.xml file (likely timed out) (batchId=232) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/17787/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17787/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17787/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12973143 - PreCommit-HIVE-Build > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874802#comment-16874802 ] Hive QA commented on HIVE-21867: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 4s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s{color} | {color:red} ql: The patch generated 1 new + 155 unchanged - 0 fixed = 156 total (was 155) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 9s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-17787/dev-support/hive-personality.sh | | git revision | master / 57c4217 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-17787/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-17787/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
[ https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Bapat updated HIVE-21880: -- Attachment: HIVE-21880.03.patch Status: Patch Available (was: In Progress) The failed tests are passing for me locally. Re-submitting .02 patch as .03 to trigger ptests. > Enable flaky test > TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites. > --- > > Key: HIVE-21880 > URL: https://issues.apache.org/jira/browse/HIVE-21880 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Sankar Hariappan >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21880.01.patch, HIVE-21880.02.patch, > HIVE-21880.03.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Need tp enable > TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites > which is disabled as it is flaky and randomly failing with below error. > {code} > Error Message > Notification events are missing in the meta store. > Stacktrace > java.lang.IllegalStateException: Notification events are missing in the meta > store. > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) > at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107) > at > org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159) > at > org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223) > at > org.apache.hadoop.hive.ql.parse.WarehouseInstance.run(WarehouseInstance.java:227) > at > org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:282) > at > org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:265) > at > org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:289) > at > org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites(TestReplicationScenariosAcidTablesBootstrap.java:328) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) >
[jira] [Updated] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
[ https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Bapat updated HIVE-21880: -- Status: In Progress (was: Patch Available) > Enable flaky test > TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites. > --- > > Key: HIVE-21880 > URL: https://issues.apache.org/jira/browse/HIVE-21880 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Sankar Hariappan >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21880.01.patch, HIVE-21880.02.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Need tp enable > TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites > which is disabled as it is flaky and randomly failing with below error. > {code} > Error Message > Notification events are missing in the meta store. > Stacktrace > java.lang.IllegalStateException: Notification events are missing in the meta > store. > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) > at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source) > at > org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107) > at > org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159) > at > org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223) > at > org.apache.hadoop.hive.ql.parse.WarehouseInstance.run(WarehouseInstance.java:227) > at > org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:282) > at > org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:265) > at > org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:289) > at > org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites(TestReplicationScenariosAcidTablesBootstrap.java:328) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runn
[jira] [Commented] (HIVE-21886) REPL - With table list - Handle rename events during replace policy
[ https://issues.apache.org/jira/browse/HIVE-21886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874779#comment-16874779 ] Sankar Hariappan commented on HIVE-21886: - +1, patch LGTM, pending tests. > REPL - With table list - Handle rename events during replace policy > --- > > Key: HIVE-21886 > URL: https://issues.apache.org/jira/browse/HIVE-21886 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: DR, Replication, pull-request-available > Attachments: HIVE-21886.01.patch, HIVE-21886.02.patch, > HIVE-21886.03.patch, HIVE-21886.04.patch, HIVE-21886.04.patch > > Time Spent: 11h 10m > Remaining Estimate: 0h > > If some rename events are found to be dumped and replayed while replace > policy is getting executed, it needs to take care of the policy inclusion in > both the policy for each table name. > 1. Create a list of tables to be bootstrapped. > 2. During handling of alter table, if the alter type is rename > 1. If the old table name is present in the list of table to be > bootstrapped, remove it. > 2. If the new table name, matches the new policy, add it to the list > of tables to be bootstrapped. > 3. If the old table does not match the old policy drop it, even if the > table is not present at target. > 3. During handling of drop table > 1. if the table is in the list of tables to be bootstrapped, then > remove it and ignore the event. > 4. During other event handling > 1. if the table is there in the list of tables to be bootstrapped, > then ignore the event. > 2. If the new policy does not match the table name, then ignore the > event. > > Rename handling during replace policy > # Old name not matching old policy – The old table will not be there at the > target cluster. The table will not be returned by get-all-table. > ## Old name is not matching new policy > ### New name not matching old policy > New name not matching new policy > * Ignore the event, no need to do anything. > New name matching new policy > * The table will be returned by get-all-table. Replace policy handler > will bootstrap this table as its matching new policy and not matching old > policy. > * All the future events will be ignored as part of check added by > replace policy handling. > * All the event with old table name will anyways be ignored as the old > name is not matching the new policy. > ### New name matching old policy > New name not matching new policy > * As the new name is not matching the new policy, the table need not be > replicated. > * As the old name is not matching the new policy, the rename events will > be ignored. > * So nothing to be done for this scenario. > New name matching new policy > * As the new name is matching both old and new policy, replace handler > will not bootstrap the table. > * Add the table to the list of tables to be bootstrapped. > * Ignore all the events with new name. > * If there is a drop event for the table (with new name), then remove > the table from the the list of table to be bootstrapped. > * In case of rename event (double rename) > ** If the new name satisfies the table pattern, then add the new name to > the list of tables to be bootstrapped and remove the old name from the list > of tables to be bootstrapped. > ** If the new name does not satisfies then just removed the table name > from the list of tables to be bootstrapped. > ## Old name is matching new policy – As per replace policy handler, which > checks based on old table, the table should be bootstrapped and event should > be ignored. But rename handler should decide based on new name.The old table > name will not be returned by get-all-table, so replace handler will not d > anything for the old table. > ### New name not matching old policy > New name not matching new policy > * As the old table is not there at target and new name is not matching > new policy. Ignore the event. > * No need to add the table to the list of tables to be bootstrapped. > * All the subsequent events will be ignored as the new name is not > matching the new policy. > New name matching new policy > * As the new name is not matching old policy but matching new policy, > the table will be bootstrapped by replace policy handler. So rename event > need not add this table to list of table to be bootstrapped. > * All the future events will be ignored by replace policy handler. > * For rename event (double rename) > ** If there is a rename, the table
[jira] [Commented] (HIVE-21888) Set hive.parquet.timestamp.skip.conversion default to true
[ https://issues.apache.org/jira/browse/HIVE-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874771#comment-16874771 ] Jesus Camacho Rodriguez commented on HIVE-21888: +1 > Set hive.parquet.timestamp.skip.conversion default to true > -- > > Key: HIVE-21888 > URL: https://issues.apache.org/jira/browse/HIVE-21888 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-21888.02.patch, HIVE-21888.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21867) Sort semijoin conditions to accelerate query processing
[ https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-21867: --- Attachment: HIVE-21867.05.patch > Sort semijoin conditions to accelerate query processing > --- > > Key: HIVE-21867 > URL: https://issues.apache.org/jira/browse/HIVE-21867 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21867.02.patch, HIVE-21867.03.patch, > HIVE-21867.04.patch, HIVE-21867.05.patch, HIVE-21867.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem was tackled for CBO in HIVE-21857. Semijoin filters are > introduced later in the planning phase. Follow similar approach to sort them, > trying to accelerate filter evaluation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21637) Synchronized metastore cache
[ https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874736#comment-16874736 ] Hive QA commented on HIVE-21637: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 36s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 34s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 27s{color} | {color:blue} storage-api in master has 48 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 31s{color} | {color:blue} standalone-metastore/metastore-common in master has 31 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 13s{color} | {color:blue} standalone-metastore/metastore-server in master has 179 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 12s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 32s{color} | {color:blue} beeline in master has 44 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 30s{color} | {color:blue} hcatalog/server-extensions in master has 3 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 25s{color} | {color:blue} standalone-metastore/metastore-tools/metastore-benchmarks in master has 3 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 43s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 49s{color} | {color:blue} itests/util in master has 44 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 10s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} storage-api: The patch generated 1 new + 5 unchanged - 0 fixed = 6 total (was 5) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s{color} | {color:red} standalone-metastore/metastore-common: The patch generated 9 new + 498 unchanged - 2 fixed = 507 total (was 500) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 48s{color} | {color:red} standalone-metastore/metastore-server: The patch generated 160 new + 2193 unchanged - 65 fixed = 2353 total (was 2258) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 54s{color} | {color:red} ql: The patch generated 10 new + 970 unchanged - 2 fixed = 980 total (was 972) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s{color} | {color:red} standalone-metastore/metastore-tools/tools-common: The patch generated 5 new + 31 unchanged - 0 fixed = 36 total (was 31) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} itests/hcatalog-unit: The patch generated 2 new + 24 unchanged - 3 fixed = 26 total (was 27) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s{color} | {color:red} itests/hive-unit: The patch generated 3 new + 163 unchanged - 1 fixed = 166 total (was 164) {color} | | {color:red}-1{color} | {color:red} checkstyle
[jira] [Commented] (HIVE-21932) IndexOutOfRangeExeption in FileChksumIterator
[ https://issues.apache.org/jira/browse/HIVE-21932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874732#comment-16874732 ] anishek commented on HIVE-21932: +1 > IndexOutOfRangeExeption in FileChksumIterator > - > > Key: HIVE-21932 > URL: https://issues.apache.org/jira/browse/HIVE-21932 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21932.01.patch > > > According to definition of {{InsertEventRequestData}} in > {{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But > the FileChksumIterator does not handle it correctly when a client fires a > insert event which does not have file checksums. The issue is that > {{InsertEvent}} class initializes fileChecksums list to a empty arrayList to > the following check will never come into play > {noformat} > result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? > chksums.get(i) : null, > subDirs != null ? subDirs.get(i) : null); > {noformat} > The chksums check above should include a {{!chksums.isEmpty()}} check as well > in the above line. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21762) REPL DUMP to support new format for replication policy input to take included tables list.
[ https://issues.apache.org/jira/browse/HIVE-21762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874729#comment-16874729 ] Sankar Hariappan commented on HIVE-21762: - Updated Apache Wiki page for this issue, https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development > REPL DUMP to support new format for replication policy input to take included > tables list. > -- > > Key: HIVE-21762 > URL: https://issues.apache.org/jira/browse/HIVE-21762 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, Replication, pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-21762.01.patch, HIVE-21762.02.patch, > HIVE-21762.03.patch, HIVE-21762.04.patch, HIVE-21762.05.patch, > HIVE-21762.06.patch, HIVE-21762.07.patch > > Time Spent: 6h > Remaining Estimate: 0h > > - REPL DUMP syntax: > {code} > REPL DUMP [FROM WITH ; > {code} > - New format for the Replication policy have 3 parts all separated with Dot > (.). > 1. First part is DB name. > 2. Second part is included list. Comma separated table names/regex with in > square brackets[]. If square brackets are not there, then it is treated as > single table replication which skips DB level events. > 3. Third part is excluded list. Comma separated table names/regex with in > square brackets[]. > {code} > -- Full DB replication which is currently supported > .['.*?'] -- Full DB replication > .[] -- Replicate just functions and not include any tables. > .['t1', 't2'] -- DB replication with static list of tables t1 and > t2 included. > .['t1*', 't2', '*t3'].['t100', '5t3', 't4'] -- DB replication with > all tables having prefix t1, with suffix t3 and include table t2 and exclude > t100 which has the prefix t1, 5t3 which suffix t3 and t4. > {code} > - Need to support regular expression of any format. > - A table is included in dump only if it matches the regular expressions in > included list and doesn't match the excluded list. -- This message was sent by Atlassian JIRA (v7.6.3#76005)