[jira] [Updated] (PHOENIX-5145) GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
[ https://issues.apache.org/jira/browse/PHOENIX-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] MariaCarrie updated PHOENIX-5145: - Attachment: application_1548138380177_1787.txt > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > --- > > Key: PHOENIX-5145 > URL: https://issues.apache.org/jira/browse/PHOENIX-5145 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0 > Environment: >HDP 3.0.0 > >Phoenix 5.0.0 > >HBase 2.0.0 > >Spark 2.3.1 > >Hadoop 3.0.1 >Reporter: MariaCarrie >Priority: Major > Attachments: application_1548138380177_1772.txt, > application_1548138380177_1787.txt > > Original Estimate: 72h > Remaining Estimate: 72h > > I can successfully read the data using the local mode. Here is my code: > ^val sqlContext: SQLContext = missionSession.app.ss.sqlContext^ > ^System.setProperty("sun.security.krb5.debug", "true")^ > ^System.setProperty("sun.security.spnego.debug", "true")^ > ^UserGroupInformation.loginUserFromKeytab("d...@devdip.org", > "devdmp.keytab")^ > ^// Load as a DataFrame directly using a Configuration object^ > ^val df: DataFrame = > sqlContext.phoenixTableAsDataFrame(missionSession.config.tableName, > Seq("ID"), zkUrl = Some(missionSession.config.zkUrl))^ > ^df.show(5)^ > But when I submit this to YARN for execution, an exception will be thrown. > The following is the exception information: > ^Tue Feb 19 13:07:53 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:53 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:53 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:54 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:54 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:55 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:57 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:08:01 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]^ > ^Tue Feb 19 13:08:11 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]^ > ^Tue Feb 19 13:08:21 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]^ > ^Tue Feb 19 13:08:31 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, p
[jira] [Assigned] (PHOENIX-2787) support IF EXISTS for ALTER TABLE SET options
[ https://issues.apache.org/jira/browse/PHOENIX-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyi Yan reassigned PHOENIX-2787: -- Assignee: (was: Xinyi Yan) > support IF EXISTS for ALTER TABLE SET options > - > > Key: PHOENIX-2787 > URL: https://issues.apache.org/jira/browse/PHOENIX-2787 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.8.0 >Reporter: Vincent Poon >Priority: Trivial > > A nice-to-have improvement to the grammar: > ALTER TABLE my_table IF EXISTS SET options > currently the 'IF EXISTS' only works for dropping/adding a column -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (PHOENIX-2787) support IF EXISTS for ALTER TABLE SET options
[ https://issues.apache.org/jira/browse/PHOENIX-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyi Yan reassigned PHOENIX-2787: -- Assignee: Xinyi Yan > support IF EXISTS for ALTER TABLE SET options > - > > Key: PHOENIX-2787 > URL: https://issues.apache.org/jira/browse/PHOENIX-2787 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.8.0 >Reporter: Vincent Poon >Assignee: Xinyi Yan >Priority: Trivial > > A nice-to-have improvement to the grammar: > ALTER TABLE my_table IF EXISTS SET options > currently the 'IF EXISTS' only works for dropping/adding a column -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-5145) GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
[ https://issues.apache.org/jira/browse/PHOENIX-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] MariaCarrie updated PHOENIX-5145: - Attachment: application_1548138380177_1772.txt > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > --- > > Key: PHOENIX-5145 > URL: https://issues.apache.org/jira/browse/PHOENIX-5145 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0 > Environment: >HDP 3.0.0 > >Phoenix 5.0.0 > >HBase 2.0.0 > >Spark 2.3.1 > >Hadoop 3.0.1 >Reporter: MariaCarrie >Priority: Major > Attachments: application_1548138380177_1772.txt > > Original Estimate: 72h > Remaining Estimate: 72h > > I can successfully read the data using the local mode. Here is my code: > ^val sqlContext: SQLContext = missionSession.app.ss.sqlContext^ > ^System.setProperty("sun.security.krb5.debug", "true")^ > ^System.setProperty("sun.security.spnego.debug", "true")^ > ^UserGroupInformation.loginUserFromKeytab("d...@devdip.org", > "devdmp.keytab")^ > ^// Load as a DataFrame directly using a Configuration object^ > ^val df: DataFrame = > sqlContext.phoenixTableAsDataFrame(missionSession.config.tableName, > Seq("ID"), zkUrl = Some(missionSession.config.zkUrl))^ > ^df.show(5)^ > But when I submit this to YARN for execution, an exception will be thrown. > The following is the exception information: > ^Tue Feb 19 13:07:53 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:53 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:53 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:54 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:54 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:55 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:07:57 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: java.io.IOException: Can not send request because relogin > is in progress.^ > ^Tue Feb 19 13:08:01 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]^ > ^Tue Feb 19 13:08:11 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]^ > ^Tue Feb 19 13:08:21 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.IOException: Call to test-dmp5.fengdai.org/10.200.162.26:16020 failed > on local exception: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]^ > ^Tue Feb 19 13:08:31 CST 2019, > RpcRetryingCaller\{globalStartTime=1550552873361, pause=100, maxAttempts=36}, > java.io.I
[jira] [Updated] (PHOENIX-5147) Add an option to disable spooling ( SORT MERGE strategy in QueryCompiler )
[ https://issues.apache.org/jira/browse/PHOENIX-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang updated PHOENIX-5147: - Attachment: PHOENIX-5147.4.x-HBase-1.3.002.patch > Add an option to disable spooling ( SORT MERGE strategy in QueryCompiler ) > -- > > Key: PHOENIX-5147 > URL: https://issues.apache.org/jira/browse/PHOENIX-5147 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.15.0 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Attachments: PHOENIX-5147.4.x-HBase-1.3.001.patch, > PHOENIX-5147.4.x-HBase-1.3.002.patch > > > We should add an option to allow database admin to disable using spooling > from the server side. > Especially before PHOENIX-5135 is fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (PHOENIX-5147) Add an option to disable spooling ( SORT MERGE strategy in QueryCompiler )
[ https://issues.apache.org/jira/browse/PHOENIX-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang reassigned PHOENIX-5147: Assignee: Xu Cang > Add an option to disable spooling ( SORT MERGE strategy in QueryCompiler ) > -- > > Key: PHOENIX-5147 > URL: https://issues.apache.org/jira/browse/PHOENIX-5147 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.15.0 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > > We should add an option to allow database admin to disable using spooling > from the server side. > Especially before PHOENIX-5135 is fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-5147) Add an option to disable spooling ( SORT MERGE strategy in QueryCompiler )
[ https://issues.apache.org/jira/browse/PHOENIX-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang updated PHOENIX-5147: - Summary: Add an option to disable spooling ( SORT MERGE strategy in QueryCompiler ) (was: Add an option to disable spooling) > Add an option to disable spooling ( SORT MERGE strategy in QueryCompiler ) > -- > > Key: PHOENIX-5147 > URL: https://issues.apache.org/jira/browse/PHOENIX-5147 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.15.0 >Reporter: Xu Cang >Priority: Major > > We should add an option to allow database admin to disable using spooling > from the server side. > Especially before PHOENIX-5135 is fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PHOENIX-5147) Add an option to disable spooling
Xu Cang created PHOENIX-5147: Summary: Add an option to disable spooling Key: PHOENIX-5147 URL: https://issues.apache.org/jira/browse/PHOENIX-5147 Project: Phoenix Issue Type: Improvement Affects Versions: 4.15.0 Reporter: Xu Cang We should add an option to allow database admin to disable using spooling from the server side. Especially before PHOENIX-5135 is fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (PHOENIX-5144) C++ JDBC Driver
[ https://issues.apache.org/jira/browse/PHOENIX-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yinghua_zh closed PHOENIX-5144. --- > C++ JDBC Driver > --- > > Key: PHOENIX-5144 > URL: https://issues.apache.org/jira/browse/PHOENIX-5144 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.14.1 >Reporter: yinghua_zh >Priority: Major > > Can you provide a C++ version of JDBC driver? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-5146) Phoenix missing class definition: java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/apache/http/Consts
[ https://issues.apache.org/jira/browse/PHOENIX-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated PHOENIX-5146: Description: While running a SparkCompatibility check for Phoniex hitting this issue: {noformat} 2019-02-15 09:03:38,470|INFO|MainThread|machine.py:169 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|RUNNING: echo " import org.apache.spark.graphx._; import org.apache.phoenix.spark._; val rdd = sc.phoenixTableAsRDD(\"EMAIL_ENRON\", Seq(\"MAIL_FROM\", \"MAIL_TO\"), zkUrl=Some(\"huaycloud012.l42scl.hortonworks.com:2181:/hbase-secure\")); val rawEdges = rdd.map { e => (e(\"MAIL_FROM\").asInstanceOf[VertexId], e(\"MAIL_TO\").asInstanceOf[VertexId])} ; val graph = Graph.fromEdgeTuples(rawEdges, 1.0); val pr = graph.pageRank(0.001); pr.vertices.saveToPhoenix(\"EMAIL_ENRON_PAGERANK\", Seq(\"ID\", \"RANK\"), zkUrl = Some(\"huaycloud012.l42scl.hortonworks.com:2181:/hbase-secure\")); " | spark-shell --master yarn --jars /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.3.1.0.0-75.jar --properties-file /grid/0/log/cluster/run_phoenix_secure_ha_all_1/artifacts/spark_defaults.conf 2>&1 | tee /grid/0/log/cluster/run_phoenix_secure_ha_all_1/artifacts/Spark_clientLogs/phoenix-spark.txt 2019-02-15 09:03:38,488|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SPARK_MAJOR_VERSION is set to 2, using Spark2 2019-02-15 09:03:39,901|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SLF4J: Class path contains multiple SLF4J bindings. 2019-02-15 09:03:39,902|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-75/phoenix/phoenix-5.0.0.3.1.0.0-75-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] 2019-02-15 09:03:39,902|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-75/spark2/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] 2019-02-15 09:03:39,902|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SLF4J: See [http://www.slf4j.org/codes.html#multiple_bindings] for an explanation. 2019-02-15 09:03:41,400|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|Setting default log level to "WARN". 2019-02-15 09:03:41,400|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 2019-02-15 09:03:54,837|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84{color:#ff}*|java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/apache/http/Consts*{color} 2019-02-15 09:03:54,838|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.phoenix.shaded.org.apache.http.client.utils.URIBuilder.digestURI(URIBuilder.java:181) 2019-02-15 09:03:54,839|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.phoenix.shaded.org.apache.http.client.utils.URIBuilder.(URIBuilder.java:82) 2019-02-15 09:03:54,839|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468) 2019-02-15 09:03:54,839|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationToken(KMSClientProvider.java:1023) 2019-02-15 09:03:54,840|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:252) 2019-02-15 09:03:54,840|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:249) 2019-02-15 09:03:54,840|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:172) 2019-02-15 09:03:54,841|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.getDelegationToken(LoadBalancingKMSClientProvider.java:249) 2019-02-15 09:03:54,841|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer.java:95) 2019-02-15 09:03:54,841|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.security.token.DelegationTokenIssuer.collectDel
Re: Integration Testing Requirements
you can check if HADOOP_HOME and JAVA_HOME are properly set in the environment. On Tue, Feb 19, 2019 at 11:23 AM William Shen wrote: > Hi everyone, > > I'm trying to set up the Jenkins job at work to build Phoenix and run the > integration tests. However, repeatedly I encounter issues with the hive > module when I run mvn verify. Does the hive integration tests require any > special set up for them to pass? The other modules passed integration > testing successfully. > > Attaching below is a sample failure trace. > > Thanks! > > - Will > > [ERROR] Tests run: 6, Failures: 6, Errors: 0, Skipped: 0, Time > elapsed: 48.302 s <<< FAILURE! - in > org.apache.phoenix.hive.HiveTezIT[ERROR] > simpleColumnMapTest(org.apache.phoenix.hive.HiveTezIT) Time elapsed: > 6.727 s <<< FAILURE!junit.framework.AssertionFailedError: > Unexpected exception java.lang.RuntimeException: > org.apache.tez.dag.api.SessionNotRunning: TezSession has already > shutdown. Application application_1550371508120_0001 failed 2 times > due to AM Container for appattempt_1550371508120_0001_02 exited > with exitCode: 127 > For more detailed output, check application tracking > page: > http://40a4dd0e8959:38843/cluster/app/application_1550371508120_0001Then, > click on links to logs of each attempt. > Diagnostics: Exception from container-launch. > Container id: container_1550371508120_0001_02_01 > Exit code: 127 > Stack trace: ExitCodeException exitCode=127: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) > at org.apache.hadoop.util.Shell.run(Shell.java:456) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > > > Container exited with a non-zero exit code 127 > Failing this attempt. Failing the application. > at > org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:535) > at > org.apache.phoenix.hive.HiveTestUtil.cliInit(HiveTestUtil.java:637) > at > org.apache.phoenix.hive.HiveTestUtil.cliInit(HiveTestUtil.java:590) > at > org.apache.phoenix.hive.BaseHivePhoenixStoreIT.runTest(BaseHivePhoenixStoreIT.java:117) > at > org.apache.phoenix.hive.HivePhoenixStoreIT.simpleColumnMapTest(HivePhoenixStoreIT.java:103) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runners.Suite.runChild(Suite.java:128) > at org.junit.runners.Suite.runChild(Suite.java:27) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunne
Integration Testing Requirements
Hi everyone, I'm trying to set up the Jenkins job at work to build Phoenix and run the integration tests. However, repeatedly I encounter issues with the hive module when I run mvn verify. Does the hive integration tests require any special set up for them to pass? The other modules passed integration testing successfully. Attaching below is a sample failure trace. Thanks! - Will [ERROR] Tests run: 6, Failures: 6, Errors: 0, Skipped: 0, Time elapsed: 48.302 s <<< FAILURE! - in org.apache.phoenix.hive.HiveTezIT[ERROR] simpleColumnMapTest(org.apache.phoenix.hive.HiveTezIT) Time elapsed: 6.727 s <<< FAILURE!junit.framework.AssertionFailedError: Unexpected exception java.lang.RuntimeException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1550371508120_0001 failed 2 times due to AM Container for appattempt_1550371508120_0001_02 exited with exitCode: 127 For more detailed output, check application tracking page:http://40a4dd0e8959:38843/cluster/app/application_1550371508120_0001Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1550371508120_0001_02_01 Exit code: 127 Stack trace: ExitCodeException exitCode=127: at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at org.apache.hadoop.util.Shell.run(Shell.java:456) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Container exited with a non-zero exit code 127 Failing this attempt. Failing the application. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:535) at org.apache.phoenix.hive.HiveTestUtil.cliInit(HiveTestUtil.java:637) at org.apache.phoenix.hive.HiveTestUtil.cliInit(HiveTestUtil.java:590) at org.apache.phoenix.hive.BaseHivePhoenixStoreIT.runTest(BaseHivePhoenixStoreIT.java:117) at org.apache.phoenix.hive.HivePhoenixStoreIT.simpleColumnMapTest(HivePhoenixStoreIT.java:103) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRe
[jira] [Updated] (PHOENIX-5018) Index mutations created by UPSERT SELECT will have wrong timestamps
[ https://issues.apache.org/jira/browse/PHOENIX-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR updated PHOENIX-5018: --- Attachment: PHOENIX-5018.master.004.patch > Index mutations created by UPSERT SELECT will have wrong timestamps > --- > > Key: PHOENIX-5018 > URL: https://issues.apache.org/jira/browse/PHOENIX-5018 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0, 5.0.0 >Reporter: Geoffrey Jacoby >Assignee: Kadir OZDEMIR >Priority: Major > Attachments: PHOENIX-5018.4.x-HBase-1.3.001.patch, > PHOENIX-5018.4.x-HBase-1.3.002.patch, PHOENIX-5018.4.x-HBase-1.4.001.patch, > PHOENIX-5018.4.x-HBase-1.4.002.patch, PHOENIX-5018.master.001.patch, > PHOENIX-5018.master.002.patch, PHOENIX-5018.master.003.patch, > PHOENIX-5018.master.004.patch > > Time Spent: 5.5h > Remaining Estimate: 0h > > When doing a full rebuild (or initial async build) of a local or global index > using IndexTool and PhoenixIndexImportDirectMapper, or doing a synchronous > initial build of a global index using the index create DDL, we generate the > index mutations by using an UPSERT SELECT query from the base table to the > index. > The timestamps of the mutations use the default HBase behavior, which is to > take the current wall clock. However, the timestamp of an index KeyValue > should use the timestamp of the initial KeyValue in the base table. > Having base table and index timestamps out of sync can cause all sorts of > weird side effects, such as if the base table has data with an expired TTL > that isn't expired in the index yet. Also inserting old mutations with new > timestamps may overwrite the data that has been newly overwritten by the > regular data path during index build, which would lead to data loss and > inconsistency issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-5018) Index mutations created by UPSERT SELECT will have wrong timestamps
[ https://issues.apache.org/jira/browse/PHOENIX-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR updated PHOENIX-5018: --- Attachment: PHOENIX-5018.4.x-HBase-1.4.002.patch > Index mutations created by UPSERT SELECT will have wrong timestamps > --- > > Key: PHOENIX-5018 > URL: https://issues.apache.org/jira/browse/PHOENIX-5018 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0, 5.0.0 >Reporter: Geoffrey Jacoby >Assignee: Kadir OZDEMIR >Priority: Major > Attachments: PHOENIX-5018.4.x-HBase-1.3.001.patch, > PHOENIX-5018.4.x-HBase-1.3.002.patch, PHOENIX-5018.4.x-HBase-1.4.001.patch, > PHOENIX-5018.4.x-HBase-1.4.002.patch, PHOENIX-5018.master.001.patch, > PHOENIX-5018.master.002.patch, PHOENIX-5018.master.003.patch, > PHOENIX-5018.master.004.patch > > Time Spent: 5.5h > Remaining Estimate: 0h > > When doing a full rebuild (or initial async build) of a local or global index > using IndexTool and PhoenixIndexImportDirectMapper, or doing a synchronous > initial build of a global index using the index create DDL, we generate the > index mutations by using an UPSERT SELECT query from the base table to the > index. > The timestamps of the mutations use the default HBase behavior, which is to > take the current wall clock. However, the timestamp of an index KeyValue > should use the timestamp of the initial KeyValue in the base table. > Having base table and index timestamps out of sync can cause all sorts of > weird side effects, such as if the base table has data with an expired TTL > that isn't expired in the index yet. Also inserting old mutations with new > timestamps may overwrite the data that has been newly overwritten by the > regular data path during index build, which would lead to data loss and > inconsistency issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-5018) Index mutations created by UPSERT SELECT will have wrong timestamps
[ https://issues.apache.org/jira/browse/PHOENIX-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR updated PHOENIX-5018: --- Attachment: PHOENIX-5018.4.x-HBase-1.3.002.patch > Index mutations created by UPSERT SELECT will have wrong timestamps > --- > > Key: PHOENIX-5018 > URL: https://issues.apache.org/jira/browse/PHOENIX-5018 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0, 5.0.0 >Reporter: Geoffrey Jacoby >Assignee: Kadir OZDEMIR >Priority: Major > Attachments: PHOENIX-5018.4.x-HBase-1.3.001.patch, > PHOENIX-5018.4.x-HBase-1.3.002.patch, PHOENIX-5018.4.x-HBase-1.4.001.patch, > PHOENIX-5018.4.x-HBase-1.4.002.patch, PHOENIX-5018.master.001.patch, > PHOENIX-5018.master.002.patch, PHOENIX-5018.master.003.patch, > PHOENIX-5018.master.004.patch > > Time Spent: 5.5h > Remaining Estimate: 0h > > When doing a full rebuild (or initial async build) of a local or global index > using IndexTool and PhoenixIndexImportDirectMapper, or doing a synchronous > initial build of a global index using the index create DDL, we generate the > index mutations by using an UPSERT SELECT query from the base table to the > index. > The timestamps of the mutations use the default HBase behavior, which is to > take the current wall clock. However, the timestamp of an index KeyValue > should use the timestamp of the initial KeyValue in the base table. > Having base table and index timestamps out of sync can cause all sorts of > weird side effects, such as if the base table has data with an expired TTL > that isn't expired in the index yet. Also inserting old mutations with new > timestamps may overwrite the data that has been newly overwritten by the > regular data path during index build, which would lead to data loss and > inconsistency issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (PHOENIX-5068) Autocommit off is not working as expected might be a bug!?
[ https://issues.apache.org/jira/browse/PHOENIX-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyi Yan resolved PHOENIX-5068. Resolution: Duplicate > Autocommit off is not working as expected might be a bug!? > -- > > Key: PHOENIX-5068 > URL: https://issues.apache.org/jira/browse/PHOENIX-5068 > Project: Phoenix > Issue Type: Bug >Reporter: Amarnath Ramamoorthi >Assignee: Xinyi Yan >Priority: Minor > Attachments: test_foo_data.sql > > > Autocommit off is working strange might be a bug!? > Here is what we found when using autocommit off. > A table has only 2 int columns and both set as primary key, containing 100 > rows in total. > On *"autocommit off"* when we try to upsert values in to same table, it says > 200 rows affected. > Works fine when we run the same Upsert command but with less than 100 rows > using WHERE command as you can see below. > There is something wrong with auto commit off with >= 100 rows upsert`s. > {code:java} > 0: jdbc:phoenix:XXYYZZ> select count(*) from "FOO".DEMO; > +---+ > | COUNT(1) | > +---+ > | 100 | > +---+ > 1 row selected (0.025 seconds) > 0: jdbc:phoenix:XXYYZZ> SELECT * FROM "FOO".DEMO WHERE "id_x"=9741; > ++---+ > | id_x | id_y | > ++---+ > | 9741 | 63423770 | > ++---+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:XXYYZZ> !autocommit off > Autocommit status: false > 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".DEMO; > 200 rows affected (0.023 seconds) > 0: jdbc:phoenix:XXYYZZ> > 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".DEMO WHERE > "id_x"=9741; > 1 row affected (0.014 seconds) > 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".DEMO WHERE > "id_x"!=9741; > 99 rows affected (0.045 seconds) > 0: jdbc:phoenix:XXYYZZ> > 0: jdbc:phoenix:XXYYZZ> !autocommit on > Autocommit status: true > 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".DEMO; > 100 rows affected (0.065 seconds) > {code} > Tested once again, but now select from different table > {code:java} > 0: jdbc:phoenix:XXYYZZ> !autocommit off > Autocommit status: false > 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".TEST limit > 100; > 200 rows affected (0.052 seconds) > 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".TEST limit > 99; > 99 rows affected (0.029 seconds) > 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".TEST limit > 500; > 1,000 rows affected (0.041 seconds) > {code} > Still the same, It shows the rows affected is 1,000 even though we have it > limited to 500. It keeps doubling up. > Would be really helpful if someone could help on this please. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4925) Use Segment tree to organize Guide Post Info
[ https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bin Shi updated PHOENIX-4925: - Description: As reported, Query compilation (for the sample queries showed below), especially deriving estimation and generating parallel scans from guide posts, becomes much slower after we introduced Phoenix Stats. a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY pk1__c,pk2__c d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c ORDER BY pk1__c,pk2__c e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to make the primary key. By using prefix encoding for guide post info, we have to decode and traverse guide posts sequentially, which causes time complexity in BaseResultIterators.getParallelScan(...) to be O( n ) , where n is the total count of guide posts. According to PHOENIX-2417, to reduce footprint in client cache and over transmition, the prefix encoding is used as in-memory and over-the-wire encoding for guide post info. We can use Segment Tree to address both memory and performance concerns. The guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by prefix encoding and the encoded data is a leaf node of the tree. The inner node contains summary info (the count of rows, the data size) of the sub tree rooted at the inner node. With this tree like data structure, compared to the current data structure, the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The time complexity for queries a, b, c can be reduced to O(m) where m is the total count of regions; the time complexity for "EXPLAN" queries a, b, c can be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even be reduced to O(1). For queries d and e, the time complexity to find the start of target scan ranges can be reduced to O(log(n/k)). The tree can also integrate AVL and B+ characteristics to support partial load/unload when interacting with stats client cache. was: As reported, Query compilation (for the sample queries showed below), especially deriving estimation and generating parallel scans from guide posts, becomes much slower after we introduced Phoenix Stats. a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY pk1__c,pk2__c d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c ORDER BY pk1__c,pk2__c e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to make the primary key. By using prefix encoding for guide post info, we have to decode and traverse guide posts sequentially, which causes time complexity in BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total count of guide posts. According to PHOENIX-2417, to reduce footprint in client cache and over transmition, the prefix encoding is used as in-memory and over-the-wire encoding for guide post info. We can use Segment Tree to address both memory and performance concerns. The guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by prefix encoding and the encoded data is a leaf node of the tree. The inner node contains summary info (the count of rows, the data size) of the sub tree rooted at the inner node. With this tree like data structure, compared to the current data structure, the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The time complexity for queries a, b, c can be reduced to O(m) where m is the total count of regions; the time complexity for "EXPLAN" queries a, b, c can be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even be reduced to O(1). For queries d and e, the time complexity to find the start of target scan ranges can be reduced to O(log(n/k)). The tree can also integrate AVL and B+ characteristics to support partial load/unload when interacting with stats client cache. > Use Segment tree to organize Guide Post Info > > > Key: PHOENIX-4925 > URL: https://issues.apache.org/jira/browse/PHOENIX-4925 > Project: Phoenix > Issue Type: Improvement >Reporter: Bin Shi >Assignee: Bin Shi >Priority: Major > > As reported, Query compilation (for the sample queries showed below), > especially deriving estimation and generating parallel scans from guide > posts,
[jira] [Updated] (PHOENIX-4925) Use Segment tree to organize Guide Post Info
[ https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bin Shi updated PHOENIX-4925: - Summary: Use Segment tree to organize Guide Post Info (was: Use Segment/SUM tree to organize Guide Post Info) > Use Segment tree to organize Guide Post Info > > > Key: PHOENIX-4925 > URL: https://issues.apache.org/jira/browse/PHOENIX-4925 > Project: Phoenix > Issue Type: Improvement >Reporter: Bin Shi >Assignee: Bin Shi >Priority: Major > > As reported, Query compilation (for the sample queries showed below), > especially deriving estimation and generating parallel scans from guide > posts, becomes much slower after we introduced Phoenix Stats. > a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c > b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY > Pk1__c > c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY > pk1__c,pk2__c > d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c > ORDER BY pk1__c,pk2__c > e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= > 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only > column to make the primary key. > > By using prefix encoding for guide post info, we have to decode and traverse > guide posts sequentially, which causes time complexity in > BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total > count of guide posts. > According to PHOENIX-2417, to reduce footprint in client cache and over > transmition, the prefix encoding is used as in-memory and over-the-wire > encoding for guide post info. > We can use something like Sum Tree (even Binary Indexed Tree) to address both > memory and performance concerns. The guide posts are partitioned to k chunks > (k=1024?), each chunk is encoded by prefix encoding and the encoded data is a > leaf node of the tree. The inner node contains summary info (the count of > rows, the data size) of the sub tree rooted at the inner node. > With this tree like data structure, compared to the current data structure, > the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. > The time complexity for queries a, b, c can be reduced to O(m) where m is the > total count of regions; the time complexity for "EXPLAN" queries a, b, c can > be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can > even be reduced to O(1). For queries d and e, the time complexity to find the > start of target scan ranges can be reduced to O(log(n/k)). > The tree can also integrate AVL and B+ characteristics to support partial > load/unload when interacting with stats client cache. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (PHOENIX-4925) Use Segment tree to organize Guide Post Info
[ https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bin Shi updated PHOENIX-4925: - Description: As reported, Query compilation (for the sample queries showed below), especially deriving estimation and generating parallel scans from guide posts, becomes much slower after we introduced Phoenix Stats. a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY pk1__c,pk2__c d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c ORDER BY pk1__c,pk2__c e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to make the primary key. By using prefix encoding for guide post info, we have to decode and traverse guide posts sequentially, which causes time complexity in BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total count of guide posts. According to PHOENIX-2417, to reduce footprint in client cache and over transmition, the prefix encoding is used as in-memory and over-the-wire encoding for guide post info. We can use Segment Tree to address both memory and performance concerns. The guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by prefix encoding and the encoded data is a leaf node of the tree. The inner node contains summary info (the count of rows, the data size) of the sub tree rooted at the inner node. With this tree like data structure, compared to the current data structure, the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The time complexity for queries a, b, c can be reduced to O(m) where m is the total count of regions; the time complexity for "EXPLAN" queries a, b, c can be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even be reduced to O(1). For queries d and e, the time complexity to find the start of target scan ranges can be reduced to O(log(n/k)). The tree can also integrate AVL and B+ characteristics to support partial load/unload when interacting with stats client cache. was: As reported, Query compilation (for the sample queries showed below), especially deriving estimation and generating parallel scans from guide posts, becomes much slower after we introduced Phoenix Stats. a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY pk1__c,pk2__c d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c ORDER BY pk1__c,pk2__c e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to make the primary key. By using prefix encoding for guide post info, we have to decode and traverse guide posts sequentially, which causes time complexity in BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total count of guide posts. According to PHOENIX-2417, to reduce footprint in client cache and over transmition, the prefix encoding is used as in-memory and over-the-wire encoding for guide post info. We can use something like Sum Tree (even Binary Indexed Tree) to address both memory and performance concerns. The guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by prefix encoding and the encoded data is a leaf node of the tree. The inner node contains summary info (the count of rows, the data size) of the sub tree rooted at the inner node. With this tree like data structure, compared to the current data structure, the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The time complexity for queries a, b, c can be reduced to O(m) where m is the total count of regions; the time complexity for "EXPLAN" queries a, b, c can be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even be reduced to O(1). For queries d and e, the time complexity to find the start of target scan ranges can be reduced to O(log(n/k)). The tree can also integrate AVL and B+ characteristics to support partial load/unload when interacting with stats client cache. > Use Segment tree to organize Guide Post Info > > > Key: PHOENIX-4925 > URL: https://issues.apache.org/jira/browse/PHOENIX-4925 > Project: Phoenix > Issue Type: Improvement >Reporter: Bin Shi >Assignee: Bin Shi >Priority: Major > > As reported, Query compilation (for the sample queries showed below), > especially deriving estimation and generating
[jira] [Updated] (PHOENIX-5137) Index Rebuilder scan increases data table region split time
[ https://issues.apache.org/jira/browse/PHOENIX-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kiran Kumar Maturi updated PHOENIX-5137: Description: [~lhofhansl] [~vincentpoon] [~tdsilva] please review In order to differentiate between the index rebuilder retries (UngroupedAggregateRegionObserver.rebuildIndices()) and commits that happen in the loop of UngroupedAggregateRegionObserver.doPostScannerOpen() as part of PHOENIX-4600 blockingMemstoreSize was set to -1 for rebuildIndices; {code:java} commitBatchWithRetries(region, mutations, -1);{code} blocks the region split as the check for region closing does not happen blockingMemstoreSize > 0 {code:java} for (int i = 0; blockingMemstoreSize > 0 && region.getMemstoreSize() > blockingMemstoreSize && i < 30; i++) { try{ checkForRegionClosing(); {code} Plan is to have the check for region closing at least once before committing the batch {code:java} checkForRegionClosing(); for (int i = 0; blockingMemstoreSize > 0 && region.getMemstoreSize() > blockingMemstoreSize && i < 30; i++) { try{ checkForRegionClosing(); {code} Steps to reproduce 1. Create a table with one index (startime) 2. Add 1-2 million rows 3. Wait till the index is active 4. Disable the index with start time (noted in step 1) 5. Once the rebuilder starts split data table region Repeat the steps again after applying the patch to check the difference. was: [~lhofhansl] [~vincentpoon] [~tdsilva] please review In order to differentiate between the index rebuilder retries (UngroupedAggregateRegionObserver.rebuildIndices()) and commits that happen in the loop of UngroupedAggregateRegionObserver.doPostScannerOpen() as part of PHOENIX-4600 blockingMemstoreSize was set to -1 for rebuildIndices; {code:java} commitBatchWithRetries(region, mutations, -1);{code} blocks the region split as the check for region closing does not happen blockingMemstoreSize > 0 {code:java} for (int i = 0; blockingMemstoreSize > 0 && region.getMemstoreSize() > blockingMemstoreSize && i < 30; i++) { try{ checkForRegionClosing(); {code} Plan is to have the check for region closing at least once before committing the batch {code:java} checkForRegionClosing(); for (int i = 0; blockingMemstoreSize > 0 && region.getMemstoreSize() > blockingMemstoreSize && i < 30; i++) { try{ checkForRegionClosing(); {code} > Index Rebuilder scan increases data table region split time > --- > > Key: PHOENIX-5137 > URL: https://issues.apache.org/jira/browse/PHOENIX-5137 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.1 >Reporter: Kiran Kumar Maturi >Assignee: Kiran Kumar Maturi >Priority: Major > Attachments: PHOENIX-5137-4.14-Hbase-1.3.01.patch, > PHOENIX-5137-4.14-Hbase-1.3.01.patch > > > [~lhofhansl] [~vincentpoon] [~tdsilva] please review > In order to differentiate between the index rebuilder retries > (UngroupedAggregateRegionObserver.rebuildIndices()) and commits that happen > in the loop of UngroupedAggregateRegionObserver.doPostScannerOpen() as part > of PHOENIX-4600 blockingMemstoreSize was set to -1 for rebuildIndices; > {code:java} > commitBatchWithRetries(region, mutations, -1);{code} > blocks the region split as the check for region closing does not happen > blockingMemstoreSize > 0 > {code:java} > for (int i = 0; blockingMemstoreSize > 0 && region.getMemstoreSize() > > blockingMemstoreSize && i < 30; i++) { > try{ >checkForRegionClosing(); > > {code} > Plan is to have the check for region closing at least once before committing > the batch > {code:java} > checkForRegionClosing(); > for (int i = 0; blockingMemstoreSize > 0 && region.getMemstoreSize() > > blockingMemstoreSize && i < 30; i++) { > try{ >checkForRegionClosing(); > > {code} > Steps to reproduce > 1. Create a table with one index (startime) > 2. Add 1-2 million rows > 3. Wait till the index is active > 4. Disable the index with start time (noted in step 1) > 5. Once the rebuilder starts split data table region > Repeat the steps again after applying the patch to check the difference. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (PHOENIX-5144) C++ JDBC Driver
[ https://issues.apache.org/jira/browse/PHOENIX-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser resolved PHOENIX-5144. - Resolution: Later > C++ JDBC Driver > --- > > Key: PHOENIX-5144 > URL: https://issues.apache.org/jira/browse/PHOENIX-5144 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.14.1 >Reporter: yinghua_zh >Priority: Major > > Can you provide a C++ version of JDBC driver? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (PHOENIX-5146) Phoenix missing class definition: java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/apache/http/Consts
Narendra Kumar created PHOENIX-5146: --- Summary: Phoenix missing class definition: java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/apache/http/Consts Key: PHOENIX-5146 URL: https://issues.apache.org/jira/browse/PHOENIX-5146 Project: Phoenix Issue Type: Bug Affects Versions: 5.0.0 Environment: 3 node kerberised cluster. Hbase 2.0.2 Reporter: Narendra Kumar While running a SparkCompatibility check for Phoniex hitting this issue: 2019-02-15 09:03:38,470|INFO|MainThread|machine.py:169 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|RUNNING: echo " import org.apache.spark.graphx._; import org.apache.phoenix.spark._; val rdd = sc.phoenixTableAsRDD(\"EMAIL_ENRON\", Seq(\"MAIL_FROM\", \"MAIL_TO\"), zkUrl=Some(\"huaycloud012.l42scl.hortonworks.com:2181:/hbase-secure\")); val rawEdges = rdd.map { e => (e(\"MAIL_FROM\").asInstanceOf[VertexId], e(\"MAIL_TO\").asInstanceOf[VertexId])} ; val graph = Graph.fromEdgeTuples(rawEdges, 1.0); val pr = graph.pageRank(0.001); pr.vertices.saveToPhoenix(\"EMAIL_ENRON_PAGERANK\", Seq(\"ID\", \"RANK\"), zkUrl = Some(\"huaycloud012.l42scl.hortonworks.com:2181:/hbase-secure\")); " | spark-shell --master yarn --jars /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.3.1.0.0-75.jar --properties-file /grid/0/log/cluster/run_phoenix_secure_ha_all_1/artifacts/spark_defaults.conf 2>&1 | tee /grid/0/log/cluster/run_phoenix_secure_ha_all_1/artifacts/Spark_clientLogs/phoenix-spark.txt 2019-02-15 09:03:38,488|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SPARK_MAJOR_VERSION is set to 2, using Spark2 2019-02-15 09:03:39,901|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SLF4J: Class path contains multiple SLF4J bindings. 2019-02-15 09:03:39,902|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-75/phoenix/phoenix-5.0.0.3.1.0.0-75-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] 2019-02-15 09:03:39,902|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-75/spark2/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] 2019-02-15 09:03:39,902|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|SLF4J: See [http://www.slf4j.org/codes.html#multiple_bindings] for an explanation. 2019-02-15 09:03:41,400|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|Setting default log level to "WARN". 2019-02-15 09:03:41,400|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 2019-02-15 09:03:54,837|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84{color:#FF}*|java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/apache/http/Consts*{color} 2019-02-15 09:03:54,838|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.phoenix.shaded.org.apache.http.client.utils.URIBuilder.digestURI(URIBuilder.java:181) 2019-02-15 09:03:54,839|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.phoenix.shaded.org.apache.http.client.utils.URIBuilder.(URIBuilder.java:82) 2019-02-15 09:03:54,839|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468) 2019-02-15 09:03:54,839|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationToken(KMSClientProvider.java:1023) 2019-02-15 09:03:54,840|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:252) 2019-02-15 09:03:54,840|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:249) 2019-02-15 09:03:54,840|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:172) 2019-02-15 09:03:54,841|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.getDelegationToken(LoadBalancingKMSClientProvider.java:249) 2019-02-15 09:03:54,841|INFO|MainThread|machine.py:184 - run()||GUID=1566a829-b1df-4757-8c3d-73a7fa302b84|at org.apache.hadoop.security.t