[jira] [Updated] (HAWQ-1162) Resource manager does not reference dynamic minimum water level of each segment when it times out YARN containers
[ https://issues.apache.org/jira/browse/HAWQ-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Jin updated HAWQ-1162: - Assignee: Amy (was: Yi Jin) > Resource manager does not reference dynamic minimum water level of each > segment when it times out YARN containers > - > > Key: HAWQ-1162 > URL: https://issues.apache.org/jira/browse/HAWQ-1162 > Project: Apache HAWQ > Issue Type: Bug > Components: Resource Manager >Reporter: Yi Jin >Assignee: Amy > > Then resource manager decides to timeout some containers from segments, the > minimum water level number is passed as reference to avoid returning too many > containers from some segments. There is a hard code 2. > timeoutIdleGRMResourceToRBByRatio(i, > retcontnum, > > , > > mark->ClusterVCore > 0 ? 2 : 0 ); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-1162) Resource manager does not reference dynamic minimum water level of each segment when it times out YARN containers
[ https://issues.apache.org/jira/browse/HAWQ-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Jin reassigned HAWQ-1162: Assignee: Yi Jin (was: Lei Chang) > Resource manager does not reference dynamic minimum water level of each > segment when it times out YARN containers > - > > Key: HAWQ-1162 > URL: https://issues.apache.org/jira/browse/HAWQ-1162 > Project: Apache HAWQ > Issue Type: Bug > Components: Resource Manager >Reporter: Yi Jin >Assignee: Yi Jin > > Then resource manager decides to timeout some containers from segments, the > minimum water level number is passed as reference to avoid returning too many > containers from some segments. There is a hard code 2. > timeoutIdleGRMResourceToRBByRatio(i, > retcontnum, > > , > > mark->ClusterVCore > 0 ? 2 : 0 ); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-hawq pull request #1002: HAWQ-1130. Make HCatalog integration work...
Github user kavinderd commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1002#discussion_r88318669 --- Diff: src/backend/access/transam/varsup.c --- @@ -474,74 +483,128 @@ ResetExternalObjectId(void) /* * master_highest_used_oid - * Query the database to find the highest used Oid by + * Uses CAQL and SPI to find the highest used Oid among user and catalog tables + * + * Uses CAQL to query catalog tables + * Uses SPI to query user tables, because CAQL supports tables from CatCoreRelation array only * 1) Find all the relations that has Oids * 2) Find max oid from those relations */ Oid master_highest_used_oid(void) { + Oid oidMaxCatalog = InvalidOid; + Oid oidMaxUser = InvalidOid; Oid oidMax = InvalidOid; + Oid currentOid; + Form_pg_class classForm; + cqContext *pcqOuterCtx; + cqContext *pcqInnerCtx; + HeapTuple outerTuple; + HeapTuple innerTuple; + /* number of user tables having oids*/ + int userTablesNum = 0; + int ret; + + pcqOuterCtx = caql_beginscan(NULL, cql("SELECT * FROM pg_class WHERE relhasoids = :1", BoolGetDatum(true))); - if (SPI_OK_CONNECT != SPI_connect()) + outerTuple = caql_getnext(pcqOuterCtx); + + if (!HeapTupleIsValid(outerTuple)) { - ereport(ERROR, (errcode(ERRCODE_CDB_INTERNAL_ERROR), - errmsg("Unable to connect to execute internal query for HCatalog."))); + caql_endscan(pcqOuterCtx); + elog(DEBUG1, "Unable to get list of tables having oids"); + return oidMax; } - int ret = SPI_execute("SELECT relname FROM pg_class where relhasoids=true", true, 0); + /* construct query to get max oid from all tables with oids */ + StringInfo sqlstrCatalog = makeStringInfo(); + StringInfo sqlstrUser = makeStringInfo(); + appendStringInfo(sqlstrUser, "SELECT max(oid) FROM ("); + while (HeapTupleIsValid(outerTuple)) + { + classForm = (Form_pg_class) GETSTRUCT(outerTuple); - int rows = SPI_processed; + /* use CAQL for accessing catalog tables*/ + if (classForm->relnamespace == PG_CATALOG_NAMESPACE) + { + appendStringInfo(sqlstrCatalog, + "SELECT oid FROM %s WHERE oid > :1 ORDER BY oid", + classForm->relname.data); - char *tableNames[rows]; + pcqInnerCtx = caql_beginscan(NULL, + cql1(sqlstrCatalog->data, __FILE__, __LINE__, + ObjectIdGetDatum(oidMaxCatalog))); - if (rows == 0 || ret <= 0 || NULL == SPI_tuptable) - { - SPI_finish(); - return oidMax; - } + innerTuple = caql_getnext(pcqInnerCtx); - TupleDesc tupdesc = SPI_tuptable->tupdesc; - SPITupleTable *tuptable = SPI_tuptable; + currentOid = InvalidOid; - for (int i = 0; i < rows; i++) - { - HeapTuple tuple = tuptable->vals[i]; - tableNames[i] = SPI_getvalue(tuple, tupdesc, 1); - } + while (HeapTupleIsValid(innerTuple)) + { + currentOid = HeapTupleGetOid(innerTuple); + innerTuple = caql_getnext(pcqInnerCtx); --- End diff -- I see, that sucks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1002: HAWQ-1130. Make HCatalog integration work...
Github user sansanichfb commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1002#discussion_r88316276 --- Diff: src/backend/access/transam/varsup.c --- @@ -474,74 +483,128 @@ ResetExternalObjectId(void) /* * master_highest_used_oid - * Query the database to find the highest used Oid by + * Uses CAQL and SPI to find the highest used Oid among user and catalog tables + * + * Uses CAQL to query catalog tables + * Uses SPI to query user tables, because CAQL supports tables from CatCoreRelation array only * 1) Find all the relations that has Oids * 2) Find max oid from those relations */ Oid master_highest_used_oid(void) { + Oid oidMaxCatalog = InvalidOid; + Oid oidMaxUser = InvalidOid; Oid oidMax = InvalidOid; + Oid currentOid; + Form_pg_class classForm; + cqContext *pcqOuterCtx; + cqContext *pcqInnerCtx; + HeapTuple outerTuple; + HeapTuple innerTuple; + /* number of user tables having oids*/ + int userTablesNum = 0; + int ret; + + pcqOuterCtx = caql_beginscan(NULL, cql("SELECT * FROM pg_class WHERE relhasoids = :1", BoolGetDatum(true))); - if (SPI_OK_CONNECT != SPI_connect()) + outerTuple = caql_getnext(pcqOuterCtx); + + if (!HeapTupleIsValid(outerTuple)) { - ereport(ERROR, (errcode(ERRCODE_CDB_INTERNAL_ERROR), - errmsg("Unable to connect to execute internal query for HCatalog."))); + caql_endscan(pcqOuterCtx); + elog(DEBUG1, "Unable to get list of tables having oids"); + return oidMax; } - int ret = SPI_execute("SELECT relname FROM pg_class where relhasoids=true", true, 0); + /* construct query to get max oid from all tables with oids */ + StringInfo sqlstrCatalog = makeStringInfo(); + StringInfo sqlstrUser = makeStringInfo(); + appendStringInfo(sqlstrUser, "SELECT max(oid) FROM ("); + while (HeapTupleIsValid(outerTuple)) + { + classForm = (Form_pg_class) GETSTRUCT(outerTuple); - int rows = SPI_processed; + /* use CAQL for accessing catalog tables*/ + if (classForm->relnamespace == PG_CATALOG_NAMESPACE) + { + appendStringInfo(sqlstrCatalog, + "SELECT oid FROM %s WHERE oid > :1 ORDER BY oid", + classForm->relname.data); - char *tableNames[rows]; + pcqInnerCtx = caql_beginscan(NULL, + cql1(sqlstrCatalog->data, __FILE__, __LINE__, + ObjectIdGetDatum(oidMaxCatalog))); - if (rows == 0 || ret <= 0 || NULL == SPI_tuptable) - { - SPI_finish(); - return oidMax; - } + innerTuple = caql_getnext(pcqInnerCtx); - TupleDesc tupdesc = SPI_tuptable->tupdesc; - SPITupleTable *tuptable = SPI_tuptable; + currentOid = InvalidOid; - for (int i = 0; i < rows; i++) - { - HeapTuple tuple = tuptable->vals[i]; - tableNames[i] = SPI_getvalue(tuple, tupdesc, 1); - } + while (HeapTupleIsValid(innerTuple)) + { + currentOid = HeapTupleGetOid(innerTuple); + innerTuple = caql_getnext(pcqInnerCtx); --- End diff -- CAQL only supports ordering in ascending order, so this logic walks over result set up to last row which supposed to be max oid for current table. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1002: HAWQ-1130. Make HCatalog integration work...
Github user kavinderd commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1002#discussion_r88310416 --- Diff: src/backend/access/transam/varsup.c --- @@ -474,74 +483,128 @@ ResetExternalObjectId(void) /* * master_highest_used_oid - * Query the database to find the highest used Oid by + * Uses CAQL and SPI to find the highest used Oid among user and catalog tables + * + * Uses CAQL to query catalog tables + * Uses SPI to query user tables, because CAQL supports tables from CatCoreRelation array only * 1) Find all the relations that has Oids * 2) Find max oid from those relations */ Oid master_highest_used_oid(void) { + Oid oidMaxCatalog = InvalidOid; + Oid oidMaxUser = InvalidOid; Oid oidMax = InvalidOid; + Oid currentOid; + Form_pg_class classForm; + cqContext *pcqOuterCtx; + cqContext *pcqInnerCtx; + HeapTuple outerTuple; + HeapTuple innerTuple; + /* number of user tables having oids*/ + int userTablesNum = 0; + int ret; + + pcqOuterCtx = caql_beginscan(NULL, cql("SELECT * FROM pg_class WHERE relhasoids = :1", BoolGetDatum(true))); - if (SPI_OK_CONNECT != SPI_connect()) + outerTuple = caql_getnext(pcqOuterCtx); + + if (!HeapTupleIsValid(outerTuple)) { - ereport(ERROR, (errcode(ERRCODE_CDB_INTERNAL_ERROR), - errmsg("Unable to connect to execute internal query for HCatalog."))); + caql_endscan(pcqOuterCtx); + elog(DEBUG1, "Unable to get list of tables having oids"); + return oidMax; } - int ret = SPI_execute("SELECT relname FROM pg_class where relhasoids=true", true, 0); + /* construct query to get max oid from all tables with oids */ + StringInfo sqlstrCatalog = makeStringInfo(); + StringInfo sqlstrUser = makeStringInfo(); + appendStringInfo(sqlstrUser, "SELECT max(oid) FROM ("); + while (HeapTupleIsValid(outerTuple)) + { + classForm = (Form_pg_class) GETSTRUCT(outerTuple); - int rows = SPI_processed; + /* use CAQL for accessing catalog tables*/ + if (classForm->relnamespace == PG_CATALOG_NAMESPACE) + { + appendStringInfo(sqlstrCatalog, + "SELECT oid FROM %s WHERE oid > :1 ORDER BY oid", + classForm->relname.data); - char *tableNames[rows]; + pcqInnerCtx = caql_beginscan(NULL, + cql1(sqlstrCatalog->data, __FILE__, __LINE__, + ObjectIdGetDatum(oidMaxCatalog))); - if (rows == 0 || ret <= 0 || NULL == SPI_tuptable) - { - SPI_finish(); - return oidMax; - } + innerTuple = caql_getnext(pcqInnerCtx); - TupleDesc tupdesc = SPI_tuptable->tupdesc; - SPITupleTable *tuptable = SPI_tuptable; + currentOid = InvalidOid; - for (int i = 0; i < rows; i++) - { - HeapTuple tuple = tuptable->vals[i]; - tableNames[i] = SPI_getvalue(tuple, tupdesc, 1); - } + while (HeapTupleIsValid(innerTuple)) + { + currentOid = HeapTupleGetOid(innerTuple); + innerTuple = caql_getnext(pcqInnerCtx); --- End diff -- If the caql query orderd by oid value, why do you need to call caql_getnext repeatedly? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (HAWQ-1161) Refactor PXF to use new Hadoop MapReduce APIs
[ https://issues.apache.org/jira/browse/HAWQ-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Goden Yao updated HAWQ-1161: Assignee: Shivram Mani (was: Lei Chang) > Refactor PXF to use new Hadoop MapReduce APIs > - > > Key: HAWQ-1161 > URL: https://issues.apache.org/jira/browse/HAWQ-1161 > Project: Apache HAWQ > Issue Type: Improvement > Components: PXF >Reporter: Kyle R Dunn >Assignee: Shivram Mani > Fix For: backlog > > > Several classes in PXF make use of the older `org.apache.hadoop.mapred` API > rather than the new `org.apache.hadoop.mapreduce` one. As a plugin developer, > this has been the source of a significant headache. Other HAWQ libraries, > like hawq-hadoop use the newer `org.apache.hadoop.mapreduce` API, creating > unnecessary friction between these two things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-hawq pull request #1002: HAWQ-1130. Make HCatalog integration work...
Github user shivzone commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1002#discussion_r88303706 --- Diff: src/backend/access/transam/varsup.c --- @@ -408,6 +409,9 @@ GetNewExternalObjectId(void) /* * must perform check on External Oid range on * initial access of NextExternalOid +* +* It's needed for upgrade scenario from old version +* of HAWQ which doesn't support dedicated oid pool for HCatalog objects */ if (!IsExternalOidInitialized) { --- End diff -- Can refactor this if(master_highest_used_oid() < FirstExternalObjectId) { ResetExternalObjectId; } else { ereport(); } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (HAWQ-1161) Refactor PXF to use new Hadoop MapReduce APIs
Kyle R Dunn created HAWQ-1161: - Summary: Refactor PXF to use new Hadoop MapReduce APIs Key: HAWQ-1161 URL: https://issues.apache.org/jira/browse/HAWQ-1161 Project: Apache HAWQ Issue Type: Improvement Components: PXF Reporter: Kyle R Dunn Assignee: Lei Chang Fix For: backlog Several classes in PXF make use of the older `org.apache.hadoop.mapred` API rather than the new `org.apache.hadoop.mapreduce` one. As a plugin developer, this has been the source of a significant headache. Other HAWQ libraries, like hawq-hadoop use the newer `org.apache.hadoop.mapreduce` API, creating unnecessary friction between these two things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HAWQ-1159) 'hawq check' fails to check namenode settings if hawq not installed on that host
[ https://issues.apache.org/jira/browse/HAWQ-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radar Lei resolved HAWQ-1159. - Resolution: Fixed > 'hawq check' fails to check namenode settings if hawq not installed on that > host > > > Key: HAWQ-1159 > URL: https://issues.apache.org/jira/browse/HAWQ-1159 > Project: Apache HAWQ > Issue Type: Bug >Reporter: Radar Lei >Assignee: Radar Lei > Fix For: 2.0.1.0-incubating > > > In some case, HDFS name node is not part of the HAWQ cluster's hosts, so > there is no hawq binary installed on namenode. This will cause 'hawq check' > failed to get namenode settings. We should not error out but skip namenode > check if it's not part of the hawq cluster. > Failed command: > hawq check -f hostfile --hadoop /usr/hdp/current/hadoop-client/ > BTW, 'hawq check --help' will hang for ever, this should be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-hawq pull request #1015: HAWQ-1159. Skip namenode check while name...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-hawq/pull/1015 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (HAWQ-870) Allocate target's tuple table slot in PortalHeapMemory during split partition
[ https://issues.apache.org/jira/browse/HAWQ-870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15670112#comment-15670112 ] Hongxu Ma commented on HAWQ-870: Test Ok. But there is a small charge in partition.sql, the statement "appendonly=false" of PARTITION statement: +PARTITION BY RANGE(log_id) +( + START (1::int) END (100::int) EVERY (5) WITH (appendonly=false), + PARTITION "Old" START (101::int) END (201::int) WITH (appendonly=false), + DEFAULT PARTITION other_log_ids WITH (statement) +); Since HAWQ do not support "appendonly=false" table, it should be remove the appendonly statement. So the rest of test make sense. > Allocate target's tuple table slot in PortalHeapMemory during split partition > - > > Key: HAWQ-870 > URL: https://issues.apache.org/jira/browse/HAWQ-870 > Project: Apache HAWQ > Issue Type: Bug > Components: Query Execution >Reporter: Venkatesh >Assignee: Hongxu Ma > Fix For: backlog > > > This is a nice fix from QP team on GPDB. Please port this fix into HAWQ. Th > GPDB Commit: > https://github.com/greenplum-db/gpdb/commit/c0e1f00c2532d1e2ef8d3b409dc8fee901a7cfe2 > PR: https://github.com/greenplum-db/gpdb/pull/866 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-hawq issue #1015: HAWQ-1159. Skip namenode check while namenode no...
Github user wengyanqing commented on the issue: https://github.com/apache/incubator-hawq/pull/1015 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1016: HAWQ-1160. Hawq checkperf does not handle hostfi...
Github user radarwave commented on the issue: https://github.com/apache/incubator-hawq/pull/1016 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1015: HAWQ-1159. Skip namenode check while namenode no...
Github user huor commented on the issue: https://github.com/apache/incubator-hawq/pull/1015 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---