[jira] [Updated] (HAWQ-1162) Resource manager does not reference dynamic minimum water level of each segment when it times out YARN containers

2016-11-16 Thread Yi Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Jin updated HAWQ-1162:
-
Assignee: Amy  (was: Yi Jin)

> Resource manager does not reference dynamic minimum water level of each 
> segment when it times out YARN containers
> -
>
> Key: HAWQ-1162
> URL: https://issues.apache.org/jira/browse/HAWQ-1162
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Yi Jin
>Assignee: Amy
>
> Then resource manager decides to timeout some containers from segments, the 
> minimum water level number is passed as reference to avoid returning too many 
> containers from some segments. There is a hard code 2.
> timeoutIdleGRMResourceToRBByRatio(i,
> retcontnum,
> 
> ,
> 
> mark->ClusterVCore > 0 ? 2 : 0 );



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1162) Resource manager does not reference dynamic minimum water level of each segment when it times out YARN containers

2016-11-16 Thread Yi Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Jin reassigned HAWQ-1162:


Assignee: Yi Jin  (was: Lei Chang)

> Resource manager does not reference dynamic minimum water level of each 
> segment when it times out YARN containers
> -
>
> Key: HAWQ-1162
> URL: https://issues.apache.org/jira/browse/HAWQ-1162
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Yi Jin
>Assignee: Yi Jin
>
> Then resource manager decides to timeout some containers from segments, the 
> minimum water level number is passed as reference to avoid returning too many 
> containers from some segments. There is a hard code 2.
> timeoutIdleGRMResourceToRBByRatio(i,
> retcontnum,
> 
> ,
> 
> mark->ClusterVCore > 0 ? 2 : 0 );



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-hawq pull request #1002: HAWQ-1130. Make HCatalog integration work...

2016-11-16 Thread kavinderd
Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1002#discussion_r88318669
  
--- Diff: src/backend/access/transam/varsup.c ---
@@ -474,74 +483,128 @@ ResetExternalObjectId(void)
 
 /*
  * master_highest_used_oid
- * Query the database to find the highest used Oid by
+ * Uses CAQL and SPI to find the highest used Oid among 
user and catalog tables
+ *
+ * Uses CAQL to query catalog tables
+ * Uses SPI to query user tables, because CAQL supports 
tables from CatCoreRelation array only
  * 1) Find all the relations that has Oids
  * 2) Find max oid from those relations
  */
 Oid
 master_highest_used_oid(void)
 {
+   Oid oidMaxCatalog = InvalidOid;
+   Oid oidMaxUser = InvalidOid;
Oid oidMax = InvalidOid;
+   Oid currentOid;
+   Form_pg_class classForm;
+   cqContext *pcqOuterCtx;
+   cqContext *pcqInnerCtx;
+   HeapTuple outerTuple;
+   HeapTuple innerTuple;
+   /* number of user tables having oids*/
+   int userTablesNum = 0;
+   int ret;
+
+   pcqOuterCtx = caql_beginscan(NULL, cql("SELECT * FROM pg_class WHERE 
relhasoids = :1", BoolGetDatum(true)));
 
-   if (SPI_OK_CONNECT != SPI_connect())
+   outerTuple = caql_getnext(pcqOuterCtx);
+
+   if (!HeapTupleIsValid(outerTuple))
{
-   ereport(ERROR, (errcode(ERRCODE_CDB_INTERNAL_ERROR),
-   errmsg("Unable to connect to execute internal 
query for HCatalog.")));
+   caql_endscan(pcqOuterCtx);
+   elog(DEBUG1, "Unable to get list of tables having oids");
+   return oidMax;
}
 
-   int ret = SPI_execute("SELECT relname FROM pg_class where 
relhasoids=true", true, 0);
+   /* construct query to get max oid from all tables with oids */
+   StringInfo sqlstrCatalog = makeStringInfo();
+   StringInfo sqlstrUser = makeStringInfo();
+   appendStringInfo(sqlstrUser, "SELECT max(oid) FROM (");
+   while (HeapTupleIsValid(outerTuple))
+   {
+   classForm = (Form_pg_class) GETSTRUCT(outerTuple);
 
-   int rows = SPI_processed;
+   /* use CAQL for accessing catalog tables*/
+   if (classForm->relnamespace == PG_CATALOG_NAMESPACE)
+   {
+   appendStringInfo(sqlstrCatalog,
+   "SELECT oid FROM %s WHERE oid > :1 
ORDER BY oid",
+   classForm->relname.data);
 
-   char *tableNames[rows];
+   pcqInnerCtx = caql_beginscan(NULL,
+   cql1(sqlstrCatalog->data, __FILE__, 
__LINE__,
+   
ObjectIdGetDatum(oidMaxCatalog)));
 
-   if (rows == 0 || ret <= 0 || NULL == SPI_tuptable)
-   {
-   SPI_finish();
-   return oidMax;
-   }
+   innerTuple = caql_getnext(pcqInnerCtx);
 
-   TupleDesc tupdesc = SPI_tuptable->tupdesc;
-   SPITupleTable *tuptable = SPI_tuptable;
+   currentOid = InvalidOid;
 
-   for (int i = 0; i < rows; i++)
-   {
-   HeapTuple tuple = tuptable->vals[i];
-   tableNames[i] = SPI_getvalue(tuple, tupdesc, 1);
-   }
+   while (HeapTupleIsValid(innerTuple))
+   {
+   currentOid = HeapTupleGetOid(innerTuple);
+   innerTuple = caql_getnext(pcqInnerCtx);
--- End diff --

I see, that sucks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #1002: HAWQ-1130. Make HCatalog integration work...

2016-11-16 Thread sansanichfb
Github user sansanichfb commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1002#discussion_r88316276
  
--- Diff: src/backend/access/transam/varsup.c ---
@@ -474,74 +483,128 @@ ResetExternalObjectId(void)
 
 /*
  * master_highest_used_oid
- * Query the database to find the highest used Oid by
+ * Uses CAQL and SPI to find the highest used Oid among 
user and catalog tables
+ *
+ * Uses CAQL to query catalog tables
+ * Uses SPI to query user tables, because CAQL supports 
tables from CatCoreRelation array only
  * 1) Find all the relations that has Oids
  * 2) Find max oid from those relations
  */
 Oid
 master_highest_used_oid(void)
 {
+   Oid oidMaxCatalog = InvalidOid;
+   Oid oidMaxUser = InvalidOid;
Oid oidMax = InvalidOid;
+   Oid currentOid;
+   Form_pg_class classForm;
+   cqContext *pcqOuterCtx;
+   cqContext *pcqInnerCtx;
+   HeapTuple outerTuple;
+   HeapTuple innerTuple;
+   /* number of user tables having oids*/
+   int userTablesNum = 0;
+   int ret;
+
+   pcqOuterCtx = caql_beginscan(NULL, cql("SELECT * FROM pg_class WHERE 
relhasoids = :1", BoolGetDatum(true)));
 
-   if (SPI_OK_CONNECT != SPI_connect())
+   outerTuple = caql_getnext(pcqOuterCtx);
+
+   if (!HeapTupleIsValid(outerTuple))
{
-   ereport(ERROR, (errcode(ERRCODE_CDB_INTERNAL_ERROR),
-   errmsg("Unable to connect to execute internal 
query for HCatalog.")));
+   caql_endscan(pcqOuterCtx);
+   elog(DEBUG1, "Unable to get list of tables having oids");
+   return oidMax;
}
 
-   int ret = SPI_execute("SELECT relname FROM pg_class where 
relhasoids=true", true, 0);
+   /* construct query to get max oid from all tables with oids */
+   StringInfo sqlstrCatalog = makeStringInfo();
+   StringInfo sqlstrUser = makeStringInfo();
+   appendStringInfo(sqlstrUser, "SELECT max(oid) FROM (");
+   while (HeapTupleIsValid(outerTuple))
+   {
+   classForm = (Form_pg_class) GETSTRUCT(outerTuple);
 
-   int rows = SPI_processed;
+   /* use CAQL for accessing catalog tables*/
+   if (classForm->relnamespace == PG_CATALOG_NAMESPACE)
+   {
+   appendStringInfo(sqlstrCatalog,
+   "SELECT oid FROM %s WHERE oid > :1 
ORDER BY oid",
+   classForm->relname.data);
 
-   char *tableNames[rows];
+   pcqInnerCtx = caql_beginscan(NULL,
+   cql1(sqlstrCatalog->data, __FILE__, 
__LINE__,
+   
ObjectIdGetDatum(oidMaxCatalog)));
 
-   if (rows == 0 || ret <= 0 || NULL == SPI_tuptable)
-   {
-   SPI_finish();
-   return oidMax;
-   }
+   innerTuple = caql_getnext(pcqInnerCtx);
 
-   TupleDesc tupdesc = SPI_tuptable->tupdesc;
-   SPITupleTable *tuptable = SPI_tuptable;
+   currentOid = InvalidOid;
 
-   for (int i = 0; i < rows; i++)
-   {
-   HeapTuple tuple = tuptable->vals[i];
-   tableNames[i] = SPI_getvalue(tuple, tupdesc, 1);
-   }
+   while (HeapTupleIsValid(innerTuple))
+   {
+   currentOid = HeapTupleGetOid(innerTuple);
+   innerTuple = caql_getnext(pcqInnerCtx);
--- End diff --

CAQL only supports ordering in ascending order, so this logic walks over 
result set up to last row which supposed to be max oid for current table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #1002: HAWQ-1130. Make HCatalog integration work...

2016-11-16 Thread kavinderd
Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1002#discussion_r88310416
  
--- Diff: src/backend/access/transam/varsup.c ---
@@ -474,74 +483,128 @@ ResetExternalObjectId(void)
 
 /*
  * master_highest_used_oid
- * Query the database to find the highest used Oid by
+ * Uses CAQL and SPI to find the highest used Oid among 
user and catalog tables
+ *
+ * Uses CAQL to query catalog tables
+ * Uses SPI to query user tables, because CAQL supports 
tables from CatCoreRelation array only
  * 1) Find all the relations that has Oids
  * 2) Find max oid from those relations
  */
 Oid
 master_highest_used_oid(void)
 {
+   Oid oidMaxCatalog = InvalidOid;
+   Oid oidMaxUser = InvalidOid;
Oid oidMax = InvalidOid;
+   Oid currentOid;
+   Form_pg_class classForm;
+   cqContext *pcqOuterCtx;
+   cqContext *pcqInnerCtx;
+   HeapTuple outerTuple;
+   HeapTuple innerTuple;
+   /* number of user tables having oids*/
+   int userTablesNum = 0;
+   int ret;
+
+   pcqOuterCtx = caql_beginscan(NULL, cql("SELECT * FROM pg_class WHERE 
relhasoids = :1", BoolGetDatum(true)));
 
-   if (SPI_OK_CONNECT != SPI_connect())
+   outerTuple = caql_getnext(pcqOuterCtx);
+
+   if (!HeapTupleIsValid(outerTuple))
{
-   ereport(ERROR, (errcode(ERRCODE_CDB_INTERNAL_ERROR),
-   errmsg("Unable to connect to execute internal 
query for HCatalog.")));
+   caql_endscan(pcqOuterCtx);
+   elog(DEBUG1, "Unable to get list of tables having oids");
+   return oidMax;
}
 
-   int ret = SPI_execute("SELECT relname FROM pg_class where 
relhasoids=true", true, 0);
+   /* construct query to get max oid from all tables with oids */
+   StringInfo sqlstrCatalog = makeStringInfo();
+   StringInfo sqlstrUser = makeStringInfo();
+   appendStringInfo(sqlstrUser, "SELECT max(oid) FROM (");
+   while (HeapTupleIsValid(outerTuple))
+   {
+   classForm = (Form_pg_class) GETSTRUCT(outerTuple);
 
-   int rows = SPI_processed;
+   /* use CAQL for accessing catalog tables*/
+   if (classForm->relnamespace == PG_CATALOG_NAMESPACE)
+   {
+   appendStringInfo(sqlstrCatalog,
+   "SELECT oid FROM %s WHERE oid > :1 
ORDER BY oid",
+   classForm->relname.data);
 
-   char *tableNames[rows];
+   pcqInnerCtx = caql_beginscan(NULL,
+   cql1(sqlstrCatalog->data, __FILE__, 
__LINE__,
+   
ObjectIdGetDatum(oidMaxCatalog)));
 
-   if (rows == 0 || ret <= 0 || NULL == SPI_tuptable)
-   {
-   SPI_finish();
-   return oidMax;
-   }
+   innerTuple = caql_getnext(pcqInnerCtx);
 
-   TupleDesc tupdesc = SPI_tuptable->tupdesc;
-   SPITupleTable *tuptable = SPI_tuptable;
+   currentOid = InvalidOid;
 
-   for (int i = 0; i < rows; i++)
-   {
-   HeapTuple tuple = tuptable->vals[i];
-   tableNames[i] = SPI_getvalue(tuple, tupdesc, 1);
-   }
+   while (HeapTupleIsValid(innerTuple))
+   {
+   currentOid = HeapTupleGetOid(innerTuple);
+   innerTuple = caql_getnext(pcqInnerCtx);
--- End diff --

If the caql query orderd by oid value, why do you need to call caql_getnext 
repeatedly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (HAWQ-1161) Refactor PXF to use new Hadoop MapReduce APIs

2016-11-16 Thread Goden Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Goden Yao updated HAWQ-1161:

Assignee: Shivram Mani  (was: Lei Chang)

> Refactor PXF to use new Hadoop MapReduce APIs
> -
>
> Key: HAWQ-1161
> URL: https://issues.apache.org/jira/browse/HAWQ-1161
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: PXF
>Reporter: Kyle R Dunn
>Assignee: Shivram Mani
> Fix For: backlog
>
>
> Several classes in PXF make use of the older `org.apache.hadoop.mapred` API 
> rather than the new `org.apache.hadoop.mapreduce` one. As a plugin developer, 
> this has been the source of a significant headache. Other HAWQ libraries, 
> like hawq-hadoop use the newer `org.apache.hadoop.mapreduce` API, creating 
> unnecessary friction between these two things. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-hawq pull request #1002: HAWQ-1130. Make HCatalog integration work...

2016-11-16 Thread shivzone
Github user shivzone commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1002#discussion_r88303706
  
--- Diff: src/backend/access/transam/varsup.c ---
@@ -408,6 +409,9 @@ GetNewExternalObjectId(void)
/*
 * must perform check on External Oid range on
 * initial access of NextExternalOid
+*
+* It's needed for upgrade scenario from old version
+* of HAWQ which doesn't support dedicated oid pool for HCatalog objects
 */
if (!IsExternalOidInitialized)
{
--- End diff --

Can refactor this 
if(master_highest_used_oid() < FirstExternalObjectId)
{
ResetExternalObjectId;
}
else
{
ereport();
}


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (HAWQ-1161) Refactor PXF to use new Hadoop MapReduce APIs

2016-11-16 Thread Kyle R Dunn (JIRA)
Kyle R Dunn created HAWQ-1161:
-

 Summary: Refactor PXF to use new Hadoop MapReduce APIs
 Key: HAWQ-1161
 URL: https://issues.apache.org/jira/browse/HAWQ-1161
 Project: Apache HAWQ
  Issue Type: Improvement
  Components: PXF
Reporter: Kyle R Dunn
Assignee: Lei Chang
 Fix For: backlog


Several classes in PXF make use of the older `org.apache.hadoop.mapred` API 
rather than the new `org.apache.hadoop.mapreduce` one. As a plugin developer, 
this has been the source of a significant headache. Other HAWQ libraries, like 
hawq-hadoop use the newer `org.apache.hadoop.mapreduce` API, creating 
unnecessary friction between these two things. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1159) 'hawq check' fails to check namenode settings if hawq not installed on that host

2016-11-16 Thread Radar Lei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radar Lei resolved HAWQ-1159.
-
Resolution: Fixed

> 'hawq check' fails to check namenode settings if hawq not installed on that 
> host
> 
>
> Key: HAWQ-1159
> URL: https://issues.apache.org/jira/browse/HAWQ-1159
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Radar Lei
>Assignee: Radar Lei
> Fix For: 2.0.1.0-incubating
>
>
> In some case, HDFS name node is not part of the HAWQ cluster's hosts, so 
> there is no hawq binary installed on namenode. This will cause 'hawq check' 
> failed to get namenode settings. We should not error out but skip namenode 
> check if it's not part of the hawq cluster.
> Failed command:
> hawq check -f hostfile --hadoop /usr/hdp/current/hadoop-client/
> BTW, 'hawq check --help' will hang for ever, this should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-hawq pull request #1015: HAWQ-1159. Skip namenode check while name...

2016-11-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq/pull/1015


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (HAWQ-870) Allocate target's tuple table slot in PortalHeapMemory during split partition

2016-11-16 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15670112#comment-15670112
 ] 

Hongxu Ma commented on HAWQ-870:


Test Ok.

But there is a small charge in partition.sql, the statement "appendonly=false" 
of PARTITION statement:
+PARTITION BY RANGE(log_id)
+(
+   START (1::int) END (100::int) EVERY (5) WITH (appendonly=false),
+   PARTITION "Old" START (101::int) END (201::int) WITH (appendonly=false),
+   DEFAULT PARTITION other_log_ids  WITH (statement)
+);

Since HAWQ do not support "appendonly=false" table, it should be remove the 
appendonly statement.
So the rest of test make sense.


> Allocate target's tuple table slot in PortalHeapMemory during split partition
> -
>
> Key: HAWQ-870
> URL: https://issues.apache.org/jira/browse/HAWQ-870
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Venkatesh
>Assignee: Hongxu Ma
> Fix For: backlog
>
>
> This is a nice fix from QP team on GPDB. Please port this fix into HAWQ. Th
> GPDB Commit: 
> https://github.com/greenplum-db/gpdb/commit/c0e1f00c2532d1e2ef8d3b409dc8fee901a7cfe2
> PR: https://github.com/greenplum-db/gpdb/pull/866



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-hawq issue #1015: HAWQ-1159. Skip namenode check while namenode no...

2016-11-16 Thread wengyanqing
Github user wengyanqing commented on the issue:

https://github.com/apache/incubator-hawq/pull/1015
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq issue #1016: HAWQ-1160. Hawq checkperf does not handle hostfi...

2016-11-16 Thread radarwave
Github user radarwave commented on the issue:

https://github.com/apache/incubator-hawq/pull/1016
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq issue #1015: HAWQ-1159. Skip namenode check while namenode no...

2016-11-16 Thread huor
Github user huor commented on the issue:

https://github.com/apache/incubator-hawq/pull/1015
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---