[GitHub] incubator-hawq pull request #963: HAWQ-1076. Fixed USAGE privilege bug on ne...

2016-10-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq/pull/963


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq issue #963: HAWQ-1076. nextval should not error out for permi...

2016-10-13 Thread liming01
Github user liming01 commented on the issue:

https://github.com/apache/incubator-hawq/pull/963
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #963: HAWQ-1076. nextval should not error out fo...

2016-10-13 Thread stanlyxiang
GitHub user stanlyxiang opened a pull request:

https://github.com/apache/incubator-hawq/pull/963

HAWQ-1076. nextval should not error out for permission denied if have…

… USAGE privilege whether optimizer on or off

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/stanlyxiang/incubator-hawq HAWQ-1076

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/963.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #963


commit b3e9bd392428d477284c790f16fabc053d8b02a8
Author: stanlyxiang 
Date:   2016-10-14T03:59:19Z

HAWQ-1076. nextval should not error out for permission denied if have USAGE 
privilege whether optimizer on or off




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #962: HAWQ-1103. Send constant datatype and leng...

2016-10-13 Thread sansanichfb
GitHub user sansanichfb opened a pull request:

https://github.com/apache/incubator-hawq/pull/962

HAWQ-1103. Send constant datatype and length in filter to PXF.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sansanichfb/incubator-hawq HAWQ-1103

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/962.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #962


commit b318fa0e7d038084ded2bc99c6e631aa26700b21
Author: Oleksandr Diachenko 
Date:   2016-10-13T23:00:03Z

HAWQ-1103. Draft implementation without taking care of different charsets.

commit d9f77da1db31e260092fbce603686f92540848f3
Author: Oleksandr Diachenko 
Date:   2016-10-13T23:45:33Z

HAWQ-1103. Updated unit-tests.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (HAWQ-1049) Enhance PXF Service to support AND,OR,NOT logical operators in Predicate Pushdown

2016-10-13 Thread Kavinder Dhaliwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kavinder Dhaliwal updated HAWQ-1049:

Fix Version/s: (was: backlog)

> Enhance PXF Service to support AND,OR,NOT logical operators in Predicate 
> Pushdown
> -
>
> Key: HAWQ-1049
> URL: https://issues.apache.org/jira/browse/HAWQ-1049
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: PXF
>Reporter: Shivram Mani
>Assignee: Kavinder Dhaliwal
>
> Support additional logical operators OR, NOT along with currently supported 
> AND.
> Update the PXF ORC Accessor to support these opearators as well.
> Currently PXF only receives filters as a list of AND expressions. In 
> anticipation of HAWQ-1048, PXF needs to support parsing a filter string that 
> includes AND, OR, and NOT operators. The proposal for doing so is to 
> introduce a special character 'l' for logical operators. With the following 
> operations
> AND='0'
> OR='1'
> NOT='2'
> Thus the filter string
> 'a1c2o0a2c5o2l1' would translate to Column 1 < 1 OR Column 2 > 5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1103) Send constant datatype and length in filter to PXF

2016-10-13 Thread Oleksandr Diachenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Diachenko updated HAWQ-1103:
--
Description: 
As for now HAWQ sends filter string without datatypes and length:

aco...

Idea is to send filter in following format:

acsdo
or 
csdao


This approach takes care of escaping symbols.

  was:
As for now HAWQ sends filter string without datatypes and length:

aco...

Idea is to send filter in following format:

acsdo
or 
csdao


> Send constant datatype and length in filter to PXF
> --
>
> Key: HAWQ-1103
> URL: https://issues.apache.org/jira/browse/HAWQ-1103
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: PXF
>Reporter: Oleksandr Diachenko
>Assignee: Oleksandr Diachenko
> Fix For: backlog
>
>
> As for now HAWQ sends filter string without datatypes and length:
> aco...
> Idea is to send filter in following format:
> acs read>do
> or 
> csd data>ao
> This approach takes care of escaping symbols.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1103) Send constant datatype and length in filter to PXF

2016-10-13 Thread Oleksandr Diachenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Diachenko updated HAWQ-1103:
--
Description: 
As for now HAWQ sends filter string without datatypes and length:

aco...

Idea is to send filter in following format:

acsdo
or 
csdao

  was:
As for now HAWQ sends filter string without datatypes and length:

aco...

Idea is to send filter in following format:

acsd.


> Send constant datatype and length in filter to PXF
> --
>
> Key: HAWQ-1103
> URL: https://issues.apache.org/jira/browse/HAWQ-1103
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: PXF
>Reporter: Oleksandr Diachenko
>Assignee: Oleksandr Diachenko
> Fix For: backlog
>
>
> As for now HAWQ sends filter string without datatypes and length:
> aco...
> Idea is to send filter in following format:
> acs read>do
> or 
> csd data>ao



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-hawq pull request #906: HAWQ-964. Support for OR and NOT Logical O...

2016-10-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq/pull/906


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (HAWQ-1103) Send constant datatype and length in filter to PXF

2016-10-13 Thread Oleksandr Diachenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Diachenko reassigned HAWQ-1103:
-

Assignee: Oleksandr Diachenko  (was: Lei Chang)

> Send constant datatype and length in filter to PXF
> --
>
> Key: HAWQ-1103
> URL: https://issues.apache.org/jira/browse/HAWQ-1103
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: PXF
>Reporter: Oleksandr Diachenko
>Assignee: Oleksandr Diachenko
> Fix For: backlog
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1103) Send constant datatype and length in filter to PXF

2016-10-13 Thread Oleksandr Diachenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Diachenko updated HAWQ-1103:
--
Description: 
As for now HAWQ sends filter string without datatypes and length:

aco...

Idea is to send filter in following format:

acsd.

  was:
As for now HAWQ sends filter string without datatypes and length:

aco...


> Send constant datatype and length in filter to PXF
> --
>
> Key: HAWQ-1103
> URL: https://issues.apache.org/jira/browse/HAWQ-1103
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: PXF
>Reporter: Oleksandr Diachenko
>Assignee: Oleksandr Diachenko
> Fix For: backlog
>
>
> As for now HAWQ sends filter string without datatypes and length:
> aco...
> Idea is to send filter in following format:
> acs read>d.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1103) Send constant datatype and length in filter to PXF

2016-10-13 Thread Oleksandr Diachenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Diachenko updated HAWQ-1103:
--
Description: 
As for now HAWQ sends filter string without datatypes and length:

aco...

> Send constant datatype and length in filter to PXF
> --
>
> Key: HAWQ-1103
> URL: https://issues.apache.org/jira/browse/HAWQ-1103
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: PXF
>Reporter: Oleksandr Diachenko
>Assignee: Oleksandr Diachenko
> Fix For: backlog
>
>
> As for now HAWQ sends filter string without datatypes and length:
> aco...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1103) Send constant datatype and length in filter to PXF

2016-10-13 Thread Oleksandr Diachenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Diachenko updated HAWQ-1103:
--
Fix Version/s: backlog

> Send constant datatype and length in filter to PXF
> --
>
> Key: HAWQ-1103
> URL: https://issues.apache.org/jira/browse/HAWQ-1103
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: PXF
>Reporter: Oleksandr Diachenko
>Assignee: Lei Chang
> Fix For: backlog
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1103) Send constant datatype and length in filter to PXF

2016-10-13 Thread Oleksandr Diachenko (JIRA)
Oleksandr Diachenko created HAWQ-1103:
-

 Summary: Send constant datatype and length in filter to PXF
 Key: HAWQ-1103
 URL: https://issues.apache.org/jira/browse/HAWQ-1103
 Project: Apache HAWQ
  Issue Type: Improvement
  Components: PXF
Reporter: Oleksandr Diachenko
Assignee: Lei Chang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-hawq pull request #959: HAWQ-1084. Fixed memory allocation for tab...

2016-10-13 Thread sansanichfb
Github user sansanichfb closed the pull request at:

https://github.com/apache/incubator-hawq/pull/959


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (HAWQ-1084) Sometimes psql crashes with "out of memory" on \d hcatalog.* command

2016-10-13 Thread Oleksandr Diachenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Diachenko resolved HAWQ-1084.
---
Resolution: Fixed

> Sometimes psql crashes with "out of memory" on \d hcatalog.* command
> 
>
> Key: HAWQ-1084
> URL: https://issues.apache.org/jira/browse/HAWQ-1084
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: PXF
>Reporter: Oleksandr Diachenko
>Assignee: Oleksandr Diachenko
> Fix For: backlog
>
>
> Normally it returns definition of all Hive tables:
> {code}
> PXF Hive Table "default.hive_abc"
>  Column |  Type  
> +
>  t0 | text
>  num1   | int4
>  d1 | float8
>  t1 | text
> PXF Hive Table "default.hive_abc222"
>  Column |  Type  
> +
>  d1 | float8
>  t0 | text
>  num1   | int4
>  t1 | text
> PXF Hive Table "default.hive_avro_table"
>  Column |  Type  
> +
>  t0 | text
>  d1 | float8
>  num1   | int4
>  t1 | text
> PXF Hive Table "default.hive_binary"
>  Column | Type  
> +---
>  b1 | bytea
> PXF Hive Table "default.hive_collections_table"
>  Column |  Type  
> +
>  ut1| text
>  sr1| text
>  m1 | text
>  a1 | text
>  s1 | text
>  f1 | float4
> PXF Hive Table "default.hive_many_partitioned_table"
>  Column |   Type
> +---
>  s1 | text
>  s2 | text
>  n1 | int4
>  d1 | float8
>  dc1| numeric
>  tm | timestamp
>  f  | float4
>  bg | int8
>  b  | bool
>  tn | int2
>  sml| int2
>  dt | date
>  vc1| varchar
>  c1 | bpchar
>  bin| bytea
> PXF Hive Table "default.hive_orc_all_types"
>  Column |   Type
> +---
>  bin| bytea
>  c1 | bpchar
>  vc1| varchar
>  dt | date
>  sml| int2
>  tn | int2
>  b  | bool
>  bg | int8
>  f  | float4
>  tm | timestamp
>  dc1| numeric
>  d1 | float8
>  n1 | int4
>  s2 | text
>  s1 | text
> PXF Hive Table "default.hive_orc_snappy"
>  Column |  Type  
> +
>  t0 | text
>  t1 | text
>  num1   | int4
>  d1 | float8
> PXF Hive Table "default.hive_orc_table"
>  Column |  Type  
> +
>  d1 | float8
>  num1   | int4
>  t1 | text
>  t0 | text
> PXF Hive Table "default.hive_orc_zlib"
>  Column |  Type  
> +
>  t0 | text
>  t1 | text
>  num1   | int4
>  d1 | float8
> PXF Hive Table "default.hive_parquet_table"
>  Column |  Type  
> +
>  d1 | float8
>  num1   | int4
>  t1 | text
>  t0 | text
> PXF Hive Table "default.hive_partitioned_clustered_sorted_table"
>  Column |  Type  
> +
>  fmt| text
>  d1 | float8
>  num1   | int4
>  t1 | text
>  t0 | text
> PXF Hive Table "default.hive_partitioned_clustered_table"
>  Column |  Type  
> +
>  fmt| text
>  d1 | float8
>  num1   | int4
>  t1 | text
>  t0 | text
> PXF Hive Table "default.hive_partitioned_skewed_stored_table"
>  Column |  Type  
> +
>  fmt| text
>  d1 | float8
>  num1   | int4
>  t1 | text
>  t0 | text
> PXF Hive Table "default.hive_partitioned_skewed_table"
>  Column |  Type  
> +
>  fmt| text
>  t0 | text
>  d1 | float8
>  num1   | int4
>  t1 | text
> PXF Hive Table "default.hive_partitioned_table"
>  Column |  Type  
> +
>  t0 | text
>  t1 | text
>  num1   | int4
>  d1 | float8
>  fmt| text
> PXF Hive Table "default.hive_rc_table"
>  Column |  Type  
> +
>  t0 | text
>  d1 | float8
>  num1   | int4
>  t1 | text
> PXF Hive Table "default.hive_rc_table_no_serde"
>  Column |  Type  
> +
>  d1 | float8
>  num1   | int4
>  t1 | text
>  t0 | text
> PXF Hive Table "default.hive_sequence_table"
>  Column |  Type  
> +
>  t0 | text
>  t1 | text
>  num1   | int4
>  d1 | float8
> PXF Hive Table "default.hive_small_data"
>  Column |  Type  
> +
>  d1 | float8
>  n1 | int4
>  s2 | text
>  s1 | text
> PXF Hive Table "default.hive_small_data_no_data_file"
>  Column |  Type  
> +
>  d1 | float8
>  n1 | int4
>  s2 | text
>  s1 | text
> PXF Hive Table "default.hive_table"
>  Column |  Type  
> +
>  s1 | text
>  n1 | int4
>  d1 | float8
>  bg | int8
>  b  | bool
> PXF Hive Table "default.hive_types"
>  Column |   Type
> +---
>  bin| bytea
>  c1 | bpchar
>  vc1| varchar
>  dt | date
>  sml| int2
>  tn | int2
>  b  | bool
>  bg | int8
>  f  | float4
>  tm | timestamp
>  dc1   

[GitHub] incubator-hawq issue #940: HAWQ 1078. Implement hawqsync-falcon DR utility.

2016-10-13 Thread kdunn-pivotal
Github user kdunn-pivotal commented on the issue:

https://github.com/apache/incubator-hawq/pull/940
  
Here is the step-by-step process - it may have some gaps but this is likely 
90% of the steps:

# HAWQSYNC initial setup runbook:

1. Ensure network connectivity between source and DR sites 

| Port  | Function  | Servers   
|
|---|-- 
|   
|
| 11000 | Oozie | From Falcon server in each env to Oozie 
server in other env   |
| 15000 | Falcon| From HAWQ master to Falcon server in other 
env|
| 50010 | Datanode  | From Falcon server & datanodes in each env to 
datanodes in other env  |
| 50070 | Namenode  | From Falcon server to namenodes,(primary and 
standby) in other env|
| 8020  | Namenode  | From datanodes to namenodes (primary and 
standby) other env   |
| 8050  | YARN RM   | From Falcon server in each env to YARN 
ResourceManager server in other env|

2. Install Falcon and Oozie on source and DR HAWQ clusters

3. Make prerequisite directories on both clusters (source, DR):

```
$ sudo su falcon -l -c 'hdfs dfs -mkdir /tmp/{staging,working}'
$ sudo su falcon -l -c 'hdfs dfs -chmod 777 /tmp/staging'
$ sudo su hdfs -l -c 'hdfs dfs -mkdir /apps/data-mirroring/workflows/lib'
$ sudo su hdfs -l -c 'hdfs dfs -chmod -R 777 /apps/data-mirroring'
$ sudo su hdfs -l -c 'hdfs dfs -mkdir /user/falcon && hdfs dfs -chown 
falcon:falcon /user/falcon'
$ sudo su hdfs -l -c 'hdfs dfs -mkdir /user/gpadmin && hdfs dfs -chown 
gpadmin:gpadmin /user/gpadmin/'
```

4. Setup cluster entities for source and DR clusters:

```
gpadmin@source $ curl -H "Content-Type:text/xml" -X POST 
http://:15000/api/entities/submit/cluster?user.name=falcon -d 
'


ftp://sandbox.hortonworks.com:50070"; version="2.2.0"/>


http://sandbox.hortonworks.com:11000/oozie/"; version="4.0.0"/>








'
```

```
gpadmin@dr $ curl -H "Content-Type:text/xml" -X POST 
http://:15000/api/entities/submit/cluster?user.name=falcon -d 
'


ftp://sandbox2.hortonworks.com:50070"; version="2.2.0"/>


http://sandbox2.hortonworks.com:11000/oozie/"; version="4.0.0"/>








'
```

5. Stage distcp-based replication workflow on both source, DR HDFS
```
gpadmin@{source,dr} $ hdfs dfs -cat - 
/apps/data-mirroring/workflows/hdfs-replication-workflow-v2.xml 





${jobTracker}
${nameNode}


mapred.job.priority
${jobPriority}


mapred.job.queue.name
${queueName}


-update
-delete
-m
${distcpMaxMaps}
-bandwidth
${distcpMapBandwidth}
-strategy
dynamic
${drSourceClusterFS}${drSourceDir}
${drTargetClusterFS}${drTargetDir}






Workflow action failed, error 
message[${wf:errorMessage(wf:lastErrorNode())}]




EOF
```

# Sync operation runbook:
1. Run hawqsync-extract to capture known-good HDFS file sizes (protects 
against HDFS / catalog inconsistency if failure during sync)

2. Run ETL batch

3. Run hawqsync-falcon, which performs the following steps:
(source safemode during sync is only allowable if using a remote Falcon to 
"pull", meaning the distcp job executes on the DR site)
  1. Stop both HAWQ masters (source and target)
  2. Archive source MASTER_DATA_DIRECTORY (MDD) tarball to HDFS
  3. Restart source HAWQ master
  4. Enable HDFS safe mode and force source checkpoint
  5. Disable remote HDFS safe mode
  6. Execute Apache 

[GitHub] incubator-hawq pull request #940: HAWQ 1078. Implement hawqsync-falcon DR ut...

2016-10-13 Thread kdunn-pivotal
Github user kdunn-pivotal commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/940#discussion_r83278114
  
--- Diff: tools/bin/hawqsync-falcon ---
@@ -0,0 +1,1331 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import sys
+from optparse import OptionParser
+from subprocess import Popen, PIPE
+from hashlib import md5
+from json import loads
+from time import strftime, sleep, time
+from collections import defaultdict
+# TODO - make use of these common HAWQ libs instead of print
+#from gppylib.gplog import setup_hawq_tool_logging, enable_verbose_logging
+#from gppylib.commands.unix import getLocalHostname, getUserName
+try:
+from xml.etree import cElementTree as ElementTree
+except ImportError, e:
+from xml.etree import ElementTree
+
+def parseargs():
+parser = OptionParser(usage="HAWQ sync options.")
+parser.add_option('-v', '--verbose', action='store_true',
+  default=False)
+parser.add_option("-a", "--prompt", action="store_false",
+  dest="prompt", default=True,
+  help="Execute without prompt.")
+parser.add_option("-l", "--logdir", dest="logDir",
+  help="Sets the directory for log files")
+parser.add_option('-d', '--dryRun', action='store_true',
+  default=False,
+  dest='testMode', help="Execute in test mode")
+parser.add_option('-u', '--user', dest='userName', default="gpadmin",
+  help="The user to own Falcon ACLs and run job as")
+parser.add_option('--maxMaps', dest='distcpMaxMaps',
+  default="10",
+  help="The maximum number of map jobs to allow")
+parser.add_option('--mapBandwidth', dest='distcpMaxMBpsPerMap',
+  default="100",
+  help="The maximum allowable bandwidth for each map 
job, in MB/s")
+parser.add_option('-s', '--sourceNamenode', dest='sourceNamenode',
+  default="",
+  help="The IP or FQDN of the source HDFS namenode")
+parser.add_option('-S', '--sourceEntity', 
dest='sourceClusterEntityName',
+  default="source",
+  help="The Falcon cluster entity name of the source")
+parser.add_option('-m', '--sourceHawqMaster', dest='sourceHawqMaster',
+  default="",
+  help="The IP or FQDN of the source HAWQ master")
+parser.add_option('-M', '--targetHawqMaster', dest='targetHawqMaster',
+  default="",
+  help="The IP or FQDN of the target HAWQ master")
+parser.add_option('-f', '--falconUri', dest='falconUri',
+  default="http://localhost:15000";,
+  help="The URI to use for issuing Falcon REST calls")
+parser.add_option('-t', '--targetNamenode', dest='targetNamenode',
+  default="",
+  help="The IP or FQDN of the source HDFS namenode")
+parser.add_option('-T', '--targetEntity', 
dest='targetClusterEntityName',
+  default="target",
+  help="The Falcon cluster entity name of the target")
+parser.add_option('-e', '--executionEntity',
+  dest='executionClusterEntityName',
+  default="source",
+  help="The Falcon cluster entity name specifying 
where to execute the job")
+parser.add_option('-w', '--workflowHdfsFilename', 
dest='workflowFilename',
+  
default="/apps/data-mirroring/workflows/hdfs-replication-workflow.xml",
+  help="The HDFS location of the underlying Oozie 
workflow to use for sync job")
+parser.add_option('-p', '--pathToSync', dest='pathToSync',
+  

[GitHub] incubator-hawq pull request #957: HAWQ-963. PXF support for IS_NULL and IS_N...

2016-10-13 Thread shivzone
Github user shivzone commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/957#discussion_r83262359
  
--- Diff: 
pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/FilterParser.java ---
@@ -63,7 +63,9 @@
 HDOP_EQ,
 HDOP_NE,
 HDOP_AND,
-HDOP_LIKE
+HDOP_LIKE,
--- End diff --

I don't think we need a separate enum for unary operator. We have separate 
enums currently based on how HAWQ bridge separates a/c/o operators. Also we 
will soon introduce IN and other operators as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #913: HAWQ-1048. Support OR, NOT logical operato...

2016-10-13 Thread shivzone
Github user shivzone commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/913#discussion_r83259075
  
--- Diff: src/backend/access/external/pxffilters.c ---
@@ -153,72 +186,102 @@ Oid pxf_supported_types[] =
CHAROID,
BYTEAOID,
BOOLOID,
-   DATEOID
+   DATEOID,
+   TIMESTAMPOID
 };
 
+static void
+pxf_free_expression_items_list(List *expressionItems, bool 
freeBoolExprNodes)
+{
+   ListCell*lc = NULL;
+   ExpressionItem  *expressionItem = NULL;
+   int previousLength;
+
+   while (list_length(expressionItems) > 0)
+   {
+   expressionItem = (ExpressionItem *) 
lfirst(list_head(expressionItems));
+   if (freeBoolExprNodes && nodeTag(expressionItem->node) == 
T_BoolExpr)
+   {
+   pfree((BoolExpr *)expressionItem->node);
+   }
+   pfree(expressionItem);
+
+   /* to avoid freeing already freed items - delete all 
occurrences of current expression*/
+   previousLength = expressionItems->length + 1;
+   while (expressionItems != NULL && previousLength > 
expressionItems->length)
+   {
+   previousLength = expressionItems->length;
+   expressionItems = list_delete_ptr(expressionItems, 
expressionItem);
+   }
+   }
+}
+
 /*
- * pxf_make_filter_list
+ * pxf_make_expression_items_list
  *
  * Given a scan node qual list, find the filters that are eligible to be 
used
- * by PXF, construct a PxfFilterDesc list that describes the filter 
information,
+ * by PXF, construct an expressions list, which consists of OpExpr or 
BoolExpr nodes
  * and return it to the caller.
  *
- * Caller is responsible for pfreeing the returned PxfFilterDesc List.
+ * Basically this function just transforms expression tree to Reversed 
Polish Notation list.
+ *
+ *
  */
 static List *
-pxf_make_filter_list(List *quals)
+pxf_make_expression_items_list(List *quals, Node *parent, bool 
*logicalOpsNum)
 {
+   ExpressionItem *expressionItem = NULL;
List*result = NIL;
ListCell*lc = NULL;
+   ListCell*ilc = NULL;

if (list_length(quals) == 0)
return NIL;
 
-   /*
-* Iterate over all implicitly ANDed qualifiers and add the ones
-* that are supported for push-down into the result filter list.
-*/
foreach (lc, quals)
{
Node *node = (Node *) lfirst(lc);
NodeTag tag = nodeTag(node);
+   expressionItem = (ExpressionItem *) 
palloc0(sizeof(ExpressionItem));
+   expressionItem->node = node;
+   expressionItem->parent = parent;
+   expressionItem->processed = false;
 
switch (tag)
{
case T_OpExpr:
{
-   OpExpr  *expr   = (OpExpr *) 
node;
-   PxfFilterDesc   *filter;
-
-   filter = (PxfFilterDesc *) 
palloc0(sizeof(PxfFilterDesc));
-   elog(DEBUG5, "pxf_make_filter_list: node tag %d 
(T_OpExpr)", tag);
-
-   if (opexpr_to_pxffilter(expr, filter))
-   result = lappend(result, filter);
-   else
-   pfree(filter);
-
+   result = lappend(result, expressionItem);
break;
}
case T_BoolExpr:
{
+   (*logicalOpsNum)++;
BoolExpr*expr = (BoolExpr *) node;
-   BoolExprType boolType = expr->boolop;
-   elog(DEBUG5, "pxf_make_filter_list: node tag %d 
(T_BoolExpr), bool node type %d %s",
-   tag, boolType, 
boolType==AND_EXPR ? "(AND_EXPR)" : "");
+   List *inner_result = 
pxf_make_expression_items_list(expr->args, node, logicalOpsNum);
+   result = list_concat(result, inner_result);
+
+   int childNodesNum = 0;
 
-   /* only AND_EXPR is supported */
-   if (expr->boolop == AND_EXPR)
+   /* Find number of child nodes on first level*/
+   foreach (ilc, inner_result)
{
-   List *inner_result = 
pxf_make_filter_list(expr->args);

[GitHub] incubator-hawq pull request #913: HAWQ-1048. Support OR, NOT logical operato...

2016-10-13 Thread shivzone
Github user shivzone commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/913#discussion_r82915549
  
--- Diff: src/backend/access/external/pxffilters.c ---
@@ -153,72 +186,102 @@ Oid pxf_supported_types[] =
CHAROID,
BYTEAOID,
BOOLOID,
-   DATEOID
+   DATEOID,
+   TIMESTAMPOID
 };
 
+static void
+pxf_free_expression_items_list(List *expressionItems, bool 
freeBoolExprNodes)
+{
+   ListCell*lc = NULL;
+   ExpressionItem  *expressionItem = NULL;
+   int previousLength;
+
+   while (list_length(expressionItems) > 0)
+   {
+   expressionItem = (ExpressionItem *) 
lfirst(list_head(expressionItems));
+   if (freeBoolExprNodes && nodeTag(expressionItem->node) == 
T_BoolExpr)
+   {
+   pfree((BoolExpr *)expressionItem->node);
+   }
+   pfree(expressionItem);
+
+   /* to avoid freeing already freed items - delete all 
occurrences of current expression*/
+   previousLength = expressionItems->length + 1;
+   while (expressionItems != NULL && previousLength > 
expressionItems->length)
+   {
+   previousLength = expressionItems->length;
+   expressionItems = list_delete_ptr(expressionItems, 
expressionItem);
+   }
+   }
+}
+
 /*
- * pxf_make_filter_list
+ * pxf_make_expression_items_list
  *
  * Given a scan node qual list, find the filters that are eligible to be 
used
- * by PXF, construct a PxfFilterDesc list that describes the filter 
information,
+ * by PXF, construct an expressions list, which consists of OpExpr or 
BoolExpr nodes
--- End diff --

We are no longer checking for the validity of the filter in this function 
right. It is now checked in pxf_serialize_filter_list. So remove this comment.
You don't have to mention in comment the types of expr supported as this 
list will be growing with time



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (HAWQ-1102) Expand PXF HBase Filter Data Type support

2016-10-13 Thread Kavinder Dhaliwal (JIRA)
Kavinder Dhaliwal created HAWQ-1102:
---

 Summary: Expand PXF HBase Filter Data Type support
 Key: HAWQ-1102
 URL: https://issues.apache.org/jira/browse/HAWQ-1102
 Project: Apache HAWQ
  Issue Type: Improvement
  Components: PXF
Reporter: Kavinder Dhaliwal
Assignee: Lei Chang


PXF's HBase profile only supports a limited set of data types for filter 
pushdown. With changes in the PXF Bridge (HAWQ-1048) filters now include a 
wider range of data types. PXF should support these data types when setting 
comparator objects for HBase filtering



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1101) PXF Hive Partition Filter for filters with OR and NOT

2016-10-13 Thread Kavinder Dhaliwal (JIRA)
Kavinder Dhaliwal created HAWQ-1101:
---

 Summary: PXF Hive Partition Filter for filters with OR and NOT
 Key: HAWQ-1101
 URL: https://issues.apache.org/jira/browse/HAWQ-1101
 Project: Apache HAWQ
  Issue Type: Improvement
  Components: PXF
Reporter: Kavinder Dhaliwal
Assignee: Lei Chang


PXF currently only supports partition filtering based on the user query when 
clauses are joined only by AND. With support for pushing down OR and NOT 
(HAWQ-964) PXF should correctly filter partitions even when the query has a OR 
or NOT condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-hawq issue #961: HAWQ-1099. Output yaml file should not contain Bu...

2016-10-13 Thread xunzhang
Github user xunzhang commented on the issue:

https://github.com/apache/incubator-hawq/pull/961
  
@ictmalili done, please review again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (HAWQ-1100) Support Decimal Values in PXF Filter

2016-10-13 Thread Kavinder Dhaliwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kavinder Dhaliwal reassigned HAWQ-1100:
---

Assignee: Kavinder Dhaliwal  (was: Lei Chang)

> Support Decimal Values in PXF Filter
> 
>
> Key: HAWQ-1100
> URL: https://issues.apache.org/jira/browse/HAWQ-1100
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: PXF
>Reporter: Kavinder Dhaliwal
>Assignee: Kavinder Dhaliwal
>
> Currently PXF's filter parser either assumes that the constant values in a 
> filter are either a String or a Long. With changes in the PXF Bridge 
> (HAWQ-1048) new numeric types are being passed in the filter string so PXF 
> must handle these appropriately



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1100) Support Decimal Values in PXF Filter

2016-10-13 Thread Kavinder Dhaliwal (JIRA)
Kavinder Dhaliwal created HAWQ-1100:
---

 Summary: Support Decimal Values in PXF Filter
 Key: HAWQ-1100
 URL: https://issues.apache.org/jira/browse/HAWQ-1100
 Project: Apache HAWQ
  Issue Type: Improvement
  Components: PXF
Reporter: Kavinder Dhaliwal
Assignee: Lei Chang


Currently PXF's filter parser either assumes that the constant values in a 
filter are either a String or a Long. With changes in the PXF Bridge 
(HAWQ-1048) new numeric types are being passed in the filter string so PXF must 
handle these appropriately



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-hawq pull request #961: HAWQ-1099. Output yaml file should not con...

2016-10-13 Thread ictmalili
Github user ictmalili commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/961#discussion_r83187014
  
--- Diff: tools/bin/hawqregister ---
@@ -60,14 +60,18 @@ def option_parser():
 
 def register_yaml_dict_check(D, table_column_num, src_tablename):
 '''check exists'''
-check_list = ['DFS_URL', 'Distribution_Policy', 'FileFormat', 
'TableName', 'Bucketnum']
+check_list = ['DFS_URL', 'Distribution_Policy', 'FileFormat', 
'TableName']
 for attr in check_list:
 if D.get(attr) is None:
 logger.error('Wrong configuration yaml file format: "%s" 
attribute does not exist.\n See example in "hawq register --help".' % attr)
 sys.exit(1)
-if D['Bucketnum'] <= 0:
-logger.error('Bucketnum should not be zero, please check your yaml 
configuration file.')
-sys.exit(1)
+if D['Distribution_Policy'].startswith('DISTRIBUTED BY'):
+if D.get('Bucketnum') is None:
+logger.error('Wrong configuration yaml file format: "%s" 
attribute does not exist.\n See example in "hawq register --help".' % attr)
+sys.exit(1)
+if D['Bucketnum'] <= 0:
+logger.error('Bucketnum should not be zero, please check your 
yaml configuration file.')
--- End diff --

Could we modify the error message.  "Should be a positive integer"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #961: HAWQ-1099. Output yaml file should not con...

2016-10-13 Thread ictmalili
Github user ictmalili commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/961#discussion_r83187352
  
--- Diff: tools/bin/hawqregister ---
@@ -432,7 +436,8 @@ class HawqRegister(object):
 if len(params[Format_FileLocations]['Files']):
 files, sizes = [params['DFS_URL'] + d['path'] for d in 
params[Format_FileLocations]['Files']], [d['size'] for d in 
params[Format_FileLocations]['Files']]
 encoding = params['Encoding']
-self._set_yml_dataa(Format, files, sizes, params['TableName'], 
params['%s_Schema' % Format], params['Distribution_Policy'], 
params[Format_FileLocations], params['Bucketnum'], partitionby,\
+bucketNum = params['Bucketnum'] if 
params['Distribution_Policy'].startswith('DISTRIBUTED BY') else 6
--- End diff --

I don't think the else should be "6", we need get the default bucket number 
from HAWQ (GUC: default_hash_table_bucket_number), or we don't need assign 
bucketnum here if we want to create table, i.e, don't specify in creating table 
DDL.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #961: HAWQ-1099. Output yaml file should not con...

2016-10-13 Thread xunzhang
GitHub user xunzhang opened a pull request:

https://github.com/apache/incubator-hawq/pull/961

HAWQ-1099. Output yaml file should not contain Bucketnum attribute with 
random distributed table.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xunzhang/incubator-hawq HAWQ-1099

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/961.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #961


commit e357f803638d68e12bb6d1e2af89e4de681a97e8
Author: xunzhang 
Date:   2016-10-13T10:25:40Z

HAWQ-1099. Output yaml file should not contain Bucketnum attribute with 
random distributed table.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq issue #961: HAWQ-1099. Output yaml file should not contain Bu...

2016-10-13 Thread xunzhang
Github user xunzhang commented on the issue:

https://github.com/apache/incubator-hawq/pull/961
  
cc @wcl14 @ictmalili 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #940: HAWQ 1078. Implement hawqsync-falcon DR ut...

2016-10-13 Thread ictmalili
Github user ictmalili commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/940#discussion_r83185244
  
--- Diff: tools/bin/hawqsync-falcon ---
@@ -0,0 +1,1331 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import sys
+from optparse import OptionParser
+from subprocess import Popen, PIPE
+from hashlib import md5
+from json import loads
+from time import strftime, sleep, time
+from collections import defaultdict
+# TODO - make use of these common HAWQ libs instead of print
+#from gppylib.gplog import setup_hawq_tool_logging, enable_verbose_logging
+#from gppylib.commands.unix import getLocalHostname, getUserName
+try:
+from xml.etree import cElementTree as ElementTree
+except ImportError, e:
+from xml.etree import ElementTree
+
+def parseargs():
+parser = OptionParser(usage="HAWQ sync options.")
+parser.add_option('-v', '--verbose', action='store_true',
+  default=False)
+parser.add_option("-a", "--prompt", action="store_false",
+  dest="prompt", default=True,
+  help="Execute without prompt.")
+parser.add_option("-l", "--logdir", dest="logDir",
+  help="Sets the directory for log files")
+parser.add_option('-d', '--dryRun', action='store_true',
+  default=False,
+  dest='testMode', help="Execute in test mode")
+parser.add_option('-u', '--user', dest='userName', default="gpadmin",
+  help="The user to own Falcon ACLs and run job as")
+parser.add_option('--maxMaps', dest='distcpMaxMaps',
+  default="10",
+  help="The maximum number of map jobs to allow")
+parser.add_option('--mapBandwidth', dest='distcpMaxMBpsPerMap',
+  default="100",
+  help="The maximum allowable bandwidth for each map 
job, in MB/s")
+parser.add_option('-s', '--sourceNamenode', dest='sourceNamenode',
+  default="",
+  help="The IP or FQDN of the source HDFS namenode")
+parser.add_option('-S', '--sourceEntity', 
dest='sourceClusterEntityName',
+  default="source",
+  help="The Falcon cluster entity name of the source")
+parser.add_option('-m', '--sourceHawqMaster', dest='sourceHawqMaster',
+  default="",
+  help="The IP or FQDN of the source HAWQ master")
+parser.add_option('-M', '--targetHawqMaster', dest='targetHawqMaster',
+  default="",
+  help="The IP or FQDN of the target HAWQ master")
+parser.add_option('-f', '--falconUri', dest='falconUri',
+  default="http://localhost:15000";,
+  help="The URI to use for issuing Falcon REST calls")
+parser.add_option('-t', '--targetNamenode', dest='targetNamenode',
+  default="",
+  help="The IP or FQDN of the source HDFS namenode")
+parser.add_option('-T', '--targetEntity', 
dest='targetClusterEntityName',
+  default="target",
+  help="The Falcon cluster entity name of the target")
+parser.add_option('-e', '--executionEntity',
+  dest='executionClusterEntityName',
+  default="source",
+  help="The Falcon cluster entity name specifying 
where to execute the job")
+parser.add_option('-w', '--workflowHdfsFilename', 
dest='workflowFilename',
+  
default="/apps/data-mirroring/workflows/hdfs-replication-workflow.xml",
+  help="The HDFS location of the underlying Oozie 
workflow to use for sync job")
+parser.add_option('-p', '--pathToSync', dest='pathToSync',
+  

[GitHub] incubator-hawq pull request #940: HAWQ 1078. Implement hawqsync-falcon DR ut...

2016-10-13 Thread ictmalili
Github user ictmalili commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/940#discussion_r83184770
  
--- Diff: tools/bin/hawqsync-falcon ---
@@ -0,0 +1,1331 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import sys
+from optparse import OptionParser
+from subprocess import Popen, PIPE
+from hashlib import md5
+from json import loads
+from time import strftime, sleep, time
+from collections import defaultdict
+# TODO - make use of these common HAWQ libs instead of print
+#from gppylib.gplog import setup_hawq_tool_logging, enable_verbose_logging
+#from gppylib.commands.unix import getLocalHostname, getUserName
+try:
+from xml.etree import cElementTree as ElementTree
+except ImportError, e:
+from xml.etree import ElementTree
+
+def parseargs():
+parser = OptionParser(usage="HAWQ sync options.")
+parser.add_option('-v', '--verbose', action='store_true',
+  default=False)
+parser.add_option("-a", "--prompt", action="store_false",
+  dest="prompt", default=True,
+  help="Execute without prompt.")
+parser.add_option("-l", "--logdir", dest="logDir",
+  help="Sets the directory for log files")
+parser.add_option('-d', '--dryRun', action='store_true',
+  default=False,
+  dest='testMode', help="Execute in test mode")
+parser.add_option('-u', '--user', dest='userName', default="gpadmin",
+  help="The user to own Falcon ACLs and run job as")
+parser.add_option('--maxMaps', dest='distcpMaxMaps',
+  default="10",
+  help="The maximum number of map jobs to allow")
+parser.add_option('--mapBandwidth', dest='distcpMaxMBpsPerMap',
+  default="100",
+  help="The maximum allowable bandwidth for each map 
job, in MB/s")
+parser.add_option('-s', '--sourceNamenode', dest='sourceNamenode',
+  default="",
+  help="The IP or FQDN of the source HDFS namenode")
+parser.add_option('-S', '--sourceEntity', 
dest='sourceClusterEntityName',
+  default="source",
+  help="The Falcon cluster entity name of the source")
+parser.add_option('-m', '--sourceHawqMaster', dest='sourceHawqMaster',
+  default="",
+  help="The IP or FQDN of the source HAWQ master")
+parser.add_option('-M', '--targetHawqMaster', dest='targetHawqMaster',
+  default="",
+  help="The IP or FQDN of the target HAWQ master")
+parser.add_option('-f', '--falconUri', dest='falconUri',
+  default="http://localhost:15000";,
+  help="The URI to use for issuing Falcon REST calls")
+parser.add_option('-t', '--targetNamenode', dest='targetNamenode',
+  default="",
+  help="The IP or FQDN of the source HDFS namenode")
+parser.add_option('-T', '--targetEntity', 
dest='targetClusterEntityName',
+  default="target",
+  help="The Falcon cluster entity name of the target")
+parser.add_option('-e', '--executionEntity',
+  dest='executionClusterEntityName',
+  default="source",
+  help="The Falcon cluster entity name specifying 
where to execute the job")
+parser.add_option('-w', '--workflowHdfsFilename', 
dest='workflowFilename',
+  
default="/apps/data-mirroring/workflows/hdfs-replication-workflow.xml",
+  help="The HDFS location of the underlying Oozie 
workflow to use for sync job")
+parser.add_option('-p', '--pathToSync', dest='pathToSync',
+  

[GitHub] incubator-hawq pull request #940: HAWQ 1078. Implement hawqsync-falcon DR ut...

2016-10-13 Thread ictmalili
Github user ictmalili commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/940#discussion_r83184402
  
--- Diff: tools/bin/hawqsync-falcon ---
@@ -0,0 +1,1331 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import sys
+from optparse import OptionParser
+from subprocess import Popen, PIPE
+from hashlib import md5
+from json import loads
+from time import strftime, sleep, time
+from collections import defaultdict
+# TODO - make use of these common HAWQ libs instead of print
+#from gppylib.gplog import setup_hawq_tool_logging, enable_verbose_logging
+#from gppylib.commands.unix import getLocalHostname, getUserName
+try:
+from xml.etree import cElementTree as ElementTree
+except ImportError, e:
+from xml.etree import ElementTree
+
+def parseargs():
+parser = OptionParser(usage="HAWQ sync options.")
+parser.add_option('-v', '--verbose', action='store_true',
+  default=False)
+parser.add_option("-a", "--prompt", action="store_false",
+  dest="prompt", default=True,
+  help="Execute without prompt.")
+parser.add_option("-l", "--logdir", dest="logDir",
+  help="Sets the directory for log files")
+parser.add_option('-d', '--dryRun', action='store_true',
+  default=False,
+  dest='testMode', help="Execute in test mode")
+parser.add_option('-u', '--user', dest='userName', default="gpadmin",
+  help="The user to own Falcon ACLs and run job as")
+parser.add_option('--maxMaps', dest='distcpMaxMaps',
+  default="10",
+  help="The maximum number of map jobs to allow")
+parser.add_option('--mapBandwidth', dest='distcpMaxMBpsPerMap',
+  default="100",
+  help="The maximum allowable bandwidth for each map 
job, in MB/s")
+parser.add_option('-s', '--sourceNamenode', dest='sourceNamenode',
+  default="",
+  help="The IP or FQDN of the source HDFS namenode")
+parser.add_option('-S', '--sourceEntity', 
dest='sourceClusterEntityName',
+  default="source",
+  help="The Falcon cluster entity name of the source")
+parser.add_option('-m', '--sourceHawqMaster', dest='sourceHawqMaster',
+  default="",
+  help="The IP or FQDN of the source HAWQ master")
+parser.add_option('-M', '--targetHawqMaster', dest='targetHawqMaster',
+  default="",
+  help="The IP or FQDN of the target HAWQ master")
+parser.add_option('-f', '--falconUri', dest='falconUri',
+  default="http://localhost:15000";,
+  help="The URI to use for issuing Falcon REST calls")
+parser.add_option('-t', '--targetNamenode', dest='targetNamenode',
+  default="",
+  help="The IP or FQDN of the source HDFS namenode")
+parser.add_option('-T', '--targetEntity', 
dest='targetClusterEntityName',
+  default="target",
+  help="The Falcon cluster entity name of the target")
+parser.add_option('-e', '--executionEntity',
+  dest='executionClusterEntityName',
+  default="source",
+  help="The Falcon cluster entity name specifying 
where to execute the job")
+parser.add_option('-w', '--workflowHdfsFilename', 
dest='workflowFilename',
+  
default="/apps/data-mirroring/workflows/hdfs-replication-workflow.xml",
+  help="The HDFS location of the underlying Oozie 
workflow to use for sync job")
+parser.add_option('-p', '--pathToSync', dest='pathToSync',
+  

[jira] [Resolved] (HAWQ-1098) build error when "configure --prefix" with different directory without running "make distclean" previously

2016-10-13 Thread Ming LI (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming LI resolved HAWQ-1098.
---
Resolution: Fixed

> build error when "configure --prefix" with different directory without 
> running "make distclean" previously
> --
>
> Key: HAWQ-1098
> URL: https://issues.apache.org/jira/browse/HAWQ-1098
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Build
>Reporter: Ming LI
>Assignee: Ming LI
> Fix For: backlog
>
>
> Some customer report at:
> http://stackoverflow.com/questions/39217467/hawq-installation-on-redhat
> If "configure --prefix" with different directory without running "make 
> distclean" previously, it will report building error:
> {code}
> ld: warning: directory not found for option 
> '-L/Users/gpadmin/workspace/hawq2/apache-hawq/depends/libhdfs3/build/install/Users/gpadmin/workspace/hawq2/hawq-db-devel3/lib'
> ld: warning: directory not found for option 
> '-L/Users/gpadmin/workspace/hawq2/apache-hawq/depends/libyarn/build/install/Users/gpadmin/workspace/hawq2/hawq-db-devel3/lib'
> ld: library not found for -lhdfs3
> clang: error: linker command failed with exit code 1 (use -v to see 
> invocation)
> make[2]: *** [postgres] Error 1
> make[1]: *** [all] Error 2
> make: *** [all] Error 2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-hawq pull request #960: HAWQ-1098. Fixed building error when "conf...

2016-10-13 Thread liming01
Github user liming01 closed the pull request at:

https://github.com/apache/incubator-hawq/pull/960


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (HAWQ-1099) Output yaml file should not contain Bucketnum attribute with random distributed table

2016-10-13 Thread hongwu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hongwu reassigned HAWQ-1099:


Assignee: hongwu  (was: Lei Chang)

> Output yaml file should not contain Bucketnum attribute with random 
> distributed table
> -
>
> Key: HAWQ-1099
> URL: https://issues.apache.org/jira/browse/HAWQ-1099
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: hongwu
>Assignee: hongwu
>
> Output yaml file should not contain Bucketnum attribute with random 
> distributed table. And hawq register should check Bucketnum attribute only 
> with hash randomly distributed table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1099) Output yaml file should not contain Bucketnum attribute with random distributed table

2016-10-13 Thread hongwu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hongwu updated HAWQ-1099:
-
Issue Type: Sub-task  (was: Bug)
Parent: HAWQ-991

> Output yaml file should not contain Bucketnum attribute with random 
> distributed table
> -
>
> Key: HAWQ-1099
> URL: https://issues.apache.org/jira/browse/HAWQ-1099
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: hongwu
>Assignee: Lei Chang
>
> Output yaml file should not contain Bucketnum attribute with random 
> distributed table. And hawq register should check Bucketnum attribute only 
> with hash randomly distributed table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1099) Output yaml file should not contain Bucketnum attribute with random distributed table

2016-10-13 Thread hongwu (JIRA)
hongwu created HAWQ-1099:


 Summary: Output yaml file should not contain Bucketnum attribute 
with random distributed table
 Key: HAWQ-1099
 URL: https://issues.apache.org/jira/browse/HAWQ-1099
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Command Line Tools
Reporter: hongwu
Assignee: Lei Chang


Output yaml file should not contain Bucketnum attribute with random distributed 
table. And hawq register should check Bucketnum attribute only with hash 
randomly distributed table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)