source IP restriction on routes

2016-10-17 Thread Sebastian Wieseler
Hi guys,

Is it possible with router (s, sharding) to restrict access on IP level?

We want to expose various applications via various routers, but
restrict access via source IP addresses,
so that different source IP addresses can only access allowed applications.

How can we do that?

Thanks a lot in advance.
Greetings,
  Sebastian

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: access restrictions to private apps

2016-05-15 Thread Sebastian Wieseler
Hi Aleks,
Thanks a lot for your reply.

Can you please give me an example how to use ROUTE_LABELS I’m a bit lost.

Thanks and Greetings,
   Sebastian


On 11 May 2016, at 3:55 PM, Aleksandar Lazic 
<aleksandar.la...@cloudwerkstatt.com<mailto:aleksandar.la...@cloudwerkstatt.com>>
 wrote:

Hi Sebastian.

you have two options from my point of view.

.) create own haproxy image and config
https://docs.openshift.org/latest/install_config/install/deploy_router.html#deploying-a-customized-haproxy-router

.) use a internal router with ROUTE_LABELS
https://github.com/openshift/origin/blob/388478c40e751c4295dcb9a44dd69e5ac65d0e3b/pkg/cmd/infra/router/router.go#L53

Best regards
Aleks


From: users-boun...@lists.openshift.redhat.com 
<users-boun...@lists.openshift.redhat.com> on behalf of Sebastian Wieseler 
<sebast...@myrepublic.com.sg>
Sent: Wednesday, May 11, 2016 05:31
To: users
Subject: access restrictions to private apps

Dear community,

Our current setup is *.my.wildcard.domain.example.com -> Load Balancer -> 
{Master1, Master2, Master3}
with the router pods deployed on the master nodes.

Is it possible to allow only app1.my.wildcard.domain.example.com and 
app2.my.wildcard.domain.example.com from the outside (0.0.0.0/0)
and for the rest (*.my.wildcard.domain.example.com) restrict it to pre-defined 
IP addresses?

How could we implement those restrictions?
What are best practices to allow only certain IPs to certain applications?


Thanks a lot in advance.
Greetings,
  Sebastian



___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: aggregate logging: kibana error

2016-04-20 Thread Sebastian Wieseler
Hi Eric,
I think I got it.
DNS was blocked on the master nodes. After I allowed it ES is not throwing the 
resolving warning anymore and Kibana boots up as expected.

Thanks one more time for your help!
Greetings,
   Sebastian



On 20 Apr 2016, at 9:54 PM, Eric Wolinetz 
<ewoli...@redhat.com<mailto:ewoli...@redhat.com>> wrote:

Hi Sebastian,

Your Elasticsearch instance does not seem to have started up completely within 
the pod you showed logs for.  Kibana will fail to start up if it is unable to 
reach its Elasticsearch instance after a certain period of time.

Can you send some more of your Elasticsearch logs?  It looks like its currently 
recovering/initializing.  Do you see any different ERROR messages within there?

Your Fluentd errors looks to be something else. What does the following look 
like?
$ oc describe pod -l component=fluentd

On Wed, Apr 20, 2016 at 2:54 AM, Sebastian Wieseler 
<sebast...@myrepublic.com.sg<mailto:sebast...@myrepublic.com.sg>> wrote:
Dear community,
I followed the guide 
https://docs.openshift.org/latest/install_config/aggregate_logging.html

NAME  READY STATUSRESTARTS   AGE
logging-kibana-1-uwob11/2   Error 12 43m


$ oc logs logging-kibana-1-uwob1  -c kibana
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":50,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:16:15.760Z","v":0}
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":60,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:16:15.762Z","v":0}
[root@MRNZ-TS8-OC-MASTER-01 glusterfs]# oc logs logging-kibana-1-uwob1  -c 
kibana
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":50,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:38:40.789Z","v":0}
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":60,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:38:40.790Z","v":0}


Elastic search pod is running, but the log shows:
[2016-04-20 
06:57:03,910][ERROR][io.fabric8.elasticsearch.plugin.acl.DynamicACLFilter] 
[Baphomet] Exception encountered when seeding initial ACL
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: 
[SERVICE_UNAVAILABLE/1/state not recovered / initialized];
at 
org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:151)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.checkGlobalBlock(TransportShardSingleOperationAction.java:103)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:132)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:116)

Fluent pod is running too, but the log shows:
2016-04-20 07:47:18 + [error]: fluentd main process died unexpectedly. 
restarting.
2016-04-20 07:47:48 + [error]: unexpected error error="getaddrinfo: Name or 
service not known"
  2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:878:in 
`initialize'
  2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:878:in `open'
  201

Re: aggregate logging: kibana error

2016-04-20 Thread Sebastian Wieseler
 ] [Baphomet] 
[.operations.2016.04.21] update_mapping [fluentd] (dynamic)
[2016-04-21 00:03:18,723][INFO ][cluster.metadata ] [Baphomet] 
[.operations.2016.04.21] update_mapping [fluentd] (dynamic)
[2016-04-21 00:06:34,760][INFO ][cluster.metadata ] [Baphomet] 
[.operations.2016.04.21] update_mapping [fluentd] (dynamic)
[2016-04-21 00:07:21,827][INFO ][cluster.metadata ] [Baphomet] 
[logging.2016.04.21] creating index, cause [auto(bulk api)], templates [], 
shards [5]/[1], mappings [fluentd]
[2016-04-21 00:07:21,911][INFO ][cluster.metadata ] [Baphomet] 
[logging.2016.04.21] update_mapping [fluentd] (dynamic)

Fluentd

Name: logging-fluentd-6c6bt
Namespace: logging
Node: x/x.x.x.x
Start Time: Wed, 20 Apr 2016 06:37:13 +
Labels: component=fluentd,provider=openshift
Status: Running
IP:  x.x.x.x
Controllers: DaemonSet/logging-fluentd
Containers:
  fluentd-elasticsearch:
Container ID: 
docker://2cbe9d5813c3d771c4fc10c92be61136bb43e11f85fc9c117c64ca5e42060627
Image: docker.io/openshift/origin-logging-fluentd:latest
Image ID: 
docker://f841fe531e980e970d69b353bf49ec8f69dd13f88dccfdaff578d91f0fa58e63
Port:
QoS Tier:
  cpu: Guaranteed
  memory: BestEffort
Limits:
  cpu: 100m
Requests:
  cpu: 100m
State: Running
  Started: Wed, 20 Apr 2016 07:43:35 +
Ready: True
Restart Count: 0
Environment Variables:
  K8S_HOST_URL: https://kubernetes.default.svc.cluster.local
  ES_HOST: logging-es
  ES_PORT: 9200
  ES_CLIENT_CERT: /etc/fluent/keys/cert
  ES_CLIENT_KEY: /etc/fluent/keys/key
  ES_CA: /etc/fluent/keys/ca
  OPS_HOST: logging-es
  OPS_PORT: 9200
  OPS_CLIENT_CERT: /etc/fluent/keys/cert
  OPS_CLIENT_KEY: /etc/fluent/keys/key
  OPS_CA: /etc/fluent/keys/ca
  ES_COPY: false
  ES_COPY_HOST:
  ES_COPY_PORT:
  ES_COPY_SCHEME: https
  ES_COPY_CLIENT_CERT:
  ES_COPY_CLIENT_KEY:
  ES_COPY_CA:
  ES_COPY_USERNAME:
  ES_COPY_PASSWORD:
  OPS_COPY_HOST:
  OPS_COPY_PORT:
  OPS_COPY_SCHEME: https
  OPS_COPY_CLIENT_CERT:
  OPS_COPY_CLIENT_KEY:
  OPS_COPY_CA:
  OPS_COPY_USERNAME:
  OPS_COPY_PASSWORD:
Conditions:
  Type Status
  Ready  True
Volumes:
  varlog:
Type: HostPath (bare host directory volume)
Path: /var/log
  varlibdockercontainers:
Type: HostPath (bare host directory volume)
Path: /var/lib/docker/containers
  certs:
Type: Secret (a volume populated by a Secret)
SecretName: logging-fluentd
  dockerhostname:
Type: HostPath (bare host directory volume)
Path: /etc/hostname
  aggregated-logging-fluentd-token-f7ysx:
Type: Secret (a volume populated by a Secret)
SecretName: aggregated-logging-fluentd-token-f7ysx
No events.



On 20 Apr 2016, at 9:54 PM, Eric Wolinetz 
<ewoli...@redhat.com<mailto:ewoli...@redhat.com>> wrote:

Hi Sebastian,

Your Elasticsearch instance does not seem to have started up completely within 
the pod you showed logs for.  Kibana will fail to start up if it is unable to 
reach its Elasticsearch instance after a certain period of time.

Can you send some more of your Elasticsearch logs?  It looks like its currently 
recovering/initializing.  Do you see any different ERROR messages within there?

Your Fluentd errors looks to be something else. What does the following look 
like?
$ oc describe pod -l component=fluentd

On Wed, Apr 20, 2016 at 2:54 AM, Sebastian Wieseler 
<sebast...@myrepublic.com.sg<mailto:sebast...@myrepublic.com.sg>> wrote:
Dear community,
I followed the guide 
https://docs.openshift.org/latest/install_config/aggregate_logging.html

NAME  READY STATUSRESTARTS   AGE
logging-kibana-1-uwob11/2   Error 12 43m


$ oc logs logging-kibana-1-uwob1  -c kibana
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":50,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:16:15.760Z","v":0}
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":60,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:16:15.762Z

aggregate logging: kibana error

2016-04-20 Thread Sebastian Wieseler
Dear community,
I followed the guide 
https://docs.openshift.org/latest/install_config/aggregate_logging.html

NAME  READY STATUSRESTARTS   AGE
logging-kibana-1-uwob11/2   Error 12 43m


$ oc logs logging-kibana-1-uwob1  -c kibana
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":50,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:16:15.760Z","v":0}
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":60,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:16:15.762Z","v":0}
[root@MRNZ-TS8-OC-MASTER-01 glusterfs]# oc logs logging-kibana-1-uwob1  -c 
kibana
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":50,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:38:40.789Z","v":0}
{"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":60,"err":{"message":"Request
 Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after 
5000ms\nat null. 
(/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
at Timer.listOnTimeout [as ontimeout] 
(timers.js:112:15)"},"msg":"","time":"2016-04-20T07:38:40.790Z","v":0}


Elastic search pod is running, but the log shows:
[2016-04-20 
06:57:03,910][ERROR][io.fabric8.elasticsearch.plugin.acl.DynamicACLFilter] 
[Baphomet] Exception encountered when seeding initial ACL
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: 
[SERVICE_UNAVAILABLE/1/state not recovered / initialized];
at 
org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:151)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.checkGlobalBlock(TransportShardSingleOperationAction.java:103)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:132)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:116)

Fluent pod is running too, but the log shows:
2016-04-20 07:47:18 + [error]: fluentd main process died unexpectedly. 
restarting.
2016-04-20 07:47:48 + [error]: unexpected error error="getaddrinfo: Name or 
service not known"
  2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:878:in 
`initialize'
  2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:878:in `open'
  2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:878:in `block 
in connect'
  2016-04-20 07:47:48 + [error]: /usr/share/ruby/timeout.rb:52:in `timeout'
  2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:877:in 
`connect'
  2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:862:in 
`do_start'
  2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:851:in `start'
  2016-04-20 07:47:48 + [error]: 
/opt/app-root/src/gems/rest-client-1.8.0/lib/restclient/request.rb:413:in 
`transmit'
  2016-04-20 07:47:48 + [error]: 
/opt/app-root/src/gems/rest-client-1.8.0/lib/restclient/request.rb:176:in 
`execute'
  2016-04-20 07:47:48 + [error]: 
/opt/app-root/src/gems/rest-client-1.8.0/lib/restclient/request.rb:41:in 
`execute'
  2016-04-20 07:47:48 + [error]: 
/opt/app-root/src/gems/rest-client-1.8.0/lib/restclient/resource.rb:51:in `get'
  2016-04-20 07:47:48 + [error]: 
/opt/app-root/src/gems/kubeclient-1.1.2/lib/kubeclient/common.rb:310:in `block 
in api'
  2016-04-20 07:47:48 + [error]: 
/opt/app-root/src/gems/kubeclient-1.1.2/lib/kubeclient/common.rb:51:in 
`handle_exception'
  2016-04-20 07:47:48 + [error]: 
/opt/app-root/src/gems/kubeclient-1.1.2/lib/kubeclient/common.rb:309:in `api'
  2016-04-20 07:47:48 + [error]: 
/opt/app-root/src/gems/kubeclient-1.1.2/lib/kubeclient/common.rb:304:in 
`api_valid?'
  2016-04-20 07:47:48 + [error]: 
/opt/app-root/src/gems/fluent-plugin-kubernetes_metadata_filter-0.18.0/lib/fluent/plugin/filter_kubernetes_metadata.rb:134:in
 `configure’
…
2016-04-20 07:50:42 + [warn]: emit transaction failed: 
error_class=Fluent::ConfigError error="Exception encountered fetching metadata 
from Kubernetes API endpoint: 

Re: pod deployment error: couldn't get deployment: dial tcp 172.30.0.1:443: getsockopt: no route to host

2016-04-19 Thread Sebastian Wieseler
Hey v,
Hey Clayton,

Thanks for your help.
I didn’t flush the iptables in the end, but ALLOW’ed all communication and 
watched netstat -atn closely.

Figured out, that you need port 8443 for communication between nodes and 
masters as well.
Previously I thought that nodes would establish the communication to the 
general master API address, instead of directly
to the masters.
So you actually need to allow port tcp,8443 for node -> master communication as 
well.

Thanks again.
Greetings,
   Sebastian




> On 19 Apr 2016, at 2:21 PM, v <vekt...@gmx.net> wrote:
> 
> Hey,
> 
> I'd try to disable all firewall rules and then see if the error message is 
> still there.
> For example:
> iptables -F
> iptables -t nat -F
> systemctl restart origin-master origin-node docker openvswitch
> 
> Note that all iptables chains have to be set to policy "accept" for this to 
> work.
> "No route to host" can be caused by "--reject-with icmp-host-prohibited" so 
> you can try looking for that in your firewall config too.
> 
> Regards,
> v
> 
> Am 2016-04-19 um 07:38 schrieb Sebastian Wieseler:
>> Hi Clayton,
>> Thanks for your reply.
>> 
>> I opened now the firewall and have only the iptables rules from ansible in 
>> place.
>> 4789 UDP is open for the OVS as I saw.
>> 
>> I ran ansible again and deployed the pod without any success.
>> Restarting the OVS daemon everywhere in the masters,nodes doesn’t help 
>> either.
>> 
>> What’s the procedure to get it fixed?
>> Thanks again in advance.
>> 
>> Greetings,
>>Sebastian
>> 
>> 
>>> On 19 Apr 2016, at 12:06 PM, Clayton Coleman <ccole...@redhat.com> wrote:
>>> 
>>> This is very commonly a misconfiguration of the network firewall rules
>>> and the Openshift SDN.  Pods attempt to connect over OVS bridges to
>>> the masters, and the OVS traffic is carried over port 4789 (I think
>>> that's the port, you may want to double check).
>>> 
>>> https://access.redhat.com/documentation/en/openshift-enterprise/3.1/cluster-administration/chapter-17-troubleshooting-openshift-sdn
>>> 
>>> Covers debugging network configuration issues
>>> 
>>>> On Apr 18, 2016, at 11:28 PM, Sebastian Wieseler 
>>>> <sebast...@myrepublic.com.sg> wrote:
>>>> 
>>>> Hi community,
>>>> We’re having difficulties to deploy pods.
>>>> Our setup includes three masters plus three nodes.
>>>> 
>>>> If we deploy a pod in the default project on a master, everything works 
>>>> fine.
>>>> But when we’re deploying it on a node, we’re getting STATUS Error for the 
>>>> pod and the log shows:
>>>> F0418 09:07:26.429738   1 deployer.go:70] couldn't get deployment 
>>>> project/pod-1: Get 
>>>> https:/172.30.0.1:443/api/v1/namespaces/project/replicationcontrollers/pod-1:
>>>>  dial tcp X.X.X.X:443: getsockopt: no route to host
>>>> 
>>>> 172.30.0.1 is the default address for kubernetes.
>>>> If I execute curl 
>>>> https://172.30.0.1:443/api/v1/namespaces/project/replicationcontrollers/pod-1on
>>>>  the master or on the nodes, I’ll get a valid response.
>>>> 
>>>> How come the pod doesn’t have a route? I couldn’t find much in the logs.
>>>> First I thought it’s a firewall issue, but even with "allow any" it 
>>>> doesn’t work.
>>>> 
>>>> Our syslog is also full of these messages, on master and nodes:
>>>> 
>>>> Apr 19 03:15:24 localhost atomic-openshift-master-api: I0419 
>>>> 03:15:24.578086   32022 iowatcher.go:103] Unexpected EOF during watch 
>>>> stream event decoding: unexpected EOF
>>>> Apr 19 03:15:24 localhost atomic-openshift-master-api: I0419 
>>>> 03:15:24.947147   32022 iowatcher.go:103] Unexpected EOF during watch 
>>>> stream event decoding: unexpected EOF
>>>> Apr 19 03:15:24 localhost atomic-openshift-master-api: I0419 
>>>> 03:15:24.948047   32022 iowatcher.go:103] Unexpected EOF during watch 
>>>> stream event decoding: unexpected EOF
>>>> Apr 19 03:15:24 localhost atomic-openshift-master-api: I0419 
>>>> 03:15:24.948076   32022 iowatcher.go:103] Unexpected EOF during watch 
>>>> stream event decoding: unexpected EOF
>>>> Apr 19 03:15:25 localhost atomic-openshift-master-api: I0419 
>>>> 03:15:25.576047   32022 iowatcher.go:103] Unexpected EOF during watch 
>>>> stream event decoding: 

Re: pod deployment error: couldn't get deployment: dial tcp 172.30.0.1:443: getsockopt: no route to host

2016-04-18 Thread Sebastian Wieseler
Hi Clayton,
Thanks for your reply.

I opened now the firewall and have only the iptables rules from ansible in 
place.
4789 UDP is open for the OVS as I saw.

I ran ansible again and deployed the pod without any success.
Restarting the OVS daemon everywhere in the masters,nodes doesn’t help either.

What’s the procedure to get it fixed?
Thanks again in advance.

Greetings,
   Sebastian


> On 19 Apr 2016, at 12:06 PM, Clayton Coleman <ccole...@redhat.com> wrote:
> 
> This is very commonly a misconfiguration of the network firewall rules
> and the Openshift SDN.  Pods attempt to connect over OVS bridges to
> the masters, and the OVS traffic is carried over port 4789 (I think
> that's the port, you may want to double check).
> 
> https://access.redhat.com/documentation/en/openshift-enterprise/3.1/cluster-administration/chapter-17-troubleshooting-openshift-sdn
> 
> Covers debugging network configuration issues
> 
>> On Apr 18, 2016, at 11:28 PM, Sebastian Wieseler 
>> <sebast...@myrepublic.com.sg> wrote:
>> 
>> Hi community,
>> We’re having difficulties to deploy pods.
>> Our setup includes three masters plus three nodes.
>> 
>> If we deploy a pod in the default project on a master, everything works fine.
>> But when we’re deploying it on a node, we’re getting STATUS Error for the 
>> pod and the log shows:
>> F0418 09:07:26.429738   1 deployer.go:70] couldn't get deployment 
>> project/pod-1: Get 
>> https:/172.30.0.1:443/api/v1/namespaces/project/replicationcontrollers/pod-1:
>>  dial tcp X.X.X.X:443: getsockopt: no route to host
>> 
>> 172.30.0.1 is the default address for kubernetes.
>> If I execute curl 
>> https://172.30.0.1:443/api/v1/namespaces/project/replicationcontrollers/pod-1on
>>  the master or on the nodes, I’ll get a valid response.
>> 
>> How come the pod doesn’t have a route? I couldn’t find much in the logs.
>> First I thought it’s a firewall issue, but even with "allow any" it doesn’t 
>> work.
>> 
>> Our syslog is also full of these messages, on master and nodes:
>> 
>> Apr 19 03:15:24 localhost atomic-openshift-master-api: I0419 03:15:24.578086 
>>   32022 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:24 localhost atomic-openshift-master-api: I0419 03:15:24.947147 
>>   32022 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:24 localhost atomic-openshift-master-api: I0419 03:15:24.948047 
>>   32022 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:24 localhost atomic-openshift-master-api: I0419 03:15:24.948076 
>>   32022 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:25 localhost atomic-openshift-master-api: I0419 03:15:25.576047 
>>   32022 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:26 localhost atomic-openshift-master-api: I0419 03:15:26.207263 
>>   32022 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:27 localhost origin-master-controllers: I0419 03:15:27.947460   
>> 51283 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:28 localhost origin-master-controllers: I0419 03:15:28.580092   
>> 51283 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:28 localhost origin-master-controllers: I0419 03:15:28.961733   
>> 51283 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:30 localhost origin-master-controllers: I0419 03:15:30.577072   
>> 51283 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:31 localhost origin-master-controllers: I0419 03:15:31.947765   
>> 51283 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:32 localhost origin-master-controllers: I0419 03:15:32.579114   
>> 51283 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:33 localhost origin-master-controllers: I0419 03:15:33.199725   
>> 51283 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>> unexpected EOF
>> Apr 19 03:15:34 localhost origin-master-controllers: I0419 03:15:34.199899   
>> 51283 iowatcher.go:103] Unexpected EOF during watch stream event decoding: 
>

Re: OPENSHIFT_KEY_DATA Private Key in log output

2016-04-11 Thread Sebastian Wieseler
Hi Jordan,
Thanks for the explanation. But since I am using the ansible playbook 
(https://github.com/openshift/openshift-ansible/),
I wonder why is it using the deprecated option then. O.o

Greetings,
   Sebastian



On 12 Apr 2016, at 11:50 AM, Jordan Liggitt 
<jligg...@redhat.com<mailto:jligg...@redhat.com>> wrote:

That is logging envvars for either the registry or router pods. If you create 
those using the --service-account option instead of the deprecated 
--credentials option, those envvars won't be created.

See https://github.com/openshift/origin/issues/3951


On Apr 11, 2016, at 11:47 PM, Sebastian Wieseler 
<sebast...@myrepublic.com.sg<mailto:sebast...@myrepublic.com.sg>> wrote:

Hey guys,
I just saw something really scary in journalctl -xe and /var/log/messages:


Apr 12 03:25:54 master.openshift origin-master-controllers[17690]: 
ValueFrom:} {Name:OPENSHIFT_INSECURE Value:false ValueFrom:} 
{Name:OPENSHIFT_KEY_DATA Value:-BEGIN RSA PRIVATE KEY
Apr 12 03:25:54 master.openshift origin-master-controllers[17690]: 
MIIEowIBAAKCAQEApryIizAcvx8FyvuKvN6rx9bpcACiqJQ+vdzOwJ3ftFxu+PY
Apr 12 03:25:54 master.openshift origin-master-controllers[17690]: 
sUJhF1f3Ni/bTj0zwF7QZbwF3KQBZ6IeGRystqE5itXgcl6dIYmOqFRRXypKjtM
Apr 12 03:25:54 master.openshift origin-master-controllers[17690]: 
tqZq+F7NsXO0SUTlPC6dAVHuW3r70lLTTush1VSyRYlhMeOKtTlLFE++PkNXaL+c


You output the PRIVATE KEY? O.o

Maybe somebody can fix this behaviour?
Thanks in advance.

Best Regards,
   Sebastian

___
users mailing list
users@lists.openshift.redhat.com<mailto:users@lists.openshift.redhat.com>
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


OPENSHIFT_KEY_DATA Private Key in log output

2016-04-11 Thread Sebastian Wieseler
Hey guys,
I just saw something really scary in journalctl -xe and /var/log/messages:


Apr 12 03:25:54 master.openshift origin-master-controllers[17690]: 
ValueFrom:} {Name:OPENSHIFT_INSECURE Value:false ValueFrom:} 
{Name:OPENSHIFT_KEY_DATA Value:-BEGIN RSA PRIVATE KEY
Apr 12 03:25:54 master.openshift origin-master-controllers[17690]: 
MIIEowIBAAKCAQEApryIizAcvx8FyvuKvN6rx9bpcACiqJQ+vdzOwJ3ftFxu+PY
Apr 12 03:25:54 master.openshift origin-master-controllers[17690]: 
sUJhF1f3Ni/bTj0zwF7QZbwF3KQBZ6IeGRystqE5itXgcl6dIYmOqFRRXypKjtM
Apr 12 03:25:54 master.openshift origin-master-controllers[17690]: 
tqZq+F7NsXO0SUTlPC6dAVHuW3r70lLTTush1VSyRYlhMeOKtTlLFE++PkNXaL+c


You output the PRIVATE KEY? O.o

Maybe somebody can fix this behaviour?
Thanks in advance.

Best Regards,
   Sebastian

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: ansible run with cert errors (certificate signed by unknown authority)

2016-04-08 Thread Sebastian Wieseler
Dear community,
I think the problem lies here:

$ openssl x509 -in /etc/etcd/peer.crt -text -noout
Subject: CN=xxx.xxx
X509v3 Subject Alternative Name:
IP Address:z.z.z.z

CN - master 1
IP - master 3

Plus this cert  /etc/etcd/peer.crt appears in all three masters - with the same 
values.
It should be: (on master1) CN:master1 IP:master1
(on master2) CN:master2 IP:master2

Seems like one of the last commits in these area broke things. It was working 
fine before :(
But I can’t find the commit. :(

Really need help with this.
Thanks a lot!
   Sebastian Wieseler



On 8 Apr 2016, at 12:05 PM, Sebastian Wieseler 
<sebast...@myrepublic.com.sg<mailto:sebast...@myrepublic.com.sg>> wrote:

Dear community,
I am running the latest ansible playbook version and followed the advanced 
installation guide.
(Updating 6bae443..1b82b1b)


When I execute ansible-playbook ~/openshift-ansible/playbooks/byo/config.yml it 
fails with:
TASK: [openshift_master | Start and enable master api] 
failed: [x.x.x.x] => {"failed": true}
msg: Job for origin-master-api.service failed because the control process 
exited with error code. See "systemctl status origin-master-api.service" and 
"journalctl -xe" for details.



Apr 08 03:47:43   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:43   etcd[12180]: dropped MsgProp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:45   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:45   etcd[12180]: publish error: etcdserver: request timed out, 
possibly due to connection lost
Apr 08 03:47:45   origin-master-controllers[116866]: E0408 03:47:45.976514  
116866 leaderlease.go:69] unable to check lease 
openshift.io/leases/controllers:<http://openshift.io/leases/controllers:> 501:
All the given peers are not reachable (failed to propose on members 
[https://xxx.xxx<https://xxx.xxx/>:2379 x509: certificate signed by unknown 
authority]) [0]

Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: the connection to peer af936f5f6ff57c05 is 
unhealthy
Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   etcd[12180]: dropped MsgProp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:47   origin-node[26652]: E0408 03:47:47.708378   26652 
kubelet.go:2761] Error updating node status, will retry: error getting node 
“xxx.xxx": error #0: net/http: TLS handshake timeout
Apr 08 03:47:47   origin-node[26652]: error #1: net/http: TLS handshake timeout
Apr 08 03:47:47   origin-node[26652]: error #2: x509: certificate signed by 
unknown authority
Apr 08 03:47:48   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:48   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:48   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
since pipeline's sending buffer is full
Apr 08 03:47:49   origin-node[26652]: E0408 03:47:49.187066   26652 
kubelet.go:2761] Error updating node status, will retry: error getting node 
“xxx.xxx": error #0: x509: certificate signed by unknown authority
Apr 08 03:47:49   origin-node[26652]: error #1: x509: certificate signed by 
unknown authority
Apr 08 03:47:49   origin-node[26652]: error #2: x509: certificate signed by 
unknown authority
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgAppResp to 9dc58f8e2290c613 since 
pipeline's sending buffer is full
Apr 08 03:47:49   etcd[12180]: dropped MsgHeartbeatResp to 9dc58f8e2290c613 
s

ansible run with cert errors (certificate signed by unknown authority)

2016-04-07 Thread Sebastian Wieseler
ve
Apr 08 03:47:49   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline 
(EOF)
Apr 08 03:47:51   etcd[12180]: failed to dial af936f5f6ff57c05 on stream 
Message (x509: certificate is valid for y.y.y.y, not z.z.z.z)
 ———>  z.z.z.z is my master03 and y.y.y.y my master02
Apr 08 03:47:51   etcd[12180]: failed to dial af936f5f6ff57c05 on stream MsgApp 
v2 (x509: certificate is valid for y.y.y.y, not z.z.z.z)
Apr 08 03:47:52   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline 
(net/http: TLS handshake timeout)
Apr 08 03:47:52   etcd[12180]: the connection with 9dc58f8e2290c613 became 
active
Apr 08 03:47:53   etcd[12180]: the connection with 9dc58f8e2290c613 became 
inactive
Apr 08 03:47:53   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline 
(net/http: TLS handshake timeout)
Apr 08 03:47:54   etcd[12180]: etcdserver: request timed out, possibly due to 
connection lost
Apr 08 03:47:56   etcd[12180]: publish error: etcdserver: request timed out, 
possibly due to connection lost
Apr 08 03:47:56   etcd[12180]: the connection with 9dc58f8e2290c613 became 
active
Apr 08 03:48:01   origin-node[26652]: E0408 03:48:01.380964   26652 
kubelet.go:2761] Error updating node status, will retry: error getting node 
“xxx.xxxt": error
Apr 08 03:48:01   origin-node[26652]: error #1: net/http: TLS handshake timeout
Apr 08 03:48:01   origin-node[26652]: error #2: x509: certificate signed by 
unknown authority
Apr 08 03:48:03   etcd[12180]: the connection with 9dc58f8e2290c613 became 
inactive
Apr 08 03:48:03   etcd[12180]: failed to write 9dc58f8e2290c613 on pipeline 
(EOF)
Apr 08 03:48:04   origin-master-controllers[116866]: E0408 03:48:04.691728  
116866 leaderlease.go:69] unable to check lease 
openshift.io/leases/controllers:<http://openshift.io/leases/controllers:> 501: 
All the given peers are not reachable



My setup includes three masters:
[masters]
x.x.x.x openshift_hostname=xxx.xxx openshift_public_hostname=xxx.xxx
y.y.y.y openshift_hostname=yyy.yyy openshift_public_hostname=yyy.yyy
z.z.z.z openshift_hostname=zzz.zzz openshift_public_hostname=zzz.zzz

[etcd]
x.x.x.x openshift_hostname=xxx.xxx openshift_public_hostname=xxx.xxx
y.y.y.y openshift_hostname=yyy.yyy openshift_public_hostname=yyy.yyy
z.z.z.z openshift_hostname=zzz.zzz openshift_public_hostname=zzz.zzz



I also tried destroying the config:
# yum -y remove openshift openshift-* etcd
# rm -rf /etc/origin /var/lib/openshift /etc/etcd \
/var/lib/etcd /etc/sysconfig/atomic-openshift* \
/root/.kube/config /etc/ansible/facts.d /usr/share/openshift

But ansible fails at the same step and the cert errors persist.

Can somebody help me?

Thanks a lot i advance!
Best Regards,
  Sebastian Wieseler


___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


private vs. public address openshift master (ansible playbook)

2016-03-10 Thread Sebastian Wieseler
Hi guys,
I hope you can help me out of this.

I configured the parameters in the ansible playbook differently
   openshift_master_cluster_hostname=
   openshift_master_cluster_public_hostname=

But it seems that the public_hostname is overriding the hostname parameter.
I followed the advanced installation guide for OpenShift origin in a multi 
master (3 masters, 2 nodes) environment.


Per design I would like to separate the load balancer for access from the 
outside (for the exposed routes, plus cluster administration)
and the load balancer for the node connections and app connections to the 
master cluster.
Therefor I am required to separate hostname and public_hostname. It’s not 
useful that the one is override the other.

Can somebody please advice?

Thanks a lot i advance!
Best Regards,
   Sebastian Wieseler




___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users