Hi Frédéric:),

I found out what was the root cause of that behaviour.  You would not
believe in this.  But first things at first place:


It is true that our OSO implementation is let say "unusual".  During
scaling up our cluster, we are using FQDNs .  We have got also self service
portal for provisioning new hosts. Our customer order 10 atomics hosts in
dedicated vlan and  decided to attach them into OSO cluster.  Before doing
this, it was decided to change DNS names of them.


And here is the place where the story is starting :)  The dns zone was
refreshed after 45 minutes.  But the host from where we were executing
Ansible playbooks, had got cached old IP addresses.


So whats happend. All atomics hosts had beed properly configured, but all
entries in Openshift configuration contains wrong IP addresses .  This is
why , from network layer cluster was working, all nodes had beed reported
as "ready". But inside of cluster the  configuration was messup.

Your link was very helpful. Thanks to it,  I found wrong configuration:

# oc get hostsubnet
NAME                   HOST                   HOST IP           SUBNET
rh71-os1.example.com   rh71-os1.example.com   192.168.122.46    10.1.1.0/24
rh71-os2.example.com   rh71-os2.example.com   192.168.122.18    10.1.2.0/24
rh71-os3.example.com   rh71-os3.example.com   192.168.122.202   10.1.0.0/24

and  from the first shot I've noticed  wrong IP addresses.


I've re-run  the playbook and everything is working like a charm. Thx a lot
for your help.

Best regards:)


2017-06-21 10:12 GMT+02:00 Frederic Giloux <fgil...@redhat.com>:

> Hi Lukasz
>
> if you don't have connectivity at the service level it is likely that the
> IPTables have not been configured on your new node. You can validate that
> with iptables -L -n. Compare the result on your new node and on one in the
> other VLAN. If this is confirmed the master may not be able to connect to
> the kubelet on the new node (port TCP 10250 as per my previews email).
> Another thing that could have gone wrong is the population of the OVS
> table. In that case restarting the node would reinitialise it.
> Other point the traffic between pods communicating through service should
> go through the SDN, which means your network team should only see SDN
> packets between nodes at a firewall between VLANs and not traffic to your
> service IP range.
> This resource should also be of help: https://docs.openshift.com/
> container-platform/3.5/admin_guide/sdn_troubleshooting.
> html#debugging-a-service
>
> I hope this helps.
>
> Regards,
>
> Frédéric
>
>
> On Wed, Jun 21, 2017 at 9:27 AM, Łukasz Strzelec <
> lukasz.strze...@gmail.com> wrote:
>
>> Hello :)
>>
>> Thx for quick replay,
>>
>> I did, I mean,  the mentioned port had been opened.  All nodes are
>> visible to eachother, and  oc get nodes showing "ready" state.
>>
>> But pushing to registry,  or simply test connectivity to endpoints or
>> services IPs  showin no route to host.
>>
>> Do you know how to test this properly ?
>>
>> The network guy is telling me that  he see some denies from VLAN_B to
>> 172.30.0.0/16 network. He also ensure me that traffic on 4780 port is
>> allowed.
>>
>>
>> I did some test once again:
>>
>> I try to deploy example ruby application, and  it stops on  pushing into
>> registry :(
>>
>> Also when I deploy simple pod (hello-openshift)  then expose service.  I
>> cannot reach the website. I'm seeing  default route page with infor that
>> application doesn't exists
>>
>>
>> Please see the logs below:
>>
>> Fetching gem metadata from https://rubygems.org/...............
>> Fetching version metadata from https://rubygems.org/..
>> Warning: the running version of Bundler is older than the version that
>> created the lockfile. We suggest you upgrade to the latest version of
>> Bundler by running `gem install bundler`.
>> Installing puma 3.4.0 with native extensions
>> Installing rack 1.6.4
>> Using bundler 1.10.6
>> Bundle complete! 2 Gemfile dependencies, 3 gems now installed.
>> Gems in the groups development and test were not installed.
>> Bundled gems are installed into ./bundle.
>> ---> Cleaning up unused ruby gems ...
>> Warning: the running version of Bundler is older than the version that
>> created the lockfile. We suggest you upgrade to the latest version of
>> Bundler by running `gem install bundler`.
>> Pushing image 172.30.123.59:5000/testshared/ddddd:latest ...
>> Registry server Address:
>> Registry server User Name: serviceaccount
>> Registry server Email: serviceacco...@example.org
>> Registry server Password: <<non-empty>>
>> error: build error: Failed to push image: Put
>> http://172.30.123.59:5000/v1/repositories/testshared/ddddd/: dial tcp
>> 172.30.123.59:5000: getsockopt: no route to host
>>
>>
>> 2017-06-21 7:31 GMT+02:00 Frederic Giloux <fgil...@redhat.com>:
>>
>>> Hi Lukasz,
>>>
>>> this is not an unusual setup. You will need:
>>> - the SDN port: 4789 UDP (both directions: masters/nodes to nodes)
>>> - the kubelet port: 10250 TCP (masters to nodes)
>>> - the DNS port: 8053 TCP/UDP (nodes to masters)
>>> If you can't reach VLAN b pods from VLAN A the issue is probably with
>>> the SDN port. Mind that it is using UDP.
>>>
>>> Regards,
>>>
>>> Frédéric
>>>
>>> On Wed, Jun 21, 2017 at 4:13 AM, Łukasz Strzelec <
>>> lukasz.strze...@gmail.com> wrote:
>>>
>>>> -- Hello,
>>>>
>>>> I have to install OSO with dedicated  HW nodes for one of  my customer.
>>>>
>>>> Current cluster is placed in VLAN (for the sake of this question)
>>>> called: VLAN_A
>>>>
>>>> The Customer's nodes have to be place in another vlan: VLAN_B
>>>>
>>>> Now the question,  what ports and routes I have to setup to get this to
>>>> work?
>>>>
>>>> The assumption is that traffic between vlans is filtered by default.
>>>>
>>>>
>>>> Now, what I already did:
>>>>
>>>> I had opened the ports with accordance to documentation, then scaled
>>>> up  the cluster (ansible playbook).
>>>>
>>>> From the first sight , everything  was working fine. Nodes had been
>>>> ready. I can deploy simple pod (eg. hello-openshift). But I can't reach te
>>>> service. During S2I process, pushing into registry is ending with
>>>>
>>>> information "no route to host". I've checked this out, and for nodes
>>>> placed in VLAN_A (the same one as registry and router) everything works
>>>> fine. The problem is in the traffic between VLANs A <-> B. I
>>>>
>>>> can't reach any IP of services  of deployed pods on newly added nodes.
>>>> Thus, traffic between pods over service-subnet is not allow.  Question is
>>>> what should I open? Whole 172.30.0.0/16 between those 2
>>>>
>>>> vlans, or  dedicated rules to /from registry, router , metrics and so
>>>> on ?
>>>>
>>>>
>>>> --
>>>> Ł.S.
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users@lists.openshift.redhat.com
>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>
>>>>
>>>
>>>
>>> --
>>> *Frédéric Giloux*
>>> Senior Middleware Consultant
>>> Red Hat Germany
>>>
>>> fgil...@redhat.com     M: +49-174-172-4661
>>>
>>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>>> ________________________________________________________________________
>>>
>>> Red Hat GmbH, http://www.de.redhat.com/ Sitz: Grasbrunn,
>>> Handelsregister: Amtsgericht München, HRB 153243
>>> Geschäftsführer: Paul Argiry, Charles Cachera, Michael Cunningham,
>>> Michael O'Neill
>>>
>>
>>
>>
>> --
>> Ł.S.
>>
>
>
>
> --
> *Frédéric Giloux*
> Senior Middleware Consultant
> Red Hat Germany
>
> fgil...@redhat.com     M: +49-174-172-4661
>
> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
> ________________________________________________________________________
> Red Hat GmbH, http://www.de.redhat.com/ Sitz: Grasbrunn,
> Handelsregister: Amtsgericht München, HRB 153243
> Geschäftsführer: Paul Argiry, Charles Cachera, Michael Cunningham, Michael
> O'Neill
>



-- 
Ł.S.
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to