Re: Immutable DS CDN - resolving Riak/Postgres data coherency

2018-02-14 Thread Steve Malenfant
Would deleting the certificate only remove the "latest" copy/alias? The
certificate and keys should still be retrievable manually.  Yes/No?

On Tue, Feb 13, 2018 at 5:40 PM, Dave Neuman  wrote:

> I think I can get on board with not allowing a user to change the CDN.  If
> you want to change the CDN you need to delete your DS and re-create it or
> create a new DS with a different XML_ID and a regex that matches the first
> DS.
>
> We have gone back and forth several times on deleting the keys from riak
> when you delete a DS.  Each time we decide not to make the change for one
> reason or another.  The worry is that if you delete a DS and then decide
> that it was a mistake you now have to generate a whole new certificate
> which could cost real money.  I am not sure that use-case is common enough
> to warrant us not deleting the certificates for a DS.  For now I am +1 on
> deleting the certificates when a DS is deleted.
>
> Thanks,
> Dave
>
> On Tue, Feb 13, 2018 at 12:14 PM, Nir Sopher  wrote:
>
> >  Hi,
> >
> > I created a delivery service and later on realized it is in the wrong
> CDN.
> > I then changed the CDN.
> > The ssl-keys record in the riak kept referring to the old CDN, even if I
> > generated new certificates. Traffic router was therefore unable to pull
> the
> > certificate.
> >
> > See issue 1847
> > 
> >
> > A fix to this issue can be done by changing the code so the record in the
> > Riak is updated along with the DS update.
> > However, this does not really make sense - if the CDN has changed, the
> > domain usually changes as well and the certificate is no longer valid.
> >
> > Therefore, I suggest to entirely block DS CDN change [see
> > https://github.com/apache/incubator-trafficcontrol/pull/1872]
> > .
> > Additionally, I added a PR for ssl-keys deletion up-on DS deletion, so
> > deleting the DS and recreating it would not cause similar issues.
> >
> > Would appreciate community input for other alternatives.
> >
> > Thanks,
> > Nir
> >
>


Re: Immutable DS CDN - resolving Riak/Postgres data coherency

2018-02-14 Thread Nir Sopher
See WIP PR:
https://github.com/apache/incubator-trafficcontrol/pull/1868/files
Deleting only the latest

On Wed, Feb 14, 2018 at 4:56 PM, Steve Malenfant 
wrote:

> Would deleting the certificate only remove the "latest" copy/alias? The
> certificate and keys should still be retrievable manually.  Yes/No?
>
> On Tue, Feb 13, 2018 at 5:40 PM, Dave Neuman  wrote:
>
> > I think I can get on board with not allowing a user to change the CDN.
> If
> > you want to change the CDN you need to delete your DS and re-create it or
> > create a new DS with a different XML_ID and a regex that matches the
> first
> > DS.
> >
> > We have gone back and forth several times on deleting the keys from riak
> > when you delete a DS.  Each time we decide not to make the change for one
> > reason or another.  The worry is that if you delete a DS and then decide
> > that it was a mistake you now have to generate a whole new certificate
> > which could cost real money.  I am not sure that use-case is common
> enough
> > to warrant us not deleting the certificates for a DS.  For now I am +1 on
> > deleting the certificates when a DS is deleted.
> >
> > Thanks,
> > Dave
> >
> > On Tue, Feb 13, 2018 at 12:14 PM, Nir Sopher  wrote:
> >
> > >  Hi,
> > >
> > > I created a delivery service and later on realized it is in the wrong
> > CDN.
> > > I then changed the CDN.
> > > The ssl-keys record in the riak kept referring to the old CDN, even if
> I
> > > generated new certificates. Traffic router was therefore unable to pull
> > the
> > > certificate.
> > >
> > > See issue 1847
> > > 
> > >
> > > A fix to this issue can be done by changing the code so the record in
> the
> > > Riak is updated along with the DS update.
> > > However, this does not really make sense - if the CDN has changed, the
> > > domain usually changes as well and the certificate is no longer valid.
> > >
> > > Therefore, I suggest to entirely block DS CDN change [see
> > > https://github.com/apache/incubator-trafficcontrol/pull/1872]
> > > .
> > > Additionally, I added a PR for ssl-keys deletion up-on DS deletion, so
> > > deleting the DS and recreating it would not cause similar issues.
> > >
> > > Would appreciate community input for other alternatives.
> > >
> > > Thanks,
> > > Nir
> > >
> >
>


Traffic Ops API Swagger Doc

2018-02-14 Thread Dewayne Richardson
We've been working diligently on the TO Golang Rewrite
 project to
obviously rewrite the Perl into Go, but also to improve our Testing and
Documentation efforts.  I presented the idea of using Swagger several
summits (years) ago about using Swagger to help drive our Traffic Ops API
documentation.  Since then Swagger has evolved and is becoming the de facto
standard for building (the potential for generating TO Golang Client and
Server stubs is now available) and documenting REST APIs.

I would like to propose going forward that we use Swagger for future API
level documentation (because it can be generated out of our Golang
code/structs).  The below resources point out a TO API version 1.3 (the
version we decided to rev for the rewritten Golang endpoints).  The intent
behind 1.3 is obviously an improved version of the API (entirely backward
compatible to 1.2), but also to give us a starting point for building the
API doc in Swagger.

The following resources are my examples:

Swagger has several implementations and I chose go-swagger
 because it has more Golang
features.

*Sample TO API doc *

https://app.swaggerhub.com/apis/dewrich/traffic-ops_api/1.3


*Sample TO Golang code with embedded doc*

Generated from the combination of these Golang documentation "hooks"
(there's current a go-swagger bug that prevents the doc from being tied
directly into the handlers)
https://github.com/dewrich/incubator-trafficcontrol/tree/swagger-demo/traffic_ops/traffic_ops_golang/docs

And the *asns.go*, *cdns.go*, *divisions.go*, *regions.go* and *statuses.go*
structs in my branch here:
https://github.com/dewrich/incubator-trafficcontrol/tree/swagger-demo/lib/go-tc


*TO Client Generated from Swagger*

This new Golang package is only a sample of a TO client generated (based
upon the the code generated swagger.json

)

https://github.com/dewrich/incubator-trafficcontrol/tree/swagger-demo/traffic_ops/traffic_ops_golang/toclient
https://github.com/dewrich/incubator-trafficcontrol/tree/swagger-demo/traffic_ops/traffic_ops_golang/toclient/main.go

The hope with tying the documentation closer to the code will help with
keeping the API docs up-to-date, as well as providing more documentation
for developers.

Please give your thoughts on this idea as well as a vote by Feb 21 (a week
from today) so that we can move forward with building our TO API doc.

-Dew


Re: Traffic Ops API Swagger Doc

2018-02-14 Thread Durfey, Ryan
I am +1 on the swagger concept.  This makes working with APIs much easier for 
non-developer staff and makes it easier to educate customers as well.

Ryan DurfeyM | 303-524-5099
CDN Support (24x7): 866-405-2993 or 
cdn_supp...@comcast.com

From: Dewayne Richardson 
Reply-To: "dev@trafficcontrol.incubator.apache.org" 

Date: Wednesday, February 14, 2018 at 9:34 AM
To: "dev@trafficcontrol.incubator.apache.org" 

Subject: Traffic Ops API Swagger Doc

We've been working diligently on the TO Golang Rewrite
 project to
obviously rewrite the Perl into Go, but also to improve our Testing and
Documentation efforts.  I presented the idea of using Swagger several
summits (years) ago about using Swagger to help drive our Traffic Ops API
documentation.  Since then Swagger has evolved and is becoming the de facto
standard for building (the potential for generating TO Golang Client and
Server stubs is now available) and documenting REST APIs.

I would like to propose going forward that we use Swagger for future API
level documentation (because it can be generated out of our Golang
code/structs).  The below resources point out a TO API version 1.3 (the
version we decided to rev for the rewritten Golang endpoints).  The intent
behind 1.3 is obviously an improved version of the API (entirely backward
compatible to 1.2), but also to give us a starting point for building the
API doc in Swagger.

The following resources are my examples:

Swagger has several implementations and I chose go-swagger
 because it has more Golang
features.

*Sample TO API doc *

https://app.swaggerhub.com/apis/dewrich/traffic-ops_api/1.3


*Sample TO Golang code with embedded doc*

Generated from the combination of these Golang documentation "hooks"
(there's current a go-swagger bug that prevents the doc from being tied
directly into the handlers)
https://github.com/dewrich/incubator-trafficcontrol/tree/swagger-demo/traffic_ops/traffic_ops_golang/docs

And the *asns.go*, *cdns.go*, *divisions.go*, *regions.go* and *statuses.go*
structs in my branch here:
https://github.com/dewrich/incubator-trafficcontrol/tree/swagger-demo/lib/go-tc


*TO Client Generated from Swagger*

This new Golang package is only a sample of a TO client generated (based
upon the the code generated swagger.json
>
)

https://github.com/dewrich/incubator-trafficcontrol/tree/swagger-demo/traffic_ops/traffic_ops_golang/toclient
https://github.com/dewrich/incubator-trafficcontrol/tree/swagger-demo/traffic_ops/traffic_ops_golang/toclient/main.go

The hope with tying the documentation closer to the code will help with
keeping the API docs up-to-date, as well as providing more documentation
for developers.

Please give your thoughts on this idea as well as a vote by Feb 21 (a week
from today) so that we can move forward with building our TO API doc.

-Dew



Re: Traffic Router Enhancement - Default Maxmind Geolocation Override

2018-02-14 Thread Nir Sopher
I need to get better understanding of the DNS infra-structure to be able to
verify this assumption.
I assumed TR localization feature already solved the problem of getting to
the right router, and from there it is in our hands ...

Nir





On Tue, Feb 13, 2018 at 11:39 PM, Rawlin Peters 
wrote:

> Nir,
>
> You bring up a good point. If we can make the assumption that requests
> coming to a specific Traffic Router are actually somewhat local to
> that Traffic Router, we might be able to localize those
> "country-localized" clients to a cachegroup close to that particular
> Traffic Router. That would have the effect of spreading load around a
> country a bit better if the Traffic Routers were geographically
> distributed well. Maybe that could be Phase 2 of this effort, but how
> much can we rely on that assumption?
>
> -Rawlin
>
> On Tue, Feb 13, 2018 at 1:27 PM, Rivas, Jesse 
> wrote:
> > Nir,
> >
> > This solution does not support that level of granularity.
> >
> > Jesse
> >
> > On 2/13/18, 11:43 AM, "Nir Sopher"  wrote:
> >
> > Hi,
> >
> > Can this solution support different value in different routers?
> > Taking TR localization into account, it might give better
> granularity.
> >
> > Nir
> >
> > On Tue, Feb 13, 2018 at 8:34 PM, Rawlin Peters <
> rawlin.pet...@gmail.com>
> > wrote:
> >
> > > Yeah, this basically solves the problem where MaxMind knows a
> client
> > > is in the US (or another country) but doesn't know the state, city,
> > > zip, etc., so it's not a "true" miss. In that case MaxMind returns
> the
> > > geographic center of that country as the client's location, but we
> > > don't want to route those clients to the cache group closest to
> that
> > > location because it might not be the ideal cachegroup. By using
> this
> > > parameter we can shift this high volume of "US" traffic that is
> > > essentially being localized to a lake in Kansas to a cachegroup
> more
> > > capable of handling that load. And we can do this on a per-country
> > > basis because we can create multiple of these parameters (which we
> > > wouldn't be able to do if we just used the Default Miss Lat/Lon of
> a
> > > DeliveryService).
> > >
> > > -Rawlin
> > >
> > > On Tue, Feb 13, 2018 at 11:10 AM, Rivas, Jesse <
> jesse_ri...@comcast.com>
> > > wrote:
> > > > Steve,
> > > >
> > > > Using the miss location for the DS was a potential solution that
> we
> > > talked about. However, the miss location is intended for use when
> the
> > > client IP falls through MaxMind without any data. Since the default
> > > location doesn't fit this criteria, it was decided to use a profile
> > > parameter to preserve granularity.
> > > >
> > > > Jesse
> > > >
> > > > On 2/13/18, 11:06 AM, "Steve Malenfant" 
> wrote:
> > > >
> > > > Jesse,
> > > >
> > > > I'm not exactly sure how MaxMind return this default value
> but would
> > > there
> > > > be a way to use the MISS location specified in the DS? Seems
> like
> > > that is
> > > > what it was intended for.
> > > >
> > > > Steve
> > > >
> > > > On Tue, Feb 13, 2018 at 12:42 PM, Rivas, Jesse <
> > > jesse_ri...@comcast.com>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > >
> > > > >
> > > > > At Comcast, we have been seeing a pattern of the same
> cache group
> > > being
> > > > > overloaded nightly as traffic increases on the CDN. The
> cause was
> > > > > determined to be a default location that the geolocation
> provider
> > > MaxMind
> > > > > returns for client IPs that it does not have additional
> data for.
> > > For the
> > > > > US, MaxMind returns a geolocation with the coordinates:
> > > 37.751,-97.822;
> > > > > this is a substantial amount of traffic that is all
> directed to
> > > the nearest
> > > > > cache group.
> > > > >
> > > > >
> > > > >
> > > > > The fix I have introduced is a new profile parameter for
> > > CRConfig.json
> > > > > named 'maxmind.default.override' in the format:
> > > > > ';,'. When MaxMind returns a
> default
> > > location, the
> > > > > code checks for a parameter entry with the same country
> code. If
> > > an entry
> > > > > exists, the default location will be overwritten with the
> > > coordinates of
> > > > > the parameter. This allows users to determine where this
> traffic
> > > should be
> > > > > sent rather than using the cache group closest to the
> MaxMind
> > > default
> > > > > location. The new parameter supports multiple entries so
> that
> > > there can be
> > > > > override coordinates for more than one country.
> > > > >
> > > > >
> > > > >

Re: Traffic Router Fail - Too Many Open Sockets

2018-02-14 Thread Nir Sopher
 Hi,

I implemented the fix and issue was resolved
until today:)

I have 2 routers, both got stuck together due to connections leak, with
"CLOSE_WAIT" connection towards the monitors.
The only messages in catalina.out were:
WARNING: Imported handshake data with alias 
Feb 13, 2018 2:04:49 PM
com.comcast.cdn.traffic_control.traffic_router.secure.CertificateRegistry
importCertificateDataList

Can it be that in some rare, probably failing, situations, the monitor does
not close the connection?
Nir

On Thu, Feb 1, 2018 at 11:27 PM, Nir Sopher  wrote:

> Great,
> Thanks!
> Nir
>
> On Thu, Feb 1, 2018 at 11:12 PM, Jeffrey Martin 
> wrote:
>
>> Hi Nir,
>>This issue is defined by:
>>
>>  Jira: https://issues.apache.org/jira/browse/TC-197
>> and Github https://github.com/apache/incubator-trafficcontrol/issues/916
>>
>> I will be working on a pull request to address this issue in 2.2. The work
>> around is in the second link above.
>> Jeff
>>
>>
>> On Thu, Feb 1, 2018 at 4:09 PM, Jeffrey Martin 
>> wrote:
>>
>> > Hi Nir,
>> >
>> >
>> > On Thu, Feb 1, 2018 at 4:01 PM, Nir Sopher  wrote:
>> >
>> >> Hi,
>> >>
>> >> One of my routers got stuck today, not being able to answer http
>> requests
>> >> (routing and API).
>> >> When trying to investigate the issue, I found catalina.log with a lot
>> of
>> >> messages complaining on failure to open a socket due to too many open
>> >> files. See example below.
>> >> No issues were found in the log earlier to that point, beyond a
>> periodic
>> >> warnings of pulling the certificates every 5 minutes.
>> >>
>> >> When trying to understand "what are these open files", I found about 4k
>> >> open connections in "CLOSE_WAIT" towards the monitor.
>> >> Note: I'm running TC2.1 RC3 with golang traffic-monitor.
>> >>
>> >> Have anyone encountered a similar issue?
>> >> Are the warnings for pulling the certificates a normal thing?
>> >>
>> >> Thanks,
>> >> Nir
>> >>
>> >> Feb 01, 2018 7:33:09 AM
>> >> com.comcast.cdn.traffic_control.traffic_router.secure.Certif
>> icateRegistry
>> >> importCertificateDataList
>> >> WARNING: Imported handshake data with alias my-ds.my-cdn.com
>> >> Feb 01, 2018 8:43:13 AM org.apache.tomcat.util.net.Nio
>> Endpoint$Acceptor
>> >> run
>> >> SEVERE: Socket accept failed
>> >> java.io.IOException: Too many open files
>> >> at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>> >> at
>> >> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChanne
>> >> lImpl.java:422)
>> >> at
>> >> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChanne
>> >> lImpl.java:250)
>> >> at
>> >> org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpo
>> >> int.java:1309)
>> >> at java.lang.Thread.run(Thread.java:745)
>> >>
>> >> Feb 01, 2018 8:43:14 AM org.apache.tomcat.util.net.Nio
>> Endpoint$Acceptor
>> >> run
>> >> SEVERE: Socket accept failed
>> >> java.io.IOException: Too many open files
>> >> at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>> >> at
>> >> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChanne
>> >> lImpl.java:422)
>> >> at
>> >> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChanne
>> >> lImpl.java:250)
>> >> at
>> >> org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpo
>> >> int.java:1309)
>> >> at java.lang.Thread.run(Thread.java:745)
>> >>
>> >
>> >
>>
>
>


[VOTE] Release Apache Traffic Control (incubating) 2.2.0-RC1

2018-02-14 Thread Robert Butts
Hello All,

I've prepared a release for v2.2.0-RC1

The vote is open for at least 72 hours and passes if a majority of at least
3 +1 PPMC votes are cast.

[ ] +1 Approve the release

[ ] -1 Do not release this package because ...

Changes since 2.1.0:
https://github.com/apache/incubator-trafficcontrol/compare/RELEASE-2.1.0...RELEASE-2.2.0-RC1

This corresponds to git:
 Hash: ea549797d98c4fe96e484c9e88f82e2d7f876c1e
 Tag: RELEASE-2.2.0-RC1

Which can be verified with the following: git tag -v RELEASE-2.2.0-RC1

My code signing key is available here:
http://keys.gnupg.net/pks/lookup?search=0xFDD04F7F&op=vindex

Make sure you refresh from a key server to get all relevant signatures.

The source .tar.gz file, pgp signature (.asc signed with my key from
above), md5 and sha1 checksums are provided here:

https://dist.apache.org/repos/dist/dev/incubator/trafficcontrol/2.2.0/RC1


Thanks!