[jira] [Commented] (FLINK-17611) Support unix domain sockets for sidecar communication in Stateful Functions

2020-05-15 Thread Francesco Guardiani (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108013#comment-17108013
 ] 

Francesco Guardiani commented on FLINK-17611:
-

> Does the {{unix://}} scheme convention also support endpoint paths?
> Otherwise, I also see a {{http+unix://}} convention used by 
> [requests-unixsocket|https://pypi.org/project/requests-unixsocket/] to 
> specify both the socket file path and url when performing HTTP requests over 
> UDS.

In theory a socket file could end without extension, so there's no way to 
distinguish where the file path end and the http path begin. This convention is 
probably based on the assumption that socket files ends always with .sock

If the tradeoff is between a new field and assuming always that the file ends 
with .sock, i probably would prefer using http+unix convention. I'll reflect 
these changes on the pr

> Support unix domain sockets for sidecar communication in Stateful Functions
> ---
>
> Key: FLINK-17611
> URL: https://issues.apache.org/jira/browse/FLINK-17611
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Francesco Guardiani
>Assignee: Francesco Guardiani
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Hi all,
> I'm quite new to this project and I've started investigating its potential 
> usage in Kubernetes.
> I've found in past that using Unix Domain Sockets across several containers 
> in the same pod gives an interesting performance boost and drastically 
> reduces the overhead of going through the network stack. Given that 
> containers in a pod run in the same host, it's perfectly reasonable to let 
> them communicate through unix domain sockets.
> If you're interested in such feature, I'm more than willing to help 
> implementing that, given that I need a few pointers where to start from



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17611) Support unix domain sockets for sidecar communication in Stateful Functions

2020-05-14 Thread Tzu-Li (Gordon) Tai (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107933#comment-17107933
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-17611:
-

[~igal] concerning the YAML spec format:

Does the {{unix://}} scheme convention also support endpoint paths?
Otherwise, I also see a {{http+unix://}} convention used by 
[requests-unixsocket|https://pypi.org/project/requests-unixsocket/] to specify 
both the socket file path and url when performing HTTP requests over UDS.

either way: I like that we use the scheme part of the endpoint URL to determine 
whether or not to talk via UDS, instead of an extra field in the YAML spec.
It seems like a known convention, and is more compact.

> Support unix domain sockets for sidecar communication in Stateful Functions
> ---
>
> Key: FLINK-17611
> URL: https://issues.apache.org/jira/browse/FLINK-17611
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Francesco Guardiani
>Assignee: Francesco Guardiani
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Hi all,
> I'm quite new to this project and I've started investigating its potential 
> usage in Kubernetes.
> I've found in past that using Unix Domain Sockets across several containers 
> in the same pod gives an interesting performance boost and drastically 
> reduces the overhead of going through the network stack. Given that 
> containers in a pod run in the same host, it's perfectly reasonable to let 
> them communicate through unix domain sockets.
> If you're interested in such feature, I'm more than willing to help 
> implementing that, given that I need a few pointers where to start from



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17611) Support unix domain sockets for sidecar communication in Stateful Functions

2020-05-14 Thread Francesco Guardiani (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107270#comment-17107270
 ] 

Francesco Guardiani commented on FLINK-17611:
-

Opened the PR here: [https://github.com/apache/flink-statefun/pull/110

]I've implemented a solution that uses both endpoint and uds field, to specify 
both path and socket file, let me know your thoughts about that

> Support unix domain sockets for sidecar communication in Stateful Functions
> ---
>
> Key: FLINK-17611
> URL: https://issues.apache.org/jira/browse/FLINK-17611
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Francesco Guardiani
>Assignee: Francesco Guardiani
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Hi all,
> I'm quite new to this project and I've started investigating its potential 
> usage in Kubernetes.
> I've found in past that using Unix Domain Sockets across several containers 
> in the same pod gives an interesting performance boost and drastically 
> reduces the overhead of going through the network stack. Given that 
> containers in a pod run in the same host, it's perfectly reasonable to let 
> them communicate through unix domain sockets.
> If you're interested in such feature, I'm more than willing to help 
> implementing that, given that I need a few pointers where to start from



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17611) Support unix domain sockets for sidecar communication in Stateful Functions

2020-05-14 Thread Igal Shilman (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107244#comment-17107244
 ] 

Igal Shilman commented on FLINK-17611:
--

[~slinkydeveloper] having an endpoint in addition to uds would allow folks to 
serve their statefun endpoint in a different location than "/".

I guess we can simplify here, and make it a requirement that working with a 
unix domain socket, you must serve the statefun endpoint at "/", I'm okay with 
that.

By the way, peeking at other projects that expose a unix domain socket (for 
example Docker) the way to specify the path for a unix domain socket is using 
the unix schema:  "unix://".

What do you folks thing (cc: [~tzulitai]) about using the endpoint field and 
deciding by the schema part of the endpoint.

For example:
{code:java}
function:
   spec:
  endpoint: unix://mnt/shared/worker.sock {code}
would mean that we are using the unix domain socket?

> Support unix domain sockets for sidecar communication in Stateful Functions
> ---
>
> Key: FLINK-17611
> URL: https://issues.apache.org/jira/browse/FLINK-17611
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Francesco Guardiani
>Assignee: Francesco Guardiani
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Hi all,
> I'm quite new to this project and I've started investigating its potential 
> usage in Kubernetes.
> I've found in past that using Unix Domain Sockets across several containers 
> in the same pod gives an interesting performance boost and drastically 
> reduces the overhead of going through the network stack. Given that 
> containers in a pod run in the same host, it's perfectly reasonable to let 
> them communicate through unix domain sockets.
> If you're interested in such feature, I'm more than willing to help 
> implementing that, given that I need a few pointers where to start from



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17611) Support unix domain sockets for sidecar communication in Stateful Functions

2020-05-13 Thread Francesco Guardiani (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106407#comment-17106407
 ] 

Francesco Guardiani commented on FLINK-17611:
-

> Can I assign you to this ticket?

Sure, feel free to do it


  endpoint: http://foobar.com/statefun  uds: /mnt/shared/statefun.sock
I think only one of the two here is valid: or you use uds or you use endpoint

> What remains to be figured out is, what do we do if the socket file isn't 
> there yet.

In the first iteration we could just let it fail when it tries to connect to :)

> Support unix domain sockets for sidecar communication in Stateful Functions
> ---
>
> Key: FLINK-17611
> URL: https://issues.apache.org/jira/browse/FLINK-17611
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Francesco Guardiani
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Hi all,
> I'm quite new to this project and I've started investigating its potential 
> usage in Kubernetes.
> I've found in past that using Unix Domain Sockets across several containers 
> in the same pod gives an interesting performance boost and drastically 
> reduces the overhead of going through the network stack. Given that 
> containers in a pod run in the same host, it's perfectly reasonable to let 
> them communicate through unix domain sockets.
> If you're interested in such feature, I'm more than willing to help 
> implementing that, given that I need a few pointers where to start from



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17611) Support unix domain sockets for sidecar communication in Stateful Functions

2020-05-13 Thread Igal Shilman (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106405#comment-17106405
 ] 

Igal Shilman commented on FLINK-17611:
--

Thanks [~slinkydeveloper].

If okhttp client works out, then here is a proposal of how we can incorporate 
unix domain sockets:
 * You will need to add a  "uds" key for the http function spec, that 
represents the unix domain socket path. Here is an example snippet of an HTTP 
function spec with the uds key.
{code:java}
 - function:
meta:
  kind: http
  ...
spec:
  endpoint: http://foobar.com/statefun
  uds: /mnt/shared/statefun.sock
  states:
...
  ...{code}

 * Add a field at 
[HttpFunctionSpec.java|https://github.com/apache/flink-statefun/blob/master/statefun-flink/statefun-flink-core/src/main/java/org/apache/flink/statefun/flink/core/httpfn/HttpFunctionSpec.java]
 for the unix domain socket path.
 * Populate this field at 
[JsonModule.java|https://github.com/apache/flink-statefun/blob/master/statefun-flink/statefun-flink-core/src/main/java/org/apache/flink/statefun/flink/core/jsonmodule/JsonModule.java#L255]
 * And 
[here|https://github.com/apache/flink-statefun/blob/master/statefun-flink/statefun-flink-core/src/main/java/org/apache/flink/statefun/flink/core/httpfn/HttpFunctionProvider.java#L47]
 you would be able to set a different socket factory if the 
unixDomainSocketPath is present.

What remains to be figured out is, what do we do if the socket file isn't there 
yet.

Regarding a contribution guide: we are using a [Flink contribution 
guide|https://flink.apache.org/contributing/contribute-code.html] with some 
exceptions to the code style here and there, for example we are using Google 
code style 1.7, enforced with spotless (instead of check style)

 

Can I assign you to this ticket?

 

 

> Support unix domain sockets for sidecar communication in Stateful Functions
> ---
>
> Key: FLINK-17611
> URL: https://issues.apache.org/jira/browse/FLINK-17611
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Francesco Guardiani
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Hi all,
> I'm quite new to this project and I've started investigating its potential 
> usage in Kubernetes.
> I've found in past that using Unix Domain Sockets across several containers 
> in the same pod gives an interesting performance boost and drastically 
> reduces the overhead of going through the network stack. Given that 
> containers in a pod run in the same host, it's perfectly reasonable to let 
> them communicate through unix domain sockets.
> If you're interested in such feature, I'm more than willing to help 
> implementing that, given that I need a few pointers where to start from



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17611) Support unix domain sockets for sidecar communication in Stateful Functions

2020-05-13 Thread Francesco Guardiani (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106049#comment-17106049
 ] 

Francesco Guardiani commented on FLINK-17611:
-

Hi [~igal] , I've used in past UDS with JVM with [Eclipse 
Vert.x|http://vertx.io], a popular library to create async applications (based 
on Netty).

What you really gain from UDS is reducing the pressure on the k8s/container 
engine networking stack, because UDS are implemented in a "memory mapped file" 
fashion.

I've checked out too that okhttp works with UDS, so i think I can start playing 
with it. I'll let you know what i manage to create, is there any contributing 
guide i can follow to start with?

> Support unix domain sockets for sidecar communication in Stateful Functions
> ---
>
> Key: FLINK-17611
> URL: https://issues.apache.org/jira/browse/FLINK-17611
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Francesco Guardiani
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Hi all,
> I'm quite new to this project and I've started investigating its potential 
> usage in Kubernetes.
> I've found in past that using Unix Domain Sockets across several containers 
> in the same pod gives an interesting performance boost and drastically 
> reduces the overhead of going through the network stack. Given that 
> containers in a pod run in the same host, it's perfectly reasonable to let 
> them communicate through unix domain sockets.
> If you're interested in such feature, I'm more than willing to help 
> implementing that, given that I need a few pointers where to start from



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17611) Support unix domain sockets for sidecar communication in Stateful Functions

2020-05-12 Thread Igal Shilman (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105753#comment-17105753
 ] 

Igal Shilman commented on FLINK-17611:
--

Hi [~slinkydeveloper],

Welcome to the project! looking forward to hear about your investigation 
results with StateFun on k8s!

 

Regarding unix domain sockets, we've discussed this as a potential performance 
improvement internally, but didn't get to investigate this yet, and any help 
here would be much appreciated. 

Before jumping into the implementation details, I'm wondering if you have ever 
used (or familiar with) unix domain sockets from the JVM, and are they are such 
a drastic improvement, specifically when used from a JVM?

since as far as I can see, using UDS requires JNI. 

 

As a side note, I've quickly Googled and found[ an 
example|[https://github.com/square/okhttp/tree/master/samples/unixdomainsockets/src/main/java/okhttp3/unixdomainsockets]]
 of okhttp client (the client that we use for remote http functions) that works 
with UDS.

So possible that could be a starting point for a small benchmark.

 

> Support unix domain sockets for sidecar communication in Stateful Functions
> ---
>
> Key: FLINK-17611
> URL: https://issues.apache.org/jira/browse/FLINK-17611
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Francesco Guardiani
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Hi all,
> I'm quite new to this project and I've started investigating its potential 
> usage in Kubernetes.
> I've found in past that using Unix Domain Sockets across several containers 
> in the same pod gives an interesting performance boost and drastically 
> reduces the overhead of going through the network stack. Given that 
> containers in a pod run in the same host, it's perfectly reasonable to let 
> them communicate through unix domain sockets.
> If you're interested in such feature, I'm more than willing to help 
> implementing that, given that I need a few pointers where to start from



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17611) Support unix domain sockets for sidecar communication in Stateful Functions

2020-05-12 Thread Tzu-Li (Gordon) Tai (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105121#comment-17105121
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-17611:
-

I like this a lot, +1. And thanks a lot for bringing this up + willing to 
implement this!

For starters, I'd suggest first looking at how to implement a new {{kind}} of 
request reply protocol based remote function. For that, I suggest taking a look 
at the {{JsonModule}} class. In there, you'll find the {{configureFunctions}} 
method which figures out what functions to provide based on the configured kind.

In any case, lets also wait for [~igal] to chime in as he'll have the best 
judgement here.

> Support unix domain sockets for sidecar communication in Stateful Functions
> ---
>
> Key: FLINK-17611
> URL: https://issues.apache.org/jira/browse/FLINK-17611
> Project: Flink
>  Issue Type: New Feature
>  Components: Stateful Functions
>Reporter: Francesco Guardiani
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Hi all,
> I'm quite new to this project and I've started investigating its potential 
> usage in Kubernetes.
> I've found in past that using Unix Domain Sockets across several containers 
> in the same pod gives an interesting performance boost and drastically 
> reduces the overhead of going through the network stack. Given that 
> containers in a pod run in the same host, it's perfectly reasonable to let 
> them communicate through unix domain sockets.
> If you're interested in such feature, I'm more than willing to help 
> implementing that, given that I need a few pointers where to start from



--
This message was sent by Atlassian Jira
(v8.3.4#803005)