Re: [E-devel] ecore-con-url review and eoify

Tom Hacohen Fri, 12 Aug 2016 10:38:52 -0700

On 09/08/16 19:20, Gustavo Sverzut Barbieri wrote:
> On Tue, Aug 9, 2016 at 12:31 PM, Tom Hacohen <[email protected]> wrote:
>> On 09/08/16 16:24, Gustavo Sverzut Barbieri wrote:
>>> On Tue, Aug 9, 2016 at 9:21 AM, Tom Hacohen <[email protected]> wrote:
>>>> On 08/08/16 21:25, Gustavo Sverzut Barbieri wrote:
>>>>> Hi all,
>>>>>
>>>>> Now it's time to review Ecore_Con_Url towards a better API and then
>>>>> provide that as a Eo object. The current ecore_con_url will stay the
>>>>> same, but the new one can have some enhancements or simplifications if
>>>>> needed.
>>>>>
>>>>> Before going into the review, the core question will be to have a
>>>>> single core class with all methods even if they do not make sense (ie:
>>>>> HTTP-specific) or multiple subclasses with protocol specific methods.
>>>>> You can see methods split below.
>>>>>
>>>>> A second point is whether to allow reuse of Ecore_Con_URL handle to do
>>>>> other requests. Reviewing the current code and cURL documentation I
>>>>> found that we may enter inconsistent behavior, for example if a POST
>>>>> is done then HEAD, it will need to reset some flags and we're not
>>>>> doing that. To me it looks like create for a single operation and then
>>>>> destroy, creating a new one if desired is the common pattern.
>>>>
>>>> That is not a decision that should be made based on the backend and what
>>>> it supports, but rather based on what we want to support. It's easy
>>>> enough to just destroy the curl handle and create a new one if someone
>>>> wants to reuse the handle.
>>>
>>> sure, what I wanted to point is that: either we force the do-not-reuse
>>> pattern or we fix the issue if someone tries to reuse. Currently it's
>>> broken (did not test, but looks so reading the docs).
>>>
>>>> I for one think a new object should be
>>>> created per request anyway, so there is no problem.
>>>
>>> me too :-)
>>>
>>>
>>>>>  * `ecore_con_url_pipeline_set(bool)/get` only usable with http, so
>>>>> add that to name?
>>>>
>>>> HTTP in comparison to what?
>>>
>>> instead of:
>>>
>>>     ecore_con_url_pipeline_set(bool)
>>>
>>> use
>>>
>>>     ecore_con_url_http_pipeline_set(bool)
>>>
>>> (actually, with the new-prefix, so efl_net_url_http_pipeline_set(bool)).
>>>
>>>
>>>>> # Base
>>>>>
>>>>> Should change:
>>>>>  * `ecore_con_url_get(con)` execute a regular fetch (HTTP: GET, FTP:
>>>>> download) request. GET is confusing since ressembles HTTP-only, maybe
>>>>> rename to 'fetch'?
>>>>>  * `ecore_con_url_head(con)` execute body-less fetch (HTTP: HEAD, FTP:
>>>>> no data) request. HEAD also ressembles HTTP-only, maybe rename to
>>>>> 'fetch_no_body' or something like that? Suggestions?
>>>>
>>>> The problem I have with these two is that yes, head/get are confusing in
>>>> the signature, but they are also the correct standard names.
>>>> ecore_con_url_action_get would maybe be better suiting, but that's, as
>>>> you said an HTTPism. I'm not sure what's the correct way of doing it.
>>>> Maybe it should be:
>>>>
>>>> ecore_con_url_request_type_set(con, GET);
>>>
>>> this still looks too much HTTP-ism. cURL supports POP3, there "LIST"
>>> would make more sense. Do we expect people to provide the correct
>>> string for each protocol? I find that also strange...
>>
>> No, that's bad too. I'm just saying I think we should come up with
>> naming. Set the type separately in a function, and then just have a
>> function that "executes".
>
> okay, we're on the same page here. But the thing is: should we set a
> default method so others are not required to? Should the method be
> called something generic (ie: FETCH) or specific (ie: GET for
> http/ftp, LIST/RETR for pop3...).
>
> Looking into more details on libcurl examples, it's kinda tricky. If
> you really want to do some mail client using it, you do need to use
> the custom method (that I said was not required!) and even set some
> other flags, for instance:
>
> https://curl.haxx.se/libcurl/c/pop3-dele.html if you need to delete a
> message using POP, then you need to set "DELE" as custom method and do
> a no-body (head) request. Then providing a POP3_DELETE would be more
> meaningful and we could do both operations. Also for things like STAT,
> UIDL...
>
> But then you come to this example
> https://curl.haxx.se/libcurl/c/pop3-top.html where the custom method
> name must include parameters, like "TOP 1 0" to only include the
> headers :-/
>
> I guess this is why other frameworks do not expose cURL, instead call
> it "http" modules...



I thought I said it, but apparently I haven't. Putting FTP and HTTP in 
the same place kinda makes sense just because they are often used 
interchangeably. I feel quite uncomfortable putting POP3 in the same 
class. We don't need to follow curl's mistake.

>
>
>> Actually though, and this is an interesting
>> question, I'd expect reuse in POP3. POP3 feels like a protocol where the
>> connection would be reused, no? Maybe I'm just not familiar enough. Same
>> with IMAP. It feels more of a few roundtrip on the same resource kind of
>> protocol (in comparison to HTTP).
>
> is this about request type or pipeline? I hope you mixed the topics
> here, otherwise I'm completely lost on what you meant :-)
>
> As for pipeline, it seems to be specific to HTTP as in different
> requests can talk to the same server. For other protocols it's all
> part of a single chat session, however I couldn't find any libcurl
> example doing chained commands to see how they do that, eventually by
> using the same handle?
>
> Again, as before... being generic to cover all possible protocols curl
> support will be painful... in that case I'm more inclined to expose it
> as efl_net_curl and map directly to cURL calls, make it clear we wrap
> that and whenever in doubt point users to look at cURL manual.
>
> The other option is to restrict to protocol specific implementations
> such as efl_net_http_client, efl_net_ftp_client and so on...

Exactly my point above. Maybe it's OK to deal both HTTP and FTP in the 
same place (but actually, maybe not, maybe that should be split), but I 
don't think we should limit ourselves to what curl did.

>
>
>>>>> Okay:
>>>>>
>>>>>  * `ecore_con_url_new(url): con`
>>>>>  * `ecore_con_url_custom_new(url, http_method)` (works for HTTP, FTP,
>>>>> IMAP, POP3, SMTP...)
>>>>>  * `ecore_con_url_free(con)`
>>>>>  * `ecore_con_url_fd_set(con, write_fd)` download to file (http/ftp
>>>>> download, etc)
>>>>>  * `ecore_con_url_verbose_set(con, bool)` set cURL verbosity
>>>>>  * `ecore_con_url_data_set(con, void*)` -> `eo_key_data_set()`
>>>>>  * `ecore_con_url_data_get(con)` -> `eo_key_data_get()`
>>>>>  * `ecore_con_url_timeout_set(con, double)`
>>>>>  * `ecore_con_url_status_code_get(con): int`
>>>>>  * `ecore_con_url_response_headers_get(con): list`
>>>>>  * `ecore_con_url_ssl_verify_peer_set(con, bool)`
>>>>>  * `ecore_con_url_ssl_ca_set(con, file)`
>>>>
>>>> Actually, all of these feel bad. Especially the first 4. They should
>>>> die.
>>>
>>> please be more specific. The first 2 are constructors, granted the
>>> "custom_new()" is not used outside of tests and from the cURL docs
>>> (https://curl.haxx.se/libcurl/c/CURLOPT_CUSTOMREQUEST.html) is not
>>> that useful in real life, thus could go.
>>>
>>> but what's the problem with new()/free()? They are pre-Eo, so needs that.
>>
>> I meant, new/free should die now in eo land. The constructors: we don't
>> construct like that anymore. We just set the http_method in eo_add(). By
>> the first 4 are bad, I meant: they have no place in eo world, so I found
>> it odd you put them there.
>
> My bad, what I meant is that the existing API (legacy) is okay like
> that. The new API will be a regular Eo one, with eo_add() and
> eo_del(), eo_key_data_get()...

Talking about legacy is moot, because what you are creating will not 
have legacy calls at all.

>
>
>>> fd_set() is indeed some hack, an optimization to write to a file
>>> (common case). If we have the "Efl.Loop_User" extra methods I've
>>> mentioned in my other thread about ecore_con, it would be much more
>>> reasonable to give that Eo object here instead of a simple fd. But
>>> that's okay in concept, you make a common case easy to use.
>>>
>>> OTOH, if you mean the name is misleading, as fd_set() doesn't imply
>>> it's a writer or reader... then okay, it's bad :-)
>>>
>>
>> I meant the "hack" part. It just feels like a super hack in all senses.
>
> but if we handle it an object with a write()/close() interface, like
> the one I'm proposing for the other thread, would it be better? Or do
> you think this is better handled as a separate object that will "link"
> two objects doing a copy?
>
> If that is the case, then this Efl.Net.URL (or http.client, or...)
> should also implement the same interface (currently it doesn't, as one
> will only notify of more data and you must call "read()" method, while
> current url provides you with read data). It would mean in libcurl's
> backend I'd get libcurl's data from callback, store locally into an
> instance binbuf, then notify event "read", which would trigger the
> listener to call read, which removes from binbuf:
>
> Current with libcurl/ecore_con_url:
>  - event: ECORE_CON_EVENT_URL_DATA (size, data)
>  - no read method, maps well to
> https://curl.haxx.se/libcurl/c/CURLOPT_READFUNCTION.html
>
> To implement the same Eo.Interface (proposed in the other email about
> ecore-con eoify):
>  - event: "read" (no payload)
>  - method: read(size max, uint8 *buffer): ssize_t
>
> Then to implement it using libcurl:
>  - on CURLOPT_READFUNCTION cb:
>     * previously: created ECORE_CON_EVENT_URL_DATA
>     * now: eina_binbuf_append_length() + eo_event_callback_call(read)
>  - new read(max, buffer) method:  will copy from binbuf to provided
> buffer, eina_binbuf_remove(0, max)
>
> This normalizes the API, but adds an extra copy. The extra copy could
> be solved with a hack:
>    pd->tmp_buffer = CUROPT_READFUNCTION's buffer
>    pd->tmp_size = CURLOPT_READFUNCTION's size * nelem
>    pd->tmp_used = 0
>    eo_event_callback_call(read);
>    if (pd->tmp_used < pd->tmp_size)
>       eina_binbuf_append_length(pd->buffer, pd->tmp_buffer +
> pd->tmp_used, pd->tmp_size - pd->tmp_used);
>    pd->tmp_buffer = NULL;
>    pd->tmp_size = 0;
>    pd->tmp_used = 0;
>
> Then read() would be smarter to read from tmp_buffer after binbuf is 
> exhausted:
>   len = eina_binbuf_length_get(pd->buffer);
>   if (len > 0)
>     {
>         sz = min(max, len);
>         memcpy(buffer, eina_binbuf_string_get(pd->buffer), sz);
>         eina_binbuf_remove(pd->buffer, 0, sz);
>         if (len > sz) return; // existing buffer wasn't exhausted
>         max -= sz;
>    }
>
>    if (pd->tmp_size == pd->tmp_used) return; // no data left
>    sz = min(max, pd->tmp_size - pd->tmp_used);
>    memcpy(buffer, pd->tmp_buffer + pd->tmp_used, sz);
>    pd->tmp_used += sz;
>
> with that extra care we can reduce the copies and performance impact.
>
> All in agreement to go this way?
>

No comment, I don't have the energy to dive so deep to provide an 
intelligent response...

If my input is really needed here, please ping me again next week. :)

My only comment, as always, is: KISS.

--
Tom.

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. http://sdm.link/zohodev2dev
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] ecore-con-url review and eoify

Reply via email to