On Tue, Aug 9, 2016 at 12:31 PM, Tom Hacohen <[email protected]> wrote:
> On 09/08/16 16:24, Gustavo Sverzut Barbieri wrote:
>> On Tue, Aug 9, 2016 at 9:21 AM, Tom Hacohen <[email protected]> wrote:
>>> On 08/08/16 21:25, Gustavo Sverzut Barbieri wrote:
>>>> Hi all,
>>>>
>>>> Now it's time to review Ecore_Con_Url towards a better API and then
>>>> provide that as a Eo object. The current ecore_con_url will stay the
>>>> same, but the new one can have some enhancements or simplifications if
>>>> needed.
>>>>
>>>> Before going into the review, the core question will be to have a
>>>> single core class with all methods even if they do not make sense (ie:
>>>> HTTP-specific) or multiple subclasses with protocol specific methods.
>>>> You can see methods split below.
>>>>
>>>> A second point is whether to allow reuse of Ecore_Con_URL handle to do
>>>> other requests. Reviewing the current code and cURL documentation I
>>>> found that we may enter inconsistent behavior, for example if a POST
>>>> is done then HEAD, it will need to reset some flags and we're not
>>>> doing that. To me it looks like create for a single operation and then
>>>> destroy, creating a new one if desired is the common pattern.
>>>
>>> That is not a decision that should be made based on the backend and what
>>> it supports, but rather based on what we want to support. It's easy
>>> enough to just destroy the curl handle and create a new one if someone
>>> wants to reuse the handle.
>>
>> sure, what I wanted to point is that: either we force the do-not-reuse
>> pattern or we fix the issue if someone tries to reuse. Currently it's
>> broken (did not test, but looks so reading the docs).
>>
>>> I for one think a new object should be
>>> created per request anyway, so there is no problem.
>>
>> me too :-)
>>
>>
>>>>  * `ecore_con_url_pipeline_set(bool)/get` only usable with http, so
>>>> add that to name?
>>>
>>> HTTP in comparison to what?
>>
>> instead of:
>>
>>     ecore_con_url_pipeline_set(bool)
>>
>> use
>>
>>     ecore_con_url_http_pipeline_set(bool)
>>
>> (actually, with the new-prefix, so efl_net_url_http_pipeline_set(bool)).
>>
>>
>>>> # Base
>>>>
>>>> Should change:
>>>>  * `ecore_con_url_get(con)` execute a regular fetch (HTTP: GET, FTP:
>>>> download) request. GET is confusing since ressembles HTTP-only, maybe
>>>> rename to 'fetch'?
>>>>  * `ecore_con_url_head(con)` execute body-less fetch (HTTP: HEAD, FTP:
>>>> no data) request. HEAD also ressembles HTTP-only, maybe rename to
>>>> 'fetch_no_body' or something like that? Suggestions?
>>>
>>> The problem I have with these two is that yes, head/get are confusing in
>>> the signature, but they are also the correct standard names.
>>> ecore_con_url_action_get would maybe be better suiting, but that's, as
>>> you said an HTTPism. I'm not sure what's the correct way of doing it.
>>> Maybe it should be:
>>>
>>> ecore_con_url_request_type_set(con, GET);
>>
>> this still looks too much HTTP-ism. cURL supports POP3, there "LIST"
>> would make more sense. Do we expect people to provide the correct
>> string for each protocol? I find that also strange...
>
> No, that's bad too. I'm just saying I think we should come up with
> naming. Set the type separately in a function, and then just have a
> function that "executes".

okay, we're on the same page here. But the thing is: should we set a
default method so others are not required to? Should the method be
called something generic (ie: FETCH) or specific (ie: GET for
http/ftp, LIST/RETR for pop3...).

Looking into more details on libcurl examples, it's kinda tricky. If
you really want to do some mail client using it, you do need to use
the custom method (that I said was not required!) and even set some
other flags, for instance:

https://curl.haxx.se/libcurl/c/pop3-dele.html if you need to delete a
message using POP, then you need to set "DELE" as custom method and do
a no-body (head) request. Then providing a POP3_DELETE would be more
meaningful and we could do both operations. Also for things like STAT,
UIDL...

But then you come to this example
https://curl.haxx.se/libcurl/c/pop3-top.html where the custom method
name must include parameters, like "TOP 1 0" to only include the
headers :-/

I guess this is why other frameworks do not expose cURL, instead call
it "http" modules...


> Actually though, and this is an interesting
> question, I'd expect reuse in POP3. POP3 feels like a protocol where the
> connection would be reused, no? Maybe I'm just not familiar enough. Same
> with IMAP. It feels more of a few roundtrip on the same resource kind of
> protocol (in comparison to HTTP).

is this about request type or pipeline? I hope you mixed the topics
here, otherwise I'm completely lost on what you meant :-)

As for pipeline, it seems to be specific to HTTP as in different
requests can talk to the same server. For other protocols it's all
part of a single chat session, however I couldn't find any libcurl
example doing chained commands to see how they do that, eventually by
using the same handle?

Again, as before... being generic to cover all possible protocols curl
support will be painful... in that case I'm more inclined to expose it
as efl_net_curl and map directly to cURL calls, make it clear we wrap
that and whenever in doubt point users to look at cURL manual.

The other option is to restrict to protocol specific implementations
such as efl_net_http_client, efl_net_ftp_client and so on...


>>>> Okay:
>>>>
>>>>  * `ecore_con_url_new(url): con`
>>>>  * `ecore_con_url_custom_new(url, http_method)` (works for HTTP, FTP,
>>>> IMAP, POP3, SMTP...)
>>>>  * `ecore_con_url_free(con)`
>>>>  * `ecore_con_url_fd_set(con, write_fd)` download to file (http/ftp
>>>> download, etc)
>>>>  * `ecore_con_url_verbose_set(con, bool)` set cURL verbosity
>>>>  * `ecore_con_url_data_set(con, void*)` -> `eo_key_data_set()`
>>>>  * `ecore_con_url_data_get(con)` -> `eo_key_data_get()`
>>>>  * `ecore_con_url_timeout_set(con, double)`
>>>>  * `ecore_con_url_status_code_get(con): int`
>>>>  * `ecore_con_url_response_headers_get(con): list`
>>>>  * `ecore_con_url_ssl_verify_peer_set(con, bool)`
>>>>  * `ecore_con_url_ssl_ca_set(con, file)`
>>>
>>> Actually, all of these feel bad. Especially the first 4. They should
>>> die.
>>
>> please be more specific. The first 2 are constructors, granted the
>> "custom_new()" is not used outside of tests and from the cURL docs
>> (https://curl.haxx.se/libcurl/c/CURLOPT_CUSTOMREQUEST.html) is not
>> that useful in real life, thus could go.
>>
>> but what's the problem with new()/free()? They are pre-Eo, so needs that.
>
> I meant, new/free should die now in eo land. The constructors: we don't
> construct like that anymore. We just set the http_method in eo_add(). By
> the first 4 are bad, I meant: they have no place in eo world, so I found
> it odd you put them there.

My bad, what I meant is that the existing API (legacy) is okay like
that. The new API will be a regular Eo one, with eo_add() and
eo_del(), eo_key_data_get()...


>> fd_set() is indeed some hack, an optimization to write to a file
>> (common case). If we have the "Efl.Loop_User" extra methods I've
>> mentioned in my other thread about ecore_con, it would be much more
>> reasonable to give that Eo object here instead of a simple fd. But
>> that's okay in concept, you make a common case easy to use.
>>
>> OTOH, if you mean the name is misleading, as fd_set() doesn't imply
>> it's a writer or reader... then okay, it's bad :-)
>>
>
> I meant the "hack" part. It just feels like a super hack in all senses.

but if we handle it an object with a write()/close() interface, like
the one I'm proposing for the other thread, would it be better? Or do
you think this is better handled as a separate object that will "link"
two objects doing a copy?

If that is the case, then this Efl.Net.URL (or http.client, or...)
should also implement the same interface (currently it doesn't, as one
will only notify of more data and you must call "read()" method, while
current url provides you with read data). It would mean in libcurl's
backend I'd get libcurl's data from callback, store locally into an
instance binbuf, then notify event "read", which would trigger the
listener to call read, which removes from binbuf:

Current with libcurl/ecore_con_url:
 - event: ECORE_CON_EVENT_URL_DATA (size, data)
 - no read method, maps well to
https://curl.haxx.se/libcurl/c/CURLOPT_READFUNCTION.html

To implement the same Eo.Interface (proposed in the other email about
ecore-con eoify):
 - event: "read" (no payload)
 - method: read(size max, uint8 *buffer): ssize_t

Then to implement it using libcurl:
 - on CURLOPT_READFUNCTION cb:
    * previously: created ECORE_CON_EVENT_URL_DATA
    * now: eina_binbuf_append_length() + eo_event_callback_call(read)
 - new read(max, buffer) method:  will copy from binbuf to provided
buffer, eina_binbuf_remove(0, max)

This normalizes the API, but adds an extra copy. The extra copy could
be solved with a hack:
   pd->tmp_buffer = CUROPT_READFUNCTION's buffer
   pd->tmp_size = CURLOPT_READFUNCTION's size * nelem
   pd->tmp_used = 0
   eo_event_callback_call(read);
   if (pd->tmp_used < pd->tmp_size)
      eina_binbuf_append_length(pd->buffer, pd->tmp_buffer +
pd->tmp_used, pd->tmp_size - pd->tmp_used);
   pd->tmp_buffer = NULL;
   pd->tmp_size = 0;
   pd->tmp_used = 0;

Then read() would be smarter to read from tmp_buffer after binbuf is exhausted:
  len = eina_binbuf_length_get(pd->buffer);
  if (len > 0)
    {
        sz = min(max, len);
        memcpy(buffer, eina_binbuf_string_get(pd->buffer), sz);
        eina_binbuf_remove(pd->buffer, 0, sz);
        if (len > sz) return; // existing buffer wasn't exhausted
        max -= sz;
   }

   if (pd->tmp_size == pd->tmp_used) return; // no data left
   sz = min(max, pd->tmp_size - pd->tmp_used);
   memcpy(buffer, pd->tmp_buffer + pd->tmp_used, sz);
   pd->tmp_used += sz;

with that extra care we can reduce the copies and performance impact.

All in agreement to go this way?

-- 
Gustavo Sverzut Barbieri
--------------------------------------
Mobile: +55 (16) 99354-9890

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. http://sdm.link/zohodev2dev
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to