Dario,

Most HTTP libraries/parsers ( including one that Mesos uses internally ) 
provide a way to specify a default size of each chunk. If a Mesos Event is too 
big , it would get split into smaller chunks and vice-versa.

-anand

> On Aug 28, 2015, at 11:51 AM, dario.re...@me.com wrote:
> 
> Anand,
> 
> in the example from my first mail you can see that curl prints the size of a 
> message and then waits for the next message and only when it receives that 
> message it will print the prior message plus the size of the next message, 
> but not the actual message.
> 
> What's the benefit of encoding multiple messages in a single chunk? You could 
> simply create a single chunk per event.
> 
> Cheers,
> Dario
> 
> On 28.08.2015, at 19:43, Anand Mazumdar <an...@mesosphere.io 
> <mailto:an...@mesosphere.io>> wrote:
> 
>> Dario,
>> 
>> Can you shed a bit more light on what you still find puzzling about the CURL 
>> behavior after my explanation ? 
>> 
>> PS: A single HTTP chunk can have 0 or more Mesos (Scheduler API) Events. So 
>> in your example, the first chunk had complete information about the first 
>> “event”, followed by partial information about the subsequent event from 
>> another chunk.
>> 
>> As for the benefit of using RecordIO format here, how else do you think we 
>> could have de-marcated two events in the response ?
>> 
>> -anand
>> 
>> 
>>> On Aug 28, 2015, at 10:01 AM, dario.re...@me.com 
>>> <mailto:dario.re...@me.com> wrote:
>>> 
>>> Anand,
>>> 
>>> thanks for the explanation. I'm still a little puzzled why curl behaves so 
>>> strange. I will check how other client behave as soon as I have a chance.
>>> 
>>> Vinod,
>>> 
>>> what exactly is the benefit of using recordio here? Doesn't it make the 
>>> content-type somewhat wrong? If I send 'Accept: application/json' and 
>>> receive 'Content-Type: application/json', I actually expect to receive only 
>>> json in the message.
>>> 
>>> Thanks,
>>> Dario
>>> 
>>> On 28.08.2015, at 18:13, Vinod Kone <vinodk...@apache.org 
>>> <mailto:vinodk...@apache.org>> wrote:
>>> 
>>>> I'm happy to add the "\n" after the event (note it's different from chunk) 
>>>> if that makes CURL play nicer. I'm not sure about the "\r" part though? Is 
>>>> that a nice to have or does it have some other benefit?
>>>> 
>>>> The design doc is not set in the stone since this has not been released 
>>>> yet. So definitely want to do the right/easy thing.
>>>> 
>>>> On Fri, Aug 28, 2015 at 7:53 AM, Anand Mazumdar <an...@mesosphere.io 
>>>> <mailto:an...@mesosphere.io>> wrote:
>>>> Dario,
>>>> 
>>>> Thanks for the detailed explanation and for trying out the new API. 
>>>> However, this is not a bug. The output from CURL is the encoding used by 
>>>> Mesos for the events stream. From the user doc 
>>>> <https://github.com/apache/mesos/blob/master/docs/scheduler_http_api.md>:
>>>> 
>>>> "Master encodes each Event in RecordIO format, i.e., string representation 
>>>> of length of the event in bytes followed by JSON or binary Protobuf  
>>>> (possibly compressed) encoded event. Note that the value of length will 
>>>> never be ‘0’ and the size of the length will be the size of unsigned 
>>>> integer (i.e., 64 bits). Also, note that the RecordIO encoding should be 
>>>> decoded by the scheduler whereas the underlying HTTP chunked encoding is 
>>>> typically invisible at the application (scheduler) layer.“
>>>> 
>>>> If you run CURL with tracing enabled i.e. —trace, the output would be 
>>>> something similar to this:
>>>> 
>>>> <= Recv header, 2 bytes (0x2)
>>>> 0000: 0d 0a                                           ..
>>>> <= Recv data, 115 bytes (0x73)
>>>> 0000: 36 64 0d 0a 31 30 35 0a 7b 22 73 75 62 73 63 72 6d..105.{"subscr
>>>> 0010: 69 62 65 64 22 3a 7b 22 66 72 61 6d 65 77 6f 72 ibed":{"framewor
>>>> 0020: 6b 5f 69 64 22 3a 7b 22 76 61 6c 75 65 22 3a 22 k_id":{"value":"
>>>> 0030: 32 30 31 35 30 38 32 35 2d 31 30 33 30 31 38 2d 20150825-103018-
>>>> 0040: 33 38 36 33 38 37 31 34 39 38 2d 35 30 35 30 2d 3863871498-5050-
>>>> 0050: 31 31 38 35 2d 30 30 31 30 22 7d 7d 2c 22 74 79 1185-0010"}},"ty
>>>> 0060: 70 65 22 3a 22 53 55 42 53 43 52 49 42 45 44 22 pe":"SUBSCRIBED"
>>>> 0070: 7d 0d 0a                                        }..
>>>> <others
>>>> 
>>>> In the output above, the chunks are correctly delimited by ‘CRLF' (0d 0a) 
>>>> as per the HTTP RFC. As mentioned earlier, the output that you observe on 
>>>> stdout with CURL is of the Record-IO encoding used for the events stream ( 
>>>> and is not related to the RFC ):
>>>> 
>>>> event = event-size LF
>>>>              event-data
>>>> 
>>>> Looking forward to more bug reports as you try out the new API !
>>>> 
>>>> -anand
>>>> 
>>>>> On Aug 28, 2015, at 12:56 AM, Dario Rexin <dario.re...@me.com 
>>>>> <mailto:dario.re...@me.com>> wrote:
>>>>> 
>>>>> -1 (non-binding)
>>>>> 
>>>>> I found a breaking bug in the new HTTP API. The messages do not conform 
>>>>> to the HTTP standard for chunked transfer encoding. in RFC 2616 Sec. 3 
>>>>> (http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html 
>>>>> <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html>) a chunk is 
>>>>> defined as:
>>>>> 
>>>>> chunk = chunk-size [ chunk-extension ] CRLF
>>>>>         chunk-data CRLF
>>>>> 
>>>>> The HTTP API currently sends a chunk as:
>>>>> 
>>>>> chunk = chunk-size LF
>>>>>         chunk-data
>>>>> 
>>>>> A standard conform HTTP client like curl can’t correctly interpret the 
>>>>> data as a complete chunk. In curl it currently looks like this:
>>>>> 
>>>>> 104
>>>>> {"subscribed":{"framework_id":{"value":"20150820-114552-16777343-5050-43704-0000"}},"type":"SUBSCRIBED"}20
>>>>> {"type":"HEARTBEAT”}666
>>>>> …. waiting …
>>>>> {"offers":{"offers":[{"agent_id":{"value":"20150820-114552-16777343-5050-43704-S0"},"framework_id":{"value":"20150820-114552-16777343-5050-43704-0000"},"hostname":"localhost","id":{"value":"20150820-114552-16777343-5050-43704-O0"},"resources":[{"name":"cpus","role":"*","scalar":{"value":8},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":15360},"type":"SCALAR"},{"name":"disk","role":"*","scalar":{"value":2965448},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"}],"url":{"address":{"hostname":"localhost","ip":"127.0.0.1","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS”}20
>>>>> … waiting …
>>>>> {"type":"HEARTBEAT”}20
>>>>> … waiting …
>>>>> 
>>>>> It will receive a couple of messages after successful registration with 
>>>>> the master and the last thing printed is a number (in this case 666). 
>>>>> Then after some time it will print the first offers message followed by 
>>>>> the number 20. The explanation for this behavior is, that curl can’t 
>>>>> interpret the data it gets from Mesos as a complete chunk and waits for 
>>>>> the missing data. So it prints what it thinks is a chunk (a message 
>>>>> followed by the size of the next messsage) and keeps the rest of the 
>>>>> message until another message arrives and so on. The fix for this is to 
>>>>> terminate both lines, the message size and the message data, with CRLF.
>>>>> 
>>>>> Cheers,
>>>>> Dario
>>>> 
>>>> 
>> 

Reply via email to