Jim Rhyness created TOREE-391:
---------------------------------

             Summary: Messages to Jupyter kernel gateway are dropped in jeromq
                 Key: TOREE-391
                 URL: https://issues.apache.org/jira/browse/TOREE-391
             Project: TOREE
          Issue Type: Bug
    Affects Versions: 0.1.0
         Environment: Linux ( RHEL 7.3 )
            Reporter: Jim Rhyness


Kernel restart from Jupyter kernel gateway is failing with a timeout.  The 
kernel is restarted, but kernel gateway times out waiting for a 
kernel_info_reply message that it is
expecting in response to kernel_info_request that it sends after initiating the 
restart.

The problem is reproducible most of the time with something like this:

curl -v -X POST --data '{ "name":"apache_toree_scala" }'  
http://127.0.0.1:8888/api/kernels
curl -v -X POST --data '{}'  
http://127.0.0.1:8888/api/kernels/<kernelid-from-above>/restart


>From the IPython message protocol doc, this is the message format:

[
  b'u-u-i-d',         # zmq identity(ies)
  b'<IDS|MSG>',       # delimiter
  b'baddad42',        # HMAC signature
  b'{header}',        # serialized header dict
  b'{parent_header}', # serialized parent header dict
  b'{metadata}',      # serialized metadata dict
  b'{content},        # serialized content dict
  b'blob',            # extra raw data buffer(s)
  ...
]

The first frame of the message contains zmq identities which, in some cases in 
a Router-type socket, are generated by jeromq and then consist of five bytes - 
0 followed by a random int.

In Toree, all frames are treated as Strings.  Conversion to UTF-8 corrupts the 
zmq id, replacing non-UTF-8 characters by the replacement character 0xEFBFBD.

When the corrupted id is used in a message sent to the Router socket, the peer 
to send the message to is not found and the message is dropped.

This affects other messages as well, not just kernel_info_reply.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to