[sage-devel] Re: notebook rewrite

William Stein Tue, 21 Jul 2009 13:22:17 -0700

On Tue, Jul 21, 2009 at 1:13 PM, Ondrej Certik<ond...@certik.cz> wrote:
>
> On Tue, Jul 21, 2009 at 12:53 PM, William Stein<wst...@gmail.com> wrote:
>>
>>
>> On Tue, Jul 21, 2009 at 10:21 AM, Ondrej Certik <ond...@certik.cz> wrote:
>>>
>>> On Tue, Jul 21, 2009 at 10:44 AM, William Stein<wst...@gmail.com> wrote:
>>> >
>>> > On Tue, Jul 21, 2009 at 9:39 AM, Ondrej Certik<ond...@certik.cz> wrote:
>>> >>
>>> >> On Tue, Jul 21, 2009 at 1:58 AM, Robert
>>> >> Bradshaw<rober...@math.washington.edu> wrote:
>>> >>>
>>> >>> On Jul 20, 2009, at 9:02 PM, Ondrej Certik wrote:
>>> >>
>>> >>>> Well, let me say that I really like to run things on the appengine,
>>> >>>> rather than to constantly maintain our own servers. I see no reason
>>> >>>> why the notebook cannot run on the appengine, only the AJAX would
>>> >>>> talk
>>> >>>> to our own server with Sage to actually evaluate the cells (and for
>>> >>>> many people, I think appengine itself could actually be enough). I
>>> >>>> have to think though what the best way to transfer data to the
>>> >>>> database with worksheets is though.
>>> >>>
>>> >>> +1, though for Sage we rely heavily on compiled code. I wonder how
>>> >>> much introduced latency there would be if the backend were served on
>>> >>> a university computer, and the front end in appengine.
>>> >>
>>> >> I think none, it would be as fast as it is now (e.g. the browser
>>> >> communicating directly with the engine).
>>> >
>>> > How is it "none", given that there are now three separate computers
>>> > involved instead of two?  There would have to be a little extra
>>>
>>> What I meant is that the latency in typing 1+1 into the cell and get
>>> the output cell saying 2 should not change at all, because the
>>> javascript in the browser sends a POST request to the Sage engine
>>> (e.g. a web app with the url interface, just like it is now) and it
>>> returns it back directly to the browser.
>>
>> Thanks for the clarification, since I clearly misunderstood you.  Robert
>> said "backend were served on a university computer, and the front end in
>> appengine."  You seem to be eliminating the frontend completely when
>> computations are done.  I.e., do you imagine appengine *just* serving some
>> javascript and a database interface, and basically nothing else?  So what
>> would happen is the following:
>>
>> 1. User visits the appengine server and gets the javascript for the sage
>> notebook (after authenticating).
>> 2. User starts a worksheet.   The javascript in the browser requests a "sage
>> engine token", and the appengine allocates a "compute engine" somewhere for
>> use by that user's worksheet.
>> 3. The user types "factor(2^197-1)" and their javascript *directly* connects
>> to the compute engine and runs the code "factor(2^197-1)".  It also connects
>> to the appengine and stores that "factor(2^197-1)" was input in the
>> database.
>> 4. The javascript in the browser gets back the answer to the factor query
>> and displays the result.
>> 5. The javascript in the browser later also stores the result in the app
>> engine database.
>
> That's exactly correct.
>
> Another possibility is to change 5) into 5'):
>
> 5') the Sage engine talks to the appengine database server directly.
>
> The advantage of 5') over 5) is that the Sage engine should be running
> on some fast network anyways (thus the communication Sage engine <->
> app engine server will be fast), but the user's laptop can be on some
> crappy connection.


Note that there are new security implications to 5' not in 5.  Without
more careful thought, the Sage engine has to have whatever
authentication credentials as the user, since the Sage engine suddenly
gets to change anything in the user's worksheets.    This isn't
necessary a problem, but is something to think about.

>
>>
>> I think there could be some weird security issues/tricks involved with the
>> javascript in the browser directly doing AJAX calls to the "compute engine"
>> above, but there are hacks to get around that.  There's also twice the
>
> Right.
>
>> communications overhead between the user's javascript and remote machines
>> than in the current Sage notebook model where everything goes through the
>> notebook server.    E.g., if the output of a Sage command (in step 4 and 5
>> above)  is large, e.g., a 10MB image, then that image is going to go all
>> over the place, both uploaded and downloaded, which will be incredibly
>> expensive.
>
> I agree, I think we should use 5'). E.g. if the database engine and
> Sage engine is running on the same machine, that's the current design,
> but if they are decoupled, but connected using fast internet, it could
> work.
>

Another issue with 5' is that it means the "sage engine" has to be
able to open new outgoing connections to communicate with the database
server.  This could be a problem if the sage engine is running in some
sort of locked down sandboxed environment.  Again, this isn't
insurmountable, but you should keep it in mind.

> The appengine database backend has to have some notion of the engine
> anyways, so it might as well retreive from it the results.
>
> I agree that it might be too complex/tricky/error prone. I simply don't know.

I don't either.  Trickiness is all relative.  If you encapsulate
things with a good design, you can sometimes build up very complicated
tricky systems that seem simple.

>>>
>>> What changes is the database storage, e.g. either the javascript in
>>> the browser, once it receives the output of the cells also sends it to
>>> the appengine (or whenever the database is running), or the engine
>>> sends it itself, I don't know yet which approach is better. So there
>>> are some issues involved, like if one of those connections fail etc.
>>> But as long as both connections are up and running, the user would not
>>> recognize anything at all.
>>
>> This is an interesting design. It hadn't occured to me before.  It would be
>> interesting to see whether it is any good or not (I can't tell).
>
> Me neither.
>
>>
>> I can tell you one thing, which is that when I start working on the notebook
>> again seriously this September, my first goal will be to create a powerful
>> system for simulating the load of n people all using the notebook at once in
>> a potentially heterogenous way (say from several different computers,
>> etc.).  This testing code will be hopefully generic enough to work with
>> codenode, sagenb, etc.   I think having actual benchmark testing code will
>> in the longrun be a better litmus test for designs than us just thinking
>> about them in the abstract.
>>
>> I could pronounce the design you suggest above as "bad" for several reasons,
>> but what if I'm wrong and in fact the design above, with some tweaks and
>> insights that would result from testing, turns out to be amazingly good?
>
> Exactly. I don't know myself and I am not sure about exact technical
> details of my design, e.g. 5) vs 5') etc. But my motivation is that I
> really want it to be able to run on the appengine completely if
> needed, because there are tons of situations, where I just want to
> show off some simple thing, be it sympy, or just some simple algorithm
> in python and I really *don't* want to maintain my own server for
> that.
>
> At the same time however, I really would like to just create a simple
> engine with web API (be it Sage, or anything else), and I would like
> to maintain just this engine and if it dies, the frontend (running
> somewhere else) would just use a different engine, or whatever.
>
> So I would like to have that, but if it's possible to get everything
> right and robust and fast, I simply don't know.
>
>> I strongly encourage you to test pyjamas with the above.  I think that's the
>> best possible next step.
>
> I will report later on this. It seems to work, but I can already see a
> big issue -- it seems a bit slow (e.g. the generated javascript in the
> browser). But it's too early to tell, once I implement the same thing,
> we can then compare which approach is the best in the long run.

That sounds like very useful information.  Benchmarking is super super
important for something like this, since javascript is already slow.

> Ondrej
>
> >
>



-- 
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to 
sage-devel-unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

[sage-devel] Re: notebook rewrite

Reply via email to