Just wanted to send out a note to this list in case the API team or others are
interested.  Ultimately I had high hopes for us in terms of using the API to
incorporate Google Spreadsheets into our application, but it doesn't seem to be
at the point that we can really invest in it yet. Below I'll provide some
details about what we had hoped to do and what we actually found. We will of
course check back periodically to see how things progress, and are not
definitively ruling out using GDocs in the future.

Though I can only provide limited detail in a public forum, I'll try and give a
pertinent description of our application and user base. Our application involves
a significant amount of offline processing ("offline" from GDocs' perspective)
which results in a dynamic number of tabular outputs. Some of these outputs are
refreshed nightly, others hourly, still others in realtime. An output typically
has between 4-20 columns and can have up to a few thousand rows.

Our users tend to be organizations with headcount in the 20-60 range, and are
typically comprised of 75% engineers and 25% marketing, sales, and E-Staff.
Penetration of Google Apps in our users' organizations is high -- to date 100%
of them use Google Apps for their email/docs/calendaring. They are typically
VC-backed companies that have raised or are raising a B or C round. 

There is generally a high level of spreadsheet-literacy in our users'
organizations, or at least a willingness to obtain that literacy. This means
that we can target spreadsheets, charts, and pivots as a primary visualization
of our tabular data.

Given this, in combination with the above, we believed that integrating Google
Spreadsheets into our system to handle the reporting needs had a lot to
recommend it:

    - User management is bootstrapped on top of existing users' Google Accounts.

    - UI is the spreadsheet, something our users understand and can manipulate
      to their hearts' content.

    - Export to CSV/Excel for those that need it is built in.

    - Google Docs can be updated by writer X and have reader Y perceive those
      changes in near real-time.

    - Ops staff does not have to run any reporting servers of its own.

The existing options we would also consider (and are now again considering) are
the reporting solutions from companies like Tableau, Pentaho, JasperSoft, and
the like. There is little about these solutions that ultimately fit the style or
culture of our company or our users', but they do the fundamental things above,
albeit at a higher personnel and licensing cost, and with less engaging UIs than
GDocs in many cases.

After a few days of trying to use the SpreadSheets APIs, I ultimately came to
the conclusion that we couldn't afford to make the engineering investment to
make our solution work, nor did I have the feeling that the ongoing reliability
of a Google SpreadSheets-based solution was something we could put our
reputation behind when dealing with our users. 

Particular details that influenced this decision:

    - The API docs (we are using
      https://developers.google.com/google-apps/spreadsheets/)  are not entirely
      consistent with what is happening. After a few hours of hacking around,
      for instance, I discovered that even though the docs encourage (require?)
      use of the 3.0 API, some of the things that the documentation discusses,
      such as the worksheets feed element, is only available if GDocs-Version is
      set to 1.0. Others have mentioned this in various places, sadly without
      much resolution that I can see:

        - 
http://code.google.com/a/google.com/p/apps-api-issues/issues/detail?id=2356
          (Comment 4 especially)

        - 
https://groups.google.com/forum/?fromgroups#!searchin/google-spreadsheets-api/worksheets$20feed/google-spreadsheets-api/hrMEWe3oGtU/PgxtMMEYwFYJ

        - 
https://groups.google.com/forum/?fromgroups#!searchin/google-spreadsheets-api/worksheets$20feed/google-spreadsheets-api/-foRn23J_TY/Yhusg4za3cQJ
    
    - There is a general feeling I get from the docs and the spirit of things
      that there is not a truly stable, supportable, bankable API and system in
      place yet. I can't say much more than what is already said here:

        - 
http://code.google.com/a/google.com/p/apps-api-issues/issues/detail?id=2516&can=4&q=label%3AAPI-Spreadsheets&colspec=API%20ID%20Type%20Status%20Priority%20Stars%20Opened%20Summary
    
    - Pivots are important to us. Unless I've overlooked it in the docs, there
      is no official support for generating or manipulating pivots. Of course we
      might be able to deduce it and hack at it, but I'm not comfortable with us
      coding to undocumented APIs.
    
    - The reliance on Atom APIS, even with the 'alt=json' alternative, is
      unnecessarily complicated and verbose, especially when considering
      realtime needs (below). I remember 2005-6. They were heady times when
      Atom, Rss, any format or protocol with the word "Open" in it, and being
      able to claim that your application re-used some other existing protocol
      for some new, unintended purpose were all badges of honor. That being
      said, it's not really where we are today as developers, and a more
      straightforward RPC style API would be preferable to us.  Indeed, many of
      Google's own APIs as described through the discovery mechanism
      (http://code.google.com/apis/discovery/) appear to follow a more RPC
      approach. 

    - Related to the point above, there do not appear to be efficient ways to
      push frequent but small updates. The docs do not discuss this, and though
      we could guess at it and experiment with various HTTP-level strategies for
      connection re-use, it would seem to make sense to introduce a WebSockets
      API for more efficient use of the network. This is admittedly minor and
      not as much of a show-stopper for us as the other things above, but it's
      worth pointing out as something that I would expect to see addressed in a
      network-based API.
    
    - When using the GDocs UI to import a tab-separated file of >20K rows, the
      import process completed after some time but ultimately truncated the data
      set silently. The truncation part was unfortunate, but more alarming to me
      was the silence in which it occurred. Although this isn't directly related
      to the APIs, it's worth pointing out that it did nevertheless play into
      the decision and overall feeling that the system may be either a bit
      immature or that there may be an issue of squashing or failing to surface
      errors properly.

Ultimately, I set out a goal of publishing a single snapshot of one tabular data
set to an existing sheet that I had set up for us to try things out. Ultimately
after 2.5 days of working with the APIs (using combinations of curl and Go), I
wasn't able to accomplish it definitively. Although I did finally see a path to
success (involving using GData-Version 1.0 for some calls and 3.0 for others), I
was not able to see a final result. 

My ultimate feeling was that if I, as a senior engineer, was not able to
accomplish this task in 2.5 days, and given that this was just the starting
point of our potential integration, I was uncomfortable with the longterm
prognosis.  Ultimately I had to make the judgement call that the development
cost to us and our users would be more than the IT and ops cost of setting up a
competing reporting solution.

So, that's where we are, and that's the postmortem. Perhaps others will have
found other ways through this or will have feedback for us, though I won't hold
anyone to it of course. I hope that future versions of the APIs will evolve to
suit our needs more, and I wish the GDocs team great luck and success with
everything, and of course I'm very much rooting for you all as things move
forward,

jonathan
http://www.bitgirder.com

Reply via email to