Re: Retrieve header names for list feeds.

Vic Fryzel Mon, 04 Apr 2011 09:17:11 -0700

Again, I'm looking into just giving back header information with each list
entry, which should resolve everyone's issues.  With a valid workaround of
using the cells feed to look up the first row positionally though, this
isn't a huge deal, in my honest opinion.


-Vic



On Thu, Mar 17, 2011 at 8:22 AM, David Bullock <[email protected]> wrote:

> Hey Jason, that's a great write-up, and a great illustration of the
> pain of (as you nicely put it) content-dependent keys.
>
> Just dreaming here, but follows might be the definition of a list.
> (Borrowing heavily from the deprecated Table API).
>
>  <gs:data>
>      <gs:header height="1" nativenames="1" />
>      <gs:column idx="1" key="id" name="Item" />
>      <gs:column idx="2" key="descr" name="Description" />
>      <gs:column idx="3" key="qty" name="Quantity" />
>      <gs:column idx="4" key="ea" name="Each" />
>      <gs:column idx="5" key="total" name="Total" c1="=qty * ea" />
>  </gs:data>
>
>
> The 'name' attribute would be the content-dependent key, while the
> 'key' would be the content-independent key, conforming to existing
> list-feed naming constraints.  User changes to the sheet would modify
> 'idx' and 'name', but 'key' would stay put (and the E-Tag and Last-
> Modified of the resource containing this metatdata would change).
> When updating this metadata via the ATOM/REST API, 'key' would be
> required always, and either 'name' or 'idx' would suffice to establish
> the binding (so long as 'name' could be distinctly found in the header
> row when the metadata is POSTed or PUT or else a 4xx error).  If
> *both* 'name' and 'idx' are supplied, the index establishes the
> binding, and the content of the header cell is rewritten.
>
> Where to put this metadata?  How to integrate it with the existing
> list feed?  OK, here's my proposal in baby-steps:
>
> Put this metadata into entries in a separate feed.  Call it the 'List
> Definition' feed to avoid confusion with Tables.  Now, for each
> worksheet in the spreadsheet, there is an automatic List Definition,
> which have the property that their 'key' is always a function of
> 'name' according to the present name-munging rules in the List API.
> These automatic List Definitions can't have their column 'key' values
> set via the API (except as a side-effect of setting 'name') or else
> they give a 4xx HTTP status.
>
> Existing clients of the API access the lists defined by these
> 'automatic list definitions' in the same way they do now.
>
> I guess by now we have to fill in a few details:
>
> <entry>
>     <title>tableA</title>
>     <gs:listtype>auto</gs:listtype>
>     <gs:worksheet name="Invoices" />
>     <gs:data>
>      <gs:header height="1" nativenames="1" />
>      <gs:column idx="1" key="id" name="Item" />
>      <gs:column idx="2" key="descr" name="Description" />
>      <gs:column idx="3" key="qty" name="Quantity" />
>      <gs:column idx="4" key="ea" name="Each" />
>      <gs:column idx="5" key="total" name="Total" c1="=qty * ea" />
>  </gs:data>
>  <content src="https://foo/bar"; type=http://schemas.google.com/
> spreadsheets/2006#listfeed" />
> </entry>
>
> Now, if we want, we can add more list definitions (unique by <title/>
> within the feed), defining whatever we liked.  Additional list feeds
> are created for each definition we POST/PUT, using the key names for
> the <gsx:*/> elements in the list entry.  (The <gs:field name=""
> idx=""/> was probably a bit of overkill in the defunct Records feed).
>
> Any cool things that might have come with the now-defunct Record feed
> in terms of queries can be added to the list feed's querying abillity.
>
> I slipped in the 'c1' attribute to <gs:column/> ... that's all about
> declaring the formula for a given column, but only in terms of other
> fields in the row.  Thought would need to be given to naming and
> synatx, but think of it more as a template for a legal spreadsheet
> expresion rather than a legal spreadsheet expression itself.  It might
> only be valid in the 'automatic' list feed definitions. Such columns
> would exhibit the property that, when you inserted a row via the API
> or via the GUI, it auto-filled the formula into the cell.
>
> That API would certainly solve my use-cases.  What about you Jason?
>
> What about you, Google?
>
> cheers,
> David.
>
> On Mar 17, 3:20 am, Jason Cipriani <[email protected]> wrote:
> > I agree with David's sentiments. Personally, however, my use cases are
> very
> > different.
> >
> > I've been doing mostly web application development with PHP (and some
> > desktop application development with Java). The Zend client library does
> not
> > support tables, only list and cell feeds.
> >
> > I map Google spreadsheet column names to application-specific column
> names
> > myself, all of my applications have a central configuration that defines
> > this mapping, and I have the spreadsheet access hidden behind a layer of
> > abstraction in my software. I abstract it away like this for two reasons:
> 1)
> > because, as David mentions, list feed keys are content-dependent and
> > unreliable, and 2) so I can move away from Google spreadsheets some day
> (the
> > only reason I am using them is because I've been building applications on
> > top of existing systems powered by Google Forms).
> >
> > What I do to combat unreliable column headers is each time I access the
> list
> > feed, I compare the key names (and order) with the last time I accessed
> the
> > list feed. This is a virtually free operation compared to the list feed
> > access itself. If they change I have the application alert me via email
> so I
> > can go adjust the configuration. If the keys change to the point where
> they
> > might break the mapping in my application, the applications will report
> an
> > error to the user until I fix it -- for the applications I've been
> writing
> > it is more important to make sure the data is not corrupt rather than
> making
> > sure the apps have 100% uptime. Column changes happen infrequently enough
> > that this works well for me, the only reason they change is because some
> > users access the spreadsheets directly and change things they shouldn't
> be
> > changing (something I can't control).
> >
> > Because the spreadsheets I work with are generated by Google Forms, and
> > because Google Forms puts the wording of the question in the column
> headers,
> > and because sometimes the questions are very, very long, sometimes I map
> by
> > column index rather than list key because I simply don't feel comfortable
> > using 500 character key names that can change any time somebody, say,
> fixes
> > a grammatical error in a Google Forms question.
> >
> > Here is the algorithm I've been using to map column header values to
> indexes
> > and list feed keys. So far it has been working but it makes a lot of
> > undocumented assumptions, and only works if there is at least one
> non-header
> > row in the spreadsheet (list feed must return a result). Weird pseudocode
> > follows:
> >
> > === BEGIN ===
> >
> > struct COLUMN {
> >    string header;
> >    int index;
> >    string listkey;
> >
> > };
> >
> > headers = retrieve first row via cell feed, no empty cell return;
> > columns = a list of COLUMN structures, initially empty;
> >
> > for each cell in headers {
> >   COLUMN column;
> >   column.header = cell.text;
> >   column.index = cell.columnnumber;
> >   // column.listkey not determined yet
> >   columns.add(column);
> >
> > }
> >
> > sort columns by .index; // no guarantees that cells returned in order.
> >
> > // now 'columns' is a list of nonempty column headers and column
> > // indices, from left to right in the spreadsheet.
> >
> > listitem = retrieve a single row via list feed;
> > int curindex = 0;
> >
> > // assumptions:
> > //   - keys for columns with empty headers will start with an underscore.
> > //   - list entry fields are in order left to right, and correspond with
> > //     column headers.
> > for each field in listitem {
> >   if (field.key starts with '_')
> >     continue; // skip
> >   columns[curindex].listkey = field.key;
> >
> > }
> >
> > // columns now contains column header, index, and list key name for each
> > // column. this data can be cached, and this cache invalidated if the
> > // application detects a change in the list entry keys, which it can do
> > // as it retrieves items.
> >
> > === END ===
> >
> > *So for me, it would solve a ton of problems if the list feed simply had
> a
> > new column metadata element that contained column header names, column
> > indices, and key names.*
> >
> > It's all very messy and hard to work with right now.
> >
> > Jason
> >
> >
> >
> > On Wed, Mar 16, 2011 at 9:01 AM, David Bullock <[email protected]>
> wrote:
> > > In the lately deprecated tables API, if it had ever worked the way I
> > > read that it was intended to work (which it once did), one wouldn't
> > > ever need to discover the column names by reading the header row.
> > > Instead, one would say something like the following when defining a
> > > table:
> >
> > >  *in future, for my own convenience, I will refer to column B as
> > > 'Description'"
> >
> > > The next day, when some sensible spreadsheet user inserts a column to
> > > the left of B and renames the new C1 to "Item Name", your app doesn't
> > > break.  Just like a named range, the sheet adjusts the reference so
> > > that the name your code relies on - 'Description' - now refers to
> > > column C ... and further, while the name your code chose and the value
> > > of the header cell C1 are now different, it's no big deal - the column
> > > has the same *semantic* meaning.  You're just isolated from changes in
> > > its position and its user-facing description.
> >
> > > How nice would it have been?  Very nice.  We could even have defined
> > > table P to include columns A-F, and table Q to include just columns A
> > > and E (useful since the Spreadsheet API doesn't support partial
> > > response/update).
> >
> > > Instead, Google screwed it up by introducing a two-way binding between
> > > the name the application chose to refer to the colum, and the value of
> > > the header cell.  Oomph.  Quick - back to the cells feed to see if we
> > > can work out (somehow) what to call these columns when addressing them
> > > via the list feed!
> >
> > > All kinds of wierd behaviour ensued (I could point you to a few bugs I
> > > personally filed), among the worst of which was the stupid, stupid
> > > message "This operation is not permitted because it would alter the
> > > structure of this table" when the spreadsheet user tried to do such
> > > nefarious things as oh, fill-down a formula cell, or cut and paste a
> > > range.
> >
> > > Having thus reduced the grand concept of tables to an annoying
> > > repetition of the List API, some grown up in the playground obviously
> > > canned it.
> >
> > > And now we are stuck with this rigmarole of *looking  up* the column
> > > names ... and Google are 'considering a feature that would allow you
> > > to fetch the header row'.  Just let us bind our application-specific
> > > names to column indexes already, and adjust them the way you do named
> > > ranges.
> >
> > > But it's like talking to a brick wall ...
> >
> > > David.
> >
> > > On Mar 16, 6:38 am, Vic Fryzel <[email protected]> wrote:
> > > > Unfortunately, at the moment, you must use the cells feed to
> accomplish
> > > > this.  I'm considering a feature that would allow you to fetch the
> header
> > > > row with the lists feed, but I don't have an ETA yet.
> >
> > > > -Vic
> >
> > > > On Fri, Mar 4, 2011 at 9:56 PM, Jason Cipriani <
> [email protected]
> > > >wrote:
> >
> > > > >  Using PHP Zend bindings for spreadsheet API.
> >
> > > > > If I have a WorksheetEntry; I can use getContentsAsRows() to get
> all of
> > > the
> > > > > entries in that worksheet's list feed.
> >
> > > > > The keys in each list entry are the column headers converted to
> > > lowercase
> > > > > with all non alpha-numeric characters stripped out.
> >
> > > > > How do I get the real column names for this worksheet entry, and
> the
> > > > > corresponding list entry keys for each column? Alternatively, a way
> to
> > > get
> > > > > the real column name given the list entry key would work too.
> >
> > > > > My goal is to be able to both display the real column header text
> > > > > (spreadsheet column headers are Google Forms questions) to the user
> of
> > > my
> > > > > application, and also to know what the list entry key is for each
> > > column so
> > > > > I can display the values of the columns as well.
> > > > >  One way I've been toying with is to use a cell feed to get the
> first
> > > row
> > > > > and read the headers that way.
> >
> > > > > But then, how do I build a correspondence between the column
> headers
> > > and
> > > > > the list entry values?
> >
> > > > > Am I guaranteed that the values in a list entry will be returned in
> > > > > left-to-right GUI order, and empty values will be present (value
> count
> > > will
> > > > > be equal to column count)?
> >
> > > > > Being able to display the column headers and the corresponding
> values
> > > in an
> > > > > application seems like a very reasonable goal, I feel like I'm
> missing
> > > > > something because this is surprisingly clunky...
> >
> > > > > Thanks!
> > > > > Jason- Hide quoted text -
> >
> > > > - Show quoted text -- Hide quoted text -
> >
> > - Show quoted text -
>

Re: Retrieve header names for list feeds.

Reply via email to