[google-appengine] Re: LIFO key name strategy?

Julian Namaro Mon, 14 Feb 2011 22:04:06 -0800

I am not sure about the mathematics of it, but intuitively there is no
perfect algorithm for constructing timestamps in a reverse
lexicographical ordering, because adding a character to a string will
always make it lexicographically superior.


But I noticed the mapreduce library just pick a "ridiculously future
time" and go in reverse order from there:
http://code.google.com/p/appengine-mapreduce/source/browse/trunk/python/src/mapreduce/model.py#190

The library also add a random string to reduce the chance of
duplicates, maybe that can be replaced by an UUID if you're really
concerned by uniqueness.



On Feb 14, 5:57 am, Joseph Letness <joe.letn...@gmail.com> wrote:
> Hi Calvin and Robert, thanks for your replies.  I should have been
> more clear about what I am doing, here is some more info:
>
> Calvin, thanks for the link to Ikai's blog post, I haven't seen that
> one and it was very interesting.
>
> Robert, here are specific answers to your questions:
>
> >>Why do you say: " I can't use a composite index since it would explode with 
> >>my use case"?
>
> I'm using Brett Slatkin's "Relation Index" method of building and
> querying set memberships (Google I/O 2009 - Building Scalable, Complex
> Apps on App Engine).  According to Brett, using a composite index on
> this kind would cause explosion, so any ordering of results will need
> to be done in-memory during the request. If the sort order is
> immutable, sorted key names can be used to order results based on the
> their lexicographical position.
>
> Since a creation timestamp is "immutable" data, I figured that using
> lexicographic key names would be the way to go.
>
> >>What would be fine if you could handle your entire result set in one 
> >>request?
>
> Ordering the result set in-memory.
>
> >>What are you trying to do?
>
> The app is a digital-asset manager.  Users need to be able to query a
> set (using the relation index method) and have the results return the
> most recent additions first.  The result set could easily be a few
> thousand, so I want to use cursor-pagination to display the results
> which would preclude any in-memory ordering.
>
> (I'm actually refactoring my existing app that I use to manage/deliver
> graphic assets to my clients so that they can add their own data.)
>
> >>Is there a single global LIFO stack, or are there multiple stacks?
>
> The entities are all of the same kind, however, LIFO behavior is
> localized to individual user groups.
>
> >>How are new items added to the stack(s)?,  What is the addition rate?
>
> Just one item per user request.  User groups would be just a few
> individual users probably less than twenty. The rate per group would
> be so low that chances of contention on any sort of accumulator would
> be almost nonexistent.
>
> >>Is there a requirement that the items are precisely ordered or are some (or 
> >>small) mis-orderings acceptable?
>
> Precision is NOT critical.  Close approximation of chronology is just
> fine.
>
> --The auto-generated ids are not strictly increasing
>
> I did not know that.  Thanks!
>
> --Using the current time may also be problematic since the machines
> will have slight variations, and in some cases significant variations.
>
> I was aware of that, but since absolute precision is not necessary I
> could still use the timestamp as an accumulator if there is some thing
> as an "inverse-timestamp algorithm"!?!?
>
> So...
>
> After spending some more time thinking about this, here is what I plan
> to do:
>
> Create a counter model kind that is created with an IntegerProperty
> starting value of ten billion (I'd like to see somebody reach the
> bottom of that!). Give each user group it's own counter and de-count
> the values in a transaction (or not, it might be simpler to dismiss
> contention and write a handler that ensures uniqueness of the key name
> but maintains approximate lexicographic position).  When the counter
> value is read, convert the value to a padded string and concatenate it
> with the user group name and a leading lowercase letter (k9999999836/
> usergroupname) and use that as the key name for the new asset.
>
> Furthermore, it occurred to me that as a result set is reduced to a
> manageable in-memory size, I could test for the length of results and
> offer the user the ability to custom order their results (asset name
> alphanumeric or asset kind, for example).  Just a thought.
>
> Thanks again for the replies, If anyone thinks there is a better
> approach please let me know, I kind of make this stuff up as I go
> along..
>
> --Joe
>
> On Feb 12, 10:52 pm, Robert Kluin <robert.kl...@gmail.com> wrote:
>
> > Hi Joe,
> >   What are you actually trying to do?  Is there a single global LIFO
> > stack, or are there multiple stacks?  How are new items added to the
> > stack(s)?  In batches to one stack at a time, batches across stacks?
> > What is the addition rate?  How are items removed / processed from the
> > stack(s)?  Is there a requirement that the items are precisely ordered
> > or are some (or small) mis-orderings acceptable?
>
> >   Why do you say: " I can't use a composite index since it would
> > explode with my use case"?
>
> >   The auto-generated ids are not strictly increasing.  What would be
> > fine if you could handle your entire result set in one request?
>
> >   Using the current time may also be problematic since the machines
> > will have slight variations, and in some cases significant variations.
>
> > Robert
>
> > On Sat, Feb 12, 2011 at 14:38, Joseph Letness <joe.letn...@gmail.com> wrote:
> > > Hi everybody, I was wondering if anybody has any good ideas for
> > > generating LIFO (Last In FIrst Out) key names.  I can't use a
> > > composite index since it would explode with my use case.
>
> > > Currently, I can think of two methods:
>
> > > Use the auto generated id (which, I believe is accumulative), query
> > > for keys only and reverse the list in memory.  This would be fine if I
> > > can guarantee that my entire result set can be handled within a single
> > > request.
>
> > > OR
>
> > > Create a de-accumulator Entity in the datastore and have it count down
> > > from some reasonably high integer and create my key name with that (a
> > > composite of the de-accumulation and the entity nam).  The draw back
> > > for this method is that I'm incurring an additional read-write every
> > > time a new LIFO entity is created and possible contention on the de-
> > > accumulator if I run it in a transaction (I haven't decided if
> > > consistency of the de-accumulation is imperative for my use case yet).
>
> > > I'm using Python.  If anybody has any better ideas it would be much
> > > appreciated!
>
> > > Thanks,
>
> > > --Joe
>
> > > --
> > > You received this message because you are subscribed to the Google Groups 
> > > "Google App Engine" group.
> > > To post to this group, send email to google-appengine@googlegroups.com.
> > > To unsubscribe from this group, send email to 
> > > google-appengine+unsubscr...@googlegroups.com.
> > > For more options, visit this group 
> > > athttp://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

[google-appengine] Re: LIFO key name strategy?

Reply via email to