On Tue, Mar 23, 2010 at 1:57 PM, homunq <jameson.qu...@gmail.com> wrote:

>
> >
> > > Watching my deletion process start to get trapped in molasses, as Eli
> > > Jones mentions above, I have to ask two things again:
> >
> > > 1. Is there ANY ANY way to delete all indexes on a given property
> > > name? Without worrying about keeping indexes in order when I'm just
> > > paring them down to 0, I'd just be running through key names and
> > > deleting them. It seems that would be much faster. (If it's any help,
> > > I strongly suspect that most of my key names are globally unique
> > > across all of Google).
> >
> > No - that would violate the constant that indexes are always kept in sync
> > with the data they refer to.
> >
>
> It seems to me that having no index at all is the same situation as if
> the property was indexed=False from the beginning. If that's so, it
> can't be violating a hard constraint.
>

Internally, indexed fields are stored in the 'properties' list in the Entity
Protocol Buffer, while unindexed fields are stored in the
'unindexed_properties' list in the Entity PB. The only way to change the
indexing properties is to fetch them and store them.


>
> >
> > > 2. What is the reason for the slowdown? If I understand his suggestion
> > > to delete every 10th record, Eli Jones seems to suspect that it's
> > > because there's some kind of resource conflict on specific sections of
> > > storage, thus the solution is to attempt to spread your load across
> > > machines. I don't see why that would cause a gradual slowdown. My best
> > > theory is that write-then-delete leaves the index somehow a little
> > > messier (for instance, maybe the index doesn't fully recover the
> > > unused space because it expects you to fill it again) and that when
> > > you do it on a massive scale you get massively messy and slow indexes.
> > > Thus, again, I suspect this question reduces to question 1, although I
> > > guess that if my theory is right a compress/garbage-collect/degunking
> > > call for the indexes would be (for me) second best after a way to nuke
> > > them.
> >
> > Deletes using the naive approach slow down because when a record is
> deleted
> > in Bigtable, it simply inserts a 'tombstone' record indicating the
> original
> > record is deleted - the record isn't actually removed entirely from the
> > datastore until the tablet it's on does its next compaction cycle. Until
> > then, every subsequent query has to skip over the tombstone records to
> find
> > the live records.
> >
> > This is easy to avoid: Use cursors to delete records sequentially. That
> way,
> > your queries won't be skipping the same tombstoned records over and over
> > again - O(n) instead of O(n^2)!
> >
>
> Thanks for explaining. Can you say anything about how often the
> compaction cycles are? Just an order of magnitude - hours, days, or
> weeks?
>

They're based on the quantity of modifications to data in a given tablet.
Doing many inserts, updates or deletes will, sooner or later, cause a
compaction.

-Nick Johnson


>
> Thanks,
> Jameson
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-appeng...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com<google-appengine%2bunsubscr...@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>


-- 
Nick Johnson, Developer Programs Engineer, App Engine
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
368047

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to