I’m running backups (both a scheduled task as well as a manual backup) from 
GAE to a GCS bucket. The GAE source datastore has about 45 kinds.  The 
datastore is a small preliminary test store.  Both the manual and scheduled 
task backups exhibit the following characteristics.  I’m interested in 
understanding the 2nd part.

The first part of the output in the GCS bucket looks to be (meta)info about 
the kind backups:  for each kind, there is an object

    ah[long ID #].[kindName].backup_info     

All of these are of size < 1000B.

The second part of the output in the GCS bucket contains two to four 
objects for each kind in the GAE datastore:

datastore_backup_[backup name]_[date]_[kindName]-[ID#]-output-[N]-attempt-1 
     SIZE
0 =< N =< 3; every kind has at least 2 such objects.

The ID# appears to be the same for all 2nd half objects.

The SIZE of most of the objects is 0B; 

Even kinds that I know have no entities in the datastore have at least 2 
such objects, though they each have SIZE = 0B.

Most of the kinds in the datastore have just one entity recorded.
For kinds where there exist entities in the datastore, usually there is 
only one corresponding object (output-N) where the size is non-zero.  For 
WHICH N the object is non-zero appears to be non-predicable.  
Also, for the kinds which have multiple entities stored (up to 10-15), 2 
corresponding objects are both non-zero.  

Almost all the SIZE values are multiples of 32KB:  32KB, 64KB, 224KB, 
256KB, 352KB, 448KB, 768KB, 1.47MB, 1.75MB

Can someone provide an explanation of the structure of this 2nd half? 
 Simple tests show that the backup/restore is working.  Generally, the 
variation in SIZEs of the non-zero objects seems to roughly correspond to 
the variation in size of the stored corresponding entities in the 
datastore.  But all the 0B objects are confusing, as is the somewhat random 
mapping on non-zero objects to the output-N entities.  If possible, I’d 
like to understand.

Another (small) question: what is the relationship (if any) between the 
long ID # of the 1st half (backup_info) objects, and the ID# and backup 
name for the 2nd half objects?

Thanks in advance,
—Ken Bowen

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine.
For more options, visit https://groups.google.com/d/optout.

Reply via email to