[Google-Base-API] Re: Detecting batch xml overflow (> 1 Megabyte)

Tom Wilson Tue, 30 Dec 2008 17:13:57 -0800

Query limit not bandwidth i meant sorry :)

Tom Wilson
Freelance Google Base Developer and Consultant
www.tomthedeveloper.com


Google Base Tools - http://dev.tomthedeveloper.com/googlebase

On Dec 30, 11:08 pm, icebackhaz <[email protected]> wrote:
> Well it's certainly nice to know that if we're successful beyond our
> wildest dreams we can beg more bandwidth!
>
> Cheers.
>
> On Dec 27, 8:34 pm, Tom Wilson <[email protected]> wrote:
>
> > No problem, if you want to discuss exacts then feel free to email me
> > of list.
>
> > As for hitting the limits i've never done so but given the poller it
> > would pick up anything that fell through the net eventually as in if
> > it didn't pick up a update to the item then it would flag it. The
> > system is a lot more complex than i'm making out also over time to add
> > and reiterate and learn through problems and issues.
>
> > Are you aware also you that the support team can up your limits ? I'm
> > not sure exactly how you go about it but i'm sure Nicolas or Eric
> > could help with that.
>
> > Tom Wilson
> > Freelance Google Base Developer and Consultantwww.tomthedeveloper.com
>
> > Google Base Tools -http://dev.tomthedeveloper.com/googlebase
>
> > On Dec 17, 3:18 pm, icebackhaz <[email protected]> wrote:
>
> > > Very nice.  Thanks a ton for your input.  We're in a slightly
> > > different mode vis. your poller: any event changing data will (if
> > > necessary) queue the affected items for (re-)sending but I guess the
> > > result is the same (consistency).  Did you ever intentionally exceed
> > > the 5query/sec rule?  If so, does GB generate an intelligible error
> > > condition?  This harks back to the start of this thread:  the response
> > > doesn't say "TOO BIG" rather than "BAD ITEM".  Actually it would be
> > > saying "BAD ITEM" but doesn't because of the (alleged) bug.
> > > I would like to make ~18,000 maximally sized queries per hour and
> > > would like to properly detect the occasional over sized load.  We'll
> > > see what happens.
>
> > > Thanks again.
>
> > > On Dec 17, 7:17 am, Tom Wilson <[email protected]> wrote:
>
> > > > 1. No before we send, this is normally deleted after being
> > > > successfully submitted as i stated.
>
> > > > 2. Yes deletes are done in batches, since all you need to pass is the
> > > > id i worked out exactly how many can be sent without hitting size
> > > > limits.
>
> > > > 3. Yep, since there a 5 query per second limit (~18,000 per hr) the
> > > > system spread the loads
>
> > > > Your not labouring the point at all, you sound to me to be in the same
> > > > mindset i was when i first started using the API.
> > > > One the best parts about the system is the poller which checks for
> > > > consistency between my database and Google Base.
>
> > > > Matching prices being a important element, but it checks a number of
> > > > things for matches.
>
> > > > For example in the the recent changes to product_types the system
> > > > handled that actually quite well, it has to update all items with the
> > > > new structure.
> > > > Its one of those things that a non-critical change so items can not be
> > > > updated for days, the poller was updated to scan and match product
> > > > types if the item didn't have it its queued for processing as a low
> > > > priority.
>
> > > > Low priority items are done after everything else, then updates to
> > > > expiration dates.
>
> > > > On Dec 17, 12:46 am, icebackhaz <[email protected]> wrote:
>
> > > > > Do you grab the xml after the send?  i.e. from the httprequest or do
> > > > > you write it out yourself.
>
> > > > > Hate to belabour the point:  Are you only using batch() for deletes?
> > > > > The rest are singletons (1000 movements + ~500000/30 resends per day)?
>
> > > > > On Dec 15, 3:19 pm, Tom Wilson <[email protected]> wrote:
>
> > > > > > Timing and load is critical. This set-up is used to keep 500,000 or 
> > > > > > so
> > > > > > items up to date of which there are roughly 1000 item movements a 
> > > > > > day
> > > > > > (update/delete/addition) after that the items are simply touched 
> > > > > > once
> > > > > > in thirty days to update the expiration date.
>
> > > > > > Deletions are handled in bulk because all this is require is the
> > > > > > google base itemID, updates are then handled then it continues on
> > > > > > updating items to extent the expiration dates. So basically a 
> > > > > > rolling
> > > > > > process, remembering though that static items can only be 
> > > > > > resubmitted
> > > > > > once in a thirty day period.
>
> > > > > > xml is saved to check size and keep record then deleted once 
> > > > > > submitted
> > > > > > sucessfully, if theres a problem its stored for problem solving
> > > > > > purposes if theres a high number of failures it takes required 
> > > > > > action
> > > > > > to alert and halts processing until resolved. That said each action
> > > > > > runs separately (deletes additions etc...)  theres also a clean up
> > > > > > process that runs in background and checks items against the Google
> > > > > > Base database for problem.
>
> > > > > > On Dec 15, 5:01 pm, icebackhaz <[email protected]> wrote:
>
> > > > > > > Let me see if I'm following along.
>
> > > > > > > 1. You save the xml for every transmision?  At what point do you 
> > > > > > > hit
> > > > > > > write-the-file?  (And delete the file?)
>
> > > > > > > 2. You put only one item in a submission?  The communication 
> > > > > > > overhead
> > > > > > > is of no concern to you? How thick is your pipe! :)
>
> > > > > > > On Dec 14, 5:44 pm, Tom Wilson <[email protected]> 
> > > > > > > wrote:
>
> > > > > > > > Writing the xml message to a temp file check it size and then 
> > > > > > > > from
> > > > > > > > there decide to send it was the default precaution i took as i 
> > > > > > > > do with
> > > > > > > > other projects that fire messages between servers. Then you 
> > > > > > > > have a
> > > > > > > > handy reference of failed messages.
>
> > > > > > > > I stick to an item a submital, and its never failed me yet. 
> > > > > > > > Even with
> > > > > > > > over 500,000 items if its spread well and the system that drive 
> > > > > > > > it is
> > > > > > > > efficient its always been more than enough.
>
> > > > > > > > On Dec 12, 4:33 pm, icebackhaz <[email protected]> wrote:
>
> > > > > > > > > Good question.  Yes it is clear (and provable :) ) that one 
> > > > > > > > > may only
> > > > > > > > > put 1Mb in the xml payload, but it is not at all clear to me 
> > > > > > > > > how one
> > > > > > > > > either a) checks how full the payload is and b) detects after 
> > > > > > > > > the fact
> > > > > > > > > that the problem was an overly large payload.
>
> > > > > > > > > We have a lot to send and want to do it efficiently and 
> > > > > > > > > correctly.
> > > > > > > > > Not checking before hand means we would need a the issues 
> > > > > > > > > (742, 921)
> > > > > > > > > taken care of or we will not know the exact problem.  If I 
> > > > > > > > > could check
> > > > > > > > > before hand I would never encounter overhead of the double 
> > > > > > > > > failure
> > > > > > > > > (921).
>
> > > > > > > > > Alternatives abound but don't really appeal.  Continuously 
> > > > > > > > > calculate
> > > > > > > > > the xml based on known overhead and data lengths: fraught with
> > > > > > > > > miscalculation errors and seriously exposed google api/xml 
> > > > > > > > > changes.
> > > > > > > > > Send ultra-conservative batch sizes (30 items might be the 
> > > > > > > > > upper limit
> > > > > > > > > of perfectly full (1000 char) values if I've done the 
> > > > > > > > > arithmetic
> > > > > > > > > correctly): seriously under uses the payload and increases 
> > > > > > > > > the number
> > > > > > > > > of submissions, traffic overhead.  Send less conservatively 
> > > > > > > > > (perhaps
> > > > > > > > > aggressively) sized batches and on failure assume the payload 
> > > > > > > > > was too
> > > > > > > > > large and split it (recursively): many possible other reasons 
> > > > > > > > > for
> > > > > > > > > failure.
>
> > > > > > > > > So yes, I'm trying to solidify our end and both these, um, ah 
> > > > > > > > > features
> > > > > > > > > of the API get in the way.  Btw, what we've decided to do on 
> > > > > > > > > failure
> > > > > > > > > is to generate the xml with on our own Writer and test the 
> > > > > > > > > size of the
> > > > > > > > > generated xml and react accordingly.  Seems a reasonable 
> > > > > > > > > compromise,
> > > > > > > > > no?
>
> > > > > > > > > Keep in mind, I'm not sure 921 will be accepted as a bug.  I 
> > > > > > > > > do
> > > > > > > > > believe the java.io.IOException is more the result of mis-
> > > > > > > > > communication (using a closed connection) than anything else. 
> > > > > > > > >  Pretty
> > > > > > > > > sure they thought they would be sending back a 
> > > > > > > > > ServiceException.
> > > > > > > > > InvalidEntryException really, and that's also wrong. 
> > > > > > > > > HTTP_BAD_REQUEST
> > > > > > > > > is the http error, and in this case it the result of payload 
> > > > > > > > > > 1 Mb,
> > > > > > > > > not that a particular entry is malformed.
>
> > > > > > > > > On Dec 11, 5:37 pm, Tom Wilson 
> > > > > > > > > <[email protected]> wrote:
>
> > > > > > > > > > Can i ask why exactly your looking at this ?
>
> > > > > > > > > > The documentation states that it accepts batch requests up 
> > > > > > > > > > to 1MB so
> > > > > > > > > > why are you checking the size before posting ?
>
> > > > > > > > > > There only so much the API will do for you but building a 
> > > > > > > > > > solid system/
> > > > > > > > > > app relies on checks on both ends.
>
> > > > > > > > > > Tom Wilson
> > > > > > > > > > Freelance Google Base Developer and 
> > > > > > > > > > Consultantwww.tomthedeveloper.com
>
> > > > > > > > > > Google Base Tools -http://dev.tomthedeveloper.com/googlebase
> > > > > > > > > > Featured Project 
> > > > > > > > > > :http://google-code-featured.blogspot.com/2008/02/google-base-competit...
>
> > > > > > > > > > On Dec 11, 11:28 pm, icebackhaz <[email protected]> 
> > > > > > > > > > wrote:
>
> > > > > > > > > > > Trying to see what happens when one stuffs too much into 
> > > > > > > > > > > the xml
> > > > > > > > > > > payload, we have discovered that there is no facility for 
> > > > > > > > > > > detecting
> > > > > > > > > > > the exact problem either before or after the send.
>
> > > > > > > > > > > The repsonse from the server is  
> > > > > > > > > > > HttpURLConnection.HTTP_BAD_REQUEST.
> > > > > > > > > > > It looks like the intension was to generate a 
> > > > > > > > > > > InvalidEntryException
> > > > > > > > > > > and that looks bogus too, but at least the getReason() 
> > > > > > > > > > > might be useful
>
> > > > > > > > > > > Unfortunately the built-in second attempt to send the 
> > > > > > > > > > > payload chokes
> > > > > > > > > > > big time and we get a java.io.IOException from deep in 
> > > > > > > > > > > Sun's code and
> > > > > > > > > > > the 401 floats off into the either.
>
> > > > > > > > > > > I've sent the to
>
> ...
>
> read more »
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google Base Data API" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/Google-Base-data-API?hl=en
-~----------~----~----~----~------~----~------~--~---

[Google-Base-API] Re: Detecting batch xml overflow (> 1 Megabyte)

Reply via email to