> > your method is right but the limit is there what ever methods we use.

There's always going to be a limit for scalable applications -
appengine just exposes it.

> because the offset is limited to 1000.
> i can not sort data by fields in results more than some limited items

Don't sort.  Use indices.  They can handle multiple fields.

Indices are the only way to build scalable applications.


On Dec 24, 9:16 pm, ajaxer <calid...@gmail.com> wrote:
> your method is right but the limit is there what ever methods we use.
>
> and you didn't get what i mean.
>
> because the offset is limited to 1000.
> i can not sort data by fields in results more than some limited items
>
> with out the offset limit, we can do it easily.
>
> On 12月24日, 上午2时10分, Andy Freeman <ana...@earthlink.net> wrote:
>
>
>
> > Any application requires fetching an unbounded amount of data for a
> > single page view is not scalable, no matter what technology you use to
> > build it, so this problem is not appengine specific.
>
> > If you need aggregations (average, median, total, etc), you have to
> > compute them incrementally or with an off-line process.
>
> > > when even with the "datetime <=" you still get a big set, how you can
> > > handle it?
>
> > We're talking about paging through a dataset, presenting n (for small
> > n) elements at a time to a user.
>
> > If we're paging through by the value of field with distinct values and
> > we want to present 20 results per page, the query for the first page
> > is "order by field" with limit 20.  That query has a "last" result.
> > The query for the next page is "field > {last result's field value}
> > order by field", again with limit 20.  That query also has a last
> > result so the form of subsequent queries should be obvious.  (If
> > you've got other conditions, such as user id or key, you need to add
> > those as well.)
>
> > Suppose that entities can have the same field value.  If you don't
> > care how those entities are ordered, the first query's order by clause
> > can be "order by field, __key__", again limit 20.  The next query
> > tries to pick up entities with the same field as the last result from
> > the previous query.  It looks like "field = {last result's field's
> > value} and __key__ > {last result's key} order by __key__" and you
> > keep using it until it fails.  You then use a query like the "next
> > page" query from the previous case.  (I stopped mentioning limit
> > because the value depends on what you need to fill the current page.)
>
> > On Dec 22, 8:50 pm, ajaxer <calid...@gmail.com> wrote:
>
> > > when even with the "datetime <=" you still get a big set, how you can
> > > handle it?
> > > for example you get 10000 item with the most specific filtering sql.
> > > and on this filtering sql, you should have a statistic info. like how
> > > many item it is .
>
> > > how do you expect the appengine to handle this problem?
> > > how about at one request with many these actions?
>
> > > On Dec 21, 11:09 pm, Andy Freeman <ana...@earthlink.net> wrote:
>
> > > > What statistics are you talking about?
>
> > > > You're claiming that one can't page through an entity type without
> > > > fetching all instances and sorting them.  That claim is wrong because
> > > > the order by constraint does exactly that.
>
> > > > For example, suppose that you want to page through by a date/time
> > > > field named "datetime".  The query for the first page uses order by
> > > > datetime while queries for subsequent pages have a "datetime <="
> > > > clause for the last datetime value from the previous page and continue
> > > > to order by datetime.
>
> > > > What part of that do you think doesn't work?
>
> > > > Do you think that Nick was wrong when he said that time time to
> > > > execute such query depends on the number of entities?
>
> > > > You can even do random access by using markers that are added/
> > > > maintained by a sequential process like the above.
>
> > > > On Dec 20, 7:34 pm, ajaxer <calid...@gmail.com> wrote:
>
> > > > > You misunderstand.
> > > > > if not show me a site with statistics on many fields.
> > > > > with more than 1000 pages please.
> > > > > thanks.
>
> > > > > On Dec 21, 9:06 am, Andy Freeman <ana...@earthlink.net> wrote:
>
> > > > > > You misunderstand.
>
> > > > > > If you have an ordering based on one or more indexed properties, you
> > > > > > can page efficiently wrt that ordering, regardless of the number of
> > > > > > data items.  (For the purposes of this discussion, __key__ is an
> > > > > > indexed property, but you don't have to use it or can use it just to
> > > > > > break ties.)
>
> > > > > > If you're fetching a large number of items and sorting so you can 
> > > > > > find
> > > > > > a contiguous subset, you're doing it wrong.
>
> > > > > > On Dec 19, 10:26 pm, ajaxer <calid...@gmail.com> wrote:
>
> > > > > > > obviously, if you have to page a data set more than 50000 items 
> > > > > > > which
> > > > > > > is not ordered by __key__,
>
> > > > > > > you may find that the __key__  is of no use, because the filtered 
> > > > > > > data
> > > > > > > is ordered not by key.
> > > > > > > but by the fields value, and for that reason you need to loop 
> > > > > > > query as
> > > > > > > you may like to do.
>
> > > > > > > but you will encounter a timeout exception before you really 
> > > > > > > finished
> > > > > > > the action.
>
> > > > > > > On Dec 19, 8:26 am, Andy Freeman <ana...@earthlink.net> wrote:
>
> > > > > > > > > > if the type of data is larger than 10000 items, you need 
> > > > > > > > > > reindexing
> > > > > > > > > for this result.
> > > > > > > > > and recount each time for getting the proper item.
>
> > > > > > > > What kind of reindexing are you talking about.
>
> > > > > > > > Global reindexing is only required when you change the indices 
> > > > > > > > in
> > > > > > > > app.yaml.  It doesn't occur when you add more entities and or 
> > > > > > > > have big
> > > > > > > > entities.
>
> > > > > > > > Of course, when you change an entity, it gets reindexed, but 
> > > > > > > > that's a
> > > > > > > > constant cost.
>
> > > > > > > > Surely you're not planning to change all your entities fairly 
> > > > > > > > often,
> > > > > > > > are you?  (You're going to have problems if you try to maintain
> > > > > > > > sequence numbers and do insertions, but that doesn't scale 
> > > > > > > > anyway.)
>
> > > > > > > > > > it seems you have not encountered such a problem.
> > > > > > > > > on this situation, the indexes on the fields helps nothing 
> > > > > > > > > for the
> > > > > > > > > bulk of  data you have to be sorted is really big.
>
> > > > > > > > Actually I have.  I've even done difference and at-least-#
> > > > > > > > (intersection and union are special cases - at-least-# also 
> > > > > > > > handles
> > > > > > > > majority), at-most-# (binary xor is the only common case that I 
> > > > > > > > came
> > > > > > > > up with), and combinations thereof on paged queries.
>
> > > > > > > > Yes, I know that offset is limited to 1000 but that's irrelevant
> > > > > > > > because the paging scheme under discussion doesn't use offset.  
> > > > > > > > It
> > > > > > > > keeps track of where it is using __key__ and indexed data 
> > > > > > > > values.
>
> > > > > > > > On Dec 16, 7:56 pm, ajaxer <calid...@gmail.com> wrote:
>
> > > > > > > > > of course the time is related to the type data you are 
> > > > > > > > > fetching by one
> > > > > > > > > query.
>
> > > > > > > > > if the type of data is larger than 10000 items, you need 
> > > > > > > > > reindexing
> > > > > > > > > for this result.
> > > > > > > > > and recount each time for getting the proper item.
>
> > > > > > > > > it seems you have not encountered such a problem.
> > > > > > > > > on this situation, the indexes on the fields helps nothing 
> > > > > > > > > for the
> > > > > > > > > bulk of  data you have to be sorted is really big.
>
> > > > > > > > > On Dec 17, 12:20 am, Andy Freeman <ana...@earthlink.net> 
> > > > > > > > > wrote:
>
> > > > > > > > > > > it still can result in timout if the data is really big
>
> > > > > > > > > > How so?  If you don't request "too many" items with a page 
> > > > > > > > > > query, it
> > > > > > > > > > won't time out.  You will run into 
> > > > > > > > > > runtime.DeadlineExceededErrors if
> > > > > > > > > > you try to use too many page queries for a given request, 
> > > > > > > > > > but ....
>
> > > > > > > > > > > of no much use to most of us if we really have big data 
> > > > > > > > > > > to sort and
> > > > > > > > > > > page.
>
> > > > > > > > > > You do know that the sorting for the page queries is done 
> > > > > > > > > > with the
> > > > > > > > > > indexing and not user code, right?  Query time is 
> > > > > > > > > > independent of the
> > > > > > > > > > total amount of data and depends only on the size of the 
> > > > > > > > > > result set.
> > > > > > > > > > (Indexing time is constant per inserted/updated entity.)
>
> > > > > > > > > > On Dec 16, 12:13 am, ajaxer <calid...@gmail.com> wrote:
>
> > > > > > > > > > > it is too complicated for most of us.
> > > > > > > > > > > and it still can result in timout if the data is really 
> > > > > > > > > > > big
>
> > > > > > > > > > > of no much use to most of us if we really have big data 
> > > > > > > > > > > to sort and
> > > > > > > > > > > page.
>
> > > > > > > > > > > On Dec 15, 11:35 pm, Stephen <sdea...@gmail.com> wrote:
>
> > > > > > > > > > > > On Dec 15, 8:04 am, ajaxer <calid...@gmail.com> wrote:
>
> > > > > > > > > > > > > also 1000 index limit makes it not possible to 
> > > > > > > > > > > > > fetcher older data on
> > > > > > > > > > > > > paging.
>
> > > > > > > > > > > > > for if we need an indexed page more than 10000 items,
> > > > > > > > > > > > > it would cost us a lot of cpu time to calculate the 
> > > > > > > > > > > > > base for GQL
> > > > > > > > > > > > > to fetch the data with index less than 1000.
>
> > > > > > > > > > > >http://code.google.com/appengine/articles/paging.html-Hidequotedtext-
>
> > > > > > > > > > > - Show quoted text -- Hide quoted text -
>
> > > > > > > > > - Show quoted text -- Hide quoted text -
>
> > > > > > > - Show quoted text -- Hide quoted text -
>
> > > > > - Show quoted text -- Hide quoted text -
>
> > > - Show quoted text -- Hide quoted text -
>
> - Show quoted text -

--

You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.


Reply via email to