your method is right but the limit is there what ever methods we use. and you didn't get what i mean.
because the offset is limited to 1000. i can not sort data by fields in results more than some limited items with out the offset limit, we can do it easily. On 12月24日, 上午2时10分, Andy Freeman <ana...@earthlink.net> wrote: > Any application requires fetching an unbounded amount of data for a > single page view is not scalable, no matter what technology you use to > build it, so this problem is not appengine specific. > > If you need aggregations (average, median, total, etc), you have to > compute them incrementally or with an off-line process. > > > when even with the "datetime <=" you still get a big set, how you can > > handle it? > > We're talking about paging through a dataset, presenting n (for small > n) elements at a time to a user. > > If we're paging through by the value of field with distinct values and > we want to present 20 results per page, the query for the first page > is "order by field" with limit 20. That query has a "last" result. > The query for the next page is "field > {last result's field value} > order by field", again with limit 20. That query also has a last > result so the form of subsequent queries should be obvious. (If > you've got other conditions, such as user id or key, you need to add > those as well.) > > Suppose that entities can have the same field value. If you don't > care how those entities are ordered, the first query's order by clause > can be "order by field, __key__", again limit 20. The next query > tries to pick up entities with the same field as the last result from > the previous query. It looks like "field = {last result's field's > value} and __key__ > {last result's key} order by __key__" and you > keep using it until it fails. You then use a query like the "next > page" query from the previous case. (I stopped mentioning limit > because the value depends on what you need to fill the current page.) > > On Dec 22, 8:50 pm, ajaxer <calid...@gmail.com> wrote: > > > > > when even with the "datetime <=" you still get a big set, how you can > > handle it? > > for example you get 10000 item with the most specific filtering sql. > > and on this filtering sql, you should have a statistic info. like how > > many item it is . > > > how do you expect the appengine to handle this problem? > > how about at one request with many these actions? > > > On Dec 21, 11:09 pm, Andy Freeman <ana...@earthlink.net> wrote: > > > > What statistics are you talking about? > > > > You're claiming that one can't page through an entity type without > > > fetching all instances and sorting them. That claim is wrong because > > > the order by constraint does exactly that. > > > > For example, suppose that you want to page through by a date/time > > > field named "datetime". The query for the first page uses order by > > > datetime while queries for subsequent pages have a "datetime <=" > > > clause for the last datetime value from the previous page and continue > > > to order by datetime. > > > > What part of that do you think doesn't work? > > > > Do you think that Nick was wrong when he said that time time to > > > execute such query depends on the number of entities? > > > > You can even do random access by using markers that are added/ > > > maintained by a sequential process like the above. > > > > On Dec 20, 7:34 pm, ajaxer <calid...@gmail.com> wrote: > > > > > You misunderstand. > > > > if not show me a site with statistics on many fields. > > > > with more than 1000 pages please. > > > > thanks. > > > > > On Dec 21, 9:06 am, Andy Freeman <ana...@earthlink.net> wrote: > > > > > > You misunderstand. > > > > > > If you have an ordering based on one or more indexed properties, you > > > > > can page efficiently wrt that ordering, regardless of the number of > > > > > data items. (For the purposes of this discussion, __key__ is an > > > > > indexed property, but you don't have to use it or can use it just to > > > > > break ties.) > > > > > > If you're fetching a large number of items and sorting so you can find > > > > > a contiguous subset, you're doing it wrong. > > > > > > On Dec 19, 10:26 pm, ajaxer <calid...@gmail.com> wrote: > > > > > > > obviously, if you have to page a data set more than 50000 items > > > > > > which > > > > > > is not ordered by __key__, > > > > > > > you may find that the __key__ is of no use, because the filtered > > > > > > data > > > > > > is ordered not by key. > > > > > > but by the fields value, and for that reason you need to loop query > > > > > > as > > > > > > you may like to do. > > > > > > > but you will encounter a timeout exception before you really > > > > > > finished > > > > > > the action. > > > > > > > On Dec 19, 8:26 am, Andy Freeman <ana...@earthlink.net> wrote: > > > > > > > > > > if the type of data is larger than 10000 items, you need > > > > > > > > > reindexing > > > > > > > > for this result. > > > > > > > > and recount each time for getting the proper item. > > > > > > > > What kind of reindexing are you talking about. > > > > > > > > Global reindexing is only required when you change the indices in > > > > > > > app.yaml. It doesn't occur when you add more entities and or > > > > > > > have big > > > > > > > entities. > > > > > > > > Of course, when you change an entity, it gets reindexed, but > > > > > > > that's a > > > > > > > constant cost. > > > > > > > > Surely you're not planning to change all your entities fairly > > > > > > > often, > > > > > > > are you? (You're going to have problems if you try to maintain > > > > > > > sequence numbers and do insertions, but that doesn't scale > > > > > > > anyway.) > > > > > > > > > > it seems you have not encountered such a problem. > > > > > > > > on this situation, the indexes on the fields helps nothing for > > > > > > > > the > > > > > > > > bulk of data you have to be sorted is really big. > > > > > > > > Actually I have. I've even done difference and at-least-# > > > > > > > (intersection and union are special cases - at-least-# also > > > > > > > handles > > > > > > > majority), at-most-# (binary xor is the only common case that I > > > > > > > came > > > > > > > up with), and combinations thereof on paged queries. > > > > > > > > Yes, I know that offset is limited to 1000 but that's irrelevant > > > > > > > because the paging scheme under discussion doesn't use offset. It > > > > > > > keeps track of where it is using __key__ and indexed data values. > > > > > > > > On Dec 16, 7:56 pm, ajaxer <calid...@gmail.com> wrote: > > > > > > > > > of course the time is related to the type data you are fetching > > > > > > > > by one > > > > > > > > query. > > > > > > > > > if the type of data is larger than 10000 items, you need > > > > > > > > reindexing > > > > > > > > for this result. > > > > > > > > and recount each time for getting the proper item. > > > > > > > > > it seems you have not encountered such a problem. > > > > > > > > on this situation, the indexes on the fields helps nothing for > > > > > > > > the > > > > > > > > bulk of data you have to be sorted is really big. > > > > > > > > > On Dec 17, 12:20 am, Andy Freeman <ana...@earthlink.net> wrote: > > > > > > > > > > > it still can result in timout if the data is really big > > > > > > > > > > How so? If you don't request "too many" items with a page > > > > > > > > > query, it > > > > > > > > > won't time out. You will run into > > > > > > > > > runtime.DeadlineExceededErrors if > > > > > > > > > you try to use too many page queries for a given request, but > > > > > > > > > .... > > > > > > > > > > > of no much use to most of us if we really have big data to > > > > > > > > > > sort and > > > > > > > > > > page. > > > > > > > > > > You do know that the sorting for the page queries is done > > > > > > > > > with the > > > > > > > > > indexing and not user code, right? Query time is independent > > > > > > > > > of the > > > > > > > > > total amount of data and depends only on the size of the > > > > > > > > > result set. > > > > > > > > > (Indexing time is constant per inserted/updated entity.) > > > > > > > > > > On Dec 16, 12:13 am, ajaxer <calid...@gmail.com> wrote: > > > > > > > > > > > it is too complicated for most of us. > > > > > > > > > > and it still can result in timout if the data is really big > > > > > > > > > > > of no much use to most of us if we really have big data to > > > > > > > > > > sort and > > > > > > > > > > page. > > > > > > > > > > > On Dec 15, 11:35 pm, Stephen <sdea...@gmail.com> wrote: > > > > > > > > > > > > On Dec 15, 8:04 am, ajaxer <calid...@gmail.com> wrote: > > > > > > > > > > > > > also 1000 index limit makes it not possible to fetcher > > > > > > > > > > > > older data on > > > > > > > > > > > > paging. > > > > > > > > > > > > > for if we need an indexed page more than 10000 items, > > > > > > > > > > > > it would cost us a lot of cpu time to calculate the > > > > > > > > > > > > base for GQL > > > > > > > > > > > > to fetch the data with index less than 1000. > > > > > > > > > > > >http://code.google.com/appengine/articles/paging.html-Hidequotedtext- > > > > > > > > > > > - Show quoted text -- Hide quoted text - > > > > > > > > > - Show quoted text -- Hide quoted text - > > > > > > > - Show quoted text -- Hide quoted text - > > > > > - Show quoted text -- Hide quoted text - > > > - Show quoted text - -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.