your method is right but the limit is there what ever methods we use.

and you didn't get what i mean.

because the offset is limited to 1000.
i can not sort data by fields in results more than some limited items

with out the offset limit, we can do it easily.


On 12月24日, 上午2时10分, Andy Freeman <ana...@earthlink.net> wrote:
> Any application requires fetching an unbounded amount of data for a
> single page view is not scalable, no matter what technology you use to
> build it, so this problem is not appengine specific.
>
> If you need aggregations (average, median, total, etc), you have to
> compute them incrementally or with an off-line process.
>
> > when even with the "datetime <=" you still get a big set, how you can
> > handle it?
>
> We're talking about paging through a dataset, presenting n (for small
> n) elements at a time to a user.
>
> If we're paging through by the value of field with distinct values and
> we want to present 20 results per page, the query for the first page
> is "order by field" with limit 20.  That query has a "last" result.
> The query for the next page is "field > {last result's field value}
> order by field", again with limit 20.  That query also has a last
> result so the form of subsequent queries should be obvious.  (If
> you've got other conditions, such as user id or key, you need to add
> those as well.)
>
> Suppose that entities can have the same field value.  If you don't
> care how those entities are ordered, the first query's order by clause
> can be "order by field, __key__", again limit 20.  The next query
> tries to pick up entities with the same field as the last result from
> the previous query.  It looks like "field = {last result's field's
> value} and __key__ > {last result's key} order by __key__" and you
> keep using it until it fails.  You then use a query like the "next
> page" query from the previous case.  (I stopped mentioning limit
> because the value depends on what you need to fill the current page.)
>
> On Dec 22, 8:50 pm, ajaxer <calid...@gmail.com> wrote:
>
>
>
> > when even with the "datetime <=" you still get a big set, how you can
> > handle it?
> > for example you get 10000 item with the most specific filtering sql.
> > and on this filtering sql, you should have a statistic info. like how
> > many item it is .
>
> > how do you expect the appengine to handle this problem?
> > how about at one request with many these actions?
>
> > On Dec 21, 11:09 pm, Andy Freeman <ana...@earthlink.net> wrote:
>
> > > What statistics are you talking about?
>
> > > You're claiming that one can't page through an entity type without
> > > fetching all instances and sorting them.  That claim is wrong because
> > > the order by constraint does exactly that.
>
> > > For example, suppose that you want to page through by a date/time
> > > field named "datetime".  The query for the first page uses order by
> > > datetime while queries for subsequent pages have a "datetime <="
> > > clause for the last datetime value from the previous page and continue
> > > to order by datetime.
>
> > > What part of that do you think doesn't work?
>
> > > Do you think that Nick was wrong when he said that time time to
> > > execute such query depends on the number of entities?
>
> > > You can even do random access by using markers that are added/
> > > maintained by a sequential process like the above.
>
> > > On Dec 20, 7:34 pm, ajaxer <calid...@gmail.com> wrote:
>
> > > > You misunderstand.
> > > > if not show me a site with statistics on many fields.
> > > > with more than 1000 pages please.
> > > > thanks.
>
> > > > On Dec 21, 9:06 am, Andy Freeman <ana...@earthlink.net> wrote:
>
> > > > > You misunderstand.
>
> > > > > If you have an ordering based on one or more indexed properties, you
> > > > > can page efficiently wrt that ordering, regardless of the number of
> > > > > data items.  (For the purposes of this discussion, __key__ is an
> > > > > indexed property, but you don't have to use it or can use it just to
> > > > > break ties.)
>
> > > > > If you're fetching a large number of items and sorting so you can find
> > > > > a contiguous subset, you're doing it wrong.
>
> > > > > On Dec 19, 10:26 pm, ajaxer <calid...@gmail.com> wrote:
>
> > > > > > obviously, if you have to page a data set more than 50000 items 
> > > > > > which
> > > > > > is not ordered by __key__,
>
> > > > > > you may find that the __key__  is of no use, because the filtered 
> > > > > > data
> > > > > > is ordered not by key.
> > > > > > but by the fields value, and for that reason you need to loop query 
> > > > > > as
> > > > > > you may like to do.
>
> > > > > > but you will encounter a timeout exception before you really 
> > > > > > finished
> > > > > > the action.
>
> > > > > > On Dec 19, 8:26 am, Andy Freeman <ana...@earthlink.net> wrote:
>
> > > > > > > > > if the type of data is larger than 10000 items, you need 
> > > > > > > > > reindexing
> > > > > > > > for this result.
> > > > > > > > and recount each time for getting the proper item.
>
> > > > > > > What kind of reindexing are you talking about.
>
> > > > > > > Global reindexing is only required when you change the indices in
> > > > > > > app.yaml.  It doesn't occur when you add more entities and or 
> > > > > > > have big
> > > > > > > entities.
>
> > > > > > > Of course, when you change an entity, it gets reindexed, but 
> > > > > > > that's a
> > > > > > > constant cost.
>
> > > > > > > Surely you're not planning to change all your entities fairly 
> > > > > > > often,
> > > > > > > are you?  (You're going to have problems if you try to maintain
> > > > > > > sequence numbers and do insertions, but that doesn't scale 
> > > > > > > anyway.)
>
> > > > > > > > > it seems you have not encountered such a problem.
> > > > > > > > on this situation, the indexes on the fields helps nothing for 
> > > > > > > > the
> > > > > > > > bulk of  data you have to be sorted is really big.
>
> > > > > > > Actually I have.  I've even done difference and at-least-#
> > > > > > > (intersection and union are special cases - at-least-# also 
> > > > > > > handles
> > > > > > > majority), at-most-# (binary xor is the only common case that I 
> > > > > > > came
> > > > > > > up with), and combinations thereof on paged queries.
>
> > > > > > > Yes, I know that offset is limited to 1000 but that's irrelevant
> > > > > > > because the paging scheme under discussion doesn't use offset.  It
> > > > > > > keeps track of where it is using __key__ and indexed data values.
>
> > > > > > > On Dec 16, 7:56 pm, ajaxer <calid...@gmail.com> wrote:
>
> > > > > > > > of course the time is related to the type data you are fetching 
> > > > > > > > by one
> > > > > > > > query.
>
> > > > > > > > if the type of data is larger than 10000 items, you need 
> > > > > > > > reindexing
> > > > > > > > for this result.
> > > > > > > > and recount each time for getting the proper item.
>
> > > > > > > > it seems you have not encountered such a problem.
> > > > > > > > on this situation, the indexes on the fields helps nothing for 
> > > > > > > > the
> > > > > > > > bulk of  data you have to be sorted is really big.
>
> > > > > > > > On Dec 17, 12:20 am, Andy Freeman <ana...@earthlink.net> wrote:
>
> > > > > > > > > > it still can result in timout if the data is really big
>
> > > > > > > > > How so?  If you don't request "too many" items with a page 
> > > > > > > > > query, it
> > > > > > > > > won't time out.  You will run into 
> > > > > > > > > runtime.DeadlineExceededErrors if
> > > > > > > > > you try to use too many page queries for a given request, but 
> > > > > > > > > ....
>
> > > > > > > > > > of no much use to most of us if we really have big data to 
> > > > > > > > > > sort and
> > > > > > > > > > page.
>
> > > > > > > > > You do know that the sorting for the page queries is done 
> > > > > > > > > with the
> > > > > > > > > indexing and not user code, right?  Query time is independent 
> > > > > > > > > of the
> > > > > > > > > total amount of data and depends only on the size of the 
> > > > > > > > > result set.
> > > > > > > > > (Indexing time is constant per inserted/updated entity.)
>
> > > > > > > > > On Dec 16, 12:13 am, ajaxer <calid...@gmail.com> wrote:
>
> > > > > > > > > > it is too complicated for most of us.
> > > > > > > > > > and it still can result in timout if the data is really big
>
> > > > > > > > > > of no much use to most of us if we really have big data to 
> > > > > > > > > > sort and
> > > > > > > > > > page.
>
> > > > > > > > > > On Dec 15, 11:35 pm, Stephen <sdea...@gmail.com> wrote:
>
> > > > > > > > > > > On Dec 15, 8:04 am, ajaxer <calid...@gmail.com> wrote:
>
> > > > > > > > > > > > also 1000 index limit makes it not possible to fetcher 
> > > > > > > > > > > > older data on
> > > > > > > > > > > > paging.
>
> > > > > > > > > > > > for if we need an indexed page more than 10000 items,
> > > > > > > > > > > > it would cost us a lot of cpu time to calculate the 
> > > > > > > > > > > > base for GQL
> > > > > > > > > > > > to fetch the data with index less than 1000.
>
> > > > > > > > > > >http://code.google.com/appengine/articles/paging.html-Hidequotedtext-
>
> > > > > > > > > > - Show quoted text -- Hide quoted text -
>
> > > > > > > > - Show quoted text -- Hide quoted text -
>
> > > > > > - Show quoted text -- Hide quoted text -
>
> > > > - Show quoted text -- Hide quoted text -
>
> > - Show quoted text -

--

You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.


Reply via email to