[appengine-java] Objectify-Appengine 2.1 released, supports Partial Indexes

2010-03-25 Thread Jeff Schnitzer
Today we released Objectify v2.1, the latest version of our opensource
replacement for JDO/JPA on the Google App Engine datastore.

This version includes a major new feature, Partial Indexes.  If you
aren't sure what partial indexes are, the Wikipedia page
(http://en.wikipedia.org/wiki/Partial_index) describes them so:

"A partial index, also known as filtered index is a database index
which has some condition applied to it such that it only includes a
portion of the rows in the table.  This can allow the index to remain
small even though the table may be rather large, and have fairly
extreme selectivity."

Here is an example of an Objectify entity using partial indexes:

public class Player {
@Id Long id;

// Simple conditions:  IfFalse, IfTrue, IfZero, IfNull, etc
@Unindexed(IfFalse.class) boolean admin;

// Smarter - sensitive to the actual default value
@Unindexed(IfDefault.class) Team team = Team.NOTCHOSEN;

// You can make your own conditions
@Unindexed(IfCustomCondition.class) Status status;

static class IfCustomCondition extends ValueIf {
public boolean matches(Status value) {
return (value == Status.DEAD || value == Status.RETIRED);
}
}
}

Why should you care about optimizing indexes?

All queries in the datastore require indexes, which are a sort of
reverse-mapping from value to key.  These indexes occupy space and
consume cpu resources whenever an entity is written to the datastore.
With the addition of just a few indexes, this cost quickly doubles or
triples the cost of storing the original entity:

 * A basic entity with no indexes costs 48 api_cpu_ms to store.
 * Each single-property indexed field adds an additional 17 api_cpu_ms.

This number appears stable and consistent; appengine seems to have a
static formula for computing datastore costs.  Storage size costs are
harder to measure, but from watching mailing list traffic it seems
quite easy to double or triple your storage size with unnecessary
indexes.

When should you care about optimizing indexes?

 * Removing unnecessary indexes will not make writes faster, it will
make them /cheaper/.  All indexes are written in parallel, so indexes
do not add latency to writes.  Instead, indexes add $ to the bill you
get at the end of the week - and push you closer to your quota limits.

 * If your application has relatively small quantities of relatively
static data, index optimization is probably pointless.  On the other
hand, if you have large data volumes or heavy write loads, you must
carefully choose your indexes (or be very rich).

Do I need partial indexes, as opposed to just declaring whole fields
indexed or not?

It depends on your dataset and your queries.  In the Player example
above, partial indexes can be extremely effective:

 * You only ever filter on the admin field for actual admins, and most
players are not admins.
 * You only ever filter on the team field for players who have chosen
a team, and the bulk of players are not associated with a team.
 * You only ever filter on the status field for players who have
active statuses, and you have a large number of inactive players.

Objectify's support for partial indexes also has the ability to
determine index behavior based on "the whole entity".  This allows you
to perform certain kinds of limited multiple-property queries
(including double inequality queries) without creating a
multi-property index.  As an example, it is very easy to model this
index from the Wikipedia page:

create index partial_salary on employee(age) where salary > 2100;

An example of this is documented in the Objectify manual.

Thanks,
The Objectify Team
Jeff, Scott, and Matt

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



Re: [appengine-java] Objectify-Appengine 2.1 released, supports Partial Indexes

2010-03-25 Thread Duong BaTien
Hi:

Congratulation and thank for tremendous efforts from the Objectify Team.

By the way, has any one attempted Objectify with possible very large
index of subscribers and publishers of web-hook pub-sub (Google
PubSubHubbub of Atom or short message twitter style).

Thanks
Duong BaTien
DBGROUPS and BudhNet


On Thu, 2010-03-25 at 13:45 -0700, Jeff Schnitzer wrote:
> Today we released Objectify v2.1, the latest version of our opensource
> replacement for JDO/JPA on the Google App Engine datastore.
> 
> This version includes a major new feature, Partial Indexes.  If you
> aren't sure what partial indexes are, the Wikipedia page
> (http://en.wikipedia.org/wiki/Partial_index) describes them so:
> 
> "A partial index, also known as filtered index is a database index
> which has some condition applied to it such that it only includes a
> portion of the rows in the table.  This can allow the index to remain
> small even though the table may be rather large, and have fairly
> extreme selectivity."
> 
> Here is an example of an Objectify entity using partial indexes:
> 
> public class Player {
> @Id Long id;
> 
> // Simple conditions:  IfFalse, IfTrue, IfZero, IfNull, etc
> @Unindexed(IfFalse.class) boolean admin;
> 
> // Smarter - sensitive to the actual default value
> @Unindexed(IfDefault.class) Team team = Team.NOTCHOSEN;
> 
> // You can make your own conditions
> @Unindexed(IfCustomCondition.class) Status status;
> 
> static class IfCustomCondition extends ValueIf {
> public boolean matches(Status value) {
> return (value == Status.DEAD || value == Status.RETIRED);
> }
> }
> }
> 
> Why should you care about optimizing indexes?
> 
> All queries in the datastore require indexes, which are a sort of
> reverse-mapping from value to key.  These indexes occupy space and
> consume cpu resources whenever an entity is written to the datastore.
> With the addition of just a few indexes, this cost quickly doubles or
> triples the cost of storing the original entity:
> 
>  * A basic entity with no indexes costs 48 api_cpu_ms to store.
>  * Each single-property indexed field adds an additional 17 api_cpu_ms.
> 
> This number appears stable and consistent; appengine seems to have a
> static formula for computing datastore costs.  Storage size costs are
> harder to measure, but from watching mailing list traffic it seems
> quite easy to double or triple your storage size with unnecessary
> indexes.
> 
> When should you care about optimizing indexes?
> 
>  * Removing unnecessary indexes will not make writes faster, it will
> make them /cheaper/.  All indexes are written in parallel, so indexes
> do not add latency to writes.  Instead, indexes add $ to the bill you
> get at the end of the week - and push you closer to your quota limits.
> 
>  * If your application has relatively small quantities of relatively
> static data, index optimization is probably pointless.  On the other
> hand, if you have large data volumes or heavy write loads, you must
> carefully choose your indexes (or be very rich).
> 
> Do I need partial indexes, as opposed to just declaring whole fields
> indexed or not?
> 
> It depends on your dataset and your queries.  In the Player example
> above, partial indexes can be extremely effective:
> 
>  * You only ever filter on the admin field for actual admins, and most
> players are not admins.
>  * You only ever filter on the team field for players who have chosen
> a team, and the bulk of players are not associated with a team.
>  * You only ever filter on the status field for players who have
> active statuses, and you have a large number of inactive players.
> 
> Objectify's support for partial indexes also has the ability to
> determine index behavior based on "the whole entity".  This allows you
> to perform certain kinds of limited multiple-property queries
> (including double inequality queries) without creating a
> multi-property index.  As an example, it is very easy to model this
> index from the Wikipedia page:
> 
> create index partial_salary on employee(age) where salary > 2100;
> 
> An example of this is documented in the Objectify manual.
> 
> Thanks,
> The Objectify Team
> Jeff, Scott, and Matt
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Google App Engine for Java" group.
> To post to this group, send email to google-appengine-j...@googlegroups.com.
> To unsubscribe from this group, send email to 
> google-appengine-java+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/google-appengine-java?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
htt