[google-appengine] Re: Question about indexing of properties

2011-12-02 Thread Ice13ill
It wasn't a problem of revealing secrets, I just thought at this
situation (while playing with different data models on my entities),
so that is why I generalized.
So, to use your example: i could model my entity like this:
- name="SimpleEntity"
- type="animal"
- properties=["cat","big"] (which is a list)

Or, for the other model, let's say I have this equivalence:
- name="SimpleEntity"
- properties=["cat", "big", "type=animal"]

The first two questions that came into my head were:
1. which index will generate more metadata ( it the there is a
difference) ?
2. which index needs more "write" operations ?



On Nov 30, 11:30 pm, "Ikai Lan (Google)"  wrote:
> Note: I wrote the first part of this email before I understood what you
> were doing, but since I think it is useful information, I am leaving it in.
>
> Original email
> ---
>
> Basically, there are a few rules to remember when considering tradeoffs:
>
> - get by key is always best. It's the most cost efficient. If you can
> perform your query using a named key, you'll see the most benefits
> - from a cost perspective, writing an index is always 2 datastore
> operations. If you UPDATE an index (change a value), that's 4 datastore
> operations because you need to delete the old indexes.
>
> In general, most websites and web apps are read heavy. The rule of thumb is
> that you might do 10 or more reads per every write, so you optimize for
> reads when possible.
>
> One pattern I generally recommend is where you store records both as
> individual rows as well as child fields in the parent entity. I was talking
> to someone yesterday about the best way to store, say, travel data. I
> recommended a structure that looked something like this:
>
> Trip
> - date
> - description
> - traveler_id
>
> Traveler
> - name
> - trips <--- serialized trips
>
> This was a situation where "Traveler" would have been read way more times
> than Trip would have been queried, but we would treat the "Trip" as the
> source of truth so we can always regenerated the Traveler's "trips"
> property. The tradeoff here is additional storage, but the benefits are
> that we have a source of truth, and reads are really, really fast since we
> only need to fetch travelers by ID.
>
> Answer to your question
> -
> In your case, I think the only read tradeoff, if I understand your problem
> correctly, is that you cannot query by property equality independent of
> type. It'll take fewer indexes. Example:
>
> Let's say you have a property value "cat" and a type "animal". If you use a
> list property of "animal=cat", you can't ever find ALL the properties
> "cat". It'll cost less in terms of indexes. If you wanted to find all the
> "animals" (type), you would do an inequality query on ">animal".
>
> I can't think of a material difference in terms of performance, but maybe
> someone else in this group can.
>
> Also, one more general tip that would have made the original question
> easier to understand: don't overgeneralize the problem (type, property,
> value). You probably aren't going to reveal any secret details, and it's
> easier for readers to conceptualize if they can map to concrete object
> types.
>
> --
> Ikai Lan
> Developer Programs Engineer, Google App Engine
> plus.ikailan.com | twitter.com/ikai
>
>
>
>
>
>
>
> On Wed, Nov 30, 2011 at 2:42 AM, Ice13ill  wrote:
> > Well let's say I have an Entity with the following fields:
> > - String name
> > - String type
> > - List properties
>
> > The purpose is to search these entities (for example: the user inputs
> > the search words to be matched against "properties" field and the
> > "name" and "type" are matched according to some other settings/
> > options)
> > So let's say I can use this index: name^ , type^ , properties^.
>
> > But another option would be to remove the "type" field and insert the
> > same information in the "properties" list filed as
> > "type=some_value_of_type" so and I could use this index: name^ ,
> > properties^.
>
> > I believe that any of this is a solution for searching my Entities (is
> > it possible that i'm missing something ?), but i was wandering what
> > are the differences (advantages/disadvantages) regarding query
> > performance, quotas etc.
>
> > On Nov 29, 10:57 pm, "Brandon Wirtz"  wrote:
> > > I think you are doing someone's home work :-)
>
> > > From: google-appengine@googlegroups.com
> > > [mailto:google-appengine@googlegroups.com] On Behalf Of Ikai Lan
> > (Google)
> > > Sent: Tuesday, November 29, 2011 10:58 AM
> > > To: google-appengine@googlegroups.com
> > > Subject: Re: [google-appengine] Question about indexing of properties
>
> > > Can you give a more concrete example of the two cases (maybe provide some
> > > code)? I'm trying to figure out what you're doing so I can list off
> > > tradeoffs that I see.
>
> > > --
>
> > > Ikai Lan
> > > Developer Programs Engineer, Google App Engine
>
> > > plus.ikailan.com   | twitter.com/ikai

Re: [google-appengine] Re: Question about indexing of properties

2011-11-30 Thread Ikai Lan (Google)
Note: I wrote the first part of this email before I understood what you
were doing, but since I think it is useful information, I am leaving it in.

Original email
---

Basically, there are a few rules to remember when considering tradeoffs:

- get by key is always best. It's the most cost efficient. If you can
perform your query using a named key, you'll see the most benefits
- from a cost perspective, writing an index is always 2 datastore
operations. If you UPDATE an index (change a value), that's 4 datastore
operations because you need to delete the old indexes.

In general, most websites and web apps are read heavy. The rule of thumb is
that you might do 10 or more reads per every write, so you optimize for
reads when possible.

One pattern I generally recommend is where you store records both as
individual rows as well as child fields in the parent entity. I was talking
to someone yesterday about the best way to store, say, travel data. I
recommended a structure that looked something like this:

Trip
- date
- description
- traveler_id

Traveler
- name
- trips <--- serialized trips

This was a situation where "Traveler" would have been read way more times
than Trip would have been queried, but we would treat the "Trip" as the
source of truth so we can always regenerated the Traveler's "trips"
property. The tradeoff here is additional storage, but the benefits are
that we have a source of truth, and reads are really, really fast since we
only need to fetch travelers by ID.

Answer to your question
-
In your case, I think the only read tradeoff, if I understand your problem
correctly, is that you cannot query by property equality independent of
type. It'll take fewer indexes. Example:

Let's say you have a property value "cat" and a type "animal". If you use a
list property of "animal=cat", you can't ever find ALL the properties
"cat". It'll cost less in terms of indexes. If you wanted to find all the
"animals" (type), you would do an inequality query on ">animal".

I can't think of a material difference in terms of performance, but maybe
someone else in this group can.

Also, one more general tip that would have made the original question
easier to understand: don't overgeneralize the problem (type, property,
value). You probably aren't going to reveal any secret details, and it's
easier for readers to conceptualize if they can map to concrete object
types.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com | twitter.com/ikai



On Wed, Nov 30, 2011 at 2:42 AM, Ice13ill  wrote:

> Well let's say I have an Entity with the following fields:
> - String name
> - String type
> - List properties
>
> The purpose is to search these entities (for example: the user inputs
> the search words to be matched against "properties" field and the
> "name" and "type" are matched according to some other settings/
> options)
> So let's say I can use this index: name^ , type^ , properties^.
>
> But another option would be to remove the "type" field and insert the
> same information in the "properties" list filed as
> "type=some_value_of_type" so and I could use this index: name^ ,
> properties^.
>
> I believe that any of this is a solution for searching my Entities (is
> it possible that i'm missing something ?), but i was wandering what
> are the differences (advantages/disadvantages) regarding query
> performance, quotas etc.
>
> On Nov 29, 10:57 pm, "Brandon Wirtz"  wrote:
> > I think you are doing someone's home work :-)
> >
> > From: google-appengine@googlegroups.com
> > [mailto:google-appengine@googlegroups.com] On Behalf Of Ikai Lan
> (Google)
> > Sent: Tuesday, November 29, 2011 10:58 AM
> > To: google-appengine@googlegroups.com
> > Subject: Re: [google-appengine] Question about indexing of properties
> >
> > Can you give a more concrete example of the two cases (maybe provide some
> > code)? I'm trying to figure out what you're doing so I can list off
> > tradeoffs that I see.
> >
> > --
> >
> > Ikai Lan
> > Developer Programs Engineer, Google App Engine
> >
> > plus.ikailan.com   | twitter.com/ikai
> >
> > On Sat, Nov 26, 2011 at 4:17 AM, Ice13ill 
> wrote:
> >
> > Hi, I was wandering which of the following data models are ore
> > efficient. Let's say i have three fields form my entity: name (a
> > simple fileld), some_property (simple field), list_of_properties (a
> > list of properites).
> > - case 1. index: ^name, ^some_propery, ^list_of_properties.
> > - case 2. index: ^name, ^list_of_properties (but all the information
> > that relates to the filed "some_property" is stored in
> > "list_of_properties" as a string: some_property=a_property_value)
> >
> > What are the advantages/disadvantages of each case (performance,
> > metadata generated / size of all data, etc)
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine" group.
> > To post to this group, send email

RE: [google-appengine] Re: Question about indexing of properties

2011-11-30 Thread Brandon Wirtz
For both Performance and Cost/quota you want which ever results in the
fewest queries.  Depending on the number of returned entities you may find
that getting "too much" and then filtering via array's is cheaper, and on
other case getting 3 results and combining them may be cheaper.

This will vary a lot base on the number of results that might be returned
because you have to page results (which is another Cost/quota hit)

-Original Message-
From: google-appengine@googlegroups.com
[mailto:google-appengine@googlegroups.com] On Behalf Of Ice13ill
Sent: Wednesday, November 30, 2011 2:43 AM
To: Google App Engine
Subject: [google-appengine] Re: Question about indexing of properties

Well let's say I have an Entity with the following fields:
- String name
- String type
- List properties

The purpose is to search these entities (for example: the user inputs the
search words to be matched against "properties" field and the "name" and
"type" are matched according to some other settings/
options)
So let's say I can use this index: name^ , type^ , properties^.

But another option would be to remove the "type" field and insert the same
information in the "properties" list filed as "type=some_value_of_type" so
and I could use this index: name^ , properties^.

I believe that any of this is a solution for searching my Entities (is it
possible that i'm missing something ?), but i was wandering what are the
differences (advantages/disadvantages) regarding query performance, quotas
etc.

On Nov 29, 10:57 pm, "Brandon Wirtz"  wrote:
> I think you are doing someone's home work :-)
>
> From: google-appengine@googlegroups.com 
> [mailto:google-appengine@googlegroups.com] On Behalf Of Ikai Lan 
> (Google)
> Sent: Tuesday, November 29, 2011 10:58 AM
> To: google-appengine@googlegroups.com
> Subject: Re: [google-appengine] Question about indexing of properties
>
> Can you give a more concrete example of the two cases (maybe provide 
> some code)? I'm trying to figure out what you're doing so I can list 
> off tradeoffs that I see.
>
> --
>
> Ikai Lan
> Developer Programs Engineer, Google App Engine
>
> plus.ikailan.com <http://plus.ikailan.com/>  | twitter.com/ikai
>
> On Sat, Nov 26, 2011 at 4:17 AM, Ice13ill 
wrote:
>
> Hi, I was wandering which of the following data models are ore 
> efficient. Let's say i have three fields form my entity: name (a 
> simple fileld), some_property (simple field), list_of_properties (a 
> list of properites).
> - case 1. index: ^name, ^some_propery, ^list_of_properties.
> - case 2. index: ^name, ^list_of_properties (but all the information 
> that relates to the filed "some_property" is stored in 
> "list_of_properties" as a string: some_property=a_property_value)
>
> What are the advantages/disadvantages of each case (performance, 
> metadata generated / size of all data, etc)
>
> --
> You received this message because you are subscribed to the Google 
> Groups "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com
> <mailto:google-appengine%2bunsubscr...@googlegroups.com> .
> For more options, visit this group
athttp://groups.google.com/group/google-appengine?hl=en.
>
> --
> You received this message because you are subscribed to the Google 
> Groups "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group
athttp://groups.google.com/group/google-appengine?hl=en.

--
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: Question about indexing of properties

2011-11-30 Thread Ice13ill
Well let's say I have an Entity with the following fields:
- String name
- String type
- List properties

The purpose is to search these entities (for example: the user inputs
the search words to be matched against "properties" field and the
"name" and "type" are matched according to some other settings/
options)
So let's say I can use this index: name^ , type^ , properties^.

But another option would be to remove the "type" field and insert the
same information in the "properties" list filed as
"type=some_value_of_type" so and I could use this index: name^ ,
properties^.

I believe that any of this is a solution for searching my Entities (is
it possible that i'm missing something ?), but i was wandering what
are the differences (advantages/disadvantages) regarding query
performance, quotas etc.

On Nov 29, 10:57 pm, "Brandon Wirtz"  wrote:
> I think you are doing someone's home work :-)
>
> From: google-appengine@googlegroups.com
> [mailto:google-appengine@googlegroups.com] On Behalf Of Ikai Lan (Google)
> Sent: Tuesday, November 29, 2011 10:58 AM
> To: google-appengine@googlegroups.com
> Subject: Re: [google-appengine] Question about indexing of properties
>
> Can you give a more concrete example of the two cases (maybe provide some
> code)? I'm trying to figure out what you're doing so I can list off
> tradeoffs that I see.
>
> --
>
> Ikai Lan
> Developer Programs Engineer, Google App Engine
>
> plus.ikailan.com   | twitter.com/ikai
>
> On Sat, Nov 26, 2011 at 4:17 AM, Ice13ill  wrote:
>
> Hi, I was wandering which of the following data models are ore
> efficient. Let's say i have three fields form my entity: name (a
> simple fileld), some_property (simple field), list_of_properties (a
> list of properites).
> - case 1. index: ^name, ^some_propery, ^list_of_properties.
> - case 2. index: ^name, ^list_of_properties (but all the information
> that relates to the filed "some_property" is stored in
> "list_of_properties" as a string: some_property=a_property_value)
>
> What are the advantages/disadvantages of each case (performance,
> metadata generated / size of all data, etc)
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com
>  .
> For more options, visit this group 
> athttp://groups.google.com/group/google-appengine?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group 
> athttp://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.