Re: [appengine-java] Objectify - Twig - approaches to persistence

John Patterson Thu, 11 Mar 2010 05:00:19 -0800

On 11 Mar 2010, at 16:27, Jeff Schnitzer wrote:

On Wed, Mar 10, 2010 at 7:39 PM, John Patterson<jdpatter...@gmail.com> wrote:
On 11 Mar 2010, at 03:40, Jeff Schnitzer wrote:
That is an empty claim with no example or evidence. Everycomparison we
have see so far is cleaner  and more readable in Twig.
Nonsense.  The only example that was "cleaner and more readable in
Twig" was a single and extremely contrived case based on a pattern
that nobody is actually using.  And even that is no longer prettier in
Twig.

Well then, where are your examples? I responded to the "edge cases"you asked about with a simple example. Now its your turn to show mehow Objectify would code the example I asked about.

So wheres the code? Where is this clean Objectify code you claim ispossible? The question is still waiting below

Readability is not even the most important issue here: Objectifymodel'sdependence on low-level Keys means that *all* code that uses youdata (theentire application) must also be dependent on the low-leveldatastore.
Let's think for a minute why anyone cares if Key, Email, GeoPt, Link,
etc shows up in higher levels of an application:  portability.  Unless
you're working on Twig for other datastores, this entire line of
thought is moot.  If you use Twig for data access, you're stuck on
GAE.  If you use Objectify for data access, you're stuck on GAE.

You are still mixing the concepts of data model portability and dataaccess layer (DAOs) portability.

Of course the code to actually manipulate data uses a particularinterface and is dependent on that interface. The only persistenceframework I am aware of that approaches complete transparency isTerracotta.

I never claimed that Twig code would run on other platforms. What Iclaimed was that the *data models* are not dependent on thedatastore. Generally a lot of application code simply uses datawithout needing to persist changes - especially in webapps. All ofthis code can remain portable ... unless your data models are notportable.

The only data access API that gives you any hope (tenuous as it is) of
real portability is JDO/JPA.

Are you having a laugh? After all your rants about how portabilitywith JDO-GAE is impossible... that is really grasping at straws.

I claim that Twig data models are more portable that JDO-GAE, JPA andObjectify data models. Why? Because they are pure POJOs with no lowlevel datastore dependencies - simple. Yes you can run JDO on otherplatforms - but

I do not claim that the data access layer is portable - althoughcertainly neither is Objectify

Without portability, all this talk of "polluting" the higher tiers of
your application with datastore classes is a whole lot of religious
claptrap.

It is not important until you hit one of the many "show-stoppers" thatthese mailing lists are full of. Then it becomes a little moreinteresting.

In my development with Objectify, I haven't found it necessary to use
Key objects in the higher levels of my app - just at the level of
DAOs.  However, I do use GeoPt a lot.  There are a lot of things that
might change if I had to port away from GAE, and this is just one of
them.


Obviously a rather large one.

The trick will be threading the needle between features and
complexity.  You weigh features more heavily, I weigh simplicity more
heavily.

In a system such as GAE which makes so much so hard productivityfeatures are a great relief. No OR queries? Of course I valuesimplicity - but the whole idea of a framework is to push complexityfrom the user code into the framework. Objectify leaves too much toothe developer for my liking. It is simple - I grant you that - butthat makes app code more complex.

Oh come on!  What is so difficult to understand here?  Calling
refresh(Object) gets the latest data from the datastore.


Actually, it was quite confusing, and I'm not just being
argumentative.  Until you posted your delete example, it was not
apparent that your entity POJOs have lifecycle state; a POJO in an
entity graph might have been loaded from the database or it might be
empty, despite having a valid key.

Now it makes sense why you need a refresh() method.

It is covered in the docs. Activation is a central concept with Twig- modelled on Db4o.

I don't like the idea of entities that look like User #1234 but don't
have the data of User #1234.  This feels like it has a lot of bug
potential to me.

All referenced objects are activated by default - the developeractually has to explicitly choose to use this feature as anoptimisation so there are no surprises.

Objectify has, basically,
four methods: get(), put(), delete(), and query(). They workexactly
the way you would expect.
I wouldn't say that - why does Query extend QueryResultIterator?This isnot at all expected. I could call offset() on your "Query" in themiddle ofiterating through its results... the API seems to suggest that ispossible.
 Quite weird.
Once again, this allows queries to be elegantly run like this:

for (Car car: ofy.query(Car.class).filter("weight >", 5000)) {
   doSomethingWith(car);
}

...which, while not being a major bullet point, is at least an mild
ergonomic win for Objectify.

Cramming a query into a for loop is hardly a gain worth riskingprogramming errors over.

Twig adds quite a few API methods and the docs don't make it clear
what they do.  What are refresh(), update(), storeOrUpdate(),
associate(), and disassociate()?

Of course Twig has more API methods than Objectify - Objectify doesnot

manage instance identity.


Managing instance identity is not that hard.  It's a simple @Id field,
just like the @Id you are familiar with in JPA.  Maybe an additional
@Parent in the rare case you need one.  Not that complicated.

Managing object identity is a lot more than just keeping an Id field.It is a guarantee that if you load the same instance twice from thedatastore they will be identical instances. obj1 == obj2

Objectify has no such guarantee so you could end up loading tworeferenced instances with the same id, making changes and puting themagain only for one to overwrite the other.

Take for example these Objectify methods: filter(), ancestor(),sort(),cursor(): Suer they succinct but there is no indication whether anyof themadd to the query state or replace it i.e. is more than on filterallowed?
You're deluding yourself if you think people are more confused by the
Objectify API than the Twig API.  Say what you will about the feature
set but Objectify has the advantage of simplicity, and the query
interface follows the GAE/Python API fairly closely.  There is no
getting lost in a menagerie of Strategies and Commands.

The Objectify framework has the advantage of simplicity because it hasfewer features. The Objectify user code is more complex because theuser must do more "manually". The merged OR query is just one exampleof this.

In my opinion the goal for any GAE framework should be to helpovercome the great and unusual limitations of the environment. Toencapsulate solutions in one place to avoid repeating them again andagain.

Person owner = datastore.load(Car.class, carKey).owner
// now activate the inactive owner instance
datastore.refresh(owner);


You need this example somewhere in your docs to explain that "empty"
POJOs get loaded, and how you convert them to "full" POJOs.  It wasn't
clear to me until I saw this example.


Good point.  I'll add some more examples.

Now for Objectify: How do you implement my earlier example inObjectify?

"Find highly skilled .NET or Java programmers ordered by skill"
Twig:
Iterator<User> users = datastore.find()
.type(UserSkill.class)
.addSort("ability")
.addFilter("ability", GREATER_THAN, 5)
        .addChildQuery()
        .addFilter("skill", EQUAL, "java")
.addChildQuery()
        .addFilter("skill", EQUAL, ".net")
        .returnParentsNow();


Yes yes, you implemented OR and in-memory sorting.  I don't personally
love these features.

I would not call this "in-memory sorting" which implies that theobjects are loaded and then sorted. The way it is implemented mergesthe already sorted results as a _stream_ into a single iterator.

With Objectify you must run two queries and sort
yourself.  I'm happy to concede this checkbox on the feature list.

Does a save of the UserSkill cascade to the User?  Whatever the
answer, what do people who want the "other behavior" do?  There is a
reason JDO & JPA have all those cascade options.
Yes these types of cascade options will become necessary onceautomaticdirty detection is available. But non-bytecode-enhanced classeswill always
be an option.
Without dirty detection, aren't you likely to persist quite a large
object graph every time you do a put?

No. Updating or storing an item makes all reachable items persistent_only_ if it is not already persistent. Each item that changes musthave update called.

I have to ask something:  Why not just use JPA?  Or work on the
DataNucleus plugin?

Twig is already much more capable than JPA on GAE for a very goodreason: It is designed specifically to work with GAE. Integratingperformance features like "Parallel Asynchronous Commands" into JPA orJDO would be impossible.

I can honestly say that without Twig my app would be at least 100times slower. Why? Embedded collections have probably reduced themultiple queries required in JDO by 10 fold. Parallel queries havenow reduced the query another 10 fold.

In fact a complex geo-spatial query that now runs in 50ms did not evenreturn in 30 seconds using my first approach with JDO.


Enough said.

I see the direction in which you're taking the project - and it's not
someplace that Objectify is likely to follow.  As impressive as
whatever you eventually come up with will be, it will be at least as
complicated as JPA and still be utterly tied to Appengine.

Specific to App Engine, yes. As complicated as JPA, no. The standardinterface implementations have much more to contend with because theymust follow a spec.

Philosophically, Objectify is more likely to head the other direction.
Scott and I have bantered about the idea of creating a parallel
Objectify implementation for Cassandra -

who's that? Your girlfriend? Ha ha I just looked at the projectpage... very interesting.

including a Key<?> class,
which makes quite a bit of sense there.  Now that would be *real*
portability.  For the moment it's just banter, but if I were going to
put as much effort into the API as you seem to be planning, this is
where I would go.

I have also considered how Twig could be made to run on otherplatforms... but there the danger is taking a lowest commondenominator approach and ending up like JDO/JPA. Twig was started tosolve problems that were not possible before. Now its time to putthat to work.

Okay, how about this: How do you delete instances without loadingthem
first?

Give me your equivalent code:
ofy.delete(ofy.query(Car.class).filter("weight >",5000).fetchKeys());
I'll break this into separate statements just for clarity:

Iterator<Car> carsToDelete = datastore.find()
.type(Car.class)
.addFilter("weight", GREATER_THAN, 5000)
.returnNoFields() // does a keys only query
getResultsNow();
datastore.deleteAll(carsToDelete);


While you have indeed eliminated Key by using an empty POJO entity as
a surrogate, I hardly think your API is cleaner or more elegant.  I
know which of these I would rather type.

Reducing key strokes is a much lower priority in Twig than creatingreadable, maintainable code.


It seems that method naming in Objectify takes the oposite approach.

I also don't love the empty-POJO-as-surrogate concept.  User POJO
classes can contain all sorts of odd logic, and allowing some to sit
around in an uninitialized state is potentially quite dangerous.  It
seems rather easy for a developer to load EntityA (with an Activate(0)
field of type EntityB), make a change to what is actually an
uninitialized reference to EntityB, then try to save EntityB.  You'll
wipe out the contents of EntityB.  Do you prevent this from happening?

 And the
idea that you can take any class and make it a data model class is
pretty fanciful.  You might be able to get away with it in limited
cases, but in the real world you pretty quickly will start needing
@Activation, @Embed, or other annotations that control persistence.
Not at all. The docs make it quite clear that annotations are acompletelyoptional way to configure a data model. Doing so in Java code isextremely
easy and gives more control.


I looked at the documentation, and "extremely easy" is not the way I
would term it.  But fair enough, I'll concede "persistence of
arbitrary objects".

Looks like the debate started early :)


Indeed :-)

Hey, I just noticed something. Does Twig lack simple batch get()operations?

I frequently write code like this:

Set<Long> ids = ... // fetch user's friend ids
Collection<Person> friends = ofy.get(Person.class, ids).values();
What is the Twig equivalent?

This would be handled with a simple direct collection reference inTwig like this:


user.friends

So the batch get by key is really not so useful in Twig - or at leastI haven't had the need for it yet. But I can see cases where would beuseful and am planning on adding a fluent load (and delete) commandswhich - just like the find and store commands - has "advanced"features such as returnResultsLater() and bulk load.

I just noticed, how do you set the chunk size in Objectify? I can'tsee the option anywhere.


In Twig it is getResultsBy(50)

Now, back to work John! :)

--
You received this message because you are subscribed to the Google Groups "Google 
App Engine for Java" group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.

Re: [appengine-java] Objectify - Twig - approaches to persistence

Reply via email to