Hi Vaclav, Thanks for chiming in. I am aware about audao, but did not consider it because I wanted to keep the list of new dependencies as small as possible. Specifically in the case of ANTLR there are backwards-incompatible changes between 2.7 and 3.0, which would force to jarjar them into appengine-mapreduce.
This would be a lot of complexity for such a tiny feature, where Fred's approach is simpler. Anyway, I think it would make a great addition into audao, where a Filter implementation based on the GSQL parser should be easy to implement. -- Nacho. On Thu, Nov 25, 2010 at 8:50 AM, Vaclav Bartacek < vaclav.barta...@spolecne.cz> wrote: > Hi, > > the open-source Java GQL parser (based on ANTLR) you can found here: > http://code.google.com/p/audao/wiki/ExtendedGQLParser > > Vaclav > > On Nov 24, 7:46 pm, Nacho Coloma <icol...@gmail.com> wrote: > > > One other thought: instead of adding a GQL interpreter, you might just > > > > add a hook for loading a class provided by the user. That class would > > > > > implement a Filter interface with a method that takes a Configuration > > > and returns a Query object so in your example, mailing and timestamp > > > would get passed in as Configuration parameters and a query object > > > corresponding to the GQL statement you put would be built by a Filter > > > class provided by the user. It would act kind of like a templating > > > language for building queries. Make sense/sound like a good idea? > > > > Actually, Filter would be simpler to implement and GQL can be added later > as > > a concrete Filter implementation if someone is still missing it (I doubt > > it). It also solves the problem of specifying the type of arguments. > > > > BTW, arguments should be passed in as request parameters, not > configuration > > attributes (like "timestamp greater than" or "process all comments by > user > > X" for example). This means that Filter may need encapsulated access to > some > > methods of AppEngineJobContext.request. > > > > It seems that it can be implemented in a couple of hours. I will still > wait > > for 1.4.0, though. > > > > On Nov 18, 7:01 am, Nacho Coloma <icol...@gmail.com> wrote: > > > > > > > I'm not entirely sure I understand > > > > > > the scope of the proposed patch. Are you thinking about adding > filters > > > > > > > at the DatastoreRecordReader level? It's not entirely clear to me > that > > > > > that provides a benefit over just applying the filter at the start > of > > > > > the map() function. Totally willing to believe I'm missing > something, > > > > > though. > > > > > > The map() filter runs against your quota. This is OK for once-only > tasks > > > > such as schema upgrades, but Mappers can also be used for repetitive > > > tasks > > > > such as mailing, data cleanup, etc. For these cases, being able to > work > > > on a > > > > subset of data is important (process only user accounts with mailing > > > > enabled, for example). > > > > > > The biggest problem to resolve is how to specify the filter clause in > > > > mapreduce.xml. I am considering implementing a GQL parser as simple > as > > > > possible, and inject servlet request parameters. Something like: > > > > > > <property> > > > > <name>mapreduce.mapper.inputformat.datastoreinputformat.query</name> > > > > <value>select * from users where mailing=:value1 and > > > > timestamp<=:value2</value> > > > > </property> > > > > > > This implies porting the GQL implementation from python to Java, or > > > > implementing an ANTLR-based parser. I feel like I am reinventing the > > > wheel, > > > > so any suggestion to use something that exists (or aim to a simpler > > > design) > > > > is welcome. > > > > > > On a logistical note, for nontrivial contributions, we require a CLA > > > > > > > from either you or your employer (depending on who owns the > copyright > > > > > for your work) before we can accept significant contributions. The > > > > > relevant forms are at: > > > > >http://code.google.com/legal/individual-cla-v1.0.html > > > > > andhttp://code.google.com/legal/corporate-cla-v1.0.html. Feel free > to > > > > > email me privately if this is an issue. > > > > > > No problem with that. > > > > > > Regards, > > > > > > Nacho. > > > > > -- > > > You received this message because you are subscribed to the Google > Groups > > > "Google App Engine for Java" group. > > > To post to this group, send email to > > > google-appengine-j...@googlegroups.com. > > > To unsubscribe from this group, send email to > > > google-appengine-java+unsubscr...@googlegroups.com<google-appengine-java%2bunsubscr...@googlegroups.com> > <google-appengine-java%2bunsubscr...@googlegroups.com<google-appengine-java%252bunsubscr...@googlegroups.com> > > > > > . > > > For more options, visit this group at > > >http://groups.google.com/group/google-appengine-java?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine for Java" group. > To post to this group, send email to > google-appengine-j...@googlegroups.com. > To unsubscribe from this group, send email to > google-appengine-java+unsubscr...@googlegroups.com<google-appengine-java%2bunsubscr...@googlegroups.com> > . > For more options, visit this group at > http://groups.google.com/group/google-appengine-java?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to google-appengine-j...@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.