Re: [google-appengine] MapReduce not stopping MR controllers when completed - frontend charges increasing in bill
Indeed, that might be the issue. Thanks for sharing Tom! -CAS On Wednesday, July 15, 2015 at 7:43:55 PM UTC-4, Tom Kaitchuck wrote: I think this may be your problem: https://github.com/GoogleCloudPlatform/appengine-mapreduce/issues/69 On Wed, Jul 15, 2015 at 9:03 AM, Camilo Silva camilo...@citrix.com javascript: wrote: So I've been working on Map Reduce Google library for quite a while on some Python App Engine projects. And to this day, I cannot comprehend why there are mrcontrollers alive doing callbacks everytime right after all processing is done (i.e., all shards terminated successfully). I always have to go back and purge the taskqueue so that unnecessary frontend calls are stopped -- this is an issue because the billing is affected due to this. Any feedback is welcomed. Thanks for your help. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com javascript:. To post to this group, send email to google-a...@googlegroups.com javascript:. Visit this group at http://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/cc62f26e-8277-422e-968b-2dd5e662c17a%40googlegroups.com https://groups.google.com/d/msgid/google-appengine/cc62f26e-8277-422e-968b-2dd5e662c17a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at http://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/5b67ac84-d04e-41f0-8393-c316e21dbe6c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [google-appengine] MapReduce not stopping MR controllers when completed - frontend charges increasing in bill
I think this may be your problem: https://github.com/GoogleCloudPlatform/appengine-mapreduce/issues/69 On Wed, Jul 15, 2015 at 9:03 AM, Camilo Silva camilo.si...@citrix.com wrote: So I've been working on Map Reduce Google library for quite a while on some Python App Engine projects. And to this day, I cannot comprehend why there are mrcontrollers alive doing callbacks everytime right after all processing is done (i.e., all shards terminated successfully). I always have to go back and purge the taskqueue so that unnecessary frontend calls are stopped -- this is an issue because the billing is affected due to this. Any feedback is welcomed. Thanks for your help. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at http://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/cc62f26e-8277-422e-968b-2dd5e662c17a%40googlegroups.com https://groups.google.com/d/msgid/google-appengine/cc62f26e-8277-422e-968b-2dd5e662c17a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at http://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/CAN4PiZF1otn%3DyhOk_Dfy5C85LBrEjwWdDpGJwa7PjP-OQ9dwcg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: [google-appengine] MapReduce import woes
Did you copy the mapreduce/python/src directory (from github) at the top of your app, at the same level where app.yaml is? PK http://www.gae123.com On November 13, 2014 at 9:09:10 AM, Marijn Vriens (marijnvri...@gmail.com) wrote: Hi all, Being relatively new to GAE and certainly to mapreduce part of it, thus the following is probably screamingly obvious, but it's eluding for a good time now. So, I'm trying to make use of mapreduce on GAE. The first hurdle is that the Python MapReduce demo on https://github.com/GoogleCloudPlatform/appengine-mapreduce seems to have been removed from the repository with commit ac4483dfae8e77ead7d0844799ed75b95afcbb99 about two days ago while all the documentation still points to it. (Oops?) So, looking at https://github.com/GoogleCloudPlatform/appengine-mapreduce/blob/1acb1e9165552dbff8cdfe32c8836b44013ab2d5/python/demo/app.yaml I see nothing special in the way of special includes... Just import it, like in python/demo/main.py , right? But from mapreduce import base_handler fails with ImportError: No module named mapreduce, Hmmm. Maybe it's my dev-server (SDK1.9,15 on windows7, running 32 bit python 2.7.8 by the way), but uploading a simple instance to the GAE production gives me the same ImportErrors. That's okay then. So, I guess there was a reason the demo was removed and I do the next most obvious thing, it's a thing you need to add it, like a library. Looking around I see mapreduce in google_appengine\google\appengine\ext\builtins. So according to https://cloud.google.com/appengine/docs/python/config/appconfig?csw=1#Python_app_yaml_Builtin_handlers adding the following to app.yaml should work. builtins: - mapreduce: on But this fails with: google.appengine.ext.builtins.InvalidBuiltinName: mapreduce is not the name of a valid builtin.. Okay, so it's not that. Maybe it's a straight include after all, just more explicit. So: from google.appengine.ext.mapreduce2 import base_handler This gives me a warning from the dev-server: WARNING 2014-11-13 16:45:09,437 __init__.py:43] You should not use the mapreduce library that is bundled with the SDK. Use the one from https://pypi.python.org/pypi/GoogleAppEngineMapReduce instead. And fails with: ImportError: No module named simplejson No it's not that... maybe add some magic handlers like in the app.yaml of the demo? Added the handlers, still giving ImportError on mapreduce. I've added the mapreduce code to the project itself. but that gives: ... \mapreduce\third_party\pipeline\__init__.py, line 26, in _fix_path all_paths = os.environ.get('PYTHONPATH').split(os.pathsep) AttributeError: 'NoneType' object has no attribute 'split' Added the mapreduce.yaml file with these strange Make messages lowercase mapreduce functions that don't seem to related to the rest of the demo. Didn't make a difference either. So, I'm now a bit lost as to actually how to include mapreduce. Anybody care to tell me how they did it? Kind regards, Marijn Vriens. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at http://groups.google.com/group/google-appengine. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at http://groups.google.com/group/google-appengine. For more options, visit https://groups.google.com/d/optout.
Re: [google-appengine] MapReduce import woes
Take a look at the build.sh script: https://github.com/GoogleCloudPlatform/appengine-mapreduce/blob/master/python/build.sh It compiles (and runs) the demo application using the checked out Mapreduce. It currently has an issue that makes it kindof annoying: https://github.com/GoogleCloudPlatform/appengine-mapreduce/issues/17 However if you just want to depend on MapReduce in your application, the easiest thing is to get it from Pypi. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at http://groups.google.com/group/google-appengine. For more options, visit https://groups.google.com/d/optout.
Re: [google-appengine] MapReduce Failures
A RetrySliceError will result in a retry. If it is not, it could be that you have the max reattempts on your task queue set too low. (Because Map Reduce manages retries based on it's configuration, it is safe to set this to unlimited.) Also you may want to take a look at shard retry: https://code.google.com/p/appengine-mapreduce/wiki/PythonShardRetry which is a new feature designed to make python Map Reduce more relyable. On Fri, May 24, 2013 at 8:20 AM, Ranjit Chacko rjcha...@gmail.com wrote: I'm seeing shards abruptly fail in my MR jobs for no apparent reason and without retrying: task_name=appengine-mrshard-1581047187783C3601732-14-2-retry-0 app_engine_release=1.8.0 instance=00c61b117c53a40e120ac864168a3fe51c2ce Shard 1581047187783C3601732-14 failed permanently. Is there some adjustment I can make to my queue parameters to avoid or reduce these issues? Recently I had been having problems with MR jobs throwing UnknownErrors and ApplicationError followed by RetrySliceErrors, and setting the min_backoff_seconds to 1 seemed to help with reducing the retry errors. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at http://groups.google.com/group/google-appengine?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at http://groups.google.com/group/google-appengine?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
[appengine-java] Re: Google AppEngine MapReduce is not working anymore... HELP needed!!
Done! Hope to find some helpful comments. Thanks!! On 24 ene, 21:09, Emanuele Ziglioli theb...@emanueleziglioli.it wrote: Could you post it on the MapReduce group too?http://groups.google.com/group/app-engine-pipeline-api On Jan 25, 12:15 am, imanol00 imano...@gmail.com wrote: Hi guys. I have a problem using the MapReduce feature of Google App Engine. In general, it works fine, but I am experiencing some strange behaviour in the MapReduce job management. It's just like there are some jobs that are running zombie in my application. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
[appengine-java] Re: Google AppEngine MapReduce is not working anymore... HELP needed!!
Could you post it on the MapReduce group too? http://groups.google.com/group/app-engine-pipeline-api On Jan 25, 12:15 am, imanol00 imano...@gmail.com wrote: Hi guys. I have a problem using the MapReduce feature of Google App Engine. In general, it works fine, but I am experiencing some strange behaviour in the MapReduce job management. It's just like there are some jobs that are running zombie in my application. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [google-appengine] MapReduce as Cron jobs: How to specify the number of shards?
Today I updated the mapreduce library. I see the same only 1 shard when I use the dev_server. The dev_server does not have the __scatter__ property of objects. The mapreduce library then falls back to a single shard. And on the production it depends on how many objects have a __scatter__ property. If less then shard_count have __scatter__ you get less shards. GAE Team: What determines if an object gets a __scatter__ property? 2011/2/11 djidjadji djidja...@gmail.com: In your cron_mapreduce.py add these two lines shard_count=int(self.request.get(shard_count, mr_control._DEFAULT_SHARD_COUNT)) mr_control.start_map( self.request.get(name), self.request.get(reader_spec, your_mapreduce.map), self.request.get(reader_parameters, mapreduce.input_readers.DatastoreInputReader), { entity_kind: self.request.get(entity_kind, models.YourModel), processing_rate: int(self.request.get(processing_rate, 100)) }, shard_count = shard_count, mapreduce_parameters={done_callback: self.request.get(done_callback, None) } ) 2011/2/10 Andrin von Rechenberg andri...@gmail.com: Hey there Today I created a library to run MapReduces as cron jobs in python. See here: http://devblog.miumeet.com/2011/02/schedule-mapreduce-daily-on-appengine.html However, I didn't figure out how to I'm able to set the shard_count programmatically. In mapreduce/control.py there is a function I call: def start_map(name, handler_spec, reader_spec, reader_parameters, shard_count=_DEFAULT_SHARD_COUNT, [...]) However, no matter what o I pass as the shard_count argument, it is ignored. Any ideas? Cheers, -Andrin -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] MapReduce as Cron jobs: How to specify the number of shards?
This document explains the strategy: http://code.google.com/p/appengine-mapreduce/wiki/ScatterPropertyImplementation It says tht there is a .8% chance of an entity getting this property. That seems really low. I wonder if they meant 8% not .8%? Stephen On Mon, Feb 14, 2011 at 7:12 AM, djidjadji djidja...@gmail.com wrote: Today I updated the mapreduce library. I see the same only 1 shard when I use the dev_server. The dev_server does not have the __scatter__ property of objects. The mapreduce library then falls back to a single shard. And on the production it depends on how many objects have a __scatter__ property. If less then shard_count have __scatter__ you get less shards. GAE Team: What determines if an object gets a __scatter__ property? 2011/2/11 djidjadji djidja...@gmail.com: In your cron_mapreduce.py add these two lines shard_count=int(self.request.get(shard_count, mr_control._DEFAULT_SHARD_COUNT)) mr_control.start_map( self.request.get(name), self.request.get(reader_spec, your_mapreduce.map), self.request.get(reader_parameters, mapreduce.input_readers.DatastoreInputReader), { entity_kind: self.request.get(entity_kind, models.YourModel), processing_rate: int(self.request.get(processing_rate, 100)) }, shard_count = shard_count, mapreduce_parameters={done_callback: self.request.get(done_callback, None) } ) 2011/2/10 Andrin von Rechenberg andri...@gmail.com: Hey there Today I created a library to run MapReduces as cron jobs in python. See here: http://devblog.miumeet.com/2011/02/schedule-mapreduce-daily-on-appengine.html However, I didn't figure out how to I'm able to set the shard_count programmatically. In mapreduce/control.py there is a function I call: def start_map(name, handler_spec, reader_spec, reader_parameters, shard_count=_DEFAULT_SHARD_COUNT, [...]) However, no matter what o I pass as the shard_count argument, it is ignored. Any ideas? Cheers, -Andrin -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] MapReduce as Cron jobs: How to specify the number of shards?
In your cron_mapreduce.py add these two lines shard_count=int(self.request.get(shard_count, mr_control._DEFAULT_SHARD_COUNT)) mr_control.start_map( self.request.get(name), self.request.get(reader_spec, your_mapreduce.map), self.request.get(reader_parameters, mapreduce.input_readers.DatastoreInputReader), { entity_kind: self.request.get(entity_kind, models.YourModel), processing_rate: int(self.request.get(processing_rate, 100)) }, shard_count = shard_count, mapreduce_parameters={done_callback: self.request.get(done_callback, None) } ) 2011/2/10 Andrin von Rechenberg andri...@gmail.com: Hey there Today I created a library to run MapReduces as cron jobs in python. See here: http://devblog.miumeet.com/2011/02/schedule-mapreduce-daily-on-appengine.html However, I didn't figure out how to I'm able to set the shard_count programmatically. In mapreduce/control.py there is a function I call: def start_map(name, handler_spec, reader_spec, reader_parameters, shard_count=_DEFAULT_SHARD_COUNT, [...]) However, no matter what o I pass as the shard_count argument, it is ignored. Any ideas? Cheers, -Andrin -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] mapreduce and cron job
You must use the control module from the mapreduce module One line in the doc tells you to chain MR tasks use a done callback and start a new one with the control submodule. Use this method to start a task from inside a cron handler Have a look at mapreduce/control.py and the test code for control.py from mapreduce import control as mr_control class MRCronHandler(webapp.RequestHandler): def get(self): mr_control.start_map( Name of the task, mymr.sometask, mapreduce.input_readers.DatastoreInputReader, {entity_kind: models.SomeModel, }, mapreduce_parameters={done_callback: /mr-end-task}, ) self.response.headers['Content-Type'] = 'text/plain' self.response.out.write('MR Cron Started') 2010/10/24 slash ewan...@gmail.com: Hi, everybody I learned a lot from this group and Nick's blog. Thanks. I want to combine mapreduce and cron job in order to run mapping automatically. I did lots of search in this group and from other resource but I got nothing. I'm able to run cron job and mapreduce separately. But I really don't know how to combine them. Maybe I missed something really basic. Could anyone tell me how to run mapreduce with cron job? I didn't find the corresponding url to trigger mapreduce. Thank you. PS: I know that it's possible to do this and I subscribe issue 41 in appengine-mapreduce project. Here is the link: http://code.google.com/p/appengine-mapreduce/issues/detail?id=41. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] mapreduce and cron job
Hello, Java: Put something like this in your cron.xml (for Python modify accordingly). cron url/cron/map-reduce-start?map-reduce-path=/mapreduce/amp;mapper-class=YOUR.PACKAGE.YOURMAPPERCLASSHEREamp;entity-kind=YOURENTITYKINDHEREamp;num-shards=YOURNUMBERSHARDSHEREamp;processing-rate=YOURPROCESSINGRATEHEREamp;controller-queue=YOURCONTROLLERQUEUEHEREamp;worker-queue=YOURWORKERQUEUEHEREamp;callback-queue=YOURCALLBACKQUEUEHEREamp;callback-url=YOURCALLBACKURLHERE/url descriptionSome description/description scheduleevery 1 hours/schedule /cron Stephen On Sun, Oct 24, 2010 at 3:14 AM, slash ewan...@gmail.com wrote: Hi, everybody I learned a lot from this group and Nick's blog. Thanks. I want to combine mapreduce and cron job in order to run mapping automatically. I did lots of search in this group and from other resource but I got nothing. I'm able to run cron job and mapreduce separately. But I really don't know how to combine them. Maybe I missed something really basic. Could anyone tell me how to run mapreduce with cron job? I didn't find the corresponding url to trigger mapreduce. Thank you. PS: I know that it's possible to do this and I subscribe issue 41 in appengine-mapreduce project. Here is the link: http://code.google.com/p/appengine-mapreduce/issues/detail?id=41. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] mapreduce and cron job
Oops! My bad, this won't work. It must be Monday. Sorry, Steve On Mon, Oct 25, 2010 at 11:56 AM, Stephen Johnson onepagewo...@gmail.comwrote: Hello, Java: Put something like this in your cron.xml (for Python modify accordingly). cron url/cron/map-reduce-start?map-reduce-path=/mapreduce/amp;mapper-class=YOUR.PACKAGE.YOURMAPPERCLASSHEREamp;entity-kind=YOURENTITYKINDHEREamp;num-shards=YOURNUMBERSHARDSHEREamp;processing-rate=YOURPROCESSINGRATEHEREamp;controller-queue=YOURCONTROLLERQUEUEHEREamp;worker-queue=YOURWORKERQUEUEHEREamp;callback-queue=YOURCALLBACKQUEUEHEREamp;callback-url=YOURCALLBACKURLHERE/url descriptionSome description/description scheduleevery 1 hours/schedule /cron Stephen On Sun, Oct 24, 2010 at 3:14 AM, slash ewan...@gmail.com wrote: Hi, everybody I learned a lot from this group and Nick's blog. Thanks. I want to combine mapreduce and cron job in order to run mapping automatically. I did lots of search in this group and from other resource but I got nothing. I'm able to run cron job and mapreduce separately. But I really don't know how to combine them. Maybe I missed something really basic. Could anyone tell me how to run mapreduce with cron job? I didn't find the corresponding url to trigger mapreduce. Thank you. PS: I know that it's possible to do this and I subscribe issue 41 in appengine-mapreduce project. Here is the link: http://code.google.com/p/appengine-mapreduce/issues/detail?id=41. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] MapReduce - Map task time limit - Is it 30 seconds?
Yes, I think so. But each task only handle one entity, it almost won't exceed 30 seconds to process it. -- keakon 2010/10/24 Jaganathan (ஜ௧நாதன்) ksja...@gmail.com: Hi Just curious and want to confirm the following. In MapReduce (http://code.google.com/p/appengine-mapreduce/), should any single map task *still* run within the 30 seconds limit? Thanks Jagan -- Let the words of our mouth and the meditations of our heart Be acceptable in Thy sight here tonight! - Rivers of Babylon (album) -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] [mapreduce] Can i run the tasks via code?
You could use cron jobs: http://code.google.com/appengine/docs/python/config/cron.html http://code.google.com/appengine/docs/python/config/cron.html On Mon, Oct 4, 2010 at 6:40 AM, BarrenTeam barren8...@gmail.com wrote: We actually run the jobs manually. Is there anyway to launch them automatically? For exemple through servlet? -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] [mapreduce] Can i run the tasks via code?
Nevermind.. is see that you mean mapreduce jobs. I guess you could read the code they have for the page that lets you manually start the jobs.. then.. if you're lucky and clicking run really just fires off a call to a handler.. you can find out the parameters.. and configure a cron job to do the same thing. On Mon, Oct 4, 2010 at 1:40 PM, Eli Jones eli.jo...@gmail.com wrote: You could use cron jobs: http://code.google.com/appengine/docs/python/config/cron.html http://code.google.com/appengine/docs/python/config/cron.html On Mon, Oct 4, 2010 at 6:40 AM, BarrenTeam barren8...@gmail.com wrote: We actually run the jobs manually. Is there anyway to launch them automatically? For exemple through servlet? -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] MapReduce: any way to get the value of counters easily?
Here's how you'd get the counter value (excuse the comments, this is copied and pasted from a blog post): MapReduceState mrState = MapReduceState.getMapReduceStateFromJobID( datastore, jobId); // There's a bit of ceremony to get the actual counter. This // example is intentionally verbose for clarity. First get all the // Counters, // then we get the CounterGroup, then we get the Counter, then we // get the count. Counters counters = mrState.getCounters(); CounterGroup counterGroup = counters.getGroup(CommentWords); Counter counter = counterGroup.findCounter(count); long wordCount = counter.getValue(); // Finally! On Sun, Aug 29, 2010 at 12:20 AM, mac macwa...@gmail.com wrote: Hi, I tried to use MapReduce (http://code.google.com/p/appengine- mapreduce/) to do some jobs in my project. I am wondering how get the value of the counters? It seems that it is stored in the counters_map field in MapreduceState and it's in JSON format. Is it the correct way to grab the value by read and parse it from MapreduceState directly? -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Ikai Lan Developer Programs Engineer, Google App Engine Blog: http://googleappengine.blogspot.com Twitter: http://twitter.com/app_engine Reddit: http://www.reddit.com/r/appengine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] mapreduce multi-entity datastore input reader
File a bug in the appengine-mapreduce project for best visibility. We'll likely contribute more functionality to that library, but our priority is working through the shuffle/reduce features and not necessarily more InputFormats. This is a case where I'd suggest that you scratch your own itch: DatastoreInputFormat is open source, so you can build the feature yourself: http://code.google.com/p/appengine-mapreduce/source/browse/trunk/java/src/com/google/appengine/tools/mapreduce/DatastoreInputFormat.java?spec=svn37r=37 Change that to use a kindless Entity Query: http://code.google.com/appengine/docs/java/javadoc/com/google/appengine/api/datastore/Query.html#Query() On Fri, Jul 9, 2010 at 7:06 AM, Luís Marques luismarq...@gmail.com wrote: Hello Googlers, (I didn't find a mailing list for mapreduce, so I'm posting this here. Is there one?) Would you please consider adding a new datastore input reader (or extending existing ones) to allow specifying more than one entity kind? Or an input reader which processes all entity kinds? (perhaps being intelligent enough to skip over mapreduce control entities, if necessary) The main use I'm thinking about is re-putting entities, to regenerate indexes, but there are other uses. Thanks, Luís -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Ikai Lan Developer Programs Engineer, Google App Engine Blog: http://googleappengine.blogspot.com Twitter: http://twitter.com/app_engine Reddit: http://www.reddit.com/r/appengine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] MapReduce - less than half of the shards have items
Hi Jason, The current implementation of the datastore mapper uses lexicographical sharding over keys to assign datastore shards. Unfortunately, this can lead to very inconsistent shard sizes, as you observe. -Nick Johnson On Fri, Jun 11, 2010 at 4:17 PM, Jason C jason.a.coll...@gmail.com wrote: We've been using MapReduce for App Engine for a couple of different jobs. Typically, we use 8 shards (the default), but it seems that only 3, sometime 4, of the shards have any items in them? E.g., we're currently running one job and three of the shards have 218,000 items processed, but the other 5 shards appear to have zero. I can understand that a particular key distribution would have different amounts in each shard, but with so many at zero, I suspect there is something else happening? BTW, we have applied the mapreduce-recommended __key__ DESC index, but we still see this strange shard distribution. Is anyone else seeing this? j -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Nick Johnson, Developer Programs Engineer, App Engine Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047 Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047 -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] MapReduce - less than half of the shards have items
Are you using random keynames on your entities? I was not using random keynames and I'm seeing similar behaviour on my mapper runs. I'm refactoring my datastore ( using mapper api itself ) to use random keynames to better balance the runs across several shards. There's an open issue related to this: http://code.google.com/p/appengine-mapreduce/issues/detail?id=3 -risto On Fri, Jun 11, 2010 at 8:17 AM, Jason C jason.a.coll...@gmail.com wrote: We've been using MapReduce for App Engine for a couple of different jobs. Typically, we use 8 shards (the default), but it seems that only 3, sometime 4, of the shards have any items in them? E.g., we're currently running one job and three of the shards have 218,000 items processed, but the other 5 shards appear to have zero. I can understand that a particular key distribution would have different amounts in each shard, but with so many at zero, I suspect there is something else happening? BTW, we have applied the mapreduce-recommended __key__ DESC index, but we still see this strange shard distribution. Is anyone else seeing this? j -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] mapreduce, with a very strange warning message in logs
There is no desc index on __key__. Each indexed property has an asc and desc index on it but that doesn't apply for keys. - alkis On Wed, Jun 9, 2010 at 12:39 AM, Jason C jason.a.coll...@gmail.com wrote: I'm attempting to use mapreduce for app engine as presented at I/O 2010. I just noticed the following warning message in our logs (on / mapreduce/command/start_job): Cannot create accurate approximation of keyspace, guessing instead. Please address this problem: no matching index found. This query needs this index: - kind: AccountKeyword properties: - name: __key__ direction: desc This seems to suggest that there is no (desc) index for the key of our Model. However, isn't this created automatically? Indeed, I can't even think of a way to _not_ have it created (i.e., I can't think of a way to specify indexed=False for the key). Is this an error in reporting, or is something very wrong with our Datastore? Thanks, j -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] mapreduce
Hi, On Tue, Jun 1, 2010 at 11:13 PM, Ikai L (Google) ika...@google.com wrote: Hi, The first release of the mapper API will require a manual invocation of the task. If you need to map all entities, you will have to run the task. There's no continuous process monitoring changed entities. You can probably build this into your update workflow. Is this planned for a future release, though? Building a CouchDB-like map/reduce solution with this would be a lot of work and it might not even be reliable. Bye, Waldemar Kornewald -- Django on App Engine, MongoDB, SimpleDB, ...? http://www.allbuttonspressed.com/blog/django -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] mapreduce
It's not on our roadmap. My suspicion is no, but it depends on how developers end up using map-reduce. On Wed, Jun 2, 2010 at 12:39 PM, Waldemar Kornewald wkornew...@gmail.comwrote: Hi, On Tue, Jun 1, 2010 at 11:13 PM, Ikai L (Google) ika...@google.com wrote: Hi, The first release of the mapper API will require a manual invocation of the task. If you need to map all entities, you will have to run the task. There's no continuous process monitoring changed entities. You can probably build this into your update workflow. Is this planned for a future release, though? Building a CouchDB-like map/reduce solution with this would be a lot of work and it might not even be reliable. Bye, Waldemar Kornewald -- Django on App Engine, MongoDB, SimpleDB, ...? http://www.allbuttonspressed.com/blog/django -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Ikai Lan Developer Programs Engineer, Google App Engine Blog: http://googleappengine.blogspot.com Twitter: http://twitter.com/app_engine Reddit: http://www.reddit.com/r/appengine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] mapreduce
Take a look at Brett Slatkin's session from GoogleIO: http://code.google.com/events/io/2010/sessions/high-throughput-data-pipelines-appengine.html He specifically shows how to build materialized views on changing data using the task queue. I haven't tried it out but it is impressive and seems reliable as he is using the technique with http://pubsubhubbub.appspot.com/for the fan out problem. Justin -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] mapreduce
Hi, The first release of the mapper API will require a manual invocation of the task. If you need to map all entities, you will have to run the task. There's no continuous process monitoring changed entities. You can probably build this into your update workflow. On Wed, May 26, 2010 at 1:29 AM, Waldemar Kornewald wkornew...@gmail.comwrote: Hi, looking at the screenshots of the mapreduce library http://code.google.com/p/appengine-mapreduce/ gives me the impression that I'll have to manually execute those mapreduce tasks. Will this change in the final release? Also, will I always have to periodically rerun the mapreduce task over the *whole* DB even if just a single entity has changed? Or will there be some CouchDB-like mapreduce tasks which run continuously in the background and react to DB changes such that only a subset of the DB needs to be recomputed? Bye, Waldemar Kornewald -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Ikai Lan Developer Programs Engineer, Google App Engine Blog: http://googleappengine.blogspot.com Twitter: http://twitter.com/app_engine Reddit: http://www.reddit.com/r/appengine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.