Jannis, I started looking at this the other day, and I haven't had a chance to fix it because the Amazon datacenter outage took all of my time the past few days.
Here is what I found out. b.pypi.python.org lives on GAE and it's currently stuck, and looking at the logs I figured out what went wrong, but I'm not sure how to fix it. [3] See log snippet at the end of the email. Basically there is a python package called '__past__' (see [0] link below) that is causing the sync process to break because we are trying to use the project name as the key_name for the Product model [1], and GAE model key_name's can't contain underscores [2]. I'm not sure how to fix the issue without possibly breaking other things. My first thought was to remove the underscores, but that might break something else, or conflict with another project with a similar name. I wrote to Martin who gave me the following advice. >From "Martin v. Löwis": Renaming/escaping sounds good. I'd check if there is any string that > can be used in a GAE key name, but not be used in a PyPI package name. > If not, standard escaping needs to be applied: a prefix of "dunder" > is added to any package whose name starts and ends with __, as well > as to any package whose name starts with "dunder". > > When looking at all child nodes, remove "dunder" from any name; > when doing lookups by name, escape as above. > > If you do find a character/string that can be in a key name but > not in a package name, just escape the string with that name - > no need to worry about escaping the escape character. However it > may be that the only possible choice is "/" (which I know cannot > appear in a package name). I looked through most of the pypi code and I think the only character you can't have is "/", all other characters look like they work. So, I know what is causing it, we just need to fix the issue, test it, and roll out the fix. I was planning on doing it this past weekend but thanks to AWS, I didn't have any time to work on it. If anyone has any free time, feel free to take over / help. Just let others know so there isn't any duplicate effort. Let me know if you have any questions. Ken Cochrane Footnotes: [0] http://pypi.python.org/pypi/__past__/0.0.1.dev [1] https://bitbucket.org/loewis/pypi-appengine/src/fa6596a427e1/fetch.py#cl-62 [2] Information about model key_names https://developers.google.com/appengine/docs/python/datastore/modelclass#Model key_name The key name for the entity. The name becomes part of the primary key. If None, a system-generated numeric ID is used for the key. The value for key_name must not be of the form __*__. [1] Log snippet. 1. 2012-06-28 06:45:18.222 step package '__past__' 2. E2012-06-28 06:45:18.778 illegal name in key path element: __past__ Traceback (most recent call last): File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", line 701, in __call__ handler.get(*groups) File "/base/data/home/apps/pypi/3.358089379617981219/handlers.py", line 171, in get self.response.out.write(fetch.cron()) File "/base/data/home/apps/pypi/3.358089379617981219/fetch.py", line 293, in cron return step() File "/base/data/home/apps/pypi/3.358089379617981219/fetch.py", line 259, in step actions[action](m, todo, param) File "/base/data/home/apps/pypi/3.358089379617981219/fetch.py", line 91, in package data = simple_page(m, name) File "/base/data/home/apps/pypi/3.358089379617981219/fetch.py", line 70, in simple_page obj.put() File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 1074, in put return datastore.Put(self._entity, **kwargs) File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 579, in Put return PutAsync(entities, **kwargs).get_result() File "/base/python_runtime/python_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 604, in get_result return self.__get_result_hook(self) File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1579, in __put_hook self.check_rpc_success(rpc) File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1216, in check_rpc_success raise _ToDatastoreError(err) BadRequestError: illegal name in key path element: __past__ On Mon, Jul 2, 2012 at 8:11 AM, Jannis Leidel <jan...@leidel.info> wrote: > On 02.05.2011, at 22:10, Martin v. Löwis <mar...@v.loewis.de> wrote: > > > Am 02.05.2011 19:24, schrieb Jannis Leidel: > >> > >> On 02.05.2011, at 18:12, Maurits van Rees wrote: > >> > >>> Hi, > >>> > >>> I noticed that some distributions are not on all mirrors. For example > >>> http://a.pypi.python.org/simple/plone.app.referenceablebehavior/ > >>> has 0.1 and 0.2 (last one released 30 April) > >>> but 0.2 is missing from > >>> http://b.pypi.python.org/simple/plone.app.referenceablebehavior/ > >>> > >>> Same for c and d. Ah, no: those two have it now. I know for sure > that at least d did not have it five minutes ago. And this version has > been released two days ago, so it should have been slightly faster. :-) > >> > >> Hm, d doesn't seem to have the file on disk even thought it's on the > simple page, see > http://d.pypi.python.org/packages/source/p/plone.app.referenceablebehavior/ > >> > >> Martin: Anything I can do to make sure this doesn't happen again? > > > > As the starting point, we should figure out why it happened in > > the first place - it shouldn't have, of course. Most likely, > > it's a bug :-) > > Looks like http://b.pypi.python.org is out of date again: > http://www.pypi-mirrors.org > > Can we do something about that? > > Jannis > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG@python.org > http://mail.python.org/mailman/listinfo/catalog-sig >
_______________________________________________ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig