Hi all. Just a post for giving you my interpretation of how making a sitemap. Base code is in this post : https://groups.google.com/forum/#!searchin/web2py/sitemap/web2py/TUMn6R3BJ10/TRSLCY_JQ8UJ
With a lot of URLs, in my case more than 2000 (a lot of products), Google Webmaster turns me back an error with my XML sitemap. Splitting the sitemap in more files was a little bit tricky, so I've change to a txt sitemap. No more errors. For having everyday a "fresh" sitemap, I've put the function in the scheduler. It works fine. These products that are sells in all over the world by a lot of online merchants, and my website is always at the first or second place in a Google search. Here is the code, I hope it could help someone: def sitemap_txt_auto(): import os from gluon.myregex import regex_expose # Functions URLs exclusions = ['index', 'user', 'unsubscribe', 'download', 'call', 'data' , 'upload', 'browse', 'delete'] ctldir = os.path.join(request.folder,"controllers") ctls=os.listdir(ctldir) if 'appadmin.py' in ctls: ctls.remove('appadmin.py') if 'manage.py' in ctls: ctls.remove('manage.py') sitemap='http://www.mydomain.com/it/index.html' sitemap += '\r\n' sitemap += 'http://www.mydomain.com/en/index.html' for ctl in ctls: if ctl.endswith(".bak") == False: filename = os.path.join(ctldir,ctl) data = open(filename, 'r').read() functions = regex_expose.findall(data) for f in functions: if not any(f in s for s in exclusions): # if function is not in exclustions sitemap += '\r\n' sitemap += 'http://www.mydomain.com/it/%s' % (f) sitemap += '\r\n' sitemap += 'http://www.mydomain.com/en/%s' % (f) # Products products = db().select(db.products.ALL, orderby=db.products.id) pdf_paths = [] for item in products: # Product pages sitemap += '\r\n' sitemap += 'http://www.mydomain.com/it/products_listing/view/products/%s' % (str(item. id)) sitemap += '\r\n' sitemap += 'http://www.mydomain.com/en/products_listing/view/products/%s' % (str(item. id)) # PDF files if pdf_paths.count(item.pdf_path) < 1:#Usefull because some products have a same pdf file pdf_paths.append(item.pdf_path) sitemap += '\r\n' sitemap += item.pdf_path file = open('%s/static/sitemaps/sitemap.txt' %request.folder, 'w') file.write(sitemap) file.close() db.commit() from gluon.scheduler import Scheduler Scheduler(db,dict(sitemap_txt_auto=sitemap_txt_auto)) -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.