Re: google webmaster tools finding status 302 on robots.txt
Googlebot is getting status 302 occasionally it seems (according to webmasters tools) on other pages that never should redirect. I can never reproduce this myself. I'm not sure if there is a way to log/ trace this. It seems to be recurring though. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Heroku group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: google webmaster tools finding status 302 on robots.txt
If it helps...one of the files that got status 302 was /javascripts/ prototype.js A little odd. On Jun 12, 9:13 pm, Mark [EMAIL PROTECTED] wrote: Googlebot is getting status302occasionally it seems (according to webmasters tools) on other pages that never should redirect. I can never reproduce this myself. I'm not sure if there is a way to log/ trace this. It seems to be recurring though. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Heroku group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
google webmaster tools finding status 302 on robots.txt
Google is finding status 302 on robots.txt, however whenever I browse to the file I am getting 200 and the correct robots.txt. Any ideas? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Heroku group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: google webmaster tools finding status 302 on robots.txt
Morten, I just realized you are one of the few who just joined the Heroku team. Welcome aboard! Let me know if you need any additional info to troubleshoot this issue. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Heroku group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: robots.txt
I found one of my app's pages in a google search yesterday and suspected that something was up. Now I suppose I need to think about friendlier link names :-P On May 9, 8:25 pm, Adam Wiggins [EMAIL PROTECTED] wrote: Blessed apps now default to being crawlable by Googlebot and friends. That is, it serves up whatever you have in your public/robots.txt. Nonblessed apps will still serve a robots.txt that prevents all crawling - this is to prevent linkspammers from abusing Heroku. Please don't hesitate to request a blessing for any app you wish to be crawlable! Adam --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Heroku group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
robots.txt
Blessed apps now default to being crawlable by Googlebot and friends. That is, it serves up whatever you have in your public/robots.txt. Nonblessed apps will still serve a robots.txt that prevents all crawling - this is to prevent linkspammers from abusing Heroku. Please don't hesitate to request a blessing for any app you wish to be crawlable! Adam --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Heroku group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: robots.txt
Thanks Yeah, live http headers is even a better choice. I couldn't remember its name. As to the 304 not modified, yes, that's telling you the file hasn't been modified between the time you're reloading, but be careful -- make sure you have live http headers started and recording before you make your first http request. Your first request will return a 200 or 301, then the next time you're reloading that page -- it's going to give you a 304 -- because it's essentially sending the same request with the same serial number and the web server knows that. Stephan http://quickspikes.com On Apr 7, 3:22 am, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Wow! Nice reply man! I saw the other post talking about robots.txt but my app is already blessed and its not working. I'm using live http headers (a firefox extension) and my app robots.txt is returning a 304 not modified. I'm already using google webmaster tools but now its useless because googlebot is blocked. Thank's for your reply! #sorry for my english, i'm spanish On 7 abr, 02:34, stephan [EMAIL PROTECTED] wrote: Masylum, According to Orion Henry last February 29th on this Google Group, he said That's a feature that will be available for blessed accounts a little ways down the line. On a side-note, when you make your request for blessing to them -- do let them know their own robots.txt is working -- but only by chance. If you go tohttp://heroku.com/robots.txtortohttp://www.heroku.com/robots.txt You'll find that they're robots.txt is written correctly, it's just that it is returning the wrong http status code in both cases. It's returning a 301 instead of a 200 OK. A 301 status code means that the content of that file is being dynamically generated. And the google bot won't recognize a robots.txt (or a sitemap) that returns a 301. Now luckily for them, and they probably knew that already since they're saying it will only be available down the line, they're not blocking anything using that file. So the fact that google is not recognizing that file as valid doesn't matter. If the google bot doesn't see a valid robots.xml file with a right status code, its default behavior is to index everything anyway. On the other hand, if you go to your own public app's robots.txt, let's say your app is called foobar2000 You'll find thathttp://foobar2000.heroku.com/robots.txtretunsthe right 200 OK That's because your own robots.txt request is being intercepted by their http/proxy web server, and the http web server will return the right 200 OK code. Ruby can also return the right 200 OK status code, it's just that you have to tell it explicitly to do so, and it's not an error most developers have come across unless they get first bitten by it. To double-check the status codes of an http request, you use your own sniffer. An easy sniffer to install is TamperData, a firefox extention.https://addons.mozilla.org/en-US/firefox/addon/966(itmust be enabled once installed, and then it must be explicitly started from its dialog menu) Anyway, good luck Masylum, and if you haven't done so already -- try the Google Webmaster Tools when you get this robots.txt working.https://www.google.com/webmasters/tools/siteoverview?hl=en(it's an important tool) - Stephanhttp://quickspikes.com On Apr 6, 5:05 am, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Anybody knows how to change the robots.txt? I modified the file in the public folder but the bots remain blocked. Maybe I'm doing something wrong, but I need to be accesible for googlebot at least. Thank's in advance! :) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Heroku group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---
Re: robots.txt
Masylum, According to Orion Henry last February 29th on this Google Group, he said That's a feature that will be available for blessed accounts a little ways down the line. On a side-note, when you make your request for blessing to them -- do let them know their own robots.txt is working -- but only by chance. If you go to http://heroku.com/robots.txt or to http://www.heroku.com/robots.txt You'll find that they're robots.txt is written correctly, it's just that it is returning the wrong http status code in both cases. It's returning a 301 instead of a 200 OK. A 301 status code means that the content of that file is being dynamically generated. And the google bot won't recognize a robots.txt (or a sitemap) that returns a 301. Now luckily for them, and they probably knew that already since they're saying it will only be available down the line, they're not blocking anything using that file. So the fact that google is not recognizing that file as valid doesn't matter. If the google bot doesn't see a valid robots.xml file with a right status code, its default behavior is to index everything anyway. On the other hand, if you go to your own public app's robots.txt, let's say your app is called foobar2000 You'll find that http://foobar2000.heroku.com/robots.txt retuns the right 200 OK That's because your own robots.txt request is being intercepted by their http/proxy web server, and the http web server will return the right 200 OK code. Ruby can also return the right 200 OK status code, it's just that you have to tell it explicitly to do so, and it's not an error most developers have come across unless they get first bitten by it. To double-check the status codes of an http request, you use your own sniffer. An easy sniffer to install is TamperData, a firefox extention. https://addons.mozilla.org/en-US/firefox/addon/966 (it must be enabled once installed, and then it must be explicitly started from its dialog menu) Anyway, good luck Masylum, and if you haven't done so already -- try the Google Webmaster Tools when you get this robots.txt working. https://www.google.com/webmasters/tools/siteoverview?hl=en (it's an important tool) - Stephan http://quickspikes.com On Apr 6, 5:05 am, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Anybody knows how to change the robots.txt? I modified the file in the public folder but the bots remain blocked. Maybe I'm doing something wrong, but I need to be accesible for googlebot at least. Thank's in advance! :) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Heroku group. To post to this group, send email to heroku@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/heroku?hl=en -~--~~~~--~~--~--~---