Re: google webmaster tools finding status 302 on robots.txt

2008-06-12 Thread Mark

Googlebot is getting status 302 occasionally it seems (according to
webmasters tools) on other pages that never should redirect.  I can
never reproduce this myself.  I'm not sure if there is a way to log/
trace this.  It seems to be recurring though.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: google webmaster tools finding status 302 on robots.txt

2008-06-12 Thread Mark

If it helps...one of the files that got status 302 was /javascripts/
prototype.js

A little odd.

On Jun 12, 9:13 pm, Mark [EMAIL PROTECTED] wrote:
 Googlebot is getting status302occasionally it seems (according to
 webmasters tools) on other pages that never should redirect.  I can
 never reproduce this myself.  I'm not sure if there is a way to log/
 trace this.  It seems to be recurring though.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



google webmaster tools finding status 302 on robots.txt

2008-05-27 Thread Mark

Google is finding status 302 on robots.txt, however whenever I browse
to the file I am getting 200 and the correct robots.txt.

Any ideas?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: google webmaster tools finding status 302 on robots.txt

2008-05-27 Thread Mark

Morten, I just realized you are one of the few who just joined the
Heroku team.  Welcome aboard!

Let me know if you need any additional info to troubleshoot this issue.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: robots.txt

2008-05-10 Thread justindz

I found one of my app's pages in a google search yesterday and
suspected that something was up.  Now I suppose I need to think about
friendlier link names :-P

On May 9, 8:25 pm, Adam Wiggins [EMAIL PROTECTED] wrote:
 Blessed apps now default to being crawlable by Googlebot and friends.
 That is, it serves up whatever you have in your public/robots.txt.

 Nonblessed apps will still serve a robots.txt that prevents all
 crawling - this is to prevent linkspammers from abusing Heroku.
 Please don't hesitate to request a blessing for any app you wish to be
 crawlable!

 Adam
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



robots.txt

2008-05-09 Thread Adam Wiggins

Blessed apps now default to being crawlable by Googlebot and friends.
That is, it serves up whatever you have in your public/robots.txt.

Nonblessed apps will still serve a robots.txt that prevents all
crawling - this is to prevent linkspammers from abusing Heroku.
Please don't hesitate to request a blessing for any app you wish to be
crawlable!

Adam

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: robots.txt

2008-04-07 Thread stephan

Thanks

Yeah, live http headers is even a better choice. I couldn't remember
its name.

As to the 304 not modified, yes, that's telling you the file hasn't
been modified between the time you're reloading, but be careful --
make sure you have live http headers started and recording before
you make your first http request.

Your first request will return a 200 or 301, then the next time you're
reloading that page -- it's going to give you a 304 -- because it's
essentially sending the same request with the same serial number and
the web server knows that.
Stephan
http://quickspikes.com

On Apr 7, 3:22 am, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Wow!

 Nice reply man!

 I saw the other post talking about robots.txt but my app is already
 blessed and its not working. I'm using live http headers (a firefox
 extension) and my app robots.txt is returning a 304 not modified.

 I'm already using google webmaster tools but now its useless because
 googlebot is blocked.

 Thank's for your reply!

 #sorry for my english, i'm spanish

 On 7 abr, 02:34, stephan [EMAIL PROTECTED] wrote:

  Masylum,
  According to Orion Henry last February 29th on this Google Group, he
  said That's a feature that will be available for blessed accounts a
  little ways down the line.

  On a side-note, when you make your request for blessing to them --
  do let them know their own robots.txt is working -- but only by
  chance.
  If you go tohttp://heroku.com/robots.txtortohttp://www.heroku.com/robots.txt

  You'll find that they're robots.txt is written correctly, it's just
  that it is returning the wrong http status code in both cases.
  It's returning a 301 instead of a 200 OK. A 301 status code means that
  the content of that file is being dynamically generated.

  And the google bot won't recognize a robots.txt (or a sitemap) that
  returns a 301. Now luckily for them, and they probably knew that
  already since they're saying it will only be available down the
  line, they're not blocking anything using that file. So the fact that
  google is not recognizing that file as valid doesn't matter. If the
  google bot doesn't see a valid robots.xml file with a right status
  code, its default behavior is to index everything anyway.

  On the other hand, if you go to your own public app's robots.txt,
  let's say your app is called foobar2000
  You'll find thathttp://foobar2000.heroku.com/robots.txtretunsthe
  right 200 OK

  That's because your own robots.txt request is being intercepted by
  their http/proxy web server, and the http web server will return the
  right 200 OK code. Ruby can also return the right 200 OK status code,
  it's just that you have to tell it explicitly to do so, and it's not
  an error most developers have come across unless they get first bitten
  by it.

  To double-check the status codes of an http request, you use your own
  sniffer. An easy sniffer to install is TamperData, a firefox
  extention.https://addons.mozilla.org/en-US/firefox/addon/966(itmust be 
  enabled
  once installed, and then it must be explicitly started from its dialog
  menu)

  Anyway, good luck Masylum, and if you haven't done so already -- try
  the Google Webmaster Tools when you get this robots.txt 
  working.https://www.google.com/webmasters/tools/siteoverview?hl=en(it's an
  important tool)

  - Stephanhttp://quickspikes.com

  On Apr 6, 5:05 am, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

   Anybody knows how to change the robots.txt?

   I modified the file in the public folder but the bots remain blocked.

   Maybe I'm doing something wrong, but I need to be accesible for
   googlebot at least.

   Thank's in advance! :)
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---



Re: robots.txt

2008-04-06 Thread stephan

Masylum,
According to Orion Henry last February 29th on this Google Group, he
said That's a feature that will be available for blessed accounts a
little ways down the line.

On a side-note, when you make your request for blessing to them --
do let them know their own robots.txt is working -- but only by
chance.
If you go to http://heroku.com/robots.txt or to http://www.heroku.com/robots.txt

You'll find that they're robots.txt is written correctly, it's just
that it is returning the wrong http status code in both cases.
It's returning a 301 instead of a 200 OK. A 301 status code means that
the content of that file is being dynamically generated.

And the google bot won't recognize a robots.txt (or a sitemap) that
returns a 301. Now luckily for them, and they probably knew that
already since they're saying it will only be available down the
line, they're not blocking anything using that file. So the fact that
google is not recognizing that file as valid doesn't matter. If the
google bot doesn't see a valid robots.xml file with a right status
code, its default behavior is to index everything anyway.

On the other hand, if you go to your own public app's robots.txt,
let's say your app is called foobar2000
You'll find that http://foobar2000.heroku.com/robots.txt retuns the
right 200 OK

That's because your own robots.txt request is being intercepted by
their http/proxy web server, and the http web server will return the
right 200 OK code. Ruby can also return the right 200 OK status code,
it's just that you have to tell it explicitly to do so, and it's not
an error most developers have come across unless they get first bitten
by it.

To double-check the status codes of an http request, you use your own
sniffer. An easy sniffer to install is TamperData, a firefox
extention.
https://addons.mozilla.org/en-US/firefox/addon/966 (it must be enabled
once installed, and then it must be explicitly started from its dialog
menu)

Anyway, good luck Masylum, and if you haven't done so already -- try
the Google Webmaster Tools when you get this robots.txt working.
https://www.google.com/webmasters/tools/siteoverview?hl=en  (it's an
important tool)

- Stephan
http://quickspikes.com

On Apr 6, 5:05 am, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Anybody knows how to change the robots.txt?

 I modified the file in the public folder but the bots remain blocked.

 Maybe I'm doing something wrong, but I need to be accesible for
 googlebot at least.

 Thank's in advance! :)
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en
-~--~~~~--~~--~--~---