I think is making an example, stef want to demostrate that web2py is
returning 200 instead of 400. Is this a bug?

On 21 ago, 10:26, mdipierro <mdipie...@cs.depaul.edu> wrote:
> why are the urls in the first set truncated?
>
> On Aug 21, 8:07 am, Stef Mientki <stef.mien...@gmail.com> wrote:
>
>
>
> >  On 21-08-2010 14:46, mdipierro wrote:
>
> > > what do you find that is strange?
>
> > This is the result with the last letter removed, so all links should give 
> > an error,
> > but they differ with the 2 methods,
> > and some of them produce 200, while they are definitely wrong
> > 404 500http://127.0.0.1:8000/welcome/default/user/logi
> > 404 500http://127.0.0.1:8000/welcome/default/user/registe
> > 404 500http://127.0.0.1:8000/welcome/default/user/request_reset_passwor
> > 200 500http://127.0.0.1:8000/welcome/default
> > 400 500http://127.0.0.1:8000/welcome/default/inde
> > 200 500http://127.0.0.1:8000/admin/default/design/welcom
> > 200 
> > 500http://127.0.0.1:8000/admin/default/edit/welcome/controllers/default.p
> > 200 
> > 500http://127.0.0.1:8000/admin/default/edit/welcome/views/default/index.htm
> > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/views/layout.htm
> > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/static/base.cs
> > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/models/db.p
> > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/models/menu.p
> > 400 500http://127.0.0.1:8000/welcome/appadmin/inde
> > 200 500http://127.0.0.1:8000/admin/default/inde
> > 400 400http://127.0.0.1:8000/examples/default/inde
> > 200 -1http://web2py.co
> > 400 400http://web2py.com/boo
> > 400 500http://127.0.0.1:8000/welcome/default/inde
> > 200 500http://127.0.0.1:8000/welcome/default
> > 200 
> > 500http://127.0.0.1:8000/admin/default/peek/welcome/controllers/default.p
> > 200 
> > 500http://127.0.0.1:8000/admin/default/peek/welcome/views/default/index.htm
> > 200 -1http://www.web2py.co
>
> > This is the normal result
> > 200 500http://127.0.0.1:8000/welcome/default/user/login
> > 200 500http://127.0.0.1:8000/welcome/default/user/register
> > 200 500http://127.0.0.1:8000/welcome/default/user/request_reset_password
> > 200 500http://127.0.0.1:8000/welcome/default
> > 200 500http://127.0.0.1:8000/welcome/default/index
> > 200 500http://127.0.0.1:8000/admin/default/design/welcome
> > 200 
> > 500http://127.0.0.1:8000/admin/default/edit/welcome/controllers/default.py
> > 200 
> > 500http://127.0.0.1:8000/admin/default/edit/welcome/views/default/index....
> > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/views/layout.html
> > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/static/base.css
> > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/models/db.py
> > 200 500http://127.0.0.1:8000/admin/default/edit/welcome/models/menu.py
> > 200 500http://127.0.0.1:8000/welcome/appadmin/index
> > 200 500http://127.0.0.1:8000/admin/default/index
> > 200 200http://127.0.0.1:8000/examples/default/index
> > 200 200http://web2py.com
> > 200 500http://web2py.com/book
> > 200 500http://127.0.0.1:8000/welcome/default/index
> > 400 500http://127.0.0.1:8000/welcome/default/index#
> > 200 
> > 500http://127.0.0.1:8000/admin/default/peek/welcome/controllers/default.py
> > 200 
> > 500http://127.0.0.1:8000/admin/default/peek/welcome/views/default/index....
> > 200 200http://www.web2py.com
>
> > So when is a URL valid ?
>
> > thanks,
> > Stef
>
> > > On Aug 21, 7:32 am, Stef Mientki <stef.mien...@gmail.com> wrote:
> > >>> Graphical representation of links or pages that don't get linked to.
> > >> I tried to test the links (with 2 algorithms, code below) in a generated 
> > >> webpage, but the result I
> > >> get are very weird.
> > >> Probably one you knows a better way ?
>
> > >> cheers,
> > >> Stef
>
> > >> from BeautifulSoup import BeautifulSoup
> > >> from urllib        import urlopen
> > >> from httplib       import HTTP
> > >> from urlparse      import urlparse
>
> > >> def Check_URL_1 ( URL ) :
> > >>   try:
> > >>     fh = urlopen ( URL )
> > >>     return fh.code == 200
> > >>   except :
> > >>     return False
>
> > >> def Check_URL_2 ( URL ) :
> > >>   p = urlparse ( URL )
> > >>   h = HTTP ( p[1] )
> > >>   h.putrequest ( 'HEAD', p[2] )
> > >>   h.endheaders()
> > >>   if h.getreply()[0] == 200:
> > >>     return True
> > >>   else:
> > >>     return False
>
> > >> def Verify_Links ( URL ) :
> > >>   Parts   = URL.split('/')
> > >>   Site    = '/'.join ( Parts [:3] )
> > >>   Current = '/'.join ( Parts [:-1] )
>
> > >>   fh = urlopen ( URL )
> > >>   lines = fh.read ()
> > >>   fh.close()
>
> > >>   Soup = BeautifulSoup ( lines )
> > >>   hrefs = lines = Soup.findAll ( 'a' )
>
> > >>   for href in hrefs :
> > >>     href = href [ 'href' ] #[:-1]     ## <== remove "#" to generate all 
> > >> errors
>
> > >>     if href.startswith ( '/' ) :
> > >>       href = Site + href
> > >>     elif href.startswith ('#' ) :
> > >>       href = URL + href
> > >>     elif href.startswith ( 'http' ) :
> > >>       pass
> > >>     else :
> > >>       href = Current + href
>
> > >>     try:
> > >>       fh = urllib.urlopen ( href )
> > >>     except :
> > >>       pass
> > >>     print Check_URL_1 ( href ), Check_URL_2 ( href ), href
>
> > >> URL = 'http://127.0.0.1:8000/welcome/default/index'
> > >> fh = Verify_Links ( URL )

Reply via email to