I am using python `web.py` framework to build a small web app.

It consists of a 

1. `Home page` that takes a url as input
2. Reads `anchor text` and `anchor tags` from it 
3. Writes it to csv file and downloads it 

Here the steps 2 and 3 happens when we clicked on a `export the links` 
button, below is my code


*    import web*
*    from web import form*
*    import urlparse*
*    from urlparse import urlparse as ue*
*    import urllib2*
*    from BeautifulSoup import BeautifulSoup*
*    import csv*
*    from cStringIO import StringIO*
*    *
*    urls = (*
*        '/', 'index',*
*        '/export', 'export',*
*    )*
*    *
*    app =  web.application(urls, globals())*
*    render = web.template.render('templates/')*
*    *
*    class index:*
*        def GET(self):*
*            return render.home()*
*    class export:*
*    *
*        def GET(self):*
*            i = web.input()*
*            if i.has_key('url') and i['url'] !='':*
*                url = i['url'] *
*                page = urllib2.urlopen(url)*
*                html = page.read()*
*                page.close()*
*                *
*                decoded = ue(url).hostname*
*                if decoded.startswith('www.'):*
*                    decoded = ".".join(decoded.split('.')[1:])*
*                file_name = str(decoded.split('.')[0])*
*                csv_file = StringIO()*
*                csv_writer = csv.writer(csv_file)*
*                csv_writer.writerow(['Name', 'Link'])*
*                soup = BeautifulSoup(html)*
*                for anchor_tag in soup.findAll('a', href=True):     *
csv_writer.writerow([anchor_tag.text,anchor_tag['href']]) *
*                web.header('Content-Type','text/csv')       *
*                web.header('Content-disposition', 'attachment; 
*                return csv_file.getvalue()*
*    if __name__ == "__main__":*
*        app.run()*


*    $def with()*
*    <html>*
*     <head>*
*       <title>Home Page</title>*
*     </head>*
*     <body>*
*         <form method="GET" action='/export'>*
*            <input type="text" name="url" maxlength="500" />*
*        <input class="button" type="submit" name="export the links" 
value="export the links" />*
*          </form>*
*     </body>*
*    </html>*

The above html code displays a form with a text box that takes a url , and 
has button `export the links` button that `downloads/exports` the csv file 
with the anchor tag links and text.

1. For example when we submit `http://www.google.co.in` and click `export 
the links`, all the anchor urls and anchor text are saving in to csv file 
and downloading successfully

2. but for example when we given the other url like 
`http://stackoveflow.com` immediately and click `export the links` button, 
the csv file (created with domain name of the url as shown in the above 
code) is downloading with tag links , but the downloaded csv file also 
contains the data(anchor text and links) of the previous url that is 

That is the data is overrriding in the same csv file from different urls, 
can anyone please let me know whats wrong in the above code(`export class`) 
that generates the csv file, why the data is overwriting instead of 
creating a new csv file with the different name created dynamically ?

Finally my intention is to download/export a new csv file with domain 
name(sliced as above in my code) of the url by writing data (anchor tag 
text and url ) from the url in to it each time when we give the new url.

Can anyone please extend/make necessary changes to my above code to 
download an individul csv file for individual url .........  

You received this message because you are subscribed to the Google Groups 
"web.py" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to webpy+unsubscr...@googlegroups.com.
To post to this group, send email to webpy@googlegroups.com.
Visit this group at http://groups.google.com/group/webpy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to