I am using python `web.py` framework to build a small web app.

It consists of a 

1. `Home page` that takes a url as input
2. Reads `anchor text` and `anchor tags` from it 
3. Writes it to csv file and downloads it 

Here the steps 2 and 3 happens when we clicked on a `export the links` 
button, below is my code

**code.py**


*    import web*
*    from web import form*
*    import urlparse*
*    from urlparse import urlparse as ue*
*    import urllib2*
*    from BeautifulSoup import BeautifulSoup*
*    import csv*
*    from cStringIO import StringIO*
*    *
*    urls = (*
*        '/', 'index',*
*        '/export', 'export',*
*    )*
*    *
*    app =  web.application(urls, globals())*
*    render = web.template.render('templates/')*
*    *
*    class index:*
*        def GET(self):*
*            return render.home()*
*
*
*
*
*    class export:*
*    *
*        def GET(self):*
*            i = web.input()*
*            if i.has_key('url') and i['url'] !='':*
*                url = i['url'] *
*                page = urllib2.urlopen(url)*
*                html = page.read()*
*                page.close()*
*
*
*                *
*                decoded = ue(url).hostname*
*                if decoded.startswith('www.'):*
*                    decoded = ".".join(decoded.split('.')[1:])*
*                file_name = str(decoded.split('.')[0])*
*
*
*                csv_file = StringIO()*
*                csv_writer = csv.writer(csv_file)*
*                csv_writer.writerow(['Name', 'Link'])*
*
*
*                soup = BeautifulSoup(html)*
*                for anchor_tag in soup.findAll('a', href=True):     *
*                    
csv_writer.writerow([anchor_tag.text,anchor_tag['href']]) *
*                web.header('Content-Type','text/csv')       *
*                web.header('Content-disposition', 'attachment; 
filename=%s.csv'%file_name)*
*                return csv_file.getvalue()*
*
*
*    if __name__ == "__main__":*
*        app.run()*
*
*

**home.html**:

*    $def with()*
*    <html>*
*     <head>*
*       <title>Home Page</title>*
*     </head>*
*     <body>*
*         <form method="GET" action='/export'>*
*            <input type="text" name="url" maxlength="500" />*
*        <input class="button" type="submit" name="export the links" 
value="export the links" />*
*          </form>*
*     </body>*
*    </html>*

 
The above html code displays a form with a text box that takes a url , and 
has button `export the links` button that `downloads/exports` the csv file 
with the anchor tag links and text.

1. For example when we submit `http://www.google.co.in` and click `export 
the links`, all the anchor urls and anchor text are saving in to csv file 
and downloading successfully

2. but for example when we given the other url like 
`http://stackoveflow.com` immediately and click `export the links` button, 
the csv file (created with domain name of the url as shown in the above 
code) is downloading with tag links , but the downloaded csv file also 
contains the data(anchor text and links) of the previous url that is 
`http://www.google.co.in`.

That is the data is overrriding in the same csv file from different urls, 
can anyone please let me know whats wrong in the above code(`export class`) 
that generates the csv file, why the data is overwriting instead of 
creating a new csv file with the different name created dynamically ?

Finally my intention is to download/export a new csv file with domain 
name(sliced as above in my code) of the url by writing data (anchor tag 
text and url ) from the url in to it each time when we give the new url.

Can anyone please extend/make necessary changes to my above code to 
download an individul csv file for individual url .........  

-- 
You received this message because you are subscribed to the Google Groups 
"web.py" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to webpy+unsubscr...@googlegroups.com.
To post to this group, send email to webpy@googlegroups.com.
Visit this group at http://groups.google.com/group/webpy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to