Re: Sort by domain name?

2006-10-02 Thread js
> How about sorting the strings as they are reversed? > > urls = """\ > http://mail.google.com > http://reader.google.com > http://mail.yahoo.co.uk > http://google.com > http://mail.yahoo.com""".split("\n") > > sortedList = [ su[1] for su in sorted([ (u[::-1],u) for u in urls ]) ] > > for url in so

Re: Sort by domain name?

2006-10-02 Thread js
On 2 Oct 2006 08:56:09 -0700, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > js: > > All I want to do is to sort out a list of url by companyname, > > like oreilly, ask, skype, amazon, google and so on, to find out > > how many company's url the list contain. > > Then if you can define a good enoug

Re: Sort by domain name?

2006-10-02 Thread js
> Gentle reminder: is this homework? And you can expect better responses > if you show youve bootstrapped yourself on the problem to some extent. Sure thing. First I tried to solve this by using a list of domain found at http://www.neuhaus.com/domaincheck/domain_list.htm I converted this to a li

Re: Sort by domain name?

2006-10-02 Thread Paul McGuire
"js " <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Hi list, > > I have a list of URL and I want to sort that list by the domain name. > > Here, domain name doesn't contain subdomain, > or should I say, domain's part of 'www', mail, news and en should be > excluded. > > For exampl

Re: Sort by domain name?

2006-10-02 Thread Paul Rubin
"js " <[EMAIL PROTECTED]> writes: > All I want to do is to sort out a list of url by companyname, > like oreilly, ask, skype, amazon, google and so on, to find out > how many company's url the list contain. Here's a function I used to use. It makes no attempt to be exhaustive, but did a reasonabl

Re: Sort by domain name?

2006-10-02 Thread jay graves
gene tani wrote: > Plus, how do you order "https:", "ftp", URLs with "www.", "www2." , > named anchors etc? Now is a good time to point out the urlparse module in the standard library. It will help the OP with all of this stuff. just adding my 2 cents. ... jay graves -- http://mail.python.or

Re: Sort by domain name?

2006-10-02 Thread bearophileHUGS
js: > All I want to do is to sort out a list of url by companyname, > like oreilly, ask, skype, amazon, google and so on, to find out > how many company's url the list contain. Then if you can define a good enough list of such company names, you can just do a search of such names inside each url.

Re: Sort by domain name?

2006-10-02 Thread bearophileHUGS
Tim Chase: > to give you a sorting function. It assumes http rather than > having mixed url-types, such as ftp or mailto. They're easy > enough to strip off as well, but putting them back on becomes a > little more exercise. With a modern Python you don't need to do all that work, you can do: s

Re: Sort by domain name?

2006-10-02 Thread gene tani
Paul Rubin wrote: > "js " <[EMAIL PROTECTED]> writes: > > Here, domain name doesn't contain subdomain, > > or should I say, domain's part of 'www', mail, news and en should be > > excluded. > > It's a little more complicated, you have to treat co.uk about > the same way as .com, and similarly for

Re: Sort by domain name?

2006-10-02 Thread js
Thanks for your quick reply. yeah, it's a hard task and unfortunately even google doesn't help me much. All I want to do is to sort out a list of url by companyname, like oreilly, ask, skype, amazon, google and so on, to find out how many company's url the list contain. -- http://mail.python.org/

Re: Sort by domain name?

2006-10-02 Thread Tim Chase
>> Here, domain name doesn't contain subdomain, or should I >> say, domain's part of 'www', mail, news and en should be >> excluded. > > It's a little more complicated, you have to treat co.uk about > the same way as .com, and similarly for some other countries > but not all. For example, subd

Re: Sort by domain name?

2006-10-02 Thread Paul Rubin
"js " <[EMAIL PROTECTED]> writes: > Here, domain name doesn't contain subdomain, > or should I say, domain's part of 'www', mail, news and en should be > excluded. It's a little more complicated, you have to treat co.uk about the same way as .com, and similarly for some other countries but not al

Sort by domain name?

2006-10-02 Thread js
Hi list, I have a list of URL and I want to sort that list by the domain name. Here, domain name doesn't contain subdomain, or should I say, domain's part of 'www', mail, news and en should be excluded. For example, if the list was the following --