Downloading the feed using feedparser
Hello all, I am trying to download the feed of http://blogs.forrester.com/feed but I am stuck with a problem. import feedparser d = feedparser.parse('http://blogs.forrester.com/feed') d.etag u'1378291653-1' d.modified 'Wed, 04 Sep 2013 10:47:33 +' feedparser.parse('http://blogs.forrester.com/feed', etag=d.etag, modified=d.modified).status 200 When I am running this, should not this be 304 ( The content can't be change so fast in a moment or this server is not configured properly ). If I rely on this then whenever I run the code, I will download the content irrespective of content changed or not. Could some one please suggest me how to avoid the duplicate download ? The below one is working fine so if I try to download again then I will get 304 response since no data is changed on server. d = feedparser.parse(feed://feeds.huffingtonpost.com/HP/MostPopular) d.etag u'Vx5oxwMUzEFvFpd6BNR23912Zk4' d.modified 'Wed, 04 Sep 2013 10:32:06 GMT' feedparser.parse(feed://feeds.huffingtonpost.com/HP/MostPopular, etag= d.etag, modified=d.modified).status 304 Thank you Mukesh Tiwari -- https://mail.python.org/mailman/listinfo/python-list
Re: Improving the web page download code.
On Wednesday, 28 August 2013 04:03:15 UTC+5:30, MRAB wrote: On 27/08/2013 21:53, mukesh tiwari wrote: On Wednesday, 28 August 2013 01:49:59 UTC+5:30, MRAB wrote: On 27/08/2013 20:41, mukesh tiwari wrote: [snip] if __name__== '__main__': u = Downloader() signal.signal( signal.SIGINT , u.handleexception) thread.start_new_thread ( u.createurl , () ) for i in xrange ( 5 ) : thread.start_new_thread ( u.downloadurl , () ) while True : pass My preferred method when working with background threads is to put a sentinel such as None at the end and then when a worker gets an item from the queue and sees that it's the sentinel, it puts it back in the queue for the other workers to see, and then returns (terminates). The main thread can then call each worker thread's .join method to wait for it to finish. You currently have the main thread running in a 'busy loop', consuming processing time doing nothing! Hi MRAB, Thank you for the reply. I wrote this while loop only because of there is no thread.join in thread[1] library but I got your point. I am simply running a while loop for doing nothing. So if somehow I can block the main without too much computation then it will great. Why don't you use the 'threading' module instead? creator = threading.Thread(target=u.createurl) workers = [] for i in xrange(5): workers.append(threading.Thread(target=u.downloadurl)) creator.start() for w in workers: w.start() creator.join() for w in workers: w.join() Hi MRAB, Initially I blocked the main using raw_input('') and it was working fine. u = Downloader() signal.signal( signal.SIGINT , u.handleexception) thread.start_new_thread ( u.createurl , () ) for i in xrange ( 5 ) : thread.start_new_thread ( u.downloadurl , () ) #This is for blocking main raw_input('') When I pressed ctrl-c then it's responding fine but now after switching to threading module, I am not able to kill my program using SIGINT ( ctrl-c ). Any idea how to signal SIGINT to threads ? Now the changed code and I have to catch the SIGINT. u = Downloader() signal.signal( signal.SIGINT , u.handleexception) urlcreator = threading.Thread ( target = u.createurl ) workers = [] for i in xrange ( 5 ): workers.append ( threading.Thread( target = u.downloadurl ) ) urlcreator.start() for w in workers: w.start() urlcreator.join() for w in workers: w.join() -Mukesh Tiwari -- http://mail.python.org/mailman/listinfo/python-list
Improving the web page download code.
Hello All, I am doing web stuff first time in python so I am looking for suggestions. I wrote this code to download the title of webpages using as much less resource ( server time, data download) as possible and should be fast enough. Initially I used BeautifulSoup for parsing but the person who is going to use this code asked me not to use this and use regular expressions ( The reason was BeautifulSoup is not fast enough ? ). Also initially, I was downloading the the whole page but finally I restricted to only 3 characters to get the title of almost all the pages. Write now I can see only two shortcomings of this code, one when I kill the code by SIGINT ( ctrl-c ) then it dies instantly. I can modify this code to process all the elements in queue and let it die. The second is one IO call per iteration in download url function ( May be I can use async IO call but I am not sure ). I don't have much web programming experience so I am looking for suggestion to make it more robust. top-1m.csv is file downloaded from alexa[1]. Also some suggestions to write more idiomatic python code. -Mukesh Tiwari [1]http://www.alexa.com/topsites. import urllib2, os, socket, Queue, thread, signal, sys, re class Downloader(): def __init__( self ): self.q = Queue.Queue( 200 ) self.count = 0 def downloadurl( self ) : #open a file in append mode and write the result ( Improvement think of writing in chunks ) with open('titleoutput.dat', 'a+' ) as file : while True : try : url = self.q.get( ) data = urllib2.urlopen ( url , data = None , timeout = 10 ).read( 3 ) regex = re.compile('title.*(.*?)/title' , re.IGNORECASE) #Read data line by line and as soon you find the title go out of loop. #title = None #for r in data: # if not r : # raise StopIteration # else: # title = regex.search( r ) # if title is not None: break title = regex.search( data ) result = ', '.join ( [ url , title.group(1) ] ) #data.close() file.write(''.join( [ result , '\n' ] ) ) except urllib2.HTTPError as e: print ''.join ( [ url, ' ', str ( e ) ] ) except urllib2.URLError as e: print ''.join ( [ url, ' ', str ( e ) ] ) except Exception as e : print ''.join ( [ url, ' ', str( e ) ] ) #With block python calls file.close() automatically. def createurl ( self ) : #check if file exist. If not then create one with default value of 0 bytes read. if os.path.exists('bytesread.dat'): f = open ( 'bytesread.dat','r') self.count = int ( f.readline() ) else: f=open('bytesread.dat','w') f.write('0\n') f.close() #Reading data in chunks is fast but we can miss some sites due to reading the data in chunks( It's worth missing because reading is very fast) with open('top-1m.csv', 'r') as file: prefix = '' file.seek( self.count * 1024 ) #you will land into the middle of bytes so discard upto newline if ( self.count ): file.readline() for lines in iter ( lambda : file.read( 1024 ) , ''): l = lines.split('\n') n = len ( l ) l[0] = ''.join( [ prefix , l[0] ] ) for i in xrange ( n - 1 ) : self.q.put ( ''.join ( [ 'http://www.', l[i].split(',')[1] ] ) ) prefix = l[n-1] self.count += 1 #do graceful exit from here. def handleexception ( self , signal , frame) : with open('bytesread.dat', 'w') as file: print
Re: Improving the web page download code.
On Wednesday, 28 August 2013 01:49:59 UTC+5:30, MRAB wrote: On 27/08/2013 20:41, mukesh tiwari wrote: Hello All, I am doing web stuff first time in python so I am looking for suggestions. I wrote this code to download the title of webpages using as much less resource ( server time, data download) as possible and should be fast enough. Initially I used BeautifulSoup for parsing but the person who is going to use this code asked me not to use this and use regular expressions ( The reason was BeautifulSoup is not fast enough ? ). Also initially, I was downloading the the whole page but finally I restricted to only 3 characters to get the title of almost all the pages. Write now I can see only two shortcomings of this code, one when I kill the code by SIGINT ( ctrl-c ) then it dies instantly. I can modify this code to process all the elements in queue and let it die. The second is one IO call per iteration in download url function ( May be I can use async IO call but I am not sure ). I don't have much web programming experience so I am looking for suggestion to make it more robust. top-1m. c sv is file downloaded from alexa[1]. Also some suggestions to write more idiomatic python code. -Mukesh Tiwari [1]http://www.alexa.com/topsites. import urllib2, os, socket, Queue, thread, signal, sys, re class Downloader(): def __init__( self ): self.q = Queue.Queue( 200 ) self.count = 0 def downloadurl( self ) : #open a file in append mode and write the result ( Improvement think of writing in chunks ) with open('titleoutput.dat', 'a+' ) as file : while True : try : url = self.q.get( ) data = urllib2.urlopen ( url , data = None , timeout = 10 ).read( 3 ) regex = re.compile('title.*(.*?)/title' , re.IGNORECASE) #Read data line by line and as soon you find the title go out of loop. #title = None #for r in data: # if not r : # raise StopIteration # else: # title = regex.search( r ) # if title is not None: break title = regex.search( data ) result = ', '.join ( [ url , title.group(1) ] ) #data.close() file.write(''.join( [ result , '\n' ] ) ) except urllib2.HTTPError as e: print ''.join ( [ url, ' ', str ( e ) ] ) except urllib2.URLError as e: print ''.join ( [ url, ' ', str ( e ) ] ) except Exception as e : print ''.join ( [ url, ' ', str( e ) ] ) #With block python calls file.close() automatically. def createurl ( self ) : #check if file exist. If not then create one with default value of 0 bytes read. if os.path.exists('bytesread.dat'): f = open ( 'bytesread.dat','r') self.count = int ( f.readline() ) else: f=open('bytesread.dat','w') f.write('0\n') f.close() #Reading data in chunks is fast but we can miss some sites due to reading the data in chunks( It's worth missing because reading is very fast) with open('top-1m.csv', 'r') as file: prefix = '' file.seek( self.count * 1024 ) #you will land into the middle of bytes so discard upto newline if ( self.count ): file.readline() for lines in iter ( lambda : file.read( 1024 ) , ''): l = lines.split('\n') n = len ( l ) l[0] = ''.join( [ prefix , l[0] ] ) for i in xrange ( n - 1 ) : self.q.put ( ''.join ( [ 'http://www.', l[i].split(',')[1] ] ) ) prefix = l[n-1
Regular expression problem
Hello all I am trying to solve this problem[1] using regular expression. I wrote this code but I am getting time limit exceed. Could some one please tell me how to make this code run faster. import re if __name__ == __main__: n = int ( raw_input() ) c = 1 while c = n : email = filter ( lambda x : x != None , [ re.search ( '[^~!@#$%^*()?,.]*[a-zA-Z0-9][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._]*@[a-zA-Z0-9]+.(com|edu|org|co.in)[^~!@#$%^*()?,.a-zA-Z0-9]*' , x ) for x in raw_input().split(' ') ] ) t = len ( email ) print 'Case #' + str ( c ) + ': ' + str ( t ) for i in xrange ( t ): print email[i].group() c += 1 Also rather than breaking the string at space, I tried to use findall but I am not getting correct result. re.findall ( '[^~!@#$%^*()?,.]*[a-zA-Z0-9][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._]*@[a-zA-Z0-9]+.(com|edu|org|co.in)[^~!@#$%^*()?,.a-zA-Z0-9]*[ ]+','mukeshtiwari.iiitm...@gmail.com mukeshtiwari.iiitmm...@gmail.com') ['com'] I am suppose to get [ mukeshtiwari.iiit...@gmail.com , mukeshtiwari.iiit...@gmail.com] ? Regards Mukesh Tiwari [1] http://www.spoj.com/problems/MAIN12C/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Regular expression problem
Hi Chris On the problem page, it is 3 second. What is the time limit? I just tried it (Python 2.6 under Windows) and it finished in a humanly-immeasurable amount of time. Are you sure that STDIN (eg raw_input()) is where your test data is coming from? Yes, on SPOJ we read data from STDIN. Regards Mukesh Tiwari ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Regular expression problem
Hi Chris Thank you! Now I am getting wrong answer so at least program is faster then previous one and I am looking for wrong answer reason. Thanks again! import re if __name__ == __main__: n = int ( raw_input() ) c = 1 while c = n : email = filter ( lambda x : x != None , [ re.search ( '[a-zA-Z0-9][a-zA-Z0-9._]{4,}@[a-zA-Z0-9]+.(com|edu|org|co.in)' , x ) for x in raw_input().split(' ') ] ) t = len ( email ) print 'Case #' + str ( c ) + ': ' + str ( t ) for i in xrange ( t ) : print email[i].group() c += 1 Regards Mukesh Tiwari -- http://mail.python.org/mailman/listinfo/python-list
Reading pen drive data
Hello all I am trying to write a python script which can mount a pen drive and read the data from pen drive. I am using pyudev [ http://packages.python.org/pyudev/api/index.html ] and wrote a small python code import pyudev, sys if __name__ ==__main__: context = pyudev.Context() devices = context.list_devices(subsystem=usb) for device in devices : print device which prints Device(u'/sys/devices/pci:00/:00:1a.0/usb3') Device(u'/sys/devices/pci:00/:00:1a.0/usb3/3-0:1.0') Device(u'/sys/devices/pci:00/:00:1a.1/usb4') Device(u'/sys/devices/pci:00/:00:1a.1/usb4/4-0:1.0') Device(u'/sys/devices/pci:00/:00:1a.2/usb5') Device(u'/sys/devices/pci:00/:00:1a.2/usb5/5-0:1.0') Device(u'/sys/devices/pci:00/:00:1a.7/usb1') Device(u'/sys/devices/pci:00/:00:1a.7/usb1/1-0:1.0') Device(u'/sys/devices/pci:00/:00:1d.0/usb6') Device(u'/sys/devices/pci:00/:00:1d.0/usb6/6-0:1.0') Device(u'/sys/devices/pci:00/:00:1d.0/usb6/6-2') Device(u'/sys/devices/pci:00/:00:1d.0/usb6/6-2/6-2:1.0') Device(u'/sys/devices/pci:00/:00:1d.1/usb7') Device(u'/sys/devices/pci:00/:00:1d.1/usb7/7-0:1.0') Device(u'/sys/devices/pci:00/:00:1d.2/usb8') Device(u'/sys/devices/pci:00/:00:1d.2/usb8/8-0:1.0') Device(u'/sys/devices/pci:00/:00:1d.7/usb2') Device(u'/sys/devices/pci:00/:00:1d.7/usb2/2-0:1.0') My problem is, how to know out of these usb which one to read . The other solution I got is whenever I insert the pen drive , my system mounts it in /media directory so I can read the /media directory to check if its empty or not but this does not seems promising to me because it may be possible that my system is not able to mount the pen drive so I have to manually do it . Kindly some one please tell me how to solve this problem. -- http://mail.python.org/mailman/listinfo/python-list
Re: Reading pen drive data
On Sep 3, 7:40 pm, mukesh tiwari mukeshtiwari.ii...@gmail.com wrote: Hello all I am trying to write a python script which can mount a pen drive and read the data from pen drive. I am using pyudev [http://packages.python.org/pyudev/api/index.html] and wrote a small python code import pyudev, sys if __name__ ==__main__: context = pyudev.Context() devices = context.list_devices(subsystem =usb) for device in devices : print device which prints Device(u'/sys/devices/pci:00/:00:1a.0/usb3') Device(u'/sys/devices/pci:00/:00:1a.0/usb3/3-0:1.0') Device(u'/sys/devices/pci:00/:00:1a.1/usb4') Device(u'/sys/devices/pci:00/:00:1a.1/usb4/4-0:1.0') Device(u'/sys/devices/pci:00/:00:1a.2/usb5') Device(u'/sys/devices/pci:00/:00:1a.2/usb5/5-0:1.0') Device(u'/sys/devices/pci:00/:00:1a.7/usb1') Device(u'/sys/devices/pci:00/:00:1a.7/usb1/1-0:1.0') Device(u'/sys/devices/pci:00/:00:1d.0/usb6') Device(u'/sys/devices/pci:00/:00:1d.0/usb6/6-0:1.0') Device(u'/sys/devices/pci:00/:00:1d.0/usb6/6-2') Device(u'/sys/devices/pci:00/:00:1d.0/usb6/6-2/6-2:1.0') Device(u'/sys/devices/pci:00/:00:1d.1/usb7') Device(u'/sys/devices/pci:00/:00:1d.1/usb7/7-0:1.0') Device(u'/sys/devices/pci:00/:00:1d.2/usb8') Device(u'/sys/devices/pci:00/:00:1d.2/usb8/8-0:1.0') Device(u'/sys/devices/pci:00/:00:1d.7/usb2') Device(u'/sys/devices/pci:00/:00:1d.7/usb2/2-0:1.0') My problem is, how to know out of these usb which one to read . The other solution I got is whenever I insert the pen drive , my system mounts it in /media directory so I can read the /media directory to check if its empty or not but this does not seems promising to me because it may be possible that my system is not able to mount the pen drive so I have to manually do it . Kindly some one please tell me how to solve this problem. I got this link and its working fine [ http://packages.python.org/pyudev/api/monitor.html ] context = pyudev.Context() monitor = pyudev.Monitor.from_netlink(context) monitor.filter_by(subsystem='input') for action, device in monitor: ... print('{0}: {1}'.format(action, device)) but when I am trying to execute code [ http://packages.python.org/pyudev/api/toolkit.html ] import pyudev, sys context = pyudev.Context() monitor = pyudev.Monitor.from_netlink(context) monitor.filter_by(subsystem='usb') observer = QUDevMonitorObserver(monitor) def device_connected(device): print('{0!r} added'.format(device)) observer.deviceAdded.connect(device_connected) monitor.start() I am getting Traceback (most recent call last): File Mount.py, line 7, in module observer = QUDevMonitorObserver(monitor) NameError: name 'QUDevMonitorObserver' is not defined Could any one please tell me how to avoid this error . Thank you -- http://mail.python.org/mailman/listinfo/python-list
Listing HAL devices
Hello all I am trying to write a python script which detects usb pen drive and copy all the data into my home directory. After bit of searching , i found these two links 1] http://en.wikibooks.org/wiki/Python_Programming/Dbus and 2] http://stackoverflow.com/questions/469243/how-can-i-listen-for-usb-device-inserted-events-in-linux-in-python . I just copied the wiki program but after running it , i got error import dbus class BusListener: def __init__( self ): self.bus = dbus.SystemBus() self.hal_obj = self.bus.get_object('org.freedesktop.Hal' , '/org/freedesktop/Hal/ Manager' ) print self.proxy if __name__ == __main__: obj = BusListener() Traceback (most recent call last): File Mount.py, line 13, in module obj = BusListener() File Mount.py, line 7, in __init__ self.hal_obj = self.bus.get_object('org.freedesktop.Hal' , '/org/ freedesktop/Hal/Manager' ) File /usr/lib/pymodules/python2.7/dbus/bus.py, line 244, in get_object follow_name_owner_changes=follow_name_owner_changes) File /usr/lib/pymodules/python2.7/dbus/proxies.py, line 241, in __init__ self._named_service = conn.activate_name_owner(bus_name) File /usr/lib/pymodules/python2.7/dbus/bus.py, line 183, in activate_name_owner self.start_service_by_name(bus_name) File /usr/lib/pymodules/python2.7/dbus/bus.py, line 281, in start_service_by_name 'su', (bus_name, flags))) File /usr/lib/pymodules/python2.7/dbus/connection.py, line 630, in call_blocking message, timeout) dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.Hal was not provided by any .service files Kindly some one please tell me why i am getting this error . Thank you -- http://mail.python.org/mailman/listinfo/python-list
Re: Listing HAL devices
On Sep 1, 8:46 pm, mukesh tiwari mukeshtiwari.ii...@gmail.com wrote: Hello all I am trying to write a python script which detects usb pen drive and copy all the data into my home directory. After bit of searching , i found these two links 1]http://en.wikibooks.org/wiki/Python_Programming/Dbus and 2]http://stackoverflow.com/questions/469243/how-can-i-listen-for-usb-de... . I just copied the wiki program but after running it , i got error import dbus class BusListener: def __init__( self ): self.bus = dbus.SystemBus() self.hal_obj = self.bus.get_object('org.freedesktop.Hal' , '/org/freedesktop/Hal/ Manager' ) print self.proxy if __name__ == __main__: obj = BusListener() Traceback (most recent call last): File Mount.py, line 13, in module obj = BusListener() File Mount.py, line 7, in __init__ self.hal_obj = self.bus.get_object('org.freedesktop.Hal' , '/org/ freedesktop/Hal/Manager' ) File /usr/lib/pymodules/python2.7/dbus/bus.py, line 244, in get_object follow_name_owner_changes=follow_name_owner_changes) File /usr/lib/pymodules/python2.7/dbus/proxies.py, line 241, in __init__ self._named_service = conn.activate_name_owner(bus_name) File /usr/lib/pymodules/python2.7/dbus/bus.py, line 183, in activate_name_owner self.start_service_by_name(bus_name) File /usr/lib/pymodules/python2.7/dbus/bus.py, line 281, in start_service_by_name 'su', (bus_name, flags))) File /usr/lib/pymodules/python2.7/dbus/connection.py, line 630, in call_blocking message, timeout) dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.Hal was not provided by any .service files Kindly some one please tell me why i am getting this error . Thank you I am using Ubuntu 11.04 . -- http://mail.python.org/mailman/listinfo/python-list
Elliptic Curve Prime factorisation
Hello all , I have implemented Elliptic curve prime factorisation using wikipedia [ http://en.wikipedia.org/wiki/Lenstra_elliptic_curve_factorization]. I think that this code is not optimised and posting for further improvement. Feel free to comment and if you have any link regarding Elliptic curve prime factorisation , kindly post it. Thank you import math import random #y^2=x^3+ax+b mod n def extended_gcd(a,b): # taken from wikipedia x,y,lastx,lasty=0,1,1,0 while b!=0: q=a/b a,b=b,a%b x,lastx=(lastx-q*x,x) y,lasty=(lasty-q*y,y) if a0: return (-a,-lastx,-lasty) else: return (a,lastx,lasty) def gcd(a,b): if a 0: a = -a if b 0: b = -b if a == 0: return b if b == 0: return a while b != 0: (a, b) = (b, a%b) return a def randomCurve(N): A,u,v=random.randrange(N),random.randrange(N),random.randrange(N) B=(v*v-u*u*u-A*u)%N return [(A,B,N),(u,v)] def addPoint(E,p_1,p_2): if p_1==Identity: return [p_2,1] if p_2==Identity: return [p_1,1] a,b,n=E (x_1,y_1)=p_1 (x_2,y_2)=p_2 x_1%=n y_1%=n x_2%=n y_2%=n if x_1 != x_2 : d,u,v=extended_gcd(x_1-x_2,n) s=((y_1-y_2)*u)%n x_3=(s*s-x_1-x_2)%n y_3=(-y_1-s*(x_3-x_1))%n else: if (y_1+y_2)%n==0:return [Identity,1] else: d,u,v=extended_gcd(2*y_1,n) s=((3*x_1*x_1+a)*u)%n x_3=(s*s-2*x_1)%n y_3=(-y_1-s*(x_3-x_1))%n return [(x_3,y_3),d] def mulPoint(E,P,m): Ret=Identity d=1 while m!=0: if m%2!=0: Ret,d=addPoint(E,Ret,P) if d!=1 : return [Ret,d] # as soon as i got anything otherthan 1 return P,d=addPoint(E,P,P) if d!=1 : return [Ret,d] m=1 return [Ret,d] def ellipticFactor(N,m,times=5): for i in xrange(times): E,P=randomCurve(N); Q,d=mulPoint(E,P,m) if d!=1 : return d return N if __name__==__main__: n=input() m=int(math.factorial(1000)) while n!=1: k=ellipticFactor(n,m) n/=k print k -- http://mail.python.org/mailman/listinfo/python-list
Re: Elliptic Curve Prime factorisation
On Jan 15, 7:02 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Fri, 14 Jan 2011 11:52:21 -0800, mukesh tiwari wrote: Hello all , I have implemented Elliptic curve prime factorisation using wikipedia [ http://en.wikipedia.org/wiki/Lenstra_elliptic_curve_factorization]. I think that this code is not optimised and posting for further improvement. Feel free to comment and if you have any link regarding Elliptic curve prime factorisation , kindly post it. Thank you I don't think you can optimize it further in pure Python, although it is probably a good candidate for something like Cython, Pyrex or Shedskin. I think the code can be optimized for easier reading by putting single spaces around operators, following commas, etc. I find your style difficult to read. It could do with a docstring explaining what it does and how to use it, and some doctests. But other than that, it looks good. Have you considered putting it up on the ActiveState Python cookbook? -- Steven Thank you for your suggestion. I posted it ActiveState with comments. #!/usr/local/bin/python # -*- coding: utf-8 -*- import math import random #y^2=x^3+ax+b mod n # ax+by=gcd(a,b). This function returns [gcd(a,b),x,y]. Source Wikipedia def extended_gcd(a,b): x,y,lastx,lasty=0,1,1,0 while b!=0: q=a/b a,b=b,a%b x,lastx=(lastx-q*x,x) y,lasty=(lasty-q*y,y) if a0: return (-a,-lastx,-lasty) else: return (a,lastx,lasty) def gcd(a,b): if a 0: a = -a if b 0: b = -b if a == 0: return b if b == 0: return a while b != 0: (a, b) = (b, a%b) return a # pick first a point P=(u,v) with random non-zero coordinates u,v (mod N), then pick a random non-zero A (mod N), # then take B = u^2 - v^3 - Ax (mod N). # http://en.wikipedia.org/wiki/Lenstra_elliptic_curve_factorization def randomCurve(N): A,u,v=random.randrange(N),random.randrange(N),random.randrange(N) B=(v*v-u*u*u-A*u)%N return [(A,B,N),(u,v)] # Given the curve y^2 = x^3 + ax + b over the field K (whose characteristic we assume to be neither 2 nor 3), and points # P = (xP, yP) and Q = (xQ, yQ) on the curve, assume first that xP != xQ. Let the slope of the line s = (yP - yQ)/(xP - xQ); since K # is a field, s is well-defined. Then we can define R = P + Q = (xR, - yR) by # s=(xP-xQ)/(yP-yQ) Mod N # xR=s^2-xP-xQMod N # yR=yP+s(xR-xP) Mod N # If xP = xQ, then there are two options: if yP = -yQ, including the case where yP = yQ = 0, then the sum is defined as 0[Identity]. # thus, the inverse of each point on the curve is found by reflecting it across the x-axis. If yP = yQ != 0, then R = P + P = 2P = # (xR, -yR) is given by # s=3xP^2+a/(2yP) Mod N # xR=s^2-2xP Mod N # yR=yP+s(xR-xP) Mod N # http://en.wikipedia.org/wiki/Elliptic_curve#The_group_law''') def addPoint(E,p_1,p_2): if p_1==Identity: return [p_2,1] if p_2==Identity: return [p_1,1] a,b,n=E (x_1,y_1)=p_1 (x_2,y_2)=p_2 x_1%=n y_1%=n x_2%=n y_2%=n if x_1 != x_2 : d,u,v=extended_gcd(x_1-x_2,n) s=((y_1-y_2)*u)%n x_3=(s*s-x_1-x_2)%n y_3=(-y_1-s*(x_3-x_1))%n else: if (y_1+y_2)%n==0:return [Identity,1] else: d,u,v=extended_gcd(2*y_1,n) s=((3*x_1*x_1+a)*u)%n x_3=(s*s-2*x_1)%n y_3=(-y_1-s*(x_3-x_1))%n return [(x_3,y_3),d] # http://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication # Q=0 [Identity element] # while m: # if (m is odd) Q+=P # P+=P # m/=2 # return Q') def mulPoint(E,P,m): Ret=Identity d=1 while m!=0: if m%2!=0: Ret,d=addPoint(E,Ret,P) if d!=1 : return [Ret,d] # as soon as i got anything otherthan 1 return P,d=addPoint(E,P,P) if d!=1 : return [Ret,d] m=1 return [Ret,d] def ellipticFactor(N,m,times=5): for i in xrange(times): E,P=randomCurve(N); Q,d=mulPoint(E,P,m) if d!=1 : return d return N if __name__==__main__: n=input() m=int(math.factorial(1000)) while n!=1: k=ellipticFactor(n,m) n/=k print k -- http://mail.python.org/mailman/listinfo/python-list
Re: File transfer on network
On Jul 16, 4:08 am, MRAB pyt...@mrabarnett.plus.com wrote: mukesh tiwari wrote: Hello all Currently i am trying to develop a client and server in python. Client takes screenshot in every 20 seconds and send it to server. Server store the received file in folder. Here is code for Client import sys import socket import gtk.gdk import time if __name__ == __main__: s=socket.socket(socket.AF_INET,socket.SOCK_STREAM); host,port=localhost,50007 s.connect((host,port)) while 1: w = gtk.gdk.get_default_root_window() sz = w.get_size() pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB,False,8,sz[0],sz[1]) pb = pb.get_from_drawable(w,w.get_colormap(),0,0,0,0,sz[0],sz[1]) if (pb != None): pb.save('/home/tmp_1/screenshot.png',png) f=open('/home//tmp_1/screenshot.png','rb') while 1: blk=f.read(2048) if not blk:break s.send(blk) f.close() print 'file transfer done'; time.sleep(20) s.close() Server Code import sys import socket import time if __name__ == __main__: s=socket.socket(socket.AF_INET,socket.SOCK_STREAM) host,port=,50007 s.bind((host,port)) s.listen(1) conn,add=s.accept() i=0 while 1: f=open('/home/tmp_2/copyscreenshot_'+str(i)+'.png','wb') while 1: data=conn.recv(2048) if not data:break f.write(data) f.close() print 'file received' time.sleep(20) i+=1; #print ' file received ' conn.close() My problem is that server is copying only first image copyscreenshot_0.png while my client continuously taking screen shot. Kindly tell me what is wrong with my server code. The .recv will return an empty string only when the connection is closed by the client, but the client keeps the connection open. You could either open and close a connection for each image, or have the client tell the server how many bytes it's going to send, followed by the bytes (my preference would be to send the size as a string ending with, say, a newline). Another point: the .send method doesn't guarantee that it'll send all the bytes (it'll return the number of bytes that it has actually sent); use the .sendall method instead. Thank you MARB. I used your first idea and opening and closing connection for each image. Just wanted to know if i have to take pictures at small intervals , this method is efficient or not? It would be great if you can provide a bit code for second method. Keeping the connection open but how to know that this is end a particular image and save it. -- http://mail.python.org/mailman/listinfo/python-list
File transfer on network
Hello all Currently i am trying to develop a client and server in python. Client takes screenshot in every 20 seconds and send it to server. Server store the received file in folder. Here is code for Client import sys import socket import gtk.gdk import time if __name__ == __main__: s=socket.socket(socket.AF_INET,socket.SOCK_STREAM); host,port=localhost,50007 s.connect((host,port)) while 1: w = gtk.gdk.get_default_root_window() sz = w.get_size() pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB,False,8,sz[0],sz[1]) pb = pb.get_from_drawable(w,w.get_colormap(),0,0,0,0,sz[0],sz[1]) if (pb != None): pb.save('/home/tmp_1/screenshot.png',png) f=open('/home//tmp_1/screenshot.png','rb') while 1: blk=f.read(2048) if not blk:break s.send(blk) f.close() print 'file transfer done'; time.sleep(20) s.close() Server Code import sys import socket import time if __name__ == __main__: s=socket.socket(socket.AF_INET,socket.SOCK_STREAM) host,port=,50007 s.bind((host,port)) s.listen(1) conn,add=s.accept() i=0 while 1: f=open('/home/tmp_2/copyscreenshot_'+str(i)+'.png','wb') while 1: data=conn.recv(2048) if not data:break f.write(data) f.close() print 'file received' time.sleep(20) i+=1; #print ' file received ' conn.close() My problem is that server is copying only first image copyscreenshot_0.png while my client continuously taking screen shot. Kindly tell me what is wrong with my server code. Thank you Mukesh Tiwari import sys import socket import gtk.gdk import time if __name__ == __main__: s=socket.socket(socket.AF_INET
Re: File transfer on network
Kindly see this post as above post contains some garbage code in end. Sorry for that. Hello all Currently i am trying to develop a client and server in python. Client takes screenshot in every 20 seconds and send it to server. Server store the received file in folder. Here is code for Client import sys import socket import gtk.gdk import time if __name__ == __main__: s=socket.socket(socket.AF_INET,socket.SOCK_STREAM); host,port=localhost,50007 s.connect((host,port)) while 1: w = gtk.gdk.get_default_root_window() sz = w.get_size() pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB,False, 8,sz[0],sz[1]) pb = pb.get_from_drawable(w,w.get_colormap(), 0,0,0,0,sz[0],sz[1]) if (pb != None): pb.save('/home/tmp_1/screenshot.png',png) f=open('/home//tmp_1/screenshot.png','rb') while 1: blk=f.read(2048) if not blk:break s.send(blk) f.close() print 'file transfer done'; time.sleep(20) s.close() Server Code import sys import socket import time if __name__ == __main__: s=socket.socket(socket.AF_INET,socket.SOCK_STREAM) host,port=,50007 s.bind((host,port)) s.listen(1) conn,add=s.accept() i=0 while 1: f=open('/home/tmp_2/copyscreenshot_'+str(i) +'.png','wb') while 1: data=conn.recv(2048) if not data:break f.write(data) f.close() print 'file received' time.sleep(20) i+=1; #print ' file received ' conn.close() My problem is that server is copying only first image copyscreenshot_0.png while my client continuously taking screen shot. Kindly tell me what is wrong with my server code. Thank you Mukesh Tiwari -- http://mail.python.org/mailman/listinfo/python-list
Precision issue in python
Hello everyone. I think it is related to the precision with double arithmetic so i posted here.I am trying with this problem (https:// www.spoj.pl/problems/CALCULAT) and the problem say that Note : for all test cases whose N=100, its K=15. I know precision of doubles in c is 16 digits. Could some one please help me with this precision issue.I used stirling (http://en.wikipedia.org/wiki/ Stirling's_approximation) to calculate the first k digits of N. Thank you. __author__=Administrator __date__ =$Feb 19, 2010 3:16:47 PM$ import math if __name__ == __main__: n=int(raw_input()) while(n0): n-=1 raw_input() l=raw_input(); m=l.split( ) N,K,L=int(m[0]),int(m[1]),int(m[2]) fwd,bkd=1,1 s_1,s_2=, if(N=200): for i in range(1,N+1): fwd=fwd*i; s=str(fwd) s_1=s[:K] else: d_1=(N*math.log(N)-N+0.5*math.log(2*math.pi*N)+(1/12/N)- (1/360/pow(N,3))+(1/1260/pow(N,5))-(1/1680/pow(N,7))+(1/1188/pow(N,9))- (691/2730/12/11/pow(N,11))+(7/6/14/13/pow(N,13)))*math.log10(math.e) d_2=d_1-int(d_1)+K-1 fwd=pow(10,d_2) #print fwd fwd=int(fwd) s_1=str(fwd) if(N=500): for i in range(1,N+1): bkd=bkd*i bkd=bkd%pow(10,L) if(bkd==0): s_2=0*L else: s_2=str(bkd) else: s_2=0*L print s_1+ +s_2 -- http://mail.python.org/mailman/listinfo/python-list
Re: Precision issue in python
On Feb 20, 5:44 pm, Mark Dickinson dicki...@gmail.com wrote: On Feb 20, 11:17 am, mukesh tiwari mukeshtiwari.ii...@gmail.com wrote: Hello everyone. I think it is related to the precision with double arithmetic so i posted here.I am trying with this problem (https://www.spoj.pl/problems/CALCULAT) and the problem say that Note : for all test cases whose N=100, its K=15. I know precision of doubles in c is 16 digits. Could some one please help me with this precision issue.I used stirling (http://en.wikipedia.org/wiki/ Stirling's_approximation) to calculate the first k digits of N. Thank you. If I understand you correctly, you're trying to compute the first k digits in the decimal expansion of N!, with bounds of k = 15 and 100 = N 10**8. Is that right? Python's floats simply don't give you enough precision for this: you'd need to find the fractional part of log10(N!) with = 15 digits of precision. But the integral part needs ~ log10(log10(N!)) digits, so you end up needing to compute log10(N!) with at least 15 + log10(log10(N!)) ~ 15 + log10(N) + log10(log10(N)) digits (plus a few extra digits for safety); so that's at least 25 digits needed for N close to 10**8. The decimal module would get you the results you need (are you allowed imports?). Or you could roll your own log implementation based on integer arithmetic. -- Mark Yes i am trying to first k digits of N!.I will try your method. Thank you -- http://mail.python.org/mailman/listinfo/python-list
Re: Precision issue in python
On Feb 20, 8:13 pm, mukesh tiwari mukeshtiwari.ii...@gmail.com wrote: On Feb 20, 5:44 pm, Mark Dickinson dicki...@gmail.com wrote: On Feb 20, 11:17 am, mukesh tiwari mukeshtiwari.ii...@gmail.com wrote: Hello everyone. I think it is related to the precision with double arithmetic so i posted here.I am trying with this problem (https://www.spoj.pl/problems/CALCULAT) and the problem say that Note : for all test cases whose N=100, its K=15. I know precision of doubles in c is 16 digits. Could some one please help me with this precision issue.I used stirling (http://en.wikipedia.org/wiki/ Stirling's_approximation) to calculate the first k digits of N. Thank you. If I understand you correctly, you're trying to compute the first k digits in the decimal expansion of N!, with bounds of k = 15 and 100 = N 10**8. Is that right? Python's floats simply don't give you enough precision for this: you'd need to find the fractional part of log10(N!) with = 15 digits of precision. But the integral part needs ~ log10(log10(N!)) digits, so you end up needing to compute log10(N!) with at least 15 + log10(log10(N!)) ~ 15 + log10(N) + log10(log10(N)) digits (plus a few extra digits for safety); so that's at least 25 digits needed for N close to 10**8. The decimal module would get you the results you need (are you allowed imports?). Or you could roll your own log implementation based on integer arithmetic. -- Mark Yes i am trying to first k digits of N!.I will try your method. Thank you I am out of luck.I put d_2=d_1-int(d_1)+25 instead of d_2=d_1- int(d_1)+K-1 but i am getting WA. The decimal module would get you the results you need (are you allowed imports?). Or you could roll your own log implementation based on integer arithmetic. I don't know if is possible to import this decimal module but kindly tell me.Also a bit about log implementation Thank you -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Optimization
On Feb 15, 1:07 am, Steve Howell showel...@yahoo.com wrote: On Feb 14, 11:52 am, Mark Dickinson dicki...@gmail.com wrote: On Feb 14, 4:53 pm, mukesh tiwari mukeshtiwari.ii...@gmail.com wrote: Hello everyone. I am new to python and previously i did programming in c/c++.Could some one please help me to improve the run time for this python program as i don't have idea how to optimized this code.This code also seems to be more unpythonic so how to make it look like more pythonic . I am trying for this problem(https://www.spoj.pl/problems/ FACT1/). Thank you One other thing: in the 'brent' function, you're setting m to randrange(1, n). What's the purpose of this? It looks to me as though m controls the number of Pollard-Rho iterations that are clumped together at one time (before doing a gcd operation), and using a random number for this doesn't make a lot of sense to me. The randomness also makes the algorithm a little buggy. I tried this input: 37^5 41^5 Usually I get the right answer. Other times I get this: 37^5 41^3 1681^1 And occasionally it appears to get into an infinite loop, which definitely could be a cause of slowness. :) Thank you all for help. I will try to improve these short comings. -- http://mail.python.org/mailman/listinfo/python-list
Python Optimization
Hello everyone. I am new to python and previously i did programming in c/c++.Could some one please help me to improve the run time for this python program as i don't have idea how to optimized this code.This code also seems to be more unpythonic so how to make it look like more pythonic . I am trying for this problem(https://www.spoj.pl/problems/ FACT1/). Thank you # To change this template, choose Tools | Templates # and open the template in the editor. __author__=Mukesh Tiwari __date__ =$Feb 10, 2010 1:35:26 AM$ import random from Queue import Queue def gcd(a,b): while b: a,b=b,a%b return a def rabin_miller(p,t=1): if(p2): return False if(p!=2 and p%2==0): return False s=p-1 while(s%2==0): s=1 for i in xrange(t): a=random.randrange(p-1)+1 temp=s mod=pow(a,temp,p) while(temp!=p-1 and mod!=1 and mod!=p-1): mod=(mod*mod)%p temp=temp*2 if(mod!=p-1 and temp%2==0): return False return True def brent(n): if(n%2==0): return 2; x,c,m=random.randrange(0,n),random.randrange(1,n),random.randrange(1,n) y,r,q=x,1,1 g,ys=0,0 while(True): x=y for i in range(r): y,k=(y*y+c)%n,0 while(True): ys=y for i in range(min(m,r-k)): y,q=(y*y+c)%n,q*abs(x-y)%n g,k=gcd(q,n),k+m if(k= r or g1):break r=2*r if(g1):break if(g==n): while(True): ys,g=(x*x+c)%n,gcd(abs(x-ys),n) if(g1):break return g def factor(n): Q_1,Q_2=Queue(),[] Q_1.put(n) while(not Q_1.empty()): l=Q_1.get() if(rabin_miller(l)): Q_2.append(l) continue d=brent(l) if(d==l):Q_1.put(l) else: Q_1.put(d) Q_1.put(l/d) return Q_2 if __name__ == __main__: while(True): n=int(raw_input()) if(n==0):break L=factor(n) L.sort() #print L i=0 s= while(ilen(L)): cnt=L.count(L[i]) #print %d^%d%(L[i],cnt) s+=str(L[i])+^+str(cnt)+ i+=cnt print s[:-1] -- http://mail.python.org/mailman/listinfo/python-list
Problem Regarding Queue
Could some one please tell what is wrong with this code. I am trying to use Queue in this program but i am getting error Traceback (most recent call last): File /home/user/NetBeansProjects/NewPythonProject2/src/ Pollard_rho.py, line 80, in module factor(n) File /home/user/NetBeansProjects/NewPythonProject2/src/ Pollard_rho.py, line 59, in factor Q_1=Queue() NameError: global name 'Queue' is not defined. As i am new to python so kindly pardon me if i sound stupid. Here is the code # To change this template, choose Tools | Templates # and open the template in the editor. __author__=Mukesh Tiwari __date__ =$Feb 10, 2010 1:35:26 AM$ import random import sys def gcd(a,b): while b: a,b=b,a%b return a def rabin_miller(p): if(p2): return False if(p!=2 and p%2==0): return False s=p-1 while(s%2==0): s=1 for i in xrange(10): a=random.randrange(p-1)+1 temp=s mod=pow(a,temp,p) while(temp!=p-1 and mod!=1 and mod!=p-1): mod=(mod*mod)%p temp=temp*2 if(mod!=p-1 and temp%2==0): return False return True def pollard(n): if(n%2==0): return 2; x=random.randrange(2,100) c=random.randrange(2,100) y=x d=1 while(d==1): x=(x*x+c)%n y=(y*y+c)%n y=(y*y+c)%n d=gcd(x-y,n) if(d==n): break; return d; def factor(n): #if(rabin_miller(n)): # print n # return #d=pollard(n) #if(d!=n): # factor(d) # factor(n/d) #else: # factor(n) Q_1=Queue() Q_2=Queue() Q_1.put(n) while(not Q_1.empty()): l=Q_1.get() if(rabin_miller(l)): Q_2.put(l) continue d=pollard(l) if(d==l):Q_1.put(l) else: Q_1.put(d) Q_1.put(l/d) while(not Q_2.empty()): print Q_2.get() if __name__ == __main__: while(True): n=input(); factor(n) -- http://mail.python.org/mailman/listinfo/python-list
Last M digits of expression A^N
Hello everyone. I am kind of new to python so pardon me if i sound stupid. I have to find out the last M digits of expression.One thing i can do is (A**N)%M but my A and N are too large (10^100) and M is less than 10^5. The other approach was repeated squaring and taking mod of expression. Is there any other way to do this in python more faster than log N. def power(A,N,M): ret=1 while(N): if(N%2!=0):ret=(ret*A)%M A=(A*A)%M N=N//2 return ret -- http://mail.python.org/mailman/listinfo/python-list