[Tutor] extracting informations (images and text) from a PDF and creating a database from it

2009-12-28 Thread Shashwat Anand
I need to make a database from some PDFs. I need to extract logos as well as
the information (i.e. name,address) beneath the logo and fill it up in
database. The logo can be text as well as picture as shown in two of the
screenshots of one of the sample PDF file:
http://imagebin.org/77378
http://imagebin.org/77379
Will converting to html  a good option? Later on I need to apply some image
processing too. What should be the ideal way towards it ?
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] computer basics

2009-12-28 Thread Richard Hultgren
I am learning Python slowly.  I would like to begin learning all about how 
computers work from the bottom up.  I have an understanding of binary code.  
Where should I go from here; can you suggest continued reading, on line or off 
to continue my education?
                                            Thank You
                                                Richard


  ___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python and Computational Geometry

2009-12-28 Thread Kent Johnson
Googling "python computational geometry" points to
http://www.cgal.org/ and
http://cgal-python.gforge.inria.fr/

Kent

On Mon, Dec 28, 2009 at 6:13 PM, Abdulhafid Igor Ryabchuk
 wrote:
> Dear Pythonistas,
>
> I am starting a small project that centres around implementation of
> computational geometry algorithms. I was wondering if there are any
> particular Python modules I should have a look at.
>
> Regards,
>
> AH
> ___
> Tutor maillist  -  tu...@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Python and Computational Geometry

2009-12-28 Thread Abdulhafid Igor Ryabchuk
Dear Pythonistas,

I am starting a small project that centres around implementation of
computational geometry algorithms. I was wondering if there are any
particular Python modules I should have a look at.

Regards,

AH
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] using mechanize to authenticate and pull data out of site

2009-12-28 Thread Norman Khine
hello,
thank you all for the replies.

On Mon, Dec 28, 2009 at 10:21 AM, Rich Lovely  wrote:
> 2009/12/26 Norman Khine :
>> Hello,
>>
>> I am trying to authenticate on http://commerce.sage.com/Solidarmonde/
>> using urllib but have a problem in that there are some hidden fields
>> that use javascript to create a security token which then is passed to
>> the submit button and to the header.
>>
>> Here is the output of the LiveHeader during authentication
>>
>> http://paste.lisp.org/display/92656
>>
>> Here is what I have so far:
>>
>> http://paste.lisp.org/+1ZHS/1
>>
> print results
>> But the page returned prints out that the session is out of time.
>>
>> Here are details of the forms:
>>
>> http://paste.lisp.org/+1ZHS/2
>>
>> Any help much appreciated.
>>
>> Norman
>> ___
>> Tutor maillist  -  tu...@python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>>
>
> The first thing to try is to attempt to login with javascript
> disabled.  If it will let you do that, transfer the relevant form info
> to the mechanize browser, and it should be fine.

It does not work, i need javascript enabled in order to login.

>
> If not, you will need to look through all of the javascript files, to
> find out which one generates/receives the security token.  Looking at
> it, the element will be called "_xmlToken".

Looking at the javascript - http://paste.lisp.org/+1ZHS/4

the 'function browser_localForm_form_onsubmit' has contextKey that is
passed to it.

i think the verification between the two tokens comes:

securityToken = _browser.getElement("_xmlToken");
document.localForm.__sgx_contextSecurity.value = securityToken.value;

also there seems to be a lot of hash keys being generated at the
begining of the javascripts, here are some examples:

http://paste.lisp.org/+1ZHS/3

>
> The "xml" suggests that it might be received over ajax, which means
> you will need to find the page that it comes from, and fake an ajax
> request to it - fortunately, this is just a simple http request, much
> like you are already doing - it's just handled under the surface by
> javascript.

how would i fake the ajax before i submit the form everything seems to
come form this page /solidarmonde/defaultsgx.asp

thanks

>
> --
> Rich "Roadie Rich" Lovely
>
> There are 10 types of people in the world: those who know binary,
> those who do not, and those who are off by one.
>



-- 
%>>> "".join( [ {'*':'@','^':'.'}.get(c,None) or
chr(97+(ord(c)-83)%26) for c in ",adym,*)&uzq^zqf" ] )
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] using mechanize to authenticate and pull data out of site

2009-12-28 Thread Rich Lovely
2009/12/26 Norman Khine :
> Hello,
>
> I am trying to authenticate on http://commerce.sage.com/Solidarmonde/
> using urllib but have a problem in that there are some hidden fields
> that use javascript to create a security token which then is passed to
> the submit button and to the header.
>
> Here is the output of the LiveHeader during authentication
>
> http://paste.lisp.org/display/92656
>
> Here is what I have so far:
>
> http://paste.lisp.org/+1ZHS/1
>
 print results
> But the page returned prints out that the session is out of time.
>
> Here are details of the forms:
>
> http://paste.lisp.org/+1ZHS/2
>
> Any help much appreciated.
>
> Norman
> ___
> Tutor maillist  -  tu...@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>

The first thing to try is to attempt to login with javascript
disabled.  If it will let you do that, transfer the relevant form info
to the mechanize browser, and it should be fine.

If not, you will need to look through all of the javascript files, to
find out which one generates/receives the security token.  Looking at
it, the element will be called "_xmlToken".

The "xml" suggests that it might be received over ajax, which means
you will need to find the page that it comes from, and fake an ajax
request to it - fortunately, this is just a simple http request, much
like you are already doing - it's just handled under the surface by
javascript.

-- 
Rich "Roadie Rich" Lovely

There are 10 types of people in the world: those who know binary,
those who do not, and those who are off by one.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor