Hello Peng Foo,

Thanks for your interest in participating in GSoC 2016 with Scrapy!
Sorry for not replying earlier.

Your proposal also arrived in the mentoring dashboard 
on https://summerofcode.withgoogle.com
We'll review it in the coming days and provide feedback if necessary.

Kind regards,
Paul.

On Saturday, March 12, 2016 at 4:47:11 PM UTC+1, Pan Foo wrote:
>
> hi, all,
>
> i'm peng foo, a post-graduate student major in CS in BUAA,aka  BeiHang 
> Univ. BHU, China, which is one of the top univ. of CS in China . (i know it 
> may be unfamiliar to you , but it's been shown in the "CSI:CYBER S2E11" :-) 
> if you've got interest , i'd love to tell you why my univ. changed its name 
> xD   )
>
> <https://pic3.zhimg.com/8559284960725caae8dc347cf4d7943a_b.png>
>
>
> *I am Python skilled and a lot of experience on web scraping and web 
> developing.*
>
>
> I major in ML/NLP, and Python is the most popular language in the field, 
> so i write python a lot, i also help my lab write some frontend work using 
> javascript and HTML5 and jquery.
>
> I am an intern at Sogou.Inc  during 11, 2015 - 4,2016, and my mainly job 
> is using python do the web scraping job and data cleaning and data 
> anylasis, during the time, i write a lot of spiders xD
> i've crawled big sites like yahoo/sina finance, tencent news and yahoo 
> news, sina weibo( chinese edition of twitter) and so on.
>
> I've  also been interned at Lenovo and i do researching jobs on machine 
> learning with Python mainly about gesture identifying at 2014. i meet with 
> Ipython Notebook then, by the time it was ipython notebook instead of 
> jupyter :) 
> and now ipython is the first choice of writing python remotely instead of 
> writing python locally and use scp command to transfer the the remote 
> server and run it:)
>
>  
> i was amazed at what a python IDE in browser can do , and i was shocked 
> when i use ipython notebook and metaplotlib to draw a line gragh in the IDE 
> instantly! 
>
> and i really think combine scrapy and ipython notebook together is a good 
> thing to do.
>
> followings are some thing i think i may meet up with and problems i shall 
> solve.
>
> 1, there are indeed some problems for showing HTML in HTML, such as class 
> conflicting and some layout showing issues because the jupyter console is 
> much thinner than the browser.
>
> 2, cross site/domain issues such as the code of ajax request and also a 
> jump-out-js-code.
>
> 3, security issues such as bad codes or some alerts.
>
> 4, the performance issue: what shall we do if user write too many 
> show-HTML code and the console may be slow. you know it is likely to happen 
> because people like to use jupyter to do some inline debug work because 
> jupyter can show the results immediately:-)
>
>
> and followings are the ideas i've always wanted to make it happen and i 
> think it may fit the scrapy + ipython idea:
>
> 0, formatted dom tree shown in jupyter console. in web scraping , dom is 
> as important as HTML, and we could both show HTML page and dom tree in the 
> console maybe. 
>
> 1, visualized elements selection. it's a feature of most browsers' 
> developer's toolkit (F12 in chrome and firebug in firefox), it is such a 
> good feature that every time i write a spider, i shall use it to locate the 
> elements, i think it would be good if we could make it happen that when i 
> select a node in the jupyter console ( based on idea 0 implemented ),  i 
> could see the elements in the html page highlighted:)
>
> 2. xpah/cssselector generator. i've been wanted to implement this idea for 
> a long time xD. when i was to scrap something, i always wanted the xpath 
> and the cssselector shall jump out themselves when i  select something in 
> the html page i was to scrap, a paragrah or a table or a <a> tag  or a <ul> 
> list, and once we can show the html page in the jupyter console, it may 
> come true!  the xpath or cssselector could be generated automatically when 
> elements in the page are selected and i'm sure it would help a lot!
>
>
>
>
> when i first search for the python projects of gsoc 2016, i did not find 
> one that i'd like to participate in. but some time earlier today when i saw 
> scrapy in the python projects of gsoc 2016, i was so excited and i thought 
> in my mind "this is it!". i just want you to know that i am long for a 
> chance work with you and contributing codes to the scrapy project!
>
> if i am not good for the "ipython IDE for scrapy" idea , any other idea is 
> okay for me to do :-)
>
>
> best regards,
>
> peng foo,
>
>
>
>
>
>
>
>
>  
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to