Hi Shane,

Thanks for your response. I looked at how Hadoop Streaming works, and I 
think I now have a vague idea of how this project will work out. 
 
Suppose we do this for a language X : The user would call some functions, 
which would write to a file. This output file will be given to scrapy as an 
input. We would then parse the file and give them as inputs to particular 
scrapy methods. All of the results would then be written to a file as JSON 
responses. This output file will be given as input to the language X, which 
would then store the results in some data structure.

Am I right ? Right now I am going through the scrapy codebase, after which 
I would provide some examples of what I am trying to do.

Thanks   

On Tuesday, February 18, 2014 2:20:14 AM UTC+5:30, shane wrote:
>
> Hi Mohammed,
>
> It's nice to hear you''ve found Scrapy useful and are interested in GSoC. 
>  Answers to your questions below.
>
>
>
>> I was interested in this idea on the ideas page "Support for spiders in 
>> other languages". TI had some questions regarding this:
>>
>> 1) Do we have to make wrappers or should the code be written in the other 
>> language from scratch ? 
>>
> The other language part can be written from scratch
>
>  
>
>>
>> 2) Quoting from the ideas page "The goal of this project is to allow 
>> developers to write spiders simply and easily in any programming language, 
>> while permitting Scrapy to manage concurrency, scheduling, item exporting, 
>> caching, etc."  Does this mean this project will enable any programming 
>> language to use Scrapy ... or will we be adding support for languages 
>> separately one by one?
>>
>
> It will enable any language to be used from Scrapy. Users will simply 
> write a program that can read serialized Scrapy responses (probably as 
> JSON) and write serialized Requests and Items.
>
> By adding support for a given language in the form of a library we can 
> make it more pleasant to implement spiders in that language. I used the 
> example of hadoop streaming, which can be used by any language, however if 
> you use a python library like mrjob, hadoopy, dumbo, etc. it's a nicer 
> experience. I added this as a stretch goal - it's optional. I expect we can 
> add something for python to make scrapy spiders run most of the time just 
> by changing an import and possibly add another language or 2 depending on 
> time.
>
>  
>
>>
>> 3) Which language will be better ? This question will depend on what the 
>> target audience is .. Developers or Scientists ? We can expect developers 
>> to be familiar with Javascript/Ruby/Java/Python/etc , Whereas Scientists 
>> would know C/C++/Python/Java. This is just my view, I might be wrong too !!
>>
>
> I'm not sure :) I expect C & C++ are probably not that convenient or 
> common for spider code, Java, JS & Ruby would probably be used, and python 
> could be useful for existing scrapy users (e.g. running spiders that crash)
>
> Maybe someone reading this wants to make a case for a specific language? 
>
> Cheers,
>
> Shane
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to