I use scrapy in order to scrape a social network and then get the data in a 
NEO4J database.

My challenge here is to relate 2 items each other:

class person(scrapy.Item): name=Field()

class AnotherPerson(scrapy.Item): name=Field()

I want to save those two items in my graph database by saying:

Person has relationship with AnotherPerson()

What I need here is to send two items in ONE pipeline !! How can we do this 
? I tried to send it through a list, but scrapy doesn't accept the list as 
soon as a collection is in there.

Here is my pseudo code:

1- I get a list of person (each person has profile and a list of firends 
like facebook)

2- For each person in this list:

   - I open his profile (through a request and send the response to a 
   callback)
   - I take the response and create a item: Person() and fill it
   - I send the item with a "yield"
   - Then I open his list of friend (through a request and send the 
   response to a another callback)
   - I have the friend list page
   - Then For each friend in this list (the page display a name and a city):
   - create an item: AnotherPerson()
      - I fill this item with the name and the city
      - I send the item with a "yield"
   
I have two pipelines. they work well to save the data in database, but I 
don't have any clue to how I can relate them because for that I need to do 
that in the same process (ie. pipeline).

Im not sure if I've been clear, so don't hesitate to ask for clarifications.

Thanks for your help guys

Regards,

Mayouf

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to