Both solutions mean that i cannot use the beam IO classes that will be
me the distribution, but i would have to get the data myself using a
ParDo method, is this something that will change in the future? i
understand that spark has a push down method that will pass the filter
to the next level of querys.
chaim
On Mon, Oct 22, 2018 at 4:02 PM Jeff Klukas <jklu...@mozilla.com> wrote:
>
> Chaim - If the full list of IDs is able to fit comfortably in memory and the 
> Mongo collection is small enough that you can read the whole collection, you 
> may want to fetch the IDs into a Java collection using the BigQuery API 
> directly, then turn them into a Beam PCollection using 
> Create.of(collection_of_ids). You could then use MongoDbIO.read() to read the 
> entire collection, but throw out rows based on the side input of IDs.
>
> If the list of IDs is particularly small, you could fetch the collection into 
> memory and parse that into a string filter that you pass to MongoDbIO.read() 
> to specify which documents to fetch, avoiding the need for a side input.
>
> Otherwise, if it's a large number of IDs, you may need to use Beam's 
> BigQueryIO to create a PCollection for the IDs, and then pass that into a 
> ParDo with a custom DoFn that issues Mongo queries for a batch of IDs. I'm 
> not very familiar with Mongo APIs, but you'd need to give the DoFn a 
> connection to Mongo that's serializable. You could likely look at the 
> implementation of MongoDbIO for inspiration there.
>
> On Sun, Oct 21, 2018 at 5:18 AM Chaim Turkel <ch...@behalf.com> wrote:
>>
>> hi,
>>   I have the following flow i need to implement.
>> From the bigquery i run a query and get a list of id's then i need to
>> load from mongo all the documents based on these id's and export them
>> as an xml file.
>> How do you suggest i go about doing this?
>>
>> chaim
>>
>> --
>>
>>
>> Loans are funded by
>> FinWise Bank, a Utah-chartered bank located in Sandy,
>> Utah, member FDIC, Equal
>> Opportunity Lender. Merchant Cash Advances are
>> made by Behalf. For more
>> information on ECOA, click here
>> <https://www.behalf.com/legal/ecoa/>. For important information about
>> opening a new
>> account, review Patriot Act procedures here
>> <https://www.behalf.com/legal/patriot/>.
>> Visit Legal
>> <https://www.behalf.com/legal/> to
>> review our comprehensive program terms,
>> conditions, and disclosures.

-- 


Loans are funded by
FinWise Bank, a Utah-chartered bank located in Sandy, 
Utah, member FDIC, Equal
Opportunity Lender. Merchant Cash Advances are 
made by Behalf. For more
information on ECOA, click here 
<https://www.behalf.com/legal/ecoa/>. For important information about 
opening a new
account, review Patriot Act procedures here 
<https://www.behalf.com/legal/patriot/>.
Visit Legal 
<https://www.behalf.com/legal/> to
review our comprehensive program terms, 
conditions, and disclosures. 

Reply via email to