Both solutions mean that i cannot use the beam IO classes that will be me the distribution, but i would have to get the data myself using a ParDo method, is this something that will change in the future? i understand that spark has a push down method that will pass the filter to the next level of querys. chaim On Mon, Oct 22, 2018 at 4:02 PM Jeff Klukas <jklu...@mozilla.com> wrote: > > Chaim - If the full list of IDs is able to fit comfortably in memory and the > Mongo collection is small enough that you can read the whole collection, you > may want to fetch the IDs into a Java collection using the BigQuery API > directly, then turn them into a Beam PCollection using > Create.of(collection_of_ids). You could then use MongoDbIO.read() to read the > entire collection, but throw out rows based on the side input of IDs. > > If the list of IDs is particularly small, you could fetch the collection into > memory and parse that into a string filter that you pass to MongoDbIO.read() > to specify which documents to fetch, avoiding the need for a side input. > > Otherwise, if it's a large number of IDs, you may need to use Beam's > BigQueryIO to create a PCollection for the IDs, and then pass that into a > ParDo with a custom DoFn that issues Mongo queries for a batch of IDs. I'm > not very familiar with Mongo APIs, but you'd need to give the DoFn a > connection to Mongo that's serializable. You could likely look at the > implementation of MongoDbIO for inspiration there. > > On Sun, Oct 21, 2018 at 5:18 AM Chaim Turkel <ch...@behalf.com> wrote: >> >> hi, >> I have the following flow i need to implement. >> From the bigquery i run a query and get a list of id's then i need to >> load from mongo all the documents based on these id's and export them >> as an xml file. >> How do you suggest i go about doing this? >> >> chaim >> >> -- >> >> >> Loans are funded by >> FinWise Bank, a Utah-chartered bank located in Sandy, >> Utah, member FDIC, Equal >> Opportunity Lender. Merchant Cash Advances are >> made by Behalf. For more >> information on ECOA, click here >> <https://www.behalf.com/legal/ecoa/>. For important information about >> opening a new >> account, review Patriot Act procedures here >> <https://www.behalf.com/legal/patriot/>. >> Visit Legal >> <https://www.behalf.com/legal/> to >> review our comprehensive program terms, >> conditions, and disclosures.
-- Loans are funded by FinWise Bank, a Utah-chartered bank located in Sandy, Utah, member FDIC, Equal Opportunity Lender. Merchant Cash Advances are made by Behalf. For more information on ECOA, click here <https://www.behalf.com/legal/ecoa/>. For important information about opening a new account, review Patriot Act procedures here <https://www.behalf.com/legal/patriot/>. Visit Legal <https://www.behalf.com/legal/> to review our comprehensive program terms, conditions, and disclosures.