Hi All,

I have a requirement to call a rest service url for 300k customer ids.

Things I have tried so far is


custid_rdd = sc.textFile('file:////Users/zzz/CustomerID_SC/Inactive User Hashed 
LCID List.csv') #getting all the customer ids and building adds

profile_rdd = custid_rdd.map(lambda r: getProfile(r.split(',')[0]))

profile_rdd.count()


#getprofile is the method to do the http call

def getProfile(cust_id):

    api_key = 'txt'

    api_secret = 'yuyuy'

    profile_uri = 'https://profile.localytics.com/x1/customers/{}'

    customer_id = cust_id


    if customer_id is not None:

        data = requests.get(profile_uri.format(customer_id), 
auth=requests.auth.HTTPBasicAuth(api_key, api_secret))

#         print json.dumps(data.json(), indent=4)

    return data


when I print the json dump of the data i see it returning results from the rest 
call. But the count never stops.


Is there an efficient way of dealing this? Some post says we have to define a 
batch size etc but don't know how.


Appreciate your help


Regards,

Amit

Reply via email to