Re: update_or_create() always creates (or recreates)

2015-11-06 Thread Yunti
Thanks - you've definitely given me some stuff to think about.  I'm doing 
XHR requests - returning JSON for the scraping (but probably later will 
have normal pages so I will definitely look at your Etag suggestion - I'm 
not familiar with that so will look into it). 

Given it's XHR and JSON I presume eTag isn't relevant, so I think your idea 
of setting a flag is a good one.  So for each row in each table (e.g. a 
Supplier) that I rescrape - get that from database based on the unique_id 
and then compare each attribute to the re-scraped JSON and alter 
flag/update instance if diff.  

The data will only change about a fraction of a percent of the time (most 
of the time constant) and it will be about 70k rows with 50 -100 fields. 
 DB is postgres (on Heroku for now). 

On Friday, 6 November 2015 17:12:05 UTC, Dan Tagg wrote:
>
> If you are web scraping you really need your code to be as efficient as 
> possible and to do as little as possible. Firstly, make sure you are using 
> everything the servers of the websites you are scraping are giving you to 
> decide whether to bother downloading the page. For example, check the etag 
> and only bother to scape if it is different from the last time you scraped 
> data.. If you don't trust the server's ETag, you can hash the page when you 
> download it and check that against your stored hash so you can check 
> whether it changed and whether it's worth processing. 
>
> Your approach of trying a 'get' with all the properties set and picking up 
> the exception has costs -- Assuming your tables have enough rows that 
> scanning the entire table won't be efficient for every "get" you will need 
> to have every column you are using in you "get" indexed in the database. 
> This obviously has a storage cost as well as an additional insert/update 
> cost and a larger cost to run the query than a simple select against a 
> single key. Whether that is more efficient than getting the result and 
> comparing the fields in python I don't know. I imagine it will be dependent 
> on what your RDBMS is and how it is hosted as well as how many rows and 
> columns will be in your database table.
>
> You could initialise a flag to False and as you process your scraped data 
> you could compare it to the attributes of your instance and set the flag to 
> True if they have changed and then not bother saving if you get to the end 
> of processing your scraped data and the modified flag has not been set to 
> True.
>
> Dan 
>
> On 6 November 2015 at 16:12, Yunti  wrote:
>
>> Hi Dan,
>>
>> Thanks for the suggestion, it's a web scraper (run as a django management 
>> command) which then saves the data to the database via the Django ORM.  
>> Given it's a scraper rather than a form (or view) is the above suggested 
>> function an ok way to proceed or would you suggest something else is more 
>> appropriate/best practice?
>>
>>
>>
>> On Friday, 6 November 2015 14:40:59 UTC, Dan Tagg wrote:
>>>
>>> Hi Yunti,
>>>
>>>
>>> You could go up a level in the structure of your application and apply 
>>> the logic there, where there is more support.
>>>
>>> Are you using Django forms? The ModelForm class pretty much does what 
>>> you want, it examines form data, validating it against its type and any 
>>> validation rules you have set in the form or your model, compares it to the 
>>> instance's data in the database and only saves if there has been some kind 
>>> of change. 
>>>
>>> Dan
>>>
>>> On 6 November 2015 at 13:47, Yunti  wrote:
>>>
 Jani,

 Thanks for your reply - you explained it much more concisely than I 
 did. :)

 Good to have it confirmed that update_or_create() doesn't quite do what 
 I needed - I was confused as to whether it would or not.

 Thanks for taking the time to do that function, that looks ideal. I'll 
 test it out.


 On Friday, 6 November 2015 12:52:11 UTC, Jani Tiainen wrote:

> Your problem lies on the way Django actually carries out create or 
> update.
>
> As name suggest, create or update does either one. But that's what you 
> don't want - you want conditional update.
>
> Only update if certain fields have been changed. Well this can be done 
> few ways.
>
> So you want to do 
> "update_only_if_at_least_one_of_default_fields_changed_or_create"
>
> Operation is simple, if object is not found, create new one using 
> defaults if found, pull values as a dict, compare against
> default values and if at least one differs do an update. Otherwise 
> don't do anything.
>
> So basically code would look something like this:
>
> update_if_changed_or_create(**kwargs):
> defaults = kwargs.pop('defaults', None)
>
> qs = MyModel.objects.filter(**kwargs)
>
>  if not qs:
> obj = MyModel(**kwargs).save()
> return obj, True  # Created object
> 

Re: update_or_create() always creates (or recreates)

2015-11-06 Thread Dan Tagg
If you are web scraping you really need your code to be as efficient as
possible and to do as little as possible. Firstly, make sure you are using
everything the servers of the websites you are scraping are giving you to
decide whether to bother downloading the page. For example, check the etag
and only bother to scape if it is different from the last time you scraped
data.. If you don't trust the server's ETag, you can hash the page when you
download it and check that against your stored hash so you can check
whether it changed and whether it's worth processing.

Your approach of trying a 'get' with all the properties set and picking up
the exception has costs -- Assuming your tables have enough rows that
scanning the entire table won't be efficient for every "get" you will need
to have every column you are using in you "get" indexed in the database.
This obviously has a storage cost as well as an additional insert/update
cost and a larger cost to run the query than a simple select against a
single key. Whether that is more efficient than getting the result and
comparing the fields in python I don't know. I imagine it will be dependent
on what your RDBMS is and how it is hosted as well as how many rows and
columns will be in your database table.

You could initialise a flag to False and as you process your scraped data
you could compare it to the attributes of your instance and set the flag to
True if they have changed and then not bother saving if you get to the end
of processing your scraped data and the modified flag has not been set to
True.

Dan

On 6 November 2015 at 16:12, Yunti  wrote:

> Hi Dan,
>
> Thanks for the suggestion, it's a web scraper (run as a django management
> command) which then saves the data to the database via the Django ORM.
> Given it's a scraper rather than a form (or view) is the above suggested
> function an ok way to proceed or would you suggest something else is more
> appropriate/best practice?
>
>
>
> On Friday, 6 November 2015 14:40:59 UTC, Dan Tagg wrote:
>>
>> Hi Yunti,
>>
>>
>> You could go up a level in the structure of your application and apply
>> the logic there, where there is more support.
>>
>> Are you using Django forms? The ModelForm class pretty much does what you
>> want, it examines form data, validating it against its type and any
>> validation rules you have set in the form or your model, compares it to the
>> instance's data in the database and only saves if there has been some kind
>> of change.
>>
>> Dan
>>
>> On 6 November 2015 at 13:47, Yunti  wrote:
>>
>>> Jani,
>>>
>>> Thanks for your reply - you explained it much more concisely than I did.
>>> :)
>>>
>>> Good to have it confirmed that update_or_create() doesn't quite do what
>>> I needed - I was confused as to whether it would or not.
>>>
>>> Thanks for taking the time to do that function, that looks ideal. I'll
>>> test it out.
>>>
>>>
>>> On Friday, 6 November 2015 12:52:11 UTC, Jani Tiainen wrote:
>>>
 Your problem lies on the way Django actually carries out create or
 update.

 As name suggest, create or update does either one. But that's what you
 don't want - you want conditional update.

 Only update if certain fields have been changed. Well this can be done
 few ways.

 So you want to do
 "update_only_if_at_least_one_of_default_fields_changed_or_create"

 Operation is simple, if object is not found, create new one using
 defaults if found, pull values as a dict, compare against
 default values and if at least one differs do an update. Otherwise
 don't do anything.

 So basically code would look something like this:

 update_if_changed_or_create(**kwargs):
 defaults = kwargs.pop('defaults', None)

 qs = MyModel.objects.filter(**kwargs)

  if not qs:
 obj = MyModel(**kwargs).save()
 return obj, True  # Created object
 else if len(qs) == 1:
 obj = qs[0]
 changed = False
 for k, v in defaults:
  if getattr(obj, k) != v:
  changed = True
  setattr(obj, k, v)
 if changed:
 obj.save()
 return obj, False  # Updated object
 else:
 # Multiple objects...

 return obj, None  # No change.


 On 06.11.2015 14:08, Yunti wrote:

 Carsten ,

 Thanks for your reply,

 A note about the last statement: If a Supplier object has the same
 unique_id, and all
 other fields (in `defaults`) are the same as well, logically there is
 no difference
 between updating and not updating – the result is the same.

 The entry in the database is the same - apart from the last_updated
 flag if it's not rewritten over the top of it.  This means I can check for
 new data often and be alerted when there is an actual 

Re: update_or_create() always creates (or recreates)

2015-11-06 Thread Yunti
Hi Dan,

Thanks for the suggestion, it's a web scraper (run as a django management 
command) which then saves the data to the database via the Django ORM. 
 Given it's a scraper rather than a form (or view) is the above suggested 
function an ok way to proceed or would you suggest something else is more 
appropriate/best practice?



On Friday, 6 November 2015 14:40:59 UTC, Dan Tagg wrote:
>
> Hi Yunti,
>
>
> You could go up a level in the structure of your application and apply the 
> logic there, where there is more support.
>
> Are you using Django forms? The ModelForm class pretty much does what you 
> want, it examines form data, validating it against its type and any 
> validation rules you have set in the form or your model, compares it to the 
> instance's data in the database and only saves if there has been some kind 
> of change. 
>
> Dan
>
> On 6 November 2015 at 13:47, Yunti  wrote:
>
>> Jani,
>>
>> Thanks for your reply - you explained it much more concisely than I did. 
>> :)
>>
>> Good to have it confirmed that update_or_create() doesn't quite do what I 
>> needed - I was confused as to whether it would or not.
>>
>> Thanks for taking the time to do that function, that looks ideal. I'll 
>> test it out.
>>
>>
>> On Friday, 6 November 2015 12:52:11 UTC, Jani Tiainen wrote:
>>
>>> Your problem lies on the way Django actually carries out create or 
>>> update.
>>>
>>> As name suggest, create or update does either one. But that's what you 
>>> don't want - you want conditional update.
>>>
>>> Only update if certain fields have been changed. Well this can be done 
>>> few ways.
>>>
>>> So you want to do 
>>> "update_only_if_at_least_one_of_default_fields_changed_or_create"
>>>
>>> Operation is simple, if object is not found, create new one using 
>>> defaults if found, pull values as a dict, compare against
>>> default values and if at least one differs do an update. Otherwise don't 
>>> do anything.
>>>
>>> So basically code would look something like this:
>>>
>>> update_if_changed_or_create(**kwargs):
>>> defaults = kwargs.pop('defaults', None)
>>>
>>> qs = MyModel.objects.filter(**kwargs)
>>>
>>>  if not qs:
>>> obj = MyModel(**kwargs).save()
>>> return obj, True  # Created object
>>> else if len(qs) == 1:
>>> obj = qs[0]
>>> changed = False
>>> for k, v in defaults:
>>>  if getattr(obj, k) != v:
>>>  changed = True
>>>  setattr(obj, k, v)
>>> if changed:
>>> obj.save()
>>> return obj, False  # Updated object
>>> else:
>>> # Multiple objects...
>>>
>>> return obj, None  # No change.
>>>
>>>
>>> On 06.11.2015 14:08, Yunti wrote:
>>>
>>> Carsten , 
>>>
>>> Thanks for your reply,
>>>
>>> A note about the last statement: If a Supplier object has the same 
>>> unique_id, and all 
>>> other fields (in `defaults`) are the same as well, logically there is no 
>>> difference 
>>> between updating and not updating – the result is the same. 
>>>
>>> The entry in the database is the same - apart from the last_updated flag 
>>> if it's not rewritten over the top of it.  This means I can check for new 
>>> data often and be alerted when there is an actual update (i.e. a change to 
>>> the data).  If it rewrites the data everytime it checks then I have no idea 
>>> when data was actually updated.
>>>
>>> Have you checked? How? 
>>> In your create_or_update_if_diff() you seem to try to re-invent 
>>> update_or_create(), but 
>>> have you actually examined the results of the 
>>>
>>>  supplier, created = Supplier.objects.update_or_create(...) 
>>>
>>> call? 
>>>
>>> I checked by seeing that the last_updated field in the database was 
>>> updated everytime.  (I suppose the issue could be with how that field gets 
>>> reset to the next time it's run- I didn't eliminate that possibility.)
>>>
>>> Yes I was worried that I might be recreating (a poor version) of 
>>> update_or_create() but it didn't seem to have the option where it wouldn't 
>>> write to the database if there was no change to the data.   
>>> Can it do this? And how would I verify when an item has been updated or 
>>> created (or neither) - could I output to the console? 
>>>
>>> If it can how do I call it so it checks against all fields (unique_id 
>>> and defaults) and updates using the defaults if it finds a difference (and 
>>> creates if it doesn't find a unique_id)?
>>>
>>> I'm still not sure if this is possible and how to call the function, 
>>> particular how to pass in the remaining defaults to check against - 
>>> **kwargs = defaults isn't right but not sure what it should be.
>>>
>>> supplier, created = 
>>> Supplier.objects.update_or_create(unique_id=product_detail['supplierId'], 
>>> **kwargs=defaults, 
>>>defaults={
>>>'name': 
>>> 

Re: update_or_create() always creates (or recreates)

2015-11-06 Thread Dan Tagg
Hi Yunti,


You could go up a level in the structure of your application and apply the
logic there, where there is more support.

Are you using Django forms? The ModelForm class pretty much does what you
want, it examines form data, validating it against its type and any
validation rules you have set in the form or your model, compares it to the
instance's data in the database and only saves if there has been some kind
of change.

Dan

On 6 November 2015 at 13:47, Yunti  wrote:

> Jani,
>
> Thanks for your reply - you explained it much more concisely than I did. :)
>
> Good to have it confirmed that update_or_create() doesn't quite do what I
> needed - I was confused as to whether it would or not.
>
> Thanks for taking the time to do that function, that looks ideal. I'll
> test it out.
>
>
> On Friday, 6 November 2015 12:52:11 UTC, Jani Tiainen wrote:
>
>> Your problem lies on the way Django actually carries out create or update.
>>
>> As name suggest, create or update does either one. But that's what you
>> don't want - you want conditional update.
>>
>> Only update if certain fields have been changed. Well this can be done
>> few ways.
>>
>> So you want to do
>> "update_only_if_at_least_one_of_default_fields_changed_or_create"
>>
>> Operation is simple, if object is not found, create new one using
>> defaults if found, pull values as a dict, compare against
>> default values and if at least one differs do an update. Otherwise don't
>> do anything.
>>
>> So basically code would look something like this:
>>
>> update_if_changed_or_create(**kwargs):
>> defaults = kwargs.pop('defaults', None)
>>
>> qs = MyModel.objects.filter(**kwargs)
>>
>>  if not qs:
>> obj = MyModel(**kwargs).save()
>> return obj, True  # Created object
>> else if len(qs) == 1:
>> obj = qs[0]
>> changed = False
>> for k, v in defaults:
>>  if getattr(obj, k) != v:
>>  changed = True
>>  setattr(obj, k, v)
>> if changed:
>> obj.save()
>> return obj, False  # Updated object
>> else:
>> # Multiple objects...
>>
>> return obj, None  # No change.
>>
>>
>> On 06.11.2015 14:08, Yunti wrote:
>>
>> Carsten ,
>>
>> Thanks for your reply,
>>
>> A note about the last statement: If a Supplier object has the same
>> unique_id, and all
>> other fields (in `defaults`) are the same as well, logically there is no
>> difference
>> between updating and not updating – the result is the same.
>>
>> The entry in the database is the same - apart from the last_updated flag
>> if it's not rewritten over the top of it.  This means I can check for new
>> data often and be alerted when there is an actual update (i.e. a change to
>> the data).  If it rewrites the data everytime it checks then I have no idea
>> when data was actually updated.
>>
>> Have you checked? How?
>> In your create_or_update_if_diff() you seem to try to re-invent
>> update_or_create(), but
>> have you actually examined the results of the
>>
>>  supplier, created = Supplier.objects.update_or_create(...)
>>
>> call?
>>
>> I checked by seeing that the last_updated field in the database was
>> updated everytime.  (I suppose the issue could be with how that field gets
>> reset to the next time it's run- I didn't eliminate that possibility.)
>>
>> Yes I was worried that I might be recreating (a poor version) of
>> update_or_create() but it didn't seem to have the option where it wouldn't
>> write to the database if there was no change to the data.
>> Can it do this? And how would I verify when an item has been updated or
>> created (or neither) - could I output to the console?
>>
>> If it can how do I call it so it checks against all fields (unique_id and
>> defaults) and updates using the defaults if it finds a difference (and
>> creates if it doesn't find a unique_id)?
>>
>> I'm still not sure if this is possible and how to call the function,
>> particular how to pass in the remaining defaults to check against -
>> **kwargs = defaults isn't right but not sure what it should be.
>>
>> supplier, created = 
>> Supplier.objects.update_or_create(unique_id=product_detail['supplierId'], 
>> **kwargs=defaults,
>>defaults={
>>'name': 
>> product_detail['supplierName'],
>>'entity_name_1': 
>> entity_name_1,
>>'entity_name_2': 
>> entity_name_1,
>>'rating': 
>> product_detail['supplierRating']})
>>
>> On Thursday, 5 November 2015 20:05:39 UTC, Carsten Fuchs wrote:
>>>
>>> Hi Yunti, Am 05.11.2015 um 18:19 schrieb Yunti: > I have tried to use
>>> the update_or_create() method assuming that it would either, create > a new
>>> entry in the db if it found 

Re: update_or_create() always creates (or recreates)

2015-11-06 Thread Yunti
Jani,

Thanks for your reply - you explained it much more concisely than I did. :)

Good to have it confirmed that update_or_create() doesn't quite do what I 
needed - I was confused as to whether it would or not.

Thanks for taking the time to do that function, that looks ideal. I'll test 
it out.


On Friday, 6 November 2015 12:52:11 UTC, Jani Tiainen wrote:
>
> Your problem lies on the way Django actually carries out create or update.
>
> As name suggest, create or update does either one. But that's what you 
> don't want - you want conditional update.
>
> Only update if certain fields have been changed. Well this can be done few 
> ways.
>
> So you want to do 
> "update_only_if_at_least_one_of_default_fields_changed_or_create"
>
> Operation is simple, if object is not found, create new one using defaults 
> if found, pull values as a dict, compare against
> default values and if at least one differs do an update. Otherwise don't 
> do anything.
>
> So basically code would look something like this:
>
> update_if_changed_or_create(**kwargs):
> defaults = kwargs.pop('defaults', None)
>
> qs = MyModel.objects.filter(**kwargs)
>
>  if not qs:
> obj = MyModel(**kwargs).save()
> return obj, True  # Created object
> else if len(qs) == 1:
> obj = qs[0]
> changed = False
> for k, v in defaults:
>  if getattr(obj, k) != v:
>  changed = True
>  setattr(obj, k, v)
> if changed:
> obj.save()
> return obj, False  # Updated object
> else:
> # Multiple objects...
>
> return obj, None  # No change.
>
>
> On 06.11.2015 14:08, Yunti wrote:
>
> Carsten , 
>
> Thanks for your reply,
>
> A note about the last statement: If a Supplier object has the same 
> unique_id, and all 
> other fields (in `defaults`) are the same as well, logically there is no 
> difference 
> between updating and not updating – the result is the same. 
>
> The entry in the database is the same - apart from the last_updated flag 
> if it's not rewritten over the top of it.  This means I can check for new 
> data often and be alerted when there is an actual update (i.e. a change to 
> the data).  If it rewrites the data everytime it checks then I have no idea 
> when data was actually updated.
>
> Have you checked? How? 
> In your create_or_update_if_diff() you seem to try to re-invent 
> update_or_create(), but 
> have you actually examined the results of the 
>
>  supplier, created = Supplier.objects.update_or_create(...) 
>
> call? 
>
> I checked by seeing that the last_updated field in the database was 
> updated everytime.  (I suppose the issue could be with how that field gets 
> reset to the next time it's run- I didn't eliminate that possibility.)
>
> Yes I was worried that I might be recreating (a poor version) of 
> update_or_create() but it didn't seem to have the option where it wouldn't 
> write to the database if there was no change to the data.   
> Can it do this? And how would I verify when an item has been updated or 
> created (or neither) - could I output to the console? 
>
> If it can how do I call it so it checks against all fields (unique_id and 
> defaults) and updates using the defaults if it finds a difference (and 
> creates if it doesn't find a unique_id)?
>
> I'm still not sure if this is possible and how to call the function, 
> particular how to pass in the remaining defaults to check against - 
> **kwargs = defaults isn't right but not sure what it should be.
>
> supplier, created = 
> Supplier.objects.update_or_create(unique_id=product_detail['supplierId'], 
> **kwargs=defaults, 
>defaults={
>'name': 
> product_detail['supplierName'],
>'entity_name_1': 
> entity_name_1,
>'entity_name_2': 
> entity_name_1,
>'rating': 
> product_detail['supplierRating']})
>
> On Thursday, 5 November 2015 20:05:39 UTC, Carsten Fuchs wrote:
>>
>> Hi Yunti, Am 05.11.2015 um 18:19 schrieb Yunti: > I have tried to use the 
>> update_or_create() method assuming that it would either, create > a new 
>> entry in the db if it found none or update an existing one if it found one 
>> and had > differences to the defaults passed in  - or wouldn't update if 
>> there was no difference. A note about the last statement: If a Supplier 
>> object has the same unique_id, and all other fields (in `defaults`) are the 
>> same as well, logically there is no difference between updating and not 
>> updating – the result is the same. >   However it just seemed to recreate 
>> entries each time even if there were no changes. Have you checked? How? In 
>> your create_or_update_if_diff() you seem to try to re-invent 
>> 

Re: update_or_create() always creates (or recreates)

2015-11-06 Thread Jani Tiainen

Your problem lies on the way Django actually carries out create or update.

As name suggest, create or update does either one. But that's what you 
don't want - you want conditional update.


Only update if certain fields have been changed. Well this can be done 
few ways.


So you want to do 
"update_only_if_at_least_one_of_default_fields_changed_or_create"


Operation is simple, if object is not found, create new one using 
defaults if found, pull values as a dict, compare against
default values and if at least one differs do an update. Otherwise don't 
do anything.


So basically code would look something like this:

update_if_changed_or_create(**kwargs):
defaults = kwargs.pop('defaults', None)

qs = MyModel.objects.filter(**kwargs)

 if not qs:
obj = MyModel(**kwargs).save()
return obj, True  # Created object
else if len(qs) == 1:
obj = qs[0]
changed = False
for k, v in defaults:
 if getattr(obj, k) != v:
 changed = True
 setattr(obj, k, v)
if changed:
obj.save()
return obj, False  # Updated object
else:
# Multiple objects...

return obj, None  # No change.


On 06.11.2015 14:08, Yunti wrote:

Carsten ,

Thanks for your reply,

A note about the last statement: If a Supplier object has the same 
unique_id, and all
other fields (in `defaults`) are the same as well, logically there is 
no difference

between updating and not updating – the result is the same.

The entry in the database is the same - apart from the last_updated 
flag if it's not rewritten over the top of it.  This means I can check 
for new data often and be alerted when there is an actual update (i.e. 
a change to the data).  If it rewrites the data everytime it checks 
then I have no idea when data was actually updated.


Have you checked? How?
In your create_or_update_if_diff() you seem to try to re-invent 
update_or_create(), but

have you actually examined the results of the

 supplier, created = Supplier.objects.update_or_create(...)

call?

I checked by seeing that the last_updated field in the database was 
updated everytime.  (I suppose the issue could be with how that field 
gets reset to the next time it's run- I didn't eliminate that 
possibility.)


Yes I was worried that I might be recreating (a poor version) of 
update_or_create() but it didn't seem to have the option where it 
wouldn't write to the database if there was no change to the data.
Can it do this? And how would I verify when an item has been updated 
or created (or neither) - could I output to the console?


If it can how do I call it so it checks against all fields (unique_id 
and defaults) and updates using the defaults if it finds a difference 
(and creates if it doesn't find a unique_id)?


I'm still not sure if this is possible and how to call the function, 
particular how to pass in the remaining defaults to check against - 
**kwargs = defaults isn't right but not sure what it should be.


supplier, created = 
Supplier.objects.update_or_create(unique_id=product_detail['supplierId'], 
**kwargs=defaults, defaults={ 'name': product_detail['supplierName'], 
'entity_name_1': entity_name_1, 'entity_name_2': entity_name_1, 
'rating': product_detail['supplierRating']})

On Thursday, 5 November 2015 20:05:39 UTC, Carsten Fuchs wrote:

Hi Yunti, Am 05.11.2015 um 18:19 schrieb Yunti: > I have tried to
use the update_or_create() method assuming that it would either,
create > a new entry in the db if it found none or update an
existing one if it found one and had > differences to the defaults
passed in  - or wouldn't update if there was no difference. A note
about the last statement: If a Supplier object has the same
unique_id, and all other fields (in `defaults`) are the same as
well, logically there is no difference between updating and not
updating – the result is the same. >   However it just seemed to
recreate entries each time even if there were no changes. Have you
checked? How? In your create_or_update_if_diff() you seem to try
to re-invent update_or_create(), but have you actually examined
the results of the  supplier, created =
Supplier.objects.update_or_create(...) call? > I think the issue
was that I wanted to: > 1)  get an entry if all fields were the
same, update_or_create() updates an object with the given kwargs,
the match is not made against *all* fields (i.e. for the match the
fields in `defaults` are not accounted for). > 2) or create a new
entry if it didn't find an existing entry with the unique_id > 3)
or if there was an entry with the same unique_id, update that
entry with remaining > fields. update_or_create() should achieve
this. It's hard to tell more without additional information, but
https://docs.djangoproject.com/en/1.8/ref/models/querysets/#update-or-create


Re: update_or_create() always creates (or recreates)

2015-11-06 Thread Yunti
Carsten ,

Thanks for your reply,

A note about the last statement: If a Supplier object has the same 
unique_id, and all 
other fields (in `defaults`) are the same as well, logically there is no 
difference 
between updating and not updating – the result is the same. 

The entry in the database is the same - apart from the last_updated flag if 
it's not rewritten over the top of it.  This means I can check for new data 
often and be alerted when there is an actual update (i.e. a change to the 
data).  If it rewrites the data everytime it checks then I have no idea 
when data was actually updated.

Have you checked? How? 
In your create_or_update_if_diff() you seem to try to re-invent 
update_or_create(), but 
have you actually examined the results of the 

 supplier, created = Supplier.objects.update_or_create(...) 

call? 

I checked by seeing that the last_updated field in the database was updated 
everytime.  (I suppose the issue could be with how that field gets reset to 
the next time it's run- I didn't eliminate that possibility.)

Yes I was worried that I might be recreating (a poor version) of 
update_or_create() but it didn't seem to have the option where it wouldn't 
write to the database if there was no change to the data.   
Can it do this? And how would I verify when an item has been updated or 
created (or neither) - could I output to the console? 

If it can how do I call it so it checks against all fields (unique_id and 
defaults) and updates using the defaults if it finds a difference (and 
creates if it doesn't find a unique_id)?

I'm still not sure if this is possible and how to call the function, 
particular how to pass in the remaining defaults to check against - 
**kwargs = defaults isn't right but not sure what it should be.

supplier, created = 
Supplier.objects.update_or_create(unique_id=product_detail['supplierId'], 
**kwargs=defaults, 
   defaults={
   'name': 
product_detail['supplierName'],
   'entity_name_1': 
entity_name_1,
   'entity_name_2': 
entity_name_1,
   'rating': 
product_detail['supplierRating']})



On Thursday, 5 November 2015 20:05:39 UTC, Carsten Fuchs wrote:
>
> Hi Yunti, 
>
> Am 05.11.2015 um 18:19 schrieb Yunti: 
> > I have tried to use the update_or_create() method assuming that it would 
> either, create 
> > a new entry in the db if it found none or update an existing one if it 
> found one and had 
> > differences to the defaults passed in  - or wouldn't update if there was 
> no difference. 
>
> A note about the last statement: If a Supplier object has the same 
> unique_id, and all 
> other fields (in `defaults`) are the same as well, logically there is no 
> difference 
> between updating and not updating – the result is the same. 
>
> >   However it just seemed to recreate entries each time even if there 
> were no changes. 
>
> Have you checked? How? 
> In your create_or_update_if_diff() you seem to try to re-invent 
> update_or_create(), but 
> have you actually examined the results of the 
>
>  supplier, created = Supplier.objects.update_or_create(...) 
>
> call? 
>
> > I think the issue was that I wanted to: 
> > 1)  get an entry if all fields were the same, 
>
> update_or_create() updates an object with the given kwargs, the match is 
> not made 
> against *all* fields (i.e. for the match the fields in `defaults` are not 
> accounted for). 
>
> > 2) or create a new entry if it didn't find an existing entry with the 
> unique_id 
> > 3) or if there was an entry with the same unique_id, update that entry 
> with remaining 
> > fields. 
>
> update_or_create() should achieve this. It's hard to tell more without 
> additional 
> information, but 
>
> https://docs.djangoproject.com/en/1.8/ref/models/querysets/#update-or-create 
> explains 
> the function well, including how it works. If you work through this in 
> small steps, 
> check examples and their (intermediate) results, you should be able to 
> find what the 
> original problem was. 
>
> Best regards, 
> Carsten 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/9b529e2d-7e2b-4194-a77c-8434efe6205d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: update_or_create() always creates (or recreates)

2015-11-06 Thread Yunti
Carsten ,

Thanks for your reply,

A note about the last statement: If a Supplier object has the same 
unique_id, and all 
other fields (in `defaults`) are the same as well, logically there is no 
difference 
between updating and not updating – the result is the same. 

The entry in the database is the same - apart from the last_updated flag if 
it's not rewritten over the top of it.  This means I can check for new data 
often and be alerted when there is an actual update (i.e. a change to the 
data).  If it rewrites the data everytime it checks then I have no idea 
when data was actually updated.

Have you checked? How? 
In your create_or_update_if_diff() you seem to try to re-invent 
update_or_create(), but 
have you actually examined the results of the 

 supplier, created = Supplier.objects.update_or_create(...) 

call? 

I checked by seeing that the last_updated field in the database was updated 
everytime.  (I suppose the issue could be with how that field gets reset to 
the next time it's run- I didnt)



On Thursday, 5 November 2015 20:05:39 UTC, Carsten Fuchs wrote:
>
> Hi Yunti, 
>
> Am 05.11.2015 um 18:19 schrieb Yunti: 
> > I have tried to use the update_or_create() method assuming that it would 
> either, create 
> > a new entry in the db if it found none or update an existing one if it 
> found one and had 
> > differences to the defaults passed in  - or wouldn't update if there was 
> no difference. 
>
> A note about the last statement: If a Supplier object has the same 
> unique_id, and all 
> other fields (in `defaults`) are the same as well, logically there is no 
> difference 
> between updating and not updating – the result is the same. 
>
> >   However it just seemed to recreate entries each time even if there 
> were no changes. 
>
> Have you checked? How? 
> In your create_or_update_if_diff() you seem to try to re-invent 
> update_or_create(), but 
> have you actually examined the results of the 
>
>  supplier, created = Supplier.objects.update_or_create(...) 
>
> call? 
>
> > I think the issue was that I wanted to: 
> > 1)  get an entry if all fields were the same, 
>
> update_or_create() updates an object with the given kwargs, the match is 
> not made 
> against *all* fields (i.e. for the match the fields in `defaults` are not 
> accounted for). 
>
> > 2) or create a new entry if it didn't find an existing entry with the 
> unique_id 
> > 3) or if there was an entry with the same unique_id, update that entry 
> with remaining 
> > fields. 
>
> update_or_create() should achieve this. It's hard to tell more without 
> additional 
> information, but 
>
> https://docs.djangoproject.com/en/1.8/ref/models/querysets/#update-or-create 
> explains 
> the function well, including how it works. If you work through this in 
> small steps, 
> check examples and their (intermediate) results, you should be able to 
> find what the 
> original problem was. 
>
> Best regards, 
> Carsten 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/45a2e51e-d7bb-4743-aa4c-c23b17098d17%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: update_or_create() always creates (or recreates)

2015-11-05 Thread Carsten Fuchs

Hi Yunti,

Am 05.11.2015 um 18:19 schrieb Yunti:

I have tried to use the update_or_create() method assuming that it would 
either, create
a new entry in the db if it found none or update an existing one if it found 
one and had
differences to the defaults passed in  - or wouldn't update if there was no 
difference.


A note about the last statement: If a Supplier object has the same unique_id, and all 
other fields (in `defaults`) are the same as well, logically there is no difference 
between updating and not updating – the result is the same.



  However it just seemed to recreate entries each time even if there were no 
changes.


Have you checked? How?
In your create_or_update_if_diff() you seem to try to re-invent update_or_create(), but 
have you actually examined the results of the


supplier, created = Supplier.objects.update_or_create(...)

call?


I think the issue was that I wanted to:
1)  get an entry if all fields were the same,


update_or_create() updates an object with the given kwargs, the match is not made 
against *all* fields (i.e. for the match the fields in `defaults` are not accounted for).



2) or create a new entry if it didn't find an existing entry with the unique_id
3) or if there was an entry with the same unique_id, update that entry with 
remaining
fields.


update_or_create() should achieve this. It's hard to tell more without additional 
information, but 
https://docs.djangoproject.com/en/1.8/ref/models/querysets/#update-or-create explains 
the function well, including how it works. If you work through this in small steps, 
check examples and their (intermediate) results, you should be able to find what the 
original problem was.


Best regards,
Carsten

--
You received this message because you are subscribed to the Google Groups "Django 
users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/563BB657.7050209%40cafu.de.
For more options, visit https://groups.google.com/d/optout.


update_or_create() always creates (or recreates)

2015-11-05 Thread Yunti
I have tried to use the update_or_create() method assuming that it would 
either, create a new entry in the db if it found none or update an existing 
one if it found one and had differences to the defaults passed in  - or 
wouldn't update if there was no difference.  However it just seemed to 
recreate entries each time even if there were no changes.

I think the issue was that I wanted to:
1)  get an entry if all fields were the same,
2) or create a new entry if it didn't find an existing entry with the 
unique_id
3) or if there was an entry with the same unique_id, update that entry with 
remaining fields. 

The update_or_create() method doesn't seem to work as I had hoped using how 
I have called it below - it just always seems to do an update if it finds a 
match on the given kwargs. 

Or if I tried passing in all That would 
would have to be passing in all the fields as keyword args to check that 
nothing had changed but then that would miss option 3) finding an existing 
entry that 






supplier, created = 
Supplier.objects.update_or_create(unique_id=product_detail['supplierId'],
   defaults={
   'name': 
product_detail['supplierName'],
   'entity_name_1': 
entity_name_1,
   'entity_name_2': 
entity_name_1,
   'rating': 
product_detail['supplierRating']})





class Supplier(models.Model):
unique_id = models.IntegerField(unique=True)
name = models.CharField(max_length=255, unique=True)
entity_name_1 = models.CharField(max_length=255, blank=True)
entity_name_2 = models.CharField(max_length=255, blank=True)
rating = models.CharField(max_length=255)

last_updated = models.DateTimeField(auto_now=True)


def __str__(self):
return self.name


Not being convinced that update_or_create() would give me what I needed I made 
the below function:


def create_or_update_if_diff(defaults, model):
try:
instance = model.objects.get(**defaults)
# if no exception, the product doesn't need to be updated
except model.DoesNotExist:
# the product needs to be created or updated
try:
model.objects.get(unique_id=defaults['unique_id'])
except model.DoesNotExist:
# needs to be created
instance = model.objects.create(**defaults)
# model(**defaults).save()
sys.stdout.write('New {} created: {}\n'.format(model, 
instance.name)) 
return instance, True
else:
# needs to be updated
instance = model.objects.update(**defaults)
sys.stdout.write('{}:'
 ' {} updated \n'.format(model, 
instance.unique_id)) 
return instance, True
return instance, False


However I can't get it to be quite right.  I key a key error on update possibly 
because the defaults passed in now include unique_id. Should the unique_id be 
separated and both passed into the function to fix this?  (And should I have 
created a function to achieve this - or would have update_or_create() have been 
able to do this.?)



supplier_defaults={
   'unique_id': 
product_detail['supplierId'],
   'name': 
product_detail['supplierName'],
   'entity_name_1': 
entity_name_1,
   'entity_name_2': 
entity_name_2,
   'rating': 
product_detail['supplierRating']}



-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/a0b6e1dd-d583-480e-9c6e-540c1ad4511a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.