Hi,
I have scrape which inserts data into a table. However if some data is
missing the insert fails.
The mySQL Insert Code is below.
def _conditional_insert(self, tx, items):
# create record if doesn't exist.
# all this block run on it's own thread
tx.execute("select product_id from data_price where product_id =
%s", (items['product_id'][0], ))
result = tx.fetchone()
if result:
log.msg("Item already stored in db: %s" % items,
level=log.DEBUG)
else:
tx.execute(\
"insert into data_price (site_type, site_id, product_id,
model_number, product_name, price) "
"values (%s, %s, %s, %s, %s, %s)",
(
items['site_type'].encode('utf-8'),
items['site_id'].encode('utf-8'),
items['product_id'].encode('utf-8'),
items['model_number'].encode('utf-8'),
items['name'].encode('utf-8'),
items['price'].encode('utf-8')
)
)
If for example the field *model number* has no data it causes the below
error.
*exceptions.KeyError: 'model_number'*
My scrape code looks like this
def parse_product(self, response):
sel = Selector(response)
# Load fields within product container.
pl = ProductLoader(selector=sel.css('section.product-details'),
response=response)
# Product Data
pl.add_css('name', 'h1')
pl.add_css('price', '.price-value')
pl.add_xpath('model_number', './/*[contains(text(),"Model
Number")]/following-sibling::dd/text()')
pl.add_value('product_id', response.url, re=r'_p(\d+)$')
# Site Information
pl.add_value('site_type','1') # 1 = Competitor
pl.add_value('site_id','1') # 1 = Site ID Number
Where should I handle NULL data values so the insert does not fail?
I would say for example if there is no data just replace with ' ' for
example
thanks
Brendan
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.