Name equals the product name. 

Each separate product can have 1 to n prices based upon pricelist.

A single document represents that single product.

<doc>
        <field name="id">1</field>
        <field name="name">The product name.</field>
        <field name="price">1.00</field>
        <field name="priceList1Price">0.99</field>
        <field name="priceList2Price">0.98</field>
        <field name="priceList1500Price">0.85</field>
</doc>
<doc>
        <field name="id">2</field>
        <field name="name">The product name.</field>
        <field name="price">1.10</field>
        <field name="priceList1Price">1.09</field>
        <field name="priceList2Price">1.08</field>
        <field name="priceList1500Price">1.05</field>
</doc>

Yes, the amount of pricelist could grow from 1000 to 5000 given the user base 
grows.

There are currently about 150,000 products.

We do need to index the products, since they change frequently.

Thanks everyone for all your responses so far!!!!!

-----Original Message-----
From: kenf_nc [mailto:ken.fos...@realestate.com] 
Sent: Wednesday, April 13, 2011 1:15 PM
To: solr-user@lucene.apache.org
Subject: RE: Indexing Question for large dataset

Is NAME a product name? Why would it be multivalue? And why would it appear
on more than one document?  Is each 'document' a package of products? And
the pricing tiers are on the package, not individual pieces?

So sounds like you could, potentially, have a PriceListX column for each
user. As your User base grows, the number of columns you need may grow (you
already bumped up from 2000 to 5000 in the space of a couple posts :) ). Is
that right?

How many products (or packages of products) do you have? Could you flip this
on its ear and make a User the document. Then it could have just 3
multivalue fields (beyond any you need to identify the user like user_id)
    product_id
    product_name
    product_price

Downside is if a new product is introduced you have to re-index all users
that have a price point on that product.  


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-Question-for-large-dataset-tp2816344p2816994.html
Sent from the Solr - User mailing list archive at Nabble.com.
The recipient of this email should check this email and any attachments for the 
presence of viruses. 
The Wasserstrom Companies accepts no liability for any damage caused by any 
virus transmitted by this email.

This footnote also confirms that this email message has been scanned for the 
presence of computer viruses.

The Wasserstrom Companies

Reply via email to