Hello, I have asked a question recently about solr limitations and some about joins. It comes that this question is about both at the same time. I am trying to figure how to denormalize my data so I will need just 1 document in my index instead of performing a join. I figure one way of doing this is storing an entity as a multivalued field, instead of storing different fields. Let me give an example. Consider the entities:
User: id: 1 type: Joan of Arc age: 27 Webpage: id: 1 url: http://wiki.apache.org/solr/Join category: Technical user_id: 1 id: 2 url: http://stackoverflow.com category: Technical user_id: 1 Instead of creating 1 document for user, 1 for webpage 1 and 1 for webpage 2 (1 parent and 2 childs) I could store webpages in a user multivalued field, as follows: User: id: 1 name: Joan of Arc age: 27 webpage1: ["id:1", "url: http://wiki.apache.org/solr/Join", "category: Technical"] webpage2: ["id:2", "url: http://stackoverflow.com", "category: Technical"] It would probably perform better than the join, right? However, it made me think about solr limitations again. What if I have 200 million webpges (200 million fields) per user? Or imagine a case where I could have 200 million values on a field, like in the case I need to index every html DOM element (div, a, etc.) for each web page user visited. I mean, if I need to do the query and this is a business requirement no matter what, although denormalizing could be better than using query time joins, I wonder it distributing the data present in this single document along the cluster wouldn't give me better performance. And this is something I won't get with block joins or multivalued fields... I guess there is probably no right answer for this question (at least not a known one), and I know I should create a POC to check how each perform... But do you think a so large number of values in a single document could make denormalization not possible in an extreme case like this? Would you share my thoughts if I said denormalization is not always the right option? Best regards, -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr