Re: SolrDocumentList - bitwise operation
Hi, Regrets, I was confused with bit-set. I l have Shawn's suggested approach in system. I want to try with other ways and test performance. How can I use join? I have 2 different solr indexes. localhost:8080/solr_1/select?q=content:testfl=id,name,type localhost:8081/solr_1_1/select?q=text:testfl=id After getting results - Join by id How do I do this? please suggest me with other ways to do this. current method is taking lot of time. Thanks Michael. On Tue, Oct 15, 2013 at 11:41 PM, Erick Erickson erickerick...@gmail.comwrote: Why do you think a bitset would help? Bitsets have a bit set on for every document that matches based on the _internal_ Lucene document ID, it has nothing to do with the uniqueKey you have defined. Nor does it have anything to do with the foreign key relationship. So either I don't understand the problem at all or pursuing bitsets is a red herring. You might be substantially faster by sorting the results and then doing a skip-list sort of thing. FWIW, Erick On Mon, Oct 14, 2013 at 1:47 PM, Michael Tyler michaeltyler1...@gmail.comwrote: Hi Shawn, This is time consuming operation. I already have this in my application . I was pondering whether I can get bit set from both the solr indexes , bitset.and then retrieve only those matched? I don't know how do I retrieve bitset. - wanted to try this and test the performance. Regards Michael On Sun, Oct 13, 2013 at 8:54 PM, Shawn Heisey s...@elyograg.org wrote: On 10/13/2013 8:34 AM, Michael Tyler wrote: Hello, I have 2 different solr indexes returning 2 different sets of SolrDocumentList. Doc Id is the foreign key relation. After obtaining them, I want to perform AND operation between them and then return results to user. Can you tell me how do I get this? I am using solr 4.3 SolrDocumentList results1 = responseA.getResults(); SolrDocumentList results2 = responseB.getResults(); results1 : d1, d2, d3 results2 : d1,d2, d4 The SolrDocumentList class extends ArrayListSolrDocument, which means that it inherits all ArrayList functionality. Unfortunately, there's no built-in way of eliminating duplicates with a java List. It's very easy to combine the two results into another object, but that object will contain both of the d1 and both of the d2 SolrDocument objects. The following code is a reasonably fast way to handle this. It assumes that results1 is the list that should win when there are duplicates, so it gets added first. It assumes that the uniqueKey field is named id and that it contains a String value. If these are incorrect assumptions, you can adjust the code accordingly. SolrDocumentList results1 = responseA.getResults(); SolrDocumentList results2 = responseB.getResults(); ListSolrDocumentList tmpList = new ArrayListSolrDocumentList(); tmpList.add(results1); tmpList.add(results2); SetString tmpSet = new HashSetString(); SolrDocumentList newList = new SolrDocumentList(); for (SolrDocumentList l : tmpList) { for (SolrDocument d : l) { String id = (String) d.get(id); if (tmpSet.contains(id)) { continue; } tmpSet.add(id); newList.add(d); } } Thanks, Shawn
Re: SolrDocumentList - bitwise operation
Why do you think a bitset would help? Bitsets have a bit set on for every document that matches based on the _internal_ Lucene document ID, it has nothing to do with the uniqueKey you have defined. Nor does it have anything to do with the foreign key relationship. So either I don't understand the problem at all or pursuing bitsets is a red herring. You might be substantially faster by sorting the results and then doing a skip-list sort of thing. FWIW, Erick On Mon, Oct 14, 2013 at 1:47 PM, Michael Tyler michaeltyler1...@gmail.comwrote: Hi Shawn, This is time consuming operation. I already have this in my application . I was pondering whether I can get bit set from both the solr indexes , bitset.and then retrieve only those matched? I don't know how do I retrieve bitset. - wanted to try this and test the performance. Regards Michael On Sun, Oct 13, 2013 at 8:54 PM, Shawn Heisey s...@elyograg.org wrote: On 10/13/2013 8:34 AM, Michael Tyler wrote: Hello, I have 2 different solr indexes returning 2 different sets of SolrDocumentList. Doc Id is the foreign key relation. After obtaining them, I want to perform AND operation between them and then return results to user. Can you tell me how do I get this? I am using solr 4.3 SolrDocumentList results1 = responseA.getResults(); SolrDocumentList results2 = responseB.getResults(); results1 : d1, d2, d3 results2 : d1,d2, d4 The SolrDocumentList class extends ArrayListSolrDocument, which means that it inherits all ArrayList functionality. Unfortunately, there's no built-in way of eliminating duplicates with a java List. It's very easy to combine the two results into another object, but that object will contain both of the d1 and both of the d2 SolrDocument objects. The following code is a reasonably fast way to handle this. It assumes that results1 is the list that should win when there are duplicates, so it gets added first. It assumes that the uniqueKey field is named id and that it contains a String value. If these are incorrect assumptions, you can adjust the code accordingly. SolrDocumentList results1 = responseA.getResults(); SolrDocumentList results2 = responseB.getResults(); ListSolrDocumentList tmpList = new ArrayListSolrDocumentList(); tmpList.add(results1); tmpList.add(results2); SetString tmpSet = new HashSetString(); SolrDocumentList newList = new SolrDocumentList(); for (SolrDocumentList l : tmpList) { for (SolrDocument d : l) { String id = (String) d.get(id); if (tmpSet.contains(id)) { continue; } tmpSet.add(id); newList.add(d); } } Thanks, Shawn
Re: SolrDocumentList - bitwise operation
Hi Shawn, This is time consuming operation. I already have this in my application . I was pondering whether I can get bit set from both the solr indexes , bitset.and then retrieve only those matched? I don't know how do I retrieve bitset. - wanted to try this and test the performance. Regards Michael On Sun, Oct 13, 2013 at 8:54 PM, Shawn Heisey s...@elyograg.org wrote: On 10/13/2013 8:34 AM, Michael Tyler wrote: Hello, I have 2 different solr indexes returning 2 different sets of SolrDocumentList. Doc Id is the foreign key relation. After obtaining them, I want to perform AND operation between them and then return results to user. Can you tell me how do I get this? I am using solr 4.3 SolrDocumentList results1 = responseA.getResults(); SolrDocumentList results2 = responseB.getResults(); results1 : d1, d2, d3 results2 : d1,d2, d4 The SolrDocumentList class extends ArrayListSolrDocument, which means that it inherits all ArrayList functionality. Unfortunately, there's no built-in way of eliminating duplicates with a java List. It's very easy to combine the two results into another object, but that object will contain both of the d1 and both of the d2 SolrDocument objects. The following code is a reasonably fast way to handle this. It assumes that results1 is the list that should win when there are duplicates, so it gets added first. It assumes that the uniqueKey field is named id and that it contains a String value. If these are incorrect assumptions, you can adjust the code accordingly. SolrDocumentList results1 = responseA.getResults(); SolrDocumentList results2 = responseB.getResults(); ListSolrDocumentList tmpList = new ArrayListSolrDocumentList(); tmpList.add(results1); tmpList.add(results2); SetString tmpSet = new HashSetString(); SolrDocumentList newList = new SolrDocumentList(); for (SolrDocumentList l : tmpList) { for (SolrDocument d : l) { String id = (String) d.get(id); if (tmpSet.contains(id)) { continue; } tmpSet.add(id); newList.add(d); } } Thanks, Shawn
Re: SolrDocumentList - bitwise operation
join query might be helpful: http://wiki.apache.org/solr/Join join can across indexes but probably won't work in solr clound. be aware that only to documents are retrievable, if you want content from both documents, join query won't work. And in lucene join query doesn't quite work on multiple join conditions, haven't test it in solr yet. I have similar join case like you, eventually I choose to denormalize our data into one set of documents. On 13 October 2013 22:34, Michael Tyler michaeltyler1...@gmail.com wrote: Hello, I have 2 different solr indexes returning 2 different sets of SolrDocumentList. Doc Id is the foreign key relation. After obtaining them, I want to perform AND operation between them and then return results to user. Can you tell me how do I get this? I am using solr 4.3 SolrDocumentList results1 = responseA.getResults(); SolrDocumentList results2 = responseB.getResults(); results1 : d1, d2, d3 results2 : d1,d2, d4 Return : d1, d2 Regards, Michael -- All the best Liu Bo
Re: SolrDocumentList - bitwise operation
On 10/13/2013 8:34 AM, Michael Tyler wrote: Hello, I have 2 different solr indexes returning 2 different sets of SolrDocumentList. Doc Id is the foreign key relation. After obtaining them, I want to perform AND operation between them and then return results to user. Can you tell me how do I get this? I am using solr 4.3 SolrDocumentList results1 = responseA.getResults(); SolrDocumentList results2 = responseB.getResults(); results1 : d1, d2, d3 results2 : d1,d2, d4 The SolrDocumentList class extends ArrayListSolrDocument, which means that it inherits all ArrayList functionality. Unfortunately, there's no built-in way of eliminating duplicates with a java List. It's very easy to combine the two results into another object, but that object will contain both of the d1 and both of the d2 SolrDocument objects. The following code is a reasonably fast way to handle this. It assumes that results1 is the list that should win when there are duplicates, so it gets added first. It assumes that the uniqueKey field is named id and that it contains a String value. If these are incorrect assumptions, you can adjust the code accordingly. SolrDocumentList results1 = responseA.getResults(); SolrDocumentList results2 = responseB.getResults(); ListSolrDocumentList tmpList = new ArrayListSolrDocumentList(); tmpList.add(results1); tmpList.add(results2); SetString tmpSet = new HashSetString(); SolrDocumentList newList = new SolrDocumentList(); for (SolrDocumentList l : tmpList) { for (SolrDocument d : l) { String id = (String) d.get(id); if (tmpSet.contains(id)) { continue; } tmpSet.add(id); newList.add(d); } } Thanks, Shawn