Re: Indexing of documents in more than one step (SOLRJ)
Maciej: you really have two choices: 1> re-index the entire document with fields a, b, c, d, e, f. In that case though, why bother indexing the first time ;) 2> use Atomic Updates: https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents but note the restrictions. Best, Erick On Wed, Feb 15, 2017 at 3:45 AM, Emir Arnautovicwrote: > Which version of Solr do you use? Is it always the same field? Again, > without checking anything, see if it could be that field is not multivalue > and your value is. > > In any case, this is inefficient way of indexing. If possible, stream both > sources ordered by ID and merge them in one input doc and send to Solr. > > Emir > > > > On 15.02.2017 12:24, Maciej Ł. PCSS wrote: >> >> No, it's not the case. In both steps I'm indexing documents from the same >> set of IDs (I mean the values of the 'id'). >> >> Maciej >> >> >> W dniu 15.02.2017 o 11:07, Emir Arnautovic pisze: >>> >>> I did not have time to test it or look at the code, but can you check if >>> it could be the case when there is no document with a, b, c fields and you >>> are trying to update it with d, e, f using partial update syntax. >>> >>> Emir >>> >>> >>> On 15.02.2017 09:25, Maciej Ł. PCSS wrote: Dear All, how should I handle the following scenario using SOLRJ? Index a collection of documents (fill fields a, b, c). Then index the same collection but this time fill fields d, e, f. In a pseudo-code it would be: step1(collectionX); step2(collectionX); solrCommit(); See my observations below: - first step is done by calling SolrInputDocument.addField(fieldName, value); and this works fine. - if I do the same for the second step then all fields in my documents get removed; - for that reason I need to call SolrInputDocument.addField(fieldName, Collections.singletonMap("set", value)); and then it's fine - but for some field, if I do the call from above, then the indexed values are like "{set=value}" instead of just "value". Can somebody explain this strange behaviour to me? Regards Maciej >>> >> > > -- > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > Solr & Elasticsearch Support * http://sematext.com/ >
Re: Indexing of documents in more than one step (SOLRJ)
Which version of Solr do you use? Is it always the same field? Again, without checking anything, see if it could be that field is not multivalue and your value is. In any case, this is inefficient way of indexing. If possible, stream both sources ordered by ID and merge them in one input doc and send to Solr. Emir On 15.02.2017 12:24, Maciej Ł. PCSS wrote: No, it's not the case. In both steps I'm indexing documents from the same set of IDs (I mean the values of the 'id'). Maciej W dniu 15.02.2017 o 11:07, Emir Arnautovic pisze: I did not have time to test it or look at the code, but can you check if it could be the case when there is no document with a, b, c fields and you are trying to update it with d, e, f using partial update syntax. Emir On 15.02.2017 09:25, Maciej Ł. PCSS wrote: Dear All, how should I handle the following scenario using SOLRJ? Index a collection of documents (fill fields a, b, c). Then index the same collection but this time fill fields d, e, f. In a pseudo-code it would be: step1(collectionX); step2(collectionX); solrCommit(); See my observations below: - first step is done by calling SolrInputDocument.addField(fieldName, value); and this works fine. - if I do the same for the second step then all fields in my documents get removed; - for that reason I need to call SolrInputDocument.addField(fieldName, Collections.singletonMap("set", value)); and then it's fine - but for some field, if I do the call from above, then the indexed values are like "{set=value}" instead of just "value". Can somebody explain this strange behaviour to me? Regards Maciej -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/
Re: Indexing of documents in more than one step (SOLRJ)
No, it's not the case. In both steps I'm indexing documents from the same set of IDs (I mean the values of the 'id'). Maciej W dniu 15.02.2017 o 11:07, Emir Arnautovic pisze: I did not have time to test it or look at the code, but can you check if it could be the case when there is no document with a, b, c fields and you are trying to update it with d, e, f using partial update syntax. Emir On 15.02.2017 09:25, Maciej Ł. PCSS wrote: Dear All, how should I handle the following scenario using SOLRJ? Index a collection of documents (fill fields a, b, c). Then index the same collection but this time fill fields d, e, f. In a pseudo-code it would be: step1(collectionX); step2(collectionX); solrCommit(); See my observations below: - first step is done by calling SolrInputDocument.addField(fieldName, value); and this works fine. - if I do the same for the second step then all fields in my documents get removed; - for that reason I need to call SolrInputDocument.addField(fieldName, Collections.singletonMap("set", value)); and then it's fine - but for some field, if I do the call from above, then the indexed values are like "{set=value}" instead of just "value". Can somebody explain this strange behaviour to me? Regards Maciej
Re: Indexing of documents in more than one step (SOLRJ)
I did not have time to test it or look at the code, but can you check if it could be the case when there is no document with a, b, c fields and you are trying to update it with d, e, f using partial update syntax. Emir On 15.02.2017 09:25, Maciej Ł. PCSS wrote: Dear All, how should I handle the following scenario using SOLRJ? Index a collection of documents (fill fields a, b, c). Then index the same collection but this time fill fields d, e, f. In a pseudo-code it would be: step1(collectionX); step2(collectionX); solrCommit(); See my observations below: - first step is done by calling SolrInputDocument.addField(fieldName, value); and this works fine. - if I do the same for the second step then all fields in my documents get removed; - for that reason I need to call SolrInputDocument.addField(fieldName, Collections.singletonMap("set", value)); and then it's fine - but for some field, if I do the call from above, then the indexed values are like "{set=value}" instead of just "value". Can somebody explain this strange behaviour to me? Regards Maciej -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/
Indexing of documents in more than one step (SOLRJ)
Dear All, how should I handle the following scenario using SOLRJ? Index a collection of documents (fill fields a, b, c). Then index the same collection but this time fill fields d, e, f. In a pseudo-code it would be: step1(collectionX); step2(collectionX); solrCommit(); See my observations below: - first step is done by calling SolrInputDocument.addField(fieldName, value); and this works fine. - if I do the same for the second step then all fields in my documents get removed; - for that reason I need to call SolrInputDocument.addField(fieldName, Collections.singletonMap("set", value)); and then it's fine - but for some field, if I do the call from above, then the indexed values are like "{set=value}" instead of just "value". Can somebody explain this strange behaviour to me? Regards Maciej