Re: Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Erick Erickson
Maciej:

you really have two choices:
1> re-index the entire document with fields a, b, c, d, e, f. In that
case though, why bother indexing the first time ;)
2> use Atomic Updates:
https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
but note the restrictions.

Best,
Erick

On Wed, Feb 15, 2017 at 3:45 AM, Emir Arnautovic
 wrote:
> Which version of Solr do you use? Is it always the same field? Again,
> without checking anything, see if it could be that field is not multivalue
> and your value is.
>
> In any case, this is inefficient way of indexing. If possible, stream both
> sources ordered by ID and merge them in one input doc and send to Solr.
>
> Emir
>
>
>
> On 15.02.2017 12:24, Maciej Ł. PCSS wrote:
>>
>> No, it's not the case. In both steps I'm indexing documents from the same
>> set of IDs (I mean the values of the 'id').
>>
>> Maciej
>>
>>
>> W dniu 15.02.2017 o 11:07, Emir Arnautovic pisze:
>>>
>>> I did not have time to test it or look at the code, but can you check if
>>> it could be the case when there is no document with a, b, c fields and you
>>> are trying to update it with d, e, f using partial update syntax.
>>>
>>> Emir
>>>
>>>
>>> On 15.02.2017 09:25, Maciej Ł. PCSS wrote:

 Dear All,
 how should I handle the following scenario using SOLRJ?  Index a
 collection of documents (fill fields a, b, c). Then index the same
 collection but this time fill fields d, e, f.

 In a pseudo-code it would be: step1(collectionX); step2(collectionX);
 solrCommit();

 See my observations below:
 - first step is done by calling SolrInputDocument.addField(fieldName,
 value); and this works fine.
 - if I do the same for the second step then all fields in my documents
 get removed;
 - for that reason I need to call SolrInputDocument.addField(fieldName,
 Collections.singletonMap("set", value)); and then it's fine
 - but for some field, if I do the call from above, then the indexed
 values are like "{set=value}" instead of just "value".

 Can somebody explain this strange behaviour to me?

 Regards
 Maciej

>>>
>>
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>


Re: Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Emir Arnautovic
Which version of Solr do you use? Is it always the same field? Again, 
without checking anything, see if it could be that field is not 
multivalue and your value is.


In any case, this is inefficient way of indexing. If possible, stream 
both sources ordered by ID and merge them in one input doc and send to 
Solr.


Emir


On 15.02.2017 12:24, Maciej Ł. PCSS wrote:
No, it's not the case. In both steps I'm indexing documents from the 
same set of IDs (I mean the values of the 'id').


Maciej


W dniu 15.02.2017 o 11:07, Emir Arnautovic pisze:
I did not have time to test it or look at the code, but can you check 
if it could be the case when there is no document with a, b, c fields 
and you are trying to update it with d, e, f using partial update 
syntax.


Emir


On 15.02.2017 09:25, Maciej Ł. PCSS wrote:

Dear All,
how should I handle the following scenario using SOLRJ?  Index a 
collection of documents (fill fields a, b, c). Then index the same 
collection but this time fill fields d, e, f.


In a pseudo-code it would be: step1(collectionX); 
step2(collectionX); solrCommit();


See my observations below:
- first step is done by calling 
SolrInputDocument.addField(fieldName, value); and this works fine.
- if I do the same for the second step then all fields in my 
documents get removed;
- for that reason I need to call 
SolrInputDocument.addField(fieldName, 
Collections.singletonMap("set", value)); and then it's fine
- but for some field, if I do the call from above, then the indexed 
values are like "{set=value}" instead of just "value".


Can somebody explain this strange behaviour to me?

Regards
Maciej







--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Maciej Ł. PCSS
No, it's not the case. In both steps I'm indexing documents from the 
same set of IDs (I mean the values of the 'id').


Maciej


W dniu 15.02.2017 o 11:07, Emir Arnautovic pisze:
I did not have time to test it or look at the code, but can you check 
if it could be the case when there is no document with a, b, c fields 
and you are trying to update it with d, e, f using partial update syntax.


Emir


On 15.02.2017 09:25, Maciej Ł. PCSS wrote:

Dear All,
how should I handle the following scenario using SOLRJ?  Index a 
collection of documents (fill fields a, b, c). Then index the same 
collection but this time fill fields d, e, f.


In a pseudo-code it would be: step1(collectionX); step2(collectionX); 
solrCommit();


See my observations below:
- first step is done by calling SolrInputDocument.addField(fieldName, 
value); and this works fine.
- if I do the same for the second step then all fields in my 
documents get removed;
- for that reason I need to call 
SolrInputDocument.addField(fieldName, Collections.singletonMap("set", 
value)); and then it's fine
- but for some field, if I do the call from above, then the indexed 
values are like "{set=value}" instead of just "value".


Can somebody explain this strange behaviour to me?

Regards
Maciej







Re: Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Emir Arnautovic
I did not have time to test it or look at the code, but can you check if 
it could be the case when there is no document with a, b, c fields and 
you are trying to update it with d, e, f using partial update syntax.


Emir


On 15.02.2017 09:25, Maciej Ł. PCSS wrote:

Dear All,
how should I handle the following scenario using SOLRJ?  Index a 
collection of documents (fill fields a, b, c). Then index the same 
collection but this time fill fields d, e, f.


In a pseudo-code it would be: step1(collectionX); step2(collectionX); 
solrCommit();


See my observations below:
- first step is done by calling SolrInputDocument.addField(fieldName, 
value); and this works fine.
- if I do the same for the second step then all fields in my documents 
get removed;
- for that reason I need to call SolrInputDocument.addField(fieldName, 
Collections.singletonMap("set", value)); and then it's fine
- but for some field, if I do the call from above, then the indexed 
values are like "{set=value}" instead of just "value".


Can somebody explain this strange behaviour to me?

Regards
Maciej



--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Maciej Ł. PCSS

Dear All,
how should I handle the following scenario using SOLRJ?  Index a 
collection of documents (fill fields a, b, c). Then index the same 
collection but this time fill fields d, e, f.


In a pseudo-code it would be: step1(collectionX); step2(collectionX); 
solrCommit();


See my observations below:
- first step is done by calling SolrInputDocument.addField(fieldName, 
value); and this works fine.
- if I do the same for the second step then all fields in my documents 
get removed;
- for that reason I need to call SolrInputDocument.addField(fieldName, 
Collections.singletonMap("set", value)); and then it's fine
- but for some field, if I do the call from above, then the indexed 
values are like "{set=value}" instead of just "value".


Can somebody explain this strange behaviour to me?

Regards
Maciej